Patchwork Fix ipa-split handling of non-ssa vars; allow splitting of functions returning agregate

login
register
mail settings
Submitter Jan Hubicka
Date June 30, 2010, 7:08 p.m.
Message ID <20100630190818.GM6233@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/57449/
State New
Headers show

Comments

Jan Hubicka - June 30, 2010, 7:08 p.m.
Hi,
this patch makes ipa-split smarter about non-ssa vars (so there are only 3
splits disqualified on compilation of tramp3d).  ipa-split collects all auto
non-ssa vars used in split part and tries to prove that header does not use any
of them.  There is bug in conditional of continue statement, so it compares
split part with split part that makes it to disqualify any splitting having
non-ssa locals.

Fixing this is not enough to enable splitting of non-trivial functions
returning aggregates (as one in testcase bellow).  THe problem is that
aggregate is constructed in temporary and same var is used in split part and in
header (in path leading directly to return) preventing the split.  THis patch
strengthens the check to try to prove that any variable used in split part
is not used in header in any BB whose execution might lead to split part.
Still with alias oracle we might work harder, but it is not completely easy
to do and the code would look similar. I guess we want to
 1) collect vars we worry about (same way as we do now)
 2) If there are any of those, walk all VOPS in split part seeeing
    if they reach to header
 3) Also try to prove that address taken in header can not escape
    to split part.
I am not sure this is worth the effort and thus I am deferring this for later.

Finally we won't split any function where both header and split part returns,
since return block does not pass requirement of find_return_bb.  It is of the form
 <retval> = tmpaggregate;
 return <retval>

So pattern matching in find_return_bb needs to be improved.

Bootstrapped/regtested x86_64-linux.  I am testing now x86 too (because of the NRV
Richard Guenther - July 1, 2010, 3:53 p.m.
On Wed, 30 Jun 2010, Jan Hubicka wrote:

> Hi,
> this patch makes ipa-split smarter about non-ssa vars (so there are only 3
> splits disqualified on compilation of tramp3d).  ipa-split collects all auto
> non-ssa vars used in split part and tries to prove that header does not use any
> of them.  There is bug in conditional of continue statement, so it compares
> split part with split part that makes it to disqualify any splitting having
> non-ssa locals.
> 
> Fixing this is not enough to enable splitting of non-trivial functions
> returning aggregates (as one in testcase bellow).  THe problem is that
> aggregate is constructed in temporary and same var is used in split part and in
> header (in path leading directly to return) preventing the split.  THis patch
> strengthens the check to try to prove that any variable used in split part
> is not used in header in any BB whose execution might lead to split part.
> Still with alias oracle we might work harder, but it is not completely easy
> to do and the code would look similar. I guess we want to
>  1) collect vars we worry about (same way as we do now)
>  2) If there are any of those, walk all VOPS in split part seeeing
>     if they reach to header
>  3) Also try to prove that address taken in header can not escape
>     to split part.
> I am not sure this is worth the effort and thus I am deferring this for later.
> 
> Finally we won't split any function where both header and split part returns,
> since return block does not pass requirement of find_return_bb.  It is of the form
>  <retval> = tmpaggregate;
>  return <retval>
> 
> So pattern matching in find_return_bb needs to be improved.
> 
> Bootstrapped/regtested x86_64-linux.  I am testing now x86 too (because of the NRV
> differences).  OK if it passes?
> 
> I will work on DECL_BY_REFERENCE return values next.

Ok.

Thanks,
Richard.

> Honza
> 
> 	* ipa-split.c (verify_non_ssa_vars): Break out from ...; perform DFS walk
> 	backwards from entry_bb to check only those basic block of header
> 	that might lead to execution of split part.
> 	(consider_split) ... here.
> 	(find_return_bb): Allow assignment in return BB.
> 	(find_retval): New.
> 	(split_function): Fix name of cloned function; take care of updating return
> 	value in return_bb containing move.
> 
> 	* gcc.dg/tree-ssa/ipa-split-5.c: New function.
> Index: testsuite/gcc.dg/tree-ssa/ipa-split-5.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/ipa-split-5.c	(revision 0)
> +++ testsuite/gcc.dg/tree-ssa/ipa-split-5.c	(revision 0)
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-fnsplit -fdump-tree-optimized" } */
> +
> +struct a {int a,b;};
> +struct a make_me_big (int a);
> +struct a split_me (int a)
> +{
> +  struct a retval;
> +  if (__builtin_expect (a!=0,1))
> +    {
> +      retval.a = 0;
> +      retval.b = 0;
> +      return retval;
> +    }
> +  else
> +    {
> +      struct a retval = make_me_big (a);
> +      retval = make_me_big (a);
> +      retval = make_me_big (a);
> +      retval = make_me_big (a);
> +      retval = make_me_big (a);
> +      retval = make_me_big (a);
> +      return retval;
> +    }
> +}
> +int val;
> +test()
> +{
> +  split_me (val);
> +  split_me (val);
> +  split_me (val);
> +  split_me (val);
> +}
> +/* { dg-final { scan-tree-dump-times "Splitting function" 1 "fnsplit"} } */
> +/* { dg-final { cleanup-tree-dump "fnsplit" } } */
> +/* { dg-final { scan-tree-dump "part" "optimized"} } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: ipa-split.c
> ===================================================================
> --- ipa-split.c	(revision 161615)
> +++ ipa-split.c	(working copy)
> @@ -159,6 +159,91 @@ dump_split_point (FILE * file, struct sp
>    dump_bitmap (file, current->ssa_names_to_pass);
>  }
>  
> +/* Look for all BBs in header that might lead to split part and verify that
> +   they are not defining any of SSA vars used by split part. 
> +   Parameters are the same as for consider_split.  */
> +
> +static bool
> +verify_non_ssa_vars (struct split_point *current, bitmap non_ssa_vars,
> +		     basic_block return_bb)
> +{
> +  bitmap seen = BITMAP_ALLOC (NULL);
> +  VEC (basic_block,heap) *worklist = NULL;
> +  edge e;
> +  edge_iterator ei;
> +  bool ok = true;
> +  
> +  FOR_EACH_EDGE (e, ei, current->entry_bb->preds)
> +    if (e->src != ENTRY_BLOCK_PTR
> +	&& !bitmap_bit_p (current->split_bbs, e->src->index))
> +      {
> +        VEC_safe_push (basic_block, heap, worklist, e->src);
> +	bitmap_set_bit (seen, e->src->index);
> +      }
> +  
> +  while (!VEC_empty (basic_block, worklist))
> +    {
> +      gimple_stmt_iterator bsi;
> +      basic_block bb = VEC_pop (basic_block, worklist);
> +
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +	if (e->src != ENTRY_BLOCK_PTR
> +	    && !bitmap_bit_p (seen, e->src->index))
> +	  {
> +	    gcc_checking_assert (!bitmap_bit_p (current->split_bbs,
> +					        e->src->index));
> +	    VEC_safe_push (basic_block, heap, worklist, e->src);
> +	    bitmap_set_bit (seen, e->src->index);
> +	  }
> +      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
> +	{
> +	  if (is_gimple_debug (gsi_stmt (bsi)))
> +	    continue;
> +	  if (walk_stmt_load_store_addr_ops
> +	      (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
> +	       test_nonssa_use, test_nonssa_use))
> +	    {
> +	      ok = false;
> +	      goto done;
> +	    }
> +	}
> +      for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
> +	{
> +	  if (walk_stmt_load_store_addr_ops
> +	      (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
> +	       test_nonssa_use, test_nonssa_use))
> +	    {
> +	      ok = false;
> +	      goto done;
> +	    }
> +	}
> +      FOR_EACH_EDGE (e, ei, bb->succs)
> +	{
> +	  if (e->dest != return_bb)
> +	    continue;
> +	  for (bsi = gsi_start_phis (return_bb); !gsi_end_p (bsi);
> +	       gsi_next (&bsi))
> +	    {
> +	      gimple stmt = gsi_stmt (bsi);
> +	      tree op = gimple_phi_arg_def (stmt, e->dest_idx);
> +
> +	      if (!is_gimple_reg (gimple_phi_result (stmt)))
> +		continue;
> +	      if (TREE_CODE (op) != SSA_NAME
> +		  && test_nonssa_use (stmt, op, non_ssa_vars))
> +		{
> +		  ok = false;
> +		  goto done;
> +		}
> +	    }
> +	}
> +    }
> +done:
> +  BITMAP_FREE (seen);
> +  VEC_free (basic_block, heap, worklist);
> +  return ok;
> +}
> +
>  /* We found an split_point CURRENT.  NON_SSA_VARS is bitmap of all non ssa
>     variables used and RETURN_BB is return basic block.
>     See if we can split function here.  */
> @@ -292,63 +377,12 @@ consider_split (struct split_point *curr
>    /* When there are non-ssa vars used in the split region, see if they
>       are used in the header region.  If so, reject the split.
>       FIXME: we can use nested function support to access both.  */
> -  if (!bitmap_empty_p (non_ssa_vars))
> +  if (!bitmap_empty_p (non_ssa_vars)
> +      && !verify_non_ssa_vars (current, non_ssa_vars, return_bb))
>      {
> -      basic_block bb;
> -      FOR_EACH_BB (bb)
> -	{
> -	  gimple_stmt_iterator bsi;
> -	  if (!bitmap_bit_p (current->split_bbs, bb->index))
> -	    continue;
> -	  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
> -	    {
> -	      if (is_gimple_debug (gsi_stmt (bsi)))
> -		continue;
> -	      if (walk_stmt_load_store_addr_ops
> -		  (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
> -		   test_nonssa_use, test_nonssa_use))
> -		{
> -		  if (dump_file && (dump_flags & TDF_DETAILS))
> -		    fprintf (dump_file,
> -			     "  Refused: split part has non-ssa uses\n");
> -		  return;
> -		}
> -	    }
> -	  for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
> -	    {
> -	      if (walk_stmt_load_store_addr_ops
> -		  (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
> -		   test_nonssa_use, test_nonssa_use))
> -		{
> -		  if (dump_file && (dump_flags & TDF_DETAILS))
> -		    fprintf (dump_file,
> -			     "  Refused: split part has non-ssa uses\n");
> -		  return;
> -		}
> -	    }
> -	  FOR_EACH_EDGE (e, ei, bb->succs)
> -	    {
> -	      if (e->dest != return_bb)
> -		continue;
> -	      for (bsi = gsi_start_phis (return_bb); !gsi_end_p (bsi);
> -		   gsi_next (&bsi))
> -		{
> -		  gimple stmt = gsi_stmt (bsi);
> -		  tree op = gimple_phi_arg_def (stmt, e->dest_idx);
> -
> -		  if (!is_gimple_reg (gimple_phi_result (stmt)))
> -		    continue;
> -		  if (TREE_CODE (op) != SSA_NAME
> -		      && test_nonssa_use (stmt, op, non_ssa_vars))
> -		    {
> -		      if (dump_file && (dump_flags & TDF_DETAILS))
> -			fprintf (dump_file,
> -				 "  Refused: split part has non-ssa uses\n");
> -		      return;
> -		    }
> -		}
> -	    }
> -	  }
> +      if (dump_file && (dump_flags & TDF_DETAILS))
> +	fprintf (dump_file,
> +		 "  Refused: split part has non-ssa uses\n");
>        return;
>      }
>    if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -379,10 +413,20 @@ consider_split (struct split_point *curr
>      }
>  }
>  
> -/* Return basic block containing RETURN statement, or EXIT_BLOCK_PTR if none
> -   found. 
> +/* Return basic block containing RETURN statement.  We allow basic blocks
> +   of the form:
> +   <retval> = tmp_var;
> +   return <retval>
> +   but return_bb can not be more complex than this.
> +   If nothing is found, return EXIT_BLOCK_PTR.
> +
>     When there are multiple RETURN statement, chose one with return value,
>     since that one is more likely shared by multiple code paths.
> +
> +   Return BB is special, because for function splitting it is the only
> +   basic block that is duplicated in between header and split part of the
> +   function.
> +
>     TODO: We might support multiple return blocks.  */
>  
>  static basic_block
> @@ -399,16 +443,29 @@ find_return_bb (void)
>  	bool found_return = false;
>  	tree retval = NULL_TREE;
>  
> -	for (bsi = gsi_start_bb (e->src); !gsi_end_p (bsi); gsi_next (&bsi))
> -	  if (gimple_code (gsi_stmt (bsi)) != GIMPLE_RETURN
> -	      && gimple_code (gsi_stmt (bsi)) != GIMPLE_LABEL
> -	      && !is_gimple_debug (gsi_stmt (bsi)))
> -	    break;
> -	  else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
> -	    {
> -	      found_return = true;
> -	      retval = gimple_return_retval (gsi_stmt (bsi));
> -	    }
> +	for (bsi = gsi_last_bb (e->src); !gsi_end_p (bsi); gsi_prev (&bsi))
> +	  {
> +	    gimple stmt = gsi_stmt (bsi);
> +	    if (gimple_code (stmt) == GIMPLE_LABEL
> +		|| is_gimple_debug (stmt))
> +	      ;
> +	    else if (gimple_code (stmt) == GIMPLE_ASSIGN
> +		     && found_return
> +		     && gimple_assign_single_p (stmt)
> +		     && (auto_var_in_fn_p (gimple_assign_rhs1 (stmt),
> +					   current_function_decl)
> +			 || is_gimple_min_invariant
> +			      (gimple_assign_rhs1 (stmt)))
> +		     && retval == gimple_assign_lhs (stmt))
> +	      ;
> +	    else if (gimple_code (stmt) == GIMPLE_RETURN)
> +	      {
> +		found_return = true;
> +		retval = gimple_return_retval (stmt);
> +	      }
> +	    else
> +	      break;
> +	  }
>  	if (gsi_end_p (bsi) && found_return)
>  	  {
>  	    if (retval)
> @@ -420,6 +477,20 @@ find_return_bb (void)
>    return return_bb;
>  }
>  
> +/* Given return basicblock RETURN_BB, see where return value is really
> +   stored.  */
> +static tree
> +find_retval (basic_block return_bb)
> +{
> +  gimple_stmt_iterator bsi;
> +  for (bsi = gsi_start_bb (return_bb); !gsi_end_p (bsi); gsi_next (&bsi))
> +    if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
> +      return gimple_return_retval (gsi_stmt (bsi));
> +    else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN)
> +      return gimple_assign_lhs (gsi_stmt (bsi));
> +  return NULL;
> +}
> +
>  /* Callback for walk_stmt_load_store_addr_ops.  If T is non-ssa automatic
>     variable, mark it as used in bitmap passed via DATA. 
>     Return true when access to T prevents splitting the function.  */
> @@ -844,7 +915,7 @@ split_function (struct split_point *spli
>  				     NULL, NULL,
>  				     args_to_skip,
>  				     split_point->split_bbs,
> -				     split_point->entry_bb, "_part");
> +				     split_point->entry_bb, "part");
>    /* For usual cloning it is enough to clear builtin only when signature
>       changes.  For partial inlining we however can not expect the part
>       of builtin implementation to have same semantic as the whole.  */
> @@ -891,17 +962,15 @@ split_function (struct split_point *spli
>        e->probability = REG_BR_PROB_BASE;
>        if (return_bb != EXIT_BLOCK_PTR)
>  	{
> -	  gimple return_stmt = gsi_stmt (gsi_last_bb (return_bb));
> -	  gcc_assert (gimple_code (return_stmt) == GIMPLE_RETURN);
> -
> -	  if ((real_retval = retval = gimple_return_retval (return_stmt))
> +	  real_retval = retval = find_retval (return_bb);
> +	  if (real_retval
>  	      && !is_gimple_min_invariant (retval)
>  	      && (TREE_CODE (retval) != SSA_NAME
>  		  || !SSA_NAME_IS_DEFAULT_DEF (retval)))
>  	    {
>  	      gimple_stmt_iterator psi;
>  
> -	      /* See if there is PHI definind return value.  */
> +	      /* See if there is PHI defining return value.  */
>  	      for (psi = gsi_start_phis (return_bb);
>  		   !gsi_end_p (psi); gsi_next (&psi))
>  		if (is_gimple_reg (gimple_phi_result (gsi_stmt (psi))))
> @@ -917,8 +986,20 @@ split_function (struct split_point *spli
>  		    add_phi_arg (gsi_stmt (psi), retval, e, UNKNOWN_LOCATION);
>  		  else if (TREE_CODE (retval) == SSA_NAME)
>  		    {
> -		      gimple_return_set_retval (return_stmt, retval);
> -		      update_stmt (return_stmt);
> +		      gimple_stmt_iterator bsi;
> +		      for (bsi = gsi_start_bb (return_bb); !gsi_end_p (bsi);
> +			   gsi_next (&bsi))
> +			if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
> +			  {
> +			    gimple_return_set_retval (gsi_stmt (bsi), retval);
> +			    break;
> +			  }
> +			else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN)
> +			  {
> +			    gimple_assign_set_rhs1 (gsi_stmt (bsi), retval);
> +			    break;
> +			  }
> +		      update_stmt (gsi_stmt (bsi));
>  		    }
>  		}
>  	      gimple_call_set_lhs (call, retval);
> 
>

Patch

differences).  OK if it passes?

I will work on DECL_BY_REFERENCE return values next.

Honza

	* ipa-split.c (verify_non_ssa_vars): Break out from ...; perform DFS walk
	backwards from entry_bb to check only those basic block of header
	that might lead to execution of split part.
	(consider_split) ... here.
	(find_return_bb): Allow assignment in return BB.
	(find_retval): New.
	(split_function): Fix name of cloned function; take care of updating return
	value in return_bb containing move.

	* gcc.dg/tree-ssa/ipa-split-5.c: New function.
Index: testsuite/gcc.dg/tree-ssa/ipa-split-5.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/ipa-split-5.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/ipa-split-5.c	(revision 0)
@@ -0,0 +1,37 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-fnsplit -fdump-tree-optimized" } */
+
+struct a {int a,b;};
+struct a make_me_big (int a);
+struct a split_me (int a)
+{
+  struct a retval;
+  if (__builtin_expect (a!=0,1))
+    {
+      retval.a = 0;
+      retval.b = 0;
+      return retval;
+    }
+  else
+    {
+      struct a retval = make_me_big (a);
+      retval = make_me_big (a);
+      retval = make_me_big (a);
+      retval = make_me_big (a);
+      retval = make_me_big (a);
+      retval = make_me_big (a);
+      return retval;
+    }
+}
+int val;
+test()
+{
+  split_me (val);
+  split_me (val);
+  split_me (val);
+  split_me (val);
+}
+/* { dg-final { scan-tree-dump-times "Splitting function" 1 "fnsplit"} } */
+/* { dg-final { cleanup-tree-dump "fnsplit" } } */
+/* { dg-final { scan-tree-dump "part" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: ipa-split.c
===================================================================
--- ipa-split.c	(revision 161615)
+++ ipa-split.c	(working copy)
@@ -159,6 +159,91 @@  dump_split_point (FILE * file, struct sp
   dump_bitmap (file, current->ssa_names_to_pass);
 }
 
+/* Look for all BBs in header that might lead to split part and verify that
+   they are not defining any of SSA vars used by split part. 
+   Parameters are the same as for consider_split.  */
+
+static bool
+verify_non_ssa_vars (struct split_point *current, bitmap non_ssa_vars,
+		     basic_block return_bb)
+{
+  bitmap seen = BITMAP_ALLOC (NULL);
+  VEC (basic_block,heap) *worklist = NULL;
+  edge e;
+  edge_iterator ei;
+  bool ok = true;
+  
+  FOR_EACH_EDGE (e, ei, current->entry_bb->preds)
+    if (e->src != ENTRY_BLOCK_PTR
+	&& !bitmap_bit_p (current->split_bbs, e->src->index))
+      {
+        VEC_safe_push (basic_block, heap, worklist, e->src);
+	bitmap_set_bit (seen, e->src->index);
+      }
+  
+  while (!VEC_empty (basic_block, worklist))
+    {
+      gimple_stmt_iterator bsi;
+      basic_block bb = VEC_pop (basic_block, worklist);
+
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->src != ENTRY_BLOCK_PTR
+	    && !bitmap_bit_p (seen, e->src->index))
+	  {
+	    gcc_checking_assert (!bitmap_bit_p (current->split_bbs,
+					        e->src->index));
+	    VEC_safe_push (basic_block, heap, worklist, e->src);
+	    bitmap_set_bit (seen, e->src->index);
+	  }
+      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+	{
+	  if (is_gimple_debug (gsi_stmt (bsi)))
+	    continue;
+	  if (walk_stmt_load_store_addr_ops
+	      (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
+	       test_nonssa_use, test_nonssa_use))
+	    {
+	      ok = false;
+	      goto done;
+	    }
+	}
+      for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+	{
+	  if (walk_stmt_load_store_addr_ops
+	      (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
+	       test_nonssa_use, test_nonssa_use))
+	    {
+	      ok = false;
+	      goto done;
+	    }
+	}
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	{
+	  if (e->dest != return_bb)
+	    continue;
+	  for (bsi = gsi_start_phis (return_bb); !gsi_end_p (bsi);
+	       gsi_next (&bsi))
+	    {
+	      gimple stmt = gsi_stmt (bsi);
+	      tree op = gimple_phi_arg_def (stmt, e->dest_idx);
+
+	      if (!is_gimple_reg (gimple_phi_result (stmt)))
+		continue;
+	      if (TREE_CODE (op) != SSA_NAME
+		  && test_nonssa_use (stmt, op, non_ssa_vars))
+		{
+		  ok = false;
+		  goto done;
+		}
+	    }
+	}
+    }
+done:
+  BITMAP_FREE (seen);
+  VEC_free (basic_block, heap, worklist);
+  return ok;
+}
+
 /* We found an split_point CURRENT.  NON_SSA_VARS is bitmap of all non ssa
    variables used and RETURN_BB is return basic block.
    See if we can split function here.  */
@@ -292,63 +377,12 @@  consider_split (struct split_point *curr
   /* When there are non-ssa vars used in the split region, see if they
      are used in the header region.  If so, reject the split.
      FIXME: we can use nested function support to access both.  */
-  if (!bitmap_empty_p (non_ssa_vars))
+  if (!bitmap_empty_p (non_ssa_vars)
+      && !verify_non_ssa_vars (current, non_ssa_vars, return_bb))
     {
-      basic_block bb;
-      FOR_EACH_BB (bb)
-	{
-	  gimple_stmt_iterator bsi;
-	  if (!bitmap_bit_p (current->split_bbs, bb->index))
-	    continue;
-	  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
-	    {
-	      if (is_gimple_debug (gsi_stmt (bsi)))
-		continue;
-	      if (walk_stmt_load_store_addr_ops
-		  (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
-		   test_nonssa_use, test_nonssa_use))
-		{
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    fprintf (dump_file,
-			     "  Refused: split part has non-ssa uses\n");
-		  return;
-		}
-	    }
-	  for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
-	    {
-	      if (walk_stmt_load_store_addr_ops
-		  (gsi_stmt (bsi), non_ssa_vars, test_nonssa_use,
-		   test_nonssa_use, test_nonssa_use))
-		{
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    fprintf (dump_file,
-			     "  Refused: split part has non-ssa uses\n");
-		  return;
-		}
-	    }
-	  FOR_EACH_EDGE (e, ei, bb->succs)
-	    {
-	      if (e->dest != return_bb)
-		continue;
-	      for (bsi = gsi_start_phis (return_bb); !gsi_end_p (bsi);
-		   gsi_next (&bsi))
-		{
-		  gimple stmt = gsi_stmt (bsi);
-		  tree op = gimple_phi_arg_def (stmt, e->dest_idx);
-
-		  if (!is_gimple_reg (gimple_phi_result (stmt)))
-		    continue;
-		  if (TREE_CODE (op) != SSA_NAME
-		      && test_nonssa_use (stmt, op, non_ssa_vars))
-		    {
-		      if (dump_file && (dump_flags & TDF_DETAILS))
-			fprintf (dump_file,
-				 "  Refused: split part has non-ssa uses\n");
-		      return;
-		    }
-		}
-	    }
-	  }
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file,
+		 "  Refused: split part has non-ssa uses\n");
       return;
     }
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -379,10 +413,20 @@  consider_split (struct split_point *curr
     }
 }
 
-/* Return basic block containing RETURN statement, or EXIT_BLOCK_PTR if none
-   found. 
+/* Return basic block containing RETURN statement.  We allow basic blocks
+   of the form:
+   <retval> = tmp_var;
+   return <retval>
+   but return_bb can not be more complex than this.
+   If nothing is found, return EXIT_BLOCK_PTR.
+
    When there are multiple RETURN statement, chose one with return value,
    since that one is more likely shared by multiple code paths.
+
+   Return BB is special, because for function splitting it is the only
+   basic block that is duplicated in between header and split part of the
+   function.
+
    TODO: We might support multiple return blocks.  */
 
 static basic_block
@@ -399,16 +443,29 @@  find_return_bb (void)
 	bool found_return = false;
 	tree retval = NULL_TREE;
 
-	for (bsi = gsi_start_bb (e->src); !gsi_end_p (bsi); gsi_next (&bsi))
-	  if (gimple_code (gsi_stmt (bsi)) != GIMPLE_RETURN
-	      && gimple_code (gsi_stmt (bsi)) != GIMPLE_LABEL
-	      && !is_gimple_debug (gsi_stmt (bsi)))
-	    break;
-	  else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
-	    {
-	      found_return = true;
-	      retval = gimple_return_retval (gsi_stmt (bsi));
-	    }
+	for (bsi = gsi_last_bb (e->src); !gsi_end_p (bsi); gsi_prev (&bsi))
+	  {
+	    gimple stmt = gsi_stmt (bsi);
+	    if (gimple_code (stmt) == GIMPLE_LABEL
+		|| is_gimple_debug (stmt))
+	      ;
+	    else if (gimple_code (stmt) == GIMPLE_ASSIGN
+		     && found_return
+		     && gimple_assign_single_p (stmt)
+		     && (auto_var_in_fn_p (gimple_assign_rhs1 (stmt),
+					   current_function_decl)
+			 || is_gimple_min_invariant
+			      (gimple_assign_rhs1 (stmt)))
+		     && retval == gimple_assign_lhs (stmt))
+	      ;
+	    else if (gimple_code (stmt) == GIMPLE_RETURN)
+	      {
+		found_return = true;
+		retval = gimple_return_retval (stmt);
+	      }
+	    else
+	      break;
+	  }
 	if (gsi_end_p (bsi) && found_return)
 	  {
 	    if (retval)
@@ -420,6 +477,20 @@  find_return_bb (void)
   return return_bb;
 }
 
+/* Given return basicblock RETURN_BB, see where return value is really
+   stored.  */
+static tree
+find_retval (basic_block return_bb)
+{
+  gimple_stmt_iterator bsi;
+  for (bsi = gsi_start_bb (return_bb); !gsi_end_p (bsi); gsi_next (&bsi))
+    if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
+      return gimple_return_retval (gsi_stmt (bsi));
+    else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN)
+      return gimple_assign_lhs (gsi_stmt (bsi));
+  return NULL;
+}
+
 /* Callback for walk_stmt_load_store_addr_ops.  If T is non-ssa automatic
    variable, mark it as used in bitmap passed via DATA. 
    Return true when access to T prevents splitting the function.  */
@@ -844,7 +915,7 @@  split_function (struct split_point *spli
 				     NULL, NULL,
 				     args_to_skip,
 				     split_point->split_bbs,
-				     split_point->entry_bb, "_part");
+				     split_point->entry_bb, "part");
   /* For usual cloning it is enough to clear builtin only when signature
      changes.  For partial inlining we however can not expect the part
      of builtin implementation to have same semantic as the whole.  */
@@ -891,17 +962,15 @@  split_function (struct split_point *spli
       e->probability = REG_BR_PROB_BASE;
       if (return_bb != EXIT_BLOCK_PTR)
 	{
-	  gimple return_stmt = gsi_stmt (gsi_last_bb (return_bb));
-	  gcc_assert (gimple_code (return_stmt) == GIMPLE_RETURN);
-
-	  if ((real_retval = retval = gimple_return_retval (return_stmt))
+	  real_retval = retval = find_retval (return_bb);
+	  if (real_retval
 	      && !is_gimple_min_invariant (retval)
 	      && (TREE_CODE (retval) != SSA_NAME
 		  || !SSA_NAME_IS_DEFAULT_DEF (retval)))
 	    {
 	      gimple_stmt_iterator psi;
 
-	      /* See if there is PHI definind return value.  */
+	      /* See if there is PHI defining return value.  */
 	      for (psi = gsi_start_phis (return_bb);
 		   !gsi_end_p (psi); gsi_next (&psi))
 		if (is_gimple_reg (gimple_phi_result (gsi_stmt (psi))))
@@ -917,8 +986,20 @@  split_function (struct split_point *spli
 		    add_phi_arg (gsi_stmt (psi), retval, e, UNKNOWN_LOCATION);
 		  else if (TREE_CODE (retval) == SSA_NAME)
 		    {
-		      gimple_return_set_retval (return_stmt, retval);
-		      update_stmt (return_stmt);
+		      gimple_stmt_iterator bsi;
+		      for (bsi = gsi_start_bb (return_bb); !gsi_end_p (bsi);
+			   gsi_next (&bsi))
+			if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
+			  {
+			    gimple_return_set_retval (gsi_stmt (bsi), retval);
+			    break;
+			  }
+			else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN)
+			  {
+			    gimple_assign_set_rhs1 (gsi_stmt (bsi), retval);
+			    break;
+			  }
+		      update_stmt (gsi_stmt (bsi));
 		    }
 		}
 	      gimple_call_set_lhs (call, retval);