Patchwork Finish up PR rtl-optimization/44194

login
register
mail settings
Submitter Richard Sandiford
Date Sept. 16, 2012, 10:52 a.m.
Message ID <87392i2q9j.fsf@talisman.home>
Download mbox | patch
Permalink /patch/184122/
State New
Headers show

Comments

Richard Sandiford - Sept. 16, 2012, 10:52 a.m.
Eric Botcazou <ebotcazou@adacore.com> writes:
> This is the PR about the useless spilling to memory of structures that are 
> returned in registers.  It was essentially addressed last year by Easwaran with 
> an enhancement of the RTL DSE pass, but Easwaran also noted that we still spill 
> to memory in the simplest cases, e.g. gcc.dg/pr44194-1.c, because expand_call
> creates a temporary on the stack to store the value returned in registers...
>
> The attached patch solves this problem by copying the value into pseudos 
> instead by means of emit_group_move_into_temps.  This is sufficient to get rid 
> of the remaining memory accesses for gcc.dg/pr44194-1.c on x86-64 for example,
> but not on strict-alignment platforms like SPARC64.

Thanks for doing this.  It'll obviously help n32 and n64 long doubles too.

I hit one problem building libgfortran for mips64-linux-gnu.
The calls.c change was:
Eric Botcazou - Sept. 16, 2012, 3:03 p.m.
> I hit one problem building libgfortran for mips64-linux-gnu.
> The calls.c change was:
> 
> Index: calls.c
> ===================================================================
> --- calls.c	(revision 191198)
> +++ calls.c	(working copy)
> @@ -3272,16 +3272,8 @@ expand_call (tree exp, rtx target, int i
>        else if (GET_CODE (valreg) == PARALLEL)
>  	{
>  	  if (target == 0)
> -	    {
> -	      /* This will only be assigned once, so it can be readonly.  */
> -	      tree nt = build_qualified_type (rettype,
> -					      (TYPE_QUALS (rettype)
> -					       | TYPE_QUAL_CONST));
> -
> -	      target = assign_temp (nt, 1, 1);
> -	    }
> -
> -	  if (! rtx_equal_p (target, valreg))
> +	    target = emit_group_move_into_temps (valreg);
> +	  else if (!rtx_equal_p (target, valreg))
>  	    emit_group_store (target, valreg, rettype,
>  			      int_size_in_bytes (rettype));
> 
>  	  /* We can not support sibling calls for this case.  */
>  	  sibcall_failure = 1;
> 
> But if we're trying to use a sibcall, we go through this loop twice,
> and the second iteration has to cope with a PARALLEL target created
> by the first.  How about the patch below?  Tested on mips64-linux-gnu,
> where a full testrun looked good.

Fine with me, thanks.  I'm a little puzzled that I didn't run into this myself.

> In some ways it's a bit silly to emit anything at all in the first
> iteration, given that we then go on to set sibcall_failure.  It's not the
> kind of loop we can just continue out of though.  Also, your patch
> probably means that we only need to set sibcall_failure for the
> emit_group_store case, although I've not tested that.

Good point, I'll give it a try.

Patch

Index: calls.c
===================================================================
--- calls.c	(revision 191198)
+++ calls.c	(working copy)
@@ -3272,16 +3272,8 @@  expand_call (tree exp, rtx target, int i
       else if (GET_CODE (valreg) == PARALLEL)
 	{
 	  if (target == 0)
-	    {
-	      /* This will only be assigned once, so it can be readonly.  */
-	      tree nt = build_qualified_type (rettype,
-					      (TYPE_QUALS (rettype)
-					       | TYPE_QUAL_CONST));
-
-	      target = assign_temp (nt, 1, 1);
-	    }
-
-	  if (! rtx_equal_p (target, valreg))
+	    target = emit_group_move_into_temps (valreg);
+	  else if (!rtx_equal_p (target, valreg))
 	    emit_group_store (target, valreg, rettype,
 			      int_size_in_bytes (rettype));
 
 	  /* We can not support sibling calls for this case.  */
 	  sibcall_failure = 1;

But if we're trying to use a sibcall, we go through this loop twice,
and the second iteration has to cope with a PARALLEL target created
by the first.  How about the patch below?  Tested on mips64-linux-gnu,
where a full testrun looked good.

In some ways it's a bit silly to emit anything at all in the first iteration,
given that we then go on to set sibcall_failure.  It's not the kind of loop
we can just continue out of though.  Also, your patch probably means that
we only need to set sibcall_failure for the emit_group_store case,
although I've not tested that.

Richard


gcc/
	* calls.c (expand_call): Use emit_group_move for PARALLEL->PARALLEL
	moves.

Index: gcc/calls.c
===================================================================
--- gcc/calls.c	2012-09-15 11:15:46.000000000 +0100
+++ gcc/calls.c	2012-09-15 11:15:46.522857695 +0100
@@ -3273,7 +3273,13 @@  expand_call (tree exp, rtx target, int i
 	{
 	  if (target == 0)
 	    target = emit_group_move_into_temps (valreg);
-	  else if (!rtx_equal_p (target, valreg))
+	  else if (rtx_equal_p (target, valreg))
+	    ;
+	  else if (GET_CODE (target) == PARALLEL)
+	    /* Handle the result of a emit_group_move_into_temps
+	       call in the previous pass.  */
+	    emit_group_move (target, valreg);
+	  else
 	    emit_group_store (target, valreg, rettype,
 			      int_size_in_bytes (rettype));