diff mbox

Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465)

Message ID 20170125210604.GS1867@tucnak
State New
Headers show

Commit Message

Jakub Jelinek Jan. 25, 2017, 9:06 p.m. UTC
Hi!

This patch adds a little optimization, if %st and %st(1) are results of
memory loads and we need to exchange them, it is shorter (and on older
machines probably also cheaper) to swap the two loads.

This triggers 26783 in x86_64-linux and i686-linux bootstraps+regtests.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Initially I thought I should also adjust debug insns in between i1 and i2
if they refer to i387 stack registers, but further look at regstack shows
that nothing does that and that such debug insns are wrong-debug for
multiple reasons.  In particular, e.g. emit_swap_insn alone emits the
fxch right after i1 or at the start of bb, without trying to adjust
debug insns in between that and insn (we should swap %st with the other reg
in those).  More importantly, at least if the debug regnos actually mean
what they stand in the assembly (i.e. the first one %st, the second one
%st(1) etc.), then we would need to ensure that every insn that pops or
pushes anything onto the i387 stack should update all DWARF expressions
that refer to those registers.  If that is really the case, the easiest
solution for this might be either to reset all debug insns that refer to %st
to %st(7) registers (easy), or replace all those registers in debug insns
with debug temporaries (8 debug temporaries for each pre-regstack register)
and then on i387 pushes or pops emit debug insns for those temporaries,
changing how they are mapped to the actual hw registers and let
var-tracking.c handle the rest.  I guess I should file a PR about this.

2017-01-25  Jakub Jelinek  <jakub@redhat.com>

	PR target/70465
	* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
	emit fld b; fld a; if possible.

	* gcc.target/i386/pr70465.c: New test.


	Jakub

Comments

Jeff Law Jan. 25, 2017, 9:43 p.m. UTC | #1
On 01/25/2017 02:06 PM, Jakub Jelinek wrote:
> Hi!
>
> This patch adds a little optimization, if %st and %st(1) are results of
> memory loads and we need to exchange them, it is shorter (and on older
> machines probably also cheaper) to swap the two loads.
>
> This triggers 26783 in x86_64-linux and i686-linux bootstraps+regtests.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> Initially I thought I should also adjust debug insns in between i1 and i2
> if they refer to i387 stack registers, but further look at regstack shows
> that nothing does that and that such debug insns are wrong-debug for
> multiple reasons.  In particular, e.g. emit_swap_insn alone emits the
> fxch right after i1 or at the start of bb, without trying to adjust
> debug insns in between that and insn (we should swap %st with the other reg
> in those).  More importantly, at least if the debug regnos actually mean
> what they stand in the assembly (i.e. the first one %st, the second one
> %st(1) etc.), then we would need to ensure that every insn that pops or
> pushes anything onto the i387 stack should update all DWARF expressions
> that refer to those registers.  If that is really the case, the easiest
> solution for this might be either to reset all debug insns that refer to %st
> to %st(7) registers (easy), or replace all those registers in debug insns
> with debug temporaries (8 debug temporaries for each pre-regstack register)
> and then on i387 pushes or pops emit debug insns for those temporaries,
> changing how they are mapped to the actual hw registers and let
> var-tracking.c handle the rest.  I guess I should file a PR about this.
>
> 2017-01-25  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR target/70465
> 	* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
> 	emit fld b; fld a; if possible.
>
> 	* gcc.target/i386/pr70465.c: New test.
So please comment on the general approach you're taking here.  I have a 
pretty good sense of what you're doing, mostly because I pondered 
something similar.  But I doubt others coming across the code would see 
the overall structure as quickly.


>
> --- gcc/reg-stack.c.jj	2017-01-25 11:38:54.924320927 +0100
> +++ gcc/reg-stack.c	2017-01-25 12:12:08.459045376 +0100
> @@ -887,6 +887,52 @@ emit_swap_insn (rtx_insn *insn, stack_pt
>  	  && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG
>  	  && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX)
>  	return;
> +
> +      if (REG_P (i1dest)
> +	  && REGNO (i1dest) == FIRST_STACK_REG
> +	  && MEM_P (SET_SRC (i1set))
> +	  && !side_effects_p (SET_SRC (i1set))
> +	  && hard_regno == FIRST_STACK_REG + 1
> +	  && i1 != BB_HEAD (current_block))
So I'd bring that inner comment to before this conditional.  It gives 
the motivation for the entire block of code.


> +	{
> +	  rtx_insn *i2 = NULL;
> +	  rtx i2set;
> +	  rtx_insn *tmp = PREV_INSN (i1);
> +	  rtx_insn *limit = PREV_INSN (BB_HEAD (current_block));
> +	  while (tmp != limit)
> +	    {
> +	      if (LABEL_P (tmp)
> +		  || CALL_P (tmp)
> +		  || NOTE_INSN_BASIC_BLOCK_P (tmp)
> +		  || (NONJUMP_INSN_P (tmp)
> +		      && stack_regs_mentioned (tmp)))
> +		{
> +		  i2 = tmp;
> +		  break;
> +		}
> +	      tmp = PREV_INSN (tmp);
> +	    }
So a comment before the while loop.  You're basically looking for the 
previous insn (relative to I1) that involves stack regs within the same 
block.  It might also be worth noting that I1 is known to push a value 
onto the FP register stack.



> +	  if (i2 != NULL_RTX
> +	      && (i2set = single_set (i2)) != NULL_RTX)
> +	    {
> +	      /* Instead of fld a; fld b; fxch %st(1); just
> +		 use fld b; fld a; if possible.  */
> +	      rtx i2dest = *get_true_reg (&SET_DEST (i2set));
> +	      if (REG_P (i2dest)
> +		  && REGNO (i2dest) == FIRST_STACK_REG
> +		  && MEM_P (SET_SRC (i2set))
> +		  && !side_effects_p (SET_SRC (i2set))
> +		  && !modified_between_p (SET_SRC (i1set), i2, i1))
And here we're trying to verify that the insn found above (I2) is 
pushing another value onto the FP register stack and that the value in 
I2 is not modified between I1 and I2.

So ISTM with just some comment improvements, this is fine for the trunk.

jeff
diff mbox

Patch

--- gcc/reg-stack.c.jj	2017-01-25 11:38:54.924320927 +0100
+++ gcc/reg-stack.c	2017-01-25 12:12:08.459045376 +0100
@@ -887,6 +887,52 @@  emit_swap_insn (rtx_insn *insn, stack_pt
 	  && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG
 	  && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX)
 	return;
+
+      if (REG_P (i1dest)
+	  && REGNO (i1dest) == FIRST_STACK_REG
+	  && MEM_P (SET_SRC (i1set))
+	  && !side_effects_p (SET_SRC (i1set))
+	  && hard_regno == FIRST_STACK_REG + 1
+	  && i1 != BB_HEAD (current_block))
+	{
+	  rtx_insn *i2 = NULL;
+	  rtx i2set;
+	  rtx_insn *tmp = PREV_INSN (i1);
+	  rtx_insn *limit = PREV_INSN (BB_HEAD (current_block));
+	  while (tmp != limit)
+	    {
+	      if (LABEL_P (tmp)
+		  || CALL_P (tmp)
+		  || NOTE_INSN_BASIC_BLOCK_P (tmp)
+		  || (NONJUMP_INSN_P (tmp)
+		      && stack_regs_mentioned (tmp)))
+		{
+		  i2 = tmp;
+		  break;
+		}
+	      tmp = PREV_INSN (tmp);
+	    }
+	  if (i2 != NULL_RTX
+	      && (i2set = single_set (i2)) != NULL_RTX)
+	    {
+	      /* Instead of fld a; fld b; fxch %st(1); just
+		 use fld b; fld a; if possible.  */
+	      rtx i2dest = *get_true_reg (&SET_DEST (i2set));
+	      if (REG_P (i2dest)
+		  && REGNO (i2dest) == FIRST_STACK_REG
+		  && MEM_P (SET_SRC (i2set))
+		  && !side_effects_p (SET_SRC (i2set))
+		  && !modified_between_p (SET_SRC (i1set), i2, i1))
+		{
+		  remove_insn (i1);
+		  SET_PREV_INSN (i1) = NULL_RTX;
+		  SET_NEXT_INSN (i1) = NULL_RTX;
+		  set_block_for_insn (i1, NULL);
+		  emit_insn_before (i1, i2);
+		  return;
+		}
+	    }
+	}
     }
 
   /* Avoid emitting the swap if this is the first register stack insn
--- gcc/testsuite/gcc.target/i386/pr70465.c.jj	2017-01-25 12:24:25.183041154 +0100
+++ gcc/testsuite/gcc.target/i386/pr70465.c	2017-01-25 12:23:59.000000000 +0100
@@ -0,0 +1,12 @@ 
+/* PR target/70465 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -mfpmath=387 -fomit-frame-pointer" } */
+/* { dg-final { scan-assembler-not "fxch\t%st.1" } } */
+
+double
+atan2 (double y, double x)
+{
+  double res = 0.0;
+  asm ("fpatan" : "=t" (res) : "u" (y), "0" (x) : "st(1)");
+  return res;
+}