Message ID | 20170125210604.GS1867@tucnak |
---|---|
State | New |
Headers | show |
On 01/25/2017 02:06 PM, Jakub Jelinek wrote: > Hi! > > This patch adds a little optimization, if %st and %st(1) are results of > memory loads and we need to exchange them, it is shorter (and on older > machines probably also cheaper) to swap the two loads. > > This triggers 26783 in x86_64-linux and i686-linux bootstraps+regtests. > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Initially I thought I should also adjust debug insns in between i1 and i2 > if they refer to i387 stack registers, but further look at regstack shows > that nothing does that and that such debug insns are wrong-debug for > multiple reasons. In particular, e.g. emit_swap_insn alone emits the > fxch right after i1 or at the start of bb, without trying to adjust > debug insns in between that and insn (we should swap %st with the other reg > in those). More importantly, at least if the debug regnos actually mean > what they stand in the assembly (i.e. the first one %st, the second one > %st(1) etc.), then we would need to ensure that every insn that pops or > pushes anything onto the i387 stack should update all DWARF expressions > that refer to those registers. If that is really the case, the easiest > solution for this might be either to reset all debug insns that refer to %st > to %st(7) registers (easy), or replace all those registers in debug insns > with debug temporaries (8 debug temporaries for each pre-regstack register) > and then on i387 pushes or pops emit debug insns for those temporaries, > changing how they are mapped to the actual hw registers and let > var-tracking.c handle the rest. I guess I should file a PR about this. > > 2017-01-25 Jakub Jelinek <jakub@redhat.com> > > PR target/70465 > * reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1); > emit fld b; fld a; if possible. > > * gcc.target/i386/pr70465.c: New test. So please comment on the general approach you're taking here. I have a pretty good sense of what you're doing, mostly because I pondered something similar. But I doubt others coming across the code would see the overall structure as quickly. > > --- gcc/reg-stack.c.jj 2017-01-25 11:38:54.924320927 +0100 > +++ gcc/reg-stack.c 2017-01-25 12:12:08.459045376 +0100 > @@ -887,6 +887,52 @@ emit_swap_insn (rtx_insn *insn, stack_pt > && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG > && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX) > return; > + > + if (REG_P (i1dest) > + && REGNO (i1dest) == FIRST_STACK_REG > + && MEM_P (SET_SRC (i1set)) > + && !side_effects_p (SET_SRC (i1set)) > + && hard_regno == FIRST_STACK_REG + 1 > + && i1 != BB_HEAD (current_block)) So I'd bring that inner comment to before this conditional. It gives the motivation for the entire block of code. > + { > + rtx_insn *i2 = NULL; > + rtx i2set; > + rtx_insn *tmp = PREV_INSN (i1); > + rtx_insn *limit = PREV_INSN (BB_HEAD (current_block)); > + while (tmp != limit) > + { > + if (LABEL_P (tmp) > + || CALL_P (tmp) > + || NOTE_INSN_BASIC_BLOCK_P (tmp) > + || (NONJUMP_INSN_P (tmp) > + && stack_regs_mentioned (tmp))) > + { > + i2 = tmp; > + break; > + } > + tmp = PREV_INSN (tmp); > + } So a comment before the while loop. You're basically looking for the previous insn (relative to I1) that involves stack regs within the same block. It might also be worth noting that I1 is known to push a value onto the FP register stack. > + if (i2 != NULL_RTX > + && (i2set = single_set (i2)) != NULL_RTX) > + { > + /* Instead of fld a; fld b; fxch %st(1); just > + use fld b; fld a; if possible. */ > + rtx i2dest = *get_true_reg (&SET_DEST (i2set)); > + if (REG_P (i2dest) > + && REGNO (i2dest) == FIRST_STACK_REG > + && MEM_P (SET_SRC (i2set)) > + && !side_effects_p (SET_SRC (i2set)) > + && !modified_between_p (SET_SRC (i1set), i2, i1)) And here we're trying to verify that the insn found above (I2) is pushing another value onto the FP register stack and that the value in I2 is not modified between I1 and I2. So ISTM with just some comment improvements, this is fine for the trunk. jeff
--- gcc/reg-stack.c.jj 2017-01-25 11:38:54.924320927 +0100 +++ gcc/reg-stack.c 2017-01-25 12:12:08.459045376 +0100 @@ -887,6 +887,52 @@ emit_swap_insn (rtx_insn *insn, stack_pt && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX) return; + + if (REG_P (i1dest) + && REGNO (i1dest) == FIRST_STACK_REG + && MEM_P (SET_SRC (i1set)) + && !side_effects_p (SET_SRC (i1set)) + && hard_regno == FIRST_STACK_REG + 1 + && i1 != BB_HEAD (current_block)) + { + rtx_insn *i2 = NULL; + rtx i2set; + rtx_insn *tmp = PREV_INSN (i1); + rtx_insn *limit = PREV_INSN (BB_HEAD (current_block)); + while (tmp != limit) + { + if (LABEL_P (tmp) + || CALL_P (tmp) + || NOTE_INSN_BASIC_BLOCK_P (tmp) + || (NONJUMP_INSN_P (tmp) + && stack_regs_mentioned (tmp))) + { + i2 = tmp; + break; + } + tmp = PREV_INSN (tmp); + } + if (i2 != NULL_RTX + && (i2set = single_set (i2)) != NULL_RTX) + { + /* Instead of fld a; fld b; fxch %st(1); just + use fld b; fld a; if possible. */ + rtx i2dest = *get_true_reg (&SET_DEST (i2set)); + if (REG_P (i2dest) + && REGNO (i2dest) == FIRST_STACK_REG + && MEM_P (SET_SRC (i2set)) + && !side_effects_p (SET_SRC (i2set)) + && !modified_between_p (SET_SRC (i1set), i2, i1)) + { + remove_insn (i1); + SET_PREV_INSN (i1) = NULL_RTX; + SET_NEXT_INSN (i1) = NULL_RTX; + set_block_for_insn (i1, NULL); + emit_insn_before (i1, i2); + return; + } + } + } } /* Avoid emitting the swap if this is the first register stack insn --- gcc/testsuite/gcc.target/i386/pr70465.c.jj 2017-01-25 12:24:25.183041154 +0100 +++ gcc/testsuite/gcc.target/i386/pr70465.c 2017-01-25 12:23:59.000000000 +0100 @@ -0,0 +1,12 @@ +/* PR target/70465 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -mfpmath=387 -fomit-frame-pointer" } */ +/* { dg-final { scan-assembler-not "fxch\t%st.1" } } */ + +double +atan2 (double y, double x) +{ + double res = 0.0; + asm ("fpatan" : "=t" (res) : "u" (y), "0" (x) : "st(1)"); + return res; +}