[RFC] Fixing instability of -fschedule-insns for x86

Submitted by Uros Bizjak on Sept. 4, 2012, 5:34 p.m.


Message ID CAFULd4YdfQ_-auaqJNZNbmKVmL0k_fTVO-Bmxx6z4KOT6WewsQ@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak Sept. 4, 2012, 5:34 p.m.
On Mon, Aug 13, 2012 at 9:39 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:

> Main idea of this activity is mostly to provide user a possibility to
> safely turn on first scheduler for his codes. In some cases this could
> positively affect performance, especially for in-order Atom.
> It would be great to hear some feedback from the community about the change.

I don't think it is necessary to set dependence for CALL_INSN
arguments. It seems to me, that it is enough to set scheduling
priority of moves to hard registers to zero, to schedule them as late
as possible, presumably just before call insn.

The attached patch builds on your idea of setting priorities of moves
from hard registers to pseudos to maximum (these are moves from
function arguments, they should be scheduled as soon as possible to
free hard registers). Please note that it is enough to handle only
likely spilled hard regs (for moves from and to registers), since
these regs are causing all the troubles.

The patch assumes that likely spilled hard regs didn't propagate to
other instructions, and that other hard registers didn't propagate to
operands with "wrong" constraints (recent x86 improvement).

Unfortunately, the patch doesn't fix PR 54472 (the spill failure with
selective scheduler). No matter what TARGET_SCHED_ADJUST_PRIORITY
returns, the offending move to ax register always get scheduled before
problematic string instruction. The patch however builds on "promise"
from the documentation that:


     This hook adjusts the integer scheduling priority PRIORITY of
     INSN.  It should return the new priority.  Increase the priority to
     execute INSN earlier, reduce the priority to execute INSN later.
     Do not define this hook if you do not need to adjust the
     scheduling priorities of insns.

The patch is in RFC state, but survives quite some -fschedule-insns
testing on current mainline, with and without added -fsched-pressure.


Patch hide | download patch | download mbox

Index: i386.c
--- i386.c	(revision 190932)
+++ i386.c	(working copy)
@@ -24314,6 +24314,49 @@  ix86_sched_reorder(FILE *dump, int sched_verbose,
   return issue_rate;
+/* Before reload, adjust priority of moves to/from likely spilled
+   hard registers.  This reduces hard register life times and consequently
+   the chance of spill failures for enclosed instructions.  */
+static int
+ix86_adjust_priority (rtx insn, int priority)
+  rtx set;
+  if (reload_completed)
+    return priority;
+  if (!NONJUMP_INSN_P (insn))
+    return priority;
+  set = single_set (insn);
+  if (set)
+    {
+      rtx tmp;
+      /* Set priority of moves from likely spilled hard registers to maximum,
+	 to schedule them as soon as possible.  These are moves from
+	 function argument registers at the top of the function entry.  */
+      tmp = SET_SRC (set);
+      if (REG_P (tmp)
+	  && HARD_REGISTER_P (tmp)
+	  && ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (tmp))
+	  && targetm.class_likely_spilled_p (REGNO_REG_CLASS (REGNO (tmp))))
+	return current_sched_info->sched_max_insns_priority;
+      /* Set priority of moves to likely spilled hard registers to minimum,
+	 to schedule them as late as possible.  These are moves to
+	 function argument registers before function call.  */
+      tmp = SET_DEST (set);
+      if (REG_P (tmp)
+	  && HARD_REGISTER_P (tmp)
+	  && ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (tmp))
+	  && targetm.class_likely_spilled_p (REGNO_REG_CLASS (REGNO (tmp))))
+	return 0;
+    }
+  return priority;
 /* Model decoder of Core 2/i7.
@@ -39608,6 +39651,8 @@  ix86_enum_va_list (int idx, const char **pname, tr
 #define TARGET_SCHED_REASSOCIATION_WIDTH ix86_reassociation_width
 #define TARGET_SCHED_REORDER ix86_sched_reorder
+#define TARGET_SCHED_ADJUST_PRIORITY ix86_adjust_priority
 /* The size of the dispatch window is the total number of bytes of
    object code allowed in a window.  */