[SMS,DDG] Correctly delete anti-dep edges for renaming

Message ID CAJnFk2eGjN4ELUXxE4A74AsyR73AxAJNxDhdcRwRHeSvJdHWSA@mail.gmail.com
State New
Headers show

Commit Message

zhroma@ispras.ru Dec. 7, 2011, 3:35 p.m.
This letter contains second separate patch for ddg.c
It prevents removing some edges in data dependency graph
when renaming is allowed.

2011/10/12 Ayal Zaks <ayal.zaks@gmail.com>:
>>> The other bug happens only with -fmodulo-sched-allow-regmoves.  Here we
>>> eliminate some anti-dependence edges in data dependency graph in order to
>>> resolve them later by adding some register moves (renaming instructions).  But
>>> in situation as in example above compiler gives an ICE because it can't create
>>> a register move, when regR is hardware flag register.  So we have to know which
>>> register(s) cause anti-dependency in order to understand whether we can ignore
>>> it.  I can't find any easy way to gather this information, so I create my own
>>> structures to store this info and had implemented my own hooks for
>>> sched_analyze function.  This leads to more complex interconnection between
>>> ddg.c and modulo-sched.c.
> Well, having ddg.c's/create_ddg_dep_from_intra_loop_link() consult
> "Register sets from modulo scheduler structures" to determine if an
> anti-dep can be eliminated, seems awkward. One should be able to build
> a ddg independent of any modulo scheduler structures.

I think now I found a proper way to gather information about
the register(s) causing anti-dependency (see the patch attached).

> One simple solution is to keep all anti-dep edges of registers that
> are clobbered anywhere in the loop. This might be overly conservative,
> but perhaps not so if "On x86_64 it is clobbered by most of arithmetic
> instructions".
> If an anti-dep between a use and a dep of a clobber register is to be
> removed, a dependence should be created between the use and a
> clobbering instruction following the def. Hope this is clear.
> This too should be solved by a dedicated (separate) patch, for easier digestion.
> Presumably, the ddg already has all intra-iteration edges preventing
> motion of clobbering instructions within an iteration, and we are only
> missing inter-iteration deps or deps surfaced by eliminating
> anti-deps, right?
> You might consider introducing a new type of dependence for such
> edges, "Clobber", if it would help.

Ayal, I suppose here we have some misunderstanding.
This is not directly related to the problem of clobbers discussed previously.
Here I talk about the need to find out the register causing the
anti-dependence (for instance, we need to check whether it can be renamed).
Currently, SMS simply attempts to take the RHS of the second instruction (via
single_set()), and if it's a register, SMS assumes it's a register causing the
dependency. This breaks in a following scenario:

insn1: ... (read flags)
insn2: regA = regB - regC; update flags

Here, SMS would wrongly take regA as the dependency register, while it was in
fact the flags register. So, I rewrote this part in a different way and make
a separate patch.

Roman Zhuykov


2011-12-07  Roman Zhuykov  <zhroma@ispras.ru>
	* ddg.c (create_ddg_dep_from_intra_loop_link): Correclty determine
	which register causes anti-dependency.  Prevent removing anti-dep
	edge from ddg when hard_register_p.

diff --git a/gcc/ddg.c b/gcc/ddg.c
index 2b1cfe8..0a3726f 100644
--- a/gcc/ddg.c
+++ b/gcc/ddg.c
@@ -204,23 +204,34 @@  create_ddg_dep_from_intra_loop_link (ddg_ptr g, ddg_node_ptr src_node,
       && (t == ANTI_DEP && dt == REG_DEP)
       && !autoinc_var_is_used_p (dest_node->insn, src_node->insn))
-      rtx set;
+      df_ref *def_rec;
+      bool can_delete_dep = true;
-      set = single_set (dest_node->insn);
-      /* TODO: Handle registers that REG_P is not true for them, i.e.
+      /* TODO: Handle !REG_P register correctly, i.e.
          subregs and special registers.  */
-      if (set && REG_P (SET_DEST (set)))
-        {
-          int regno = REGNO (SET_DEST (set));
-          df_ref first_def;
-          struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb);
+      for (def_rec = DF_INSN_DEFS (dest_node->insn); *def_rec; def_rec++)
+	{
+	  df_ref first_def, def = *def_rec, *use_rec;
+	  unsigned regno = DF_REF_REGNO (def);
+	  struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb);
+	  bool src_uses = false;
-          first_def = df_bb_regno_first_def_find (g->bb, regno);
-          gcc_assert (first_def);
+	  for (use_rec = DF_INSN_USES (src_node->insn); *use_rec; use_rec++)
+	    if (regno == DF_REF_REGNO (*use_rec))
+	      src_uses = true;
-          if (bitmap_bit_p (&bb_info->gen, DF_REF_ID (first_def)))
-            return;
-        }
+	  if (!src_uses)
+	    continue;
+	  first_def = df_bb_regno_first_def_find (g->bb, regno);
+	  gcc_assert (first_def);
+	  if (HARD_REGISTER_NUM_P (regno)
+	      || !bitmap_bit_p (&bb_info->gen, DF_REF_ID (first_def)))
+	    can_delete_dep = false;
+	}
+      if (can_delete_dep)
+	return;
    latency = dep_cost (link);