diff mbox

C6X port 5/11: Track predication conditions more accurately

Message ID 4DC95BA3.504@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt May 10, 2011, 3:37 p.m. UTC
The scheduler knows that insns with different COND_EXEC conditions don't
conflict and can be scheduled independently. Unfortunately, sched-deps.c
does not try to keep the conditions valid as it progresses. For example,

[B0] A0 = [A1]
     B0 = something
[!B0] [A2] = A0

The first and third insns have opposite conditions, so the scheduler
decides they are independent. For most targets this isn't a problem, as
the insn in the middle will produce enough dependencies to ensure the
right order. However, on C6X, order alone isn't sufficient due to the
exposed pipeline: we also need to ensure that the latencies are observed.


Bernd
* sched-int.h (struct _haifa_deps_insn_data): New members cond
	and reverse_cond.
	(INSN_COND, INSN_REVERSE_COND): New macros.
	* sched-deps.c (deps_analyze_insn): Call sched_get_condition_with_rev
	once.
	(sched_get_condition_with_rev): Cache the results, and look them up
	if possible.
	(sched_analyze_insn): Destroy INSN_COND of previous insns if they
	are clobbered by the current insn.

Comments

Alexander Monakov May 11, 2011, 10:45 a.m. UTC | #1
On Tue, 10 May 2011, Bernd Schmidt wrote:

> The scheduler knows that insns with different COND_EXEC conditions don't
> conflict and can be scheduled independently. Unfortunately, sched-deps.c
> does not try to keep the conditions valid as it progresses. For example,
> 
> [B0] A0 = [A1]
>      B0 = something
> [!B0] [A2] = A0
> 
> The first and third insns have opposite conditions, so the scheduler
> decides they are independent. For most targets this isn't a problem, as
> the insn in the middle will produce enough dependencies to ensure the
> right order. However, on C6X, order alone isn't sufficient due to the
> exposed pipeline: we also need to ensure that the latencies are observed.
> 

> @@ -2871,6 +2895,21 @@ sched_analyze_insn (struct deps_desc *de
>  	      }
>  	  }
>  
> +      INIT_REG_SET (&set_or_clobbered);
> +      bitmap_ior (&set_or_clobbered, reg_pending_clobbers, reg_pending_sets);
> +      EXECUTE_IF_SET_IN_REG_SET (&set_or_clobbered, 0, i, rsi)
> +	{
> +	  struct deps_reg *reg_last = &deps->reg_last[i];
> +	  rtx list;
> +	  for (list = reg_last->uses; list; list = XEXP (list, 1))
> +	    {
> +	      rtx other = XEXP (list, 0);
> +	      if (INSN_COND (other) != const_true_rtx
> +		  && refers_to_regno_p (i, i + 1, INSN_COND (other), NULL))
> +		INSN_COND (other) = const_true_rtx;
> +	    }
> +	}
> +
>        /* If the current insn is conditional, we can't free any
>  	 of the lists.  */
>        if (sched_has_condition_p (insn))           

Could the above be conditional on whether the target CPU is exposed-pipeline?
I'm concerned this may degrade scheduling for other targets in some cases.

I wonder if sharing predicate RTXes between setters and users could also be a
solution.  I even thought (until moments ago) GCC was already doing that, but
apparently I was mistaken, ifcvt.c explicitly does copy_rtx.

Thanks.

Alexander
Steve Ellcey May 31, 2011, 6:24 p.m. UTC | #2
Bernd,

This patch (r174336) is causing me many testsuite failures on IA64.
Tests like gcc.c-torture/compile/20010408-1.c are dying with a 
seg fault in vinsn_detach.

#0  0x55b8760:0 in vinsn_detach (vi=0xf)
#1  0x55bda30:0 in clear_expr (expr=0x7fffeef0)
#2  0x562ea50:0 in schedule_expr_on_boundary (bnd=0x4040cf14, 
    expr_vliw=0x4040c4ac, seqno=-1)
#3  0x562fab0:0 in fill_insns (fence=0x4040bdec, seqno=-1, 
    scheduled_insns_tailpp=0x7fffeff0)
#4  0x563d3b0:0 in schedule_on_fences (fences=0x4040bde8, max_seqno=1, 
    scheduled_insns_tailpp=0x7fffeff0)
#5  0x563eb70:0 in sel_sched_region_2 (orig_max_seqno=22)
#6  0x563f1b0:0 in sel_sched_region_1 ()
#7  0x56403f0:0 in sel_sched_region (rgn=9)
#8  0x5640980:0 in run_selective_scheduling ()
#9  0x613d0f0:0 in ia64_reorg ()

I am still trying to get more information but I was wondering if you
have already seen this problem or if anyone else has reported it?

Steve Ellcey
sje@cup.hp.com
Andrey Belevantsev May 31, 2011, 7:59 p.m. UTC | #3
On 31.05.2011 22:24, Steve Ellcey wrote:
> Bernd,
>
> This patch (r174336) is causing me many testsuite failures on IA64.
> Tests like gcc.c-torture/compile/20010408-1.c are dying with a
> seg fault in vinsn_detach.
I will look at it tomorrow.  Bernd, Steve, please let us know about any 
issues with sel-sched code so we can help.

Andrey

>
> #0  0x55b8760:0 in vinsn_detach (vi=0xf)
> #1  0x55bda30:0 in clear_expr (expr=0x7fffeef0)
> #2  0x562ea50:0 in schedule_expr_on_boundary (bnd=0x4040cf14,
>      expr_vliw=0x4040c4ac, seqno=-1)
> #3  0x562fab0:0 in fill_insns (fence=0x4040bdec, seqno=-1,
>      scheduled_insns_tailpp=0x7fffeff0)
> #4  0x563d3b0:0 in schedule_on_fences (fences=0x4040bde8, max_seqno=1,
>      scheduled_insns_tailpp=0x7fffeff0)
> #5  0x563eb70:0 in sel_sched_region_2 (orig_max_seqno=22)
> #6  0x563f1b0:0 in sel_sched_region_1 ()
> #7  0x56403f0:0 in sel_sched_region (rgn=9)
> #8  0x5640980:0 in run_selective_scheduling ()
> #9  0x613d0f0:0 in ia64_reorg ()
>
> I am still trying to get more information but I was wondering if you
> have already seen this problem or if anyone else has reported it?
>
> Steve Ellcey
> sje@cup.hp.com
>
Steve Ellcey May 31, 2011, 8:43 p.m. UTC | #4
On Tue, 2011-05-31 at 23:59 +0400, Andrey Belevantsev wrote:
> On 31.05.2011 22:24, Steve Ellcey wrote:
> > Bernd,
> >
> > This patch (r174336) is causing me many testsuite failures on IA64.
> > Tests like gcc.c-torture/compile/20010408-1.c are dying with a
> > seg fault in vinsn_detach.
> I will look at it tomorrow.  Bernd, Steve, please let us know about any 
> issues with sel-sched code so we can help.
> 
> Andrey

Has anything been decided about how to fix PR 48673?  I am still seeing
gfortran.dg/sms-1.f and sms-2.f fail on IA64.

Steve Ellcey
sje@cup.hp.com
Andrey Belevantsev June 1, 2011, 8:18 a.m. UTC | #5
On 31.05.2011 23:59, Andrey Belevantsev wrote:
>
> On 31.05.2011 22:24, Steve Ellcey wrote:
>> Bernd,
>>
>> This patch (r174336) is causing me many testsuite failures on IA64.
>> Tests like gcc.c-torture/compile/20010408-1.c are dying with a
>> seg fault in vinsn_detach.
> I will look at it tomorrow. Bernd, Steve, please let us know about any
> issues with sel-sched code so we can help.
I cannot reproduce this with today's trunk with a cross either to 
ia64-linux or ia64-hpux, can you give me a test case with compiler options 
etc.?

Andrey

>
> Andrey
>
>>
>> #0 0x55b8760:0 in vinsn_detach (vi=0xf)
>> #1 0x55bda30:0 in clear_expr (expr=0x7fffeef0)
>> #2 0x562ea50:0 in schedule_expr_on_boundary (bnd=0x4040cf14,
>> expr_vliw=0x4040c4ac, seqno=-1)
>> #3 0x562fab0:0 in fill_insns (fence=0x4040bdec, seqno=-1,
>> scheduled_insns_tailpp=0x7fffeff0)
>> #4 0x563d3b0:0 in schedule_on_fences (fences=0x4040bde8, max_seqno=1,
>> scheduled_insns_tailpp=0x7fffeff0)
>> #5 0x563eb70:0 in sel_sched_region_2 (orig_max_seqno=22)
>> #6 0x563f1b0:0 in sel_sched_region_1 ()
>> #7 0x56403f0:0 in sel_sched_region (rgn=9)
>> #8 0x5640980:0 in run_selective_scheduling ()
>> #9 0x613d0f0:0 in ia64_reorg ()
>>
>> I am still trying to get more information but I was wondering if you
>> have already seen this problem or if anyone else has reported it?
>>
>> Steve Ellcey
>> sje@cup.hp.com
>>
Steve Ellcey June 1, 2011, 3:40 p.m. UTC | #6
On Wed, 2011-06-01 at 12:18 +0400, Andrey Belevantsev wrote:
> On 31.05.2011 23:59, Andrey Belevantsev wrote:
> >
> > On 31.05.2011 22:24, Steve Ellcey wrote:
> >> Bernd,
> >>
> >> This patch (r174336) is causing me many testsuite failures on IA64.
> >> Tests like gcc.c-torture/compile/20010408-1.c are dying with a
> >> seg fault in vinsn_detach.
> > I will look at it tomorrow. Bernd, Steve, please let us know about any
> > issues with sel-sched code so we can help.
> I cannot reproduce this with today's trunk with a cross either to 
> ia64-linux or ia64-hpux, can you give me a test case with compiler options 
> etc.?
> 
> Andrey

gcc.c-torture/compile/20010408-1.c was the test case that the stack
trace was from.  That test failed on IA64 HP-UX with just the -O3
option.  It looks like that passes in 64 bit mode though (-mlp64) so
that is probably why it also doesn't fail on Linux.

It looks like the failures on IA64 Linux require -g or
-fomit-frame-pointer in addition to -O3 in order to have the failure.

So for example, gcc.c-torture/execute/20020402-3.c, fails on IA64
Linux with -O3 -fomit-frame-pointer or with -O3 -g.

Steve Ellcey
sje@cup.hp.com
diff mbox

Patch

Index: gcc/sched-deps.c
===================================================================
--- gcc.orig/sched-deps.c
+++ gcc/sched-deps.c
@@ -489,13 +489,27 @@  deps_may_trap_p (const_rtx mem)
 
 /* Find the condition under which INSN is executed.  If REV is not NULL,
    it is set to TRUE when the returned comparison should be reversed
-   to get the actual condition.  */
+   to get the actual condition.
+   We only do actual work the first time we come here for an insn; the
+   results are cached in INSN_COND and INSN_REVERSE_COND.  */
 static rtx
 sched_get_condition_with_rev (const_rtx insn, bool *rev)
 {
   rtx pat = PATTERN (insn);
   rtx src;
 
+  if (INSN_COND (insn) == const_true_rtx)
+    return NULL_RTX;
+
+  if (INSN_COND (insn) != NULL_RTX)
+    {
+      if (rev)
+	*rev = INSN_REVERSE_COND (insn);
+      return INSN_COND (insn);
+    }
+
+  INSN_COND (insn) = const_true_rtx;
+  INSN_REVERSE_COND (insn) = false;
   if (pat == 0)
     return 0;
 
@@ -503,7 +517,10 @@  sched_get_condition_with_rev (const_rtx
     *rev = false;
 
   if (GET_CODE (pat) == COND_EXEC)
-    return COND_EXEC_TEST (pat);
+    {
+      INSN_COND (insn) = COND_EXEC_TEST (pat);
+      return COND_EXEC_TEST (pat);
+    }
 
   if (!any_condjump_p (insn) || !onlyjump_p (insn))
     return 0;
@@ -511,7 +528,10 @@  sched_get_condition_with_rev (const_rtx
   src = SET_SRC (pc_set (insn));
 
   if (XEXP (src, 2) == pc_rtx)
-    return XEXP (src, 0);
+    {
+      INSN_COND (insn) = XEXP (src, 0);
+      return XEXP (src, 0);
+    }
   else if (XEXP (src, 1) == pc_rtx)
     {
       rtx cond = XEXP (src, 0);
@@ -522,6 +542,8 @@  sched_get_condition_with_rev (const_rtx
 
       if (rev)
 	*rev = true;
+      INSN_COND (insn) = cond;
+      INSN_REVERSE_COND (insn) = true;
       return cond;
     }
 
@@ -2841,6 +2863,8 @@  sched_analyze_insn (struct deps_desc *de
     }
   else
     {
+      regset_head set_or_clobbered;
+
       EXECUTE_IF_SET_IN_REG_SET (reg_pending_uses, 0, i, rsi)
 	{
 	  struct deps_reg *reg_last = &deps->reg_last[i];
@@ -2871,6 +2895,21 @@  sched_analyze_insn (struct deps_desc *de
 	      }
 	  }
 
+      INIT_REG_SET (&set_or_clobbered);
+      bitmap_ior (&set_or_clobbered, reg_pending_clobbers, reg_pending_sets);
+      EXECUTE_IF_SET_IN_REG_SET (&set_or_clobbered, 0, i, rsi)
+	{
+	  struct deps_reg *reg_last = &deps->reg_last[i];
+	  rtx list;
+	  for (list = reg_last->uses; list; list = XEXP (list, 1))
+	    {
+	      rtx other = XEXP (list, 0);
+	      if (INSN_COND (other) != const_true_rtx
+		  && refers_to_regno_p (i, i + 1, INSN_COND (other), NULL))
+		INSN_COND (other) = const_true_rtx;
+	    }
+	}
+
       /* If the current insn is conditional, we can't free any
 	 of the lists.  */
       if (sched_has_condition_p (insn))
@@ -3245,6 +3284,10 @@  deps_analyze_insn (struct deps_desc *dep
   if (sched_deps_info->start_insn)
     sched_deps_info->start_insn (insn);
 
+  /* Record the condition for this insn.  */
+  if (NONDEBUG_INSN_P (insn))
+    sched_get_condition_with_rev (insn, NULL);
+
   if (NONJUMP_INSN_P (insn) || DEBUG_INSN_P (insn) || JUMP_P (insn))
     {
       /* Make each JUMP_INSN (but not a speculative check)
Index: gcc/sched-int.h
===================================================================
--- gcc.orig/sched-int.h
+++ gcc/sched-int.h
@@ -716,6 +716,17 @@  struct _haifa_deps_insn_data
      search in 'forw_deps'.  */
   deps_list_t resolved_forw_deps;
 
+  /* If the insn is conditional (either through COND_EXEC, or because
+     it is a conditional branch), this records the condition.  NULL
+     for insns that haven't been seen yet or don't have a condition;
+     const_true_rtx to mark an insn without a condition, or with a
+     condition that has been clobbered by a subsequent insn.  */
+  rtx cond;
+
+  /* True if the condition in 'cond' should be reversed to get the actual
+     condition.  */
+  unsigned int reverse_cond : 1;
+
   /* Some insns (e.g. call) are not allowed to move across blocks.  */
   unsigned int cant_move : 1;
 };
@@ -891,6 +902,8 @@  extern VEC(haifa_deps_insn_data_def, hea
 #define INSN_RESOLVED_FORW_DEPS(INSN) (HDID (INSN)->resolved_forw_deps)
 #define INSN_HARD_BACK_DEPS(INSN) (HDID (INSN)->hard_back_deps)
 #define INSN_SPEC_BACK_DEPS(INSN) (HDID (INSN)->spec_back_deps)
+#define INSN_COND(INSN)	(HDID (INSN)->cond)
+#define INSN_REVERSE_COND(INSN) (HDID (INSN)->reverse_cond)
 #define CANT_MOVE(INSN)	(HDID (INSN)->cant_move)
 #define CANT_MOVE_BY_LUID(LUID)	(VEC_index (haifa_deps_insn_data_def, h_d_i_d, \
                                             LUID)->cant_move)