Patchwork [SMS] Fix violation of memory dependence

login
register
mail settings
Submitter Revital Eres
Date June 13, 2011, 7:29 a.m.
Message ID <BANLkTi=a=dUCO_3LyEcP1HufCyOE-r0CsA@mail.gmail.com>
Download mbox | patch
Permalink /patch/100137/
State New
Headers show

Comments

Revital Eres - June 13, 2011, 7:29 a.m.
Hello,

The attached patch fixes violation of memory dependencies. The
problematic scenario happens when -fmodulo-sched-allow-regmoves flag
is set and certain anti-dep edges are not created.

For example, consider the following three instructions and the edges
between them.  When -fmodulo-sched-allow-regmoves is set the edge (63 -
Anti, 0 -> 64) is not created. (probably due to transitivity)

Insn 63)  r168 = MEM[176]
Out edges: (63 - Anti, 0 -> 64)
In edges: (64 - True, 1 -> 63), (68 - True, 1 -> 63)

insn 64)  176 = 176 + 4
Out edges: (64 - True, 1 -> 63), (64 - True, 0-> 68)
In edges: (63 - Anti, 0 -> 64)

insn 68)  MEM[176 – 4] =  193
Out edges: (68 - True, 1 -> 63)
In edges: (64 - True, 0-> 68)

This anti-dep edge is on the path from one memory instruction to another
--- from 63 to 68; such that removing the edge caused a violation of
the memory dependencies as insn 63 was scheduled after insn 68.

This patch adds intra edges between every two memory instructions in
this case.  It fixes recent bootstrap failure on ARM. (with SMS flags)

The patch was tested as follows:
On ppc64-redhat-linux regtest as well as bootstrap with SMS flags
enabling SMS also on loops with stage count 1.  Regtested on SPU.
On arm-linux-gnueabi bootstrap c language with SMS
flags enabling SMS also on loops with stage count 1
and currently regression testing on c,c++.

OK for mainline once regtest on arm-linux-gnueabi completes?

Thanks,
Revital

Changelog:

gcc/
        * ddg.c (add_intra_loop_mem_dep): New function.
        (build_intra_loop_deps): Call it.

testsuite/
        * gcc.dg/sms-9.c: New file.
Ayal Zaks - June 14, 2011, 7:28 a.m.
Revital Eres <revital.eres@linaro.org> wrote on 13/06/2011 10:29:06 AM:

> From: Revital Eres <revital.eres@linaro.org>

> To: Ayal Zaks/Haifa/IBM@IBMIL

> Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patches@linaro.org>

> Date: 13/06/2011 10:29 AM

> Subject: [PATCH, SMS] Fix violation of memory dependence

>

> Hello,

>

> The attached patch fixes violation of memory dependencies. The

> problematic scenario happens when -fmodulo-sched-allow-regmoves flag

> is set and certain anti-dep edges are not created.

>

> For example, consider the following three instructions and the edges

> between them.  When -fmodulo-sched-allow-regmoves is set the edge (63 -

> Anti, 0 -> 64) is not created. (probably due to transitivity)

>

> Insn 63)  r168 = MEM[176]

> Out edges: (63 - Anti, 0 -> 64)

> In edges: (64 - True, 1 -> 63), (68 - True, 1 -> 63)

>

> insn 64)  176 = 176 + 4

> Out edges: (64 - True, 1 -> 63), (64 - True, 0-> 68)

> In edges: (63 - Anti, 0 -> 64)

>

> insn 68)  MEM[176 – 4] =  193

> Out edges: (68 - True, 1 -> 63)

> In edges: (64 - True, 0-> 68)

>

> This anti-dep edge is on the path from one memory instruction to another

> --- from 63 to 68; such that removing the edge caused a violation of

> the memory dependencies as insn 63 was scheduled after insn 68.

>

> This patch adds intra edges between every two memory instructions in

> this case.  It fixes recent bootstrap failure on ARM. (with SMS flags)

>

> The patch was tested as follows:

> On ppc64-redhat-linux regtest as well as bootstrap with SMS flags

> enabling SMS also on loops with stage count 1.  Regtested on SPU.

> On arm-linux-gnueabi bootstrap c language with SMS

> flags enabling SMS also on loops with stage count 1

> and currently regression testing on c,c++.

>

> OK for mainline once regtest on arm-linux-gnueabi completes?

>


Yes, this is a straightforward fix to a wrong-code bug, as discussed
offline. Other alternatives that might introduce less edges:
o connect predecessors of u with v, and u with successors of v, when
removing edge (u,v). Maybe there are other cases which rely on transitivity
(?).
o have a version of sched_analyze that avoids creating register anti-deps
to begin with, and thus will create memory-deps in the absence of
transitivity.


>>         * ddg.c (add_intra_loop_mem_dep): New function.


You could check first thing if (from->cuid == to->cuid), for code clarity.

Nice catch,
Ayal.


> Thanks,

> Revital

>

> Changelog:

>

> gcc/

>         * ddg.c (add_intra_loop_mem_dep): New function.

>         (build_intra_loop_deps): Call it.

>

> testsuite/

>         * gcc.dg/sms-9.c: New file.

> [attachment "patch_fix_regmoves_12_6.txt" deleted by Ayal Zaks/Haifa/IBM]
Revital Eres - June 14, 2011, 8:29 a.m.
Hello,

> Yes, this is a straightforward fix to a wrong-code bug, as discussed
> offline. Other alternatives that might introduce less edges:
> o connect predecessors of u with v, and u with successors of v, when
> removing edge (u,v). Maybe there are other cases which rely on transitivity
> (?).

Right. as discussed off-line I will further think if we are currently
cover all the cases.
>
>>>         * ddg.c (add_intra_loop_mem_dep): New function.
>
> You could check first thing if (from->cuid == to->cuid), for code clarity.

I will address this point separately and commit the current version of
the patch as is if that's OK.

Thanks,
Revital
Revital Eres - June 14, 2011, 9:09 a.m.
Hello,

>> You could check first thing if (from->cuid == to->cuid), for code clarity.
>
> I will address this point separately and commit the current version of
> the patch as is if that's OK.

Re-thinking about that, I'll prepare a new version of the patch which
addresses this and re-send it.

Sorry for the confusion,
Revital
Ayal Zaks - June 15, 2011, 10:18 p.m.
> OK for mainline once regtest on arm-linux-gnueabi completes?

ok.


+  else
+    {
+      if (mem_read_insn_p (to->insn))
+	return;
+      else

better do
   else if (!mem_read_insn_p (to->insn))

+	create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 0);
+    }

Ayal.


Revital Eres <revital.eres@linaro.org> wrote on 15/06/2011 11:45:15 AM:

> From: Revital Eres <revital.eres@linaro.org>
> To: Ayal Zaks/Haifa/IBM@IBMIL
> Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patches@linaro.org>
> Date: 15/06/2011 11:45 AM
> Subject: Re: [PATCH, SMS] Fix violation of memory dependence
>
> Hello,
>
> >>>         * ddg.c (add_intra_loop_mem_dep): New function.
> >
> > You could check first thing if (from->cuid == to->cuid), for code
clarity.
>
> Attached is the new version of the patch which addresses this.
>
> The patch was re-tested as follows:
> On ppc64-redhat-linux regtest as well as bootstrap with SMS flags
> enabling SMS also on loops with stage count 1.  Regtested on SPU.
> On arm-linux-gnueabi bootstrap c language with SMS
> flags enabling SMS also on loops with stage count 1
> and currently regression testing on c,c++.
>
> OK for mainline once regtest on arm-linux-gnueabi completes?
>
> Thanks,
> Revital
>
> gcc/
>         * ddg.c (add_intra_loop_mem_dep): New function.
>         (build_intra_loop_deps): Call it.
>
> testsuite/
>         * gcc.dg/sms-9.c: New file.
> [attachment "patch_sms_14_6.txt" deleted by Ayal Zaks/Haifa/IBM]
Revital Eres - June 16, 2011, 4:04 a.m.
Hello,

> better do
>   else if (!mem_read_insn_p (to->insn))
>
> +       create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 0);
> +    }

Done. Committed to -r175090.

Thanks,
Revital

Patch

Index: ddg.c

===================================================================
--- ddg.c	(revision 174906)

+++ ddg.c	(working copy)

@@ -390,6 +390,36 @@  insns_may_alias_p (rtx insn1, rtx insn2)

 			 &PATTERN (insn2));
 }
 
+/* Given two nodes, analyze their RTL insns and add intra-loop mem deps

+   to ddg G.  */

+static void

+add_intra_loop_mem_dep (ddg_ptr g, ddg_node_ptr from, ddg_node_ptr to)

+{

+

+  if (!insns_may_alias_p (from->insn, to->insn))

+    /* Do not create edge if memory references have disjoint alias sets.  */

+    return;

+

+  if (mem_write_insn_p (from->insn))

+    {

+      if (mem_read_insn_p (to->insn))

+	create_ddg_dep_no_link (g, from, to,

+				DEBUG_INSN_P (to->insn)

+				? ANTI_DEP : TRUE_DEP, MEM_DEP, 0);

+      else if (from->cuid != to->cuid)

+	create_ddg_dep_no_link (g, from, to,

+				DEBUG_INSN_P (to->insn)

+				? ANTI_DEP : OUTPUT_DEP, MEM_DEP, 0);

+    }

+  else

+    {

+      if (mem_read_insn_p (to->insn))

+	return;

+      else if (from->cuid != to->cuid)

+	create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 0);

+    }

+}

+

 /* Given two nodes, analyze their RTL insns and add inter-loop mem deps
    to ddg G.  */
 static void
@@ -477,10 +507,22 @@  build_intra_loop_deps (ddg_ptr g)

 	      if (DEBUG_INSN_P (j_node->insn))
 		continue;
 	      if (mem_access_insn_p (j_node->insn))
- 		/* Don't bother calculating inter-loop dep if an intra-loop dep

-		   already exists.  */

+		{

+		  /* Don't bother calculating inter-loop dep if an intra-loop dep

+		     already exists.  */

 	      	  if (! TEST_BIT (dest_node->successors, j))
 		    add_inter_loop_mem_dep (g, dest_node, j_node);
+		  /* If -fmodulo-sched-allow-regmoves

+		     is set certain anti-dep edges are not created.

+		     It might be that these anti-dep edges are on the

+		     path from one memory instruction to another such that

+		     removing these edges could cause a violation of the

+		     memory dependencies.  Thus we add intra edges between

+		     every two memory instructions in this case.  */

+		  if (flag_modulo_sched_allow_regmoves

+		      && !TEST_BIT (dest_node->predecessors, j))

+		    add_intra_loop_mem_dep (g, j_node, dest_node);

+		}

             }
         }
     }
Index: testsuite/gcc.dg/sms-9.c

===================================================================
--- testsuite/gcc.dg/sms-9.c	(revision 0)

+++ testsuite/gcc.dg/sms-9.c	(revision 0)

@@ -0,0 +1,60 @@ 

+/* { dg-do run } */

+/* { dg-options "-O2 -fmodulo-sched -fno-auto-inc-dec -O2 -fmodulo-sched-allow-regmoves" } */

+

+#include <stdlib.h>

+#include <stdarg.h>

+

+struct df_ref_info

+{

+  unsigned int *begin;

+  unsigned int *count;

+};

+

+extern void *memset (void *s, int c, __SIZE_TYPE__ n);

+

+

+__attribute__ ((noinline))

+int

+df_reorganize_refs_by_reg_by_insn (struct df_ref_info *ref_info,

+			           int num, unsigned int start)

+{

+  unsigned int m = num;

+  unsigned int offset = 77;

+  unsigned int r;

+

+  for (r = start; r < m; r++)

+    {

+      ref_info->begin[r] = offset;

+      offset += ref_info->count[r];

+      ref_info->count[r] = 0;

+    }

+

+  return offset;

+}

+

+int

+main ()

+{

+  struct df_ref_info temp;

+  int num = 100;

+  unsigned int start = 5;

+  int i, offset;

+

+  temp.begin = malloc (100 * sizeof (unsigned int));

+  temp.count = malloc (100 * sizeof (unsigned int));

+

+  memset (temp.begin, 0, sizeof (unsigned int) * num);

+  memset (temp.count, 0, sizeof (unsigned int) * num);

+

+  for (i = 0; i < num; i++)

+    temp.count[i] = i + 1;

+

+  offset = df_reorganize_refs_by_reg_by_insn (&temp, num, start);

+

+  if (offset != 5112)

+    abort ();

+

+  free (temp.begin);

+  free (temp.count);

+  return 0;

+}