Patchwork [PR51752] publication safety violations in loop invariant motion pass

login
register
mail settings
Submitter Aldy Hernandez
Date March 9, 2012, 9:48 p.m.
Message ID <4F5A7AB0.6020709@redhat.com>
Download mbox | patch
Permalink /patch/145764/
State New
Headers show

Comments

Aldy Hernandez - March 9, 2012, 9:48 p.m.
> Note that partial PRE (enabled at -O3) can insert expressions into paths
> that did _not_ execute the expression.  For regular PRE you are right.
>
> Richard.

I've thought about this some more, and Torvald's comment makes a lot of 
sense.  PRE can make things completely redundant, and a later pass may 
move things before its publication.  I believe it's wise to disable PRE 
inside transactions as originally discussed.  The attached patch does so.

Below is an example (from my patch) with an explanation of what may go 
wrong.

Torvald is this what you were thinking of?

Richards, is this OK for the 4.7 branch and trunk (pending tests)?

Thanks.
Aldy

+	  /* Non local loads in a transaction cannot be hoisted out,
+	     because they may make partially redundant expressions
+	     totally redundant, which a later pass may move before its
+	     publication by another thread.
+
+	     For example:
+
+	       __transaction_atomic {
+	         if (flag)
+		   y = x + 4;
+		 else
+		   // stuff
+		 z = x + 4;
+	       }
+
+	     PRE can rewrite this into:
+
+	       __transaction_atomic {
+	         if (flag) {
+	           tmp = x + 4;
+		   y = tmp;
+		 } else {
+		   // stuff
+		   tmp = x + 4;
+		 }
+		 z = tmp;
+	       }
+
+	     A later pass can move the now totally redundant [x + 4]
+	     before its publication predicated by "flag":
+
+	       __transaction_atomic {
+	         tmp = x + 4;
+		 if (flag) {
+		 } else {
+		   // stuff
+		 }
+		 z = tmp;
+	  */
PR middle-end/51752
	* tree-ssa-pre.c (compute_avail): Disable PRE movements inside a
	transaction.
	(execute_pre): Compute transaction bits.
Torvald Riegel - March 10, 2012, 2:14 p.m.
On Fri, 2012-03-09 at 15:48 -0600, Aldy Hernandez wrote:
> Torvald is this what you were thinking of?

Yes, but with an exit in the else branch or something that can cause x
not being read after the condition.  I _suppose_ that your original
example would be an allowed transformation but just because x would be
read anyway independently of flag's value; we can assume data-race
freedom, and thus we must be able to read x in a data-race-free way even
if flag is false, so flag's value actually doesn't matter.

What about modifying the example like below?  In this case, if flag2 is
true, flag's value will matter and we can't move the load to x before
it.  Will PRE still introduce "tmp = x + 4" in such an example?

Torvald

> +	       __transaction_atomic {
> +	         if (flag)
> +		   y = x + 4;
> +		 else
> +		   // stuff
                     if (flag2)
                       return;
> +		 z = x + 4;
> +	       }
> +
> +	     PRE can rewrite this into:
> +
> +	       __transaction_atomic {
> +	         if (flag) {
> +	           tmp = x + 4;
> +		   y = tmp;
> +		 } else {
> +		   // stuff
> +		   tmp = x + 4;
                     if (flag2)
                       return;
> +		 }
> +		 z = tmp;
> +	       }
> +
> +	     A later pass can move the now totally redundant [x + 4]
> +	     before its publication predicated by "flag":
> +
> +	       __transaction_atomic {
> +	         tmp = x + 4;
> +		 if (flag) {
> +		 } else {
> +		   // stuff
                     if (flag2)
                       return;
> +		 }
> +		 z = tmp;
> +	  */
Richard Guenther - March 12, 2012, 11:57 a.m.
On Fri, Mar 9, 2012 at 10:48 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Note that partial PRE (enabled at -O3) can insert expressions into paths
>> that did _not_ execute the expression.  For regular PRE you are right.
>>
>> Richard.
>
>
> I've thought about this some more, and Torvald's comment makes a lot of
> sense.  PRE can make things completely redundant, and a later pass may move
> things before its publication.  I believe it's wise to disable PRE inside
> transactions as originally discussed.  The attached patch does so.
>
> Below is an example (from my patch) with an explanation of what may go
> wrong.
>
> Torvald is this what you were thinking of?
>
> Richards, is this OK for the 4.7 branch and trunk (pending tests)?

+	  if (flag_tm
+	      && gimple_in_transaction (stmt)
+	      && gimple_assign_single_p (stmt))
+	    {
+	      tree rhs = gimple_assign_rhs1 (stmt);
+	      if (DECL_P (rhs) && is_global_var (rhs))
+		continue;

this does not cover a read like 'a.b', nor a '*p' read.  I think in some other
thread I sketched some ref_may_refer_to_global function, maybe you
remember.

Richard.

> Thanks.
> Aldy
>
> +         /* Non local loads in a transaction cannot be hoisted out,
> +            because they may make partially redundant expressions
> +            totally redundant, which a later pass may move before its
> +            publication by another thread.
> +
> +            For example:
> +
> +              __transaction_atomic {
> +                if (flag)
> +                  y = x + 4;
> +                else
> +                  // stuff
> +                z = x + 4;
> +              }
> +
> +            PRE can rewrite this into:
> +
> +              __transaction_atomic {
> +                if (flag) {
> +                  tmp = x + 4;
> +                  y = tmp;
> +                } else {
> +                  // stuff
> +                  tmp = x + 4;
> +                }
> +                z = tmp;
> +              }
> +
> +            A later pass can move the now totally redundant [x + 4]
> +            before its publication predicated by "flag":
> +
> +              __transaction_atomic {
> +                tmp = x + 4;
> +                if (flag) {
> +                } else {
> +                  // stuff
> +                }
> +                z = tmp;
> +         */
Aldy Hernandez - March 12, 2012, 4:47 p.m.
On 03/10/12 08:14, Torvald Riegel wrote:
> On Fri, 2012-03-09 at 15:48 -0600, Aldy Hernandez wrote:
>> Torvald is this what you were thinking of?
>
> Yes, but with an exit in the else branch or something that can cause x
> not being read after the condition.  I _suppose_ that your original
> example would be an allowed transformation but just because x would be
> read anyway independently of flag's value; we can assume data-race
> freedom, and thus we must be able to read x in a data-race-free way even
> if flag is false, so flag's value actually doesn't matter.
>
> What about modifying the example like below?  In this case, if flag2 is
> true, flag's value will matter and we can't move the load to x before
> it.  Will PRE still introduce "tmp = x + 4" in such an example?
>
> Torvald
>
>> +	       __transaction_atomic {
>> +	         if (flag)
>> +		   y = x + 4;
>> +		 else
>> +		   // stuff
>                       if (flag2)
>                         return;
>> +		 z = x + 4;
>> +	       }

Hmmm, by adding the exit, PRE introduces the read of "x + 4" correctly 
*after* the read of flag2, so something like this:

__transaction_atomic {
	if (flag) {
		tmp = x + 4;
		y = tmp;
	} else {
		if (flag2)
			return;
		tmp = x + 4;
	}
	z = tmp;

So... by your logic, this is allowed because the read of "x" would 
happen anyway (it is not inserted in the "flag2 != 0" case).

I'm back to having no testcase, so perhaps I should drop this patch 
until we can come up with a PRE testcase that actually triggers a 
publication safety violation.

Richi, you mentioned partial PRE inserting code into code paths that 
previously did not have reads.  Do you have an example?  As much as I 
tried, I could not trigger a partial PRE, but that may be because I 
don't understand the algorithm very well.

Thanks guys.
Aldy

Patch

Index: testsuite/gcc.dg/tm/pr51752.c
===================================================================
--- testsuite/gcc.dg/tm/pr51752.c	(revision 0)
+++ testsuite/gcc.dg/tm/pr51752.c	(revision 0)
@@ -0,0 +1,23 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fgnu-tm -fdump-tree-pre -O2" } */
+
+int flag, hoist, y, z, george;
+
+void
+foo (void)
+{
+  __transaction_atomic
+  {
+    if (george)
+      {
+	if (flag)
+	  y = hoist + 4;
+	else
+	  flag = 888;
+	z = hoist + 4;
+      }
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "pretmp.*= hoist" 0 "pre" } } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: tree-ssa-pre.c
===================================================================
--- tree-ssa-pre.c	(revision 184935)
+++ tree-ssa-pre.c	(working copy)
@@ -3986,6 +3986,54 @@  compute_avail (void)
 	      || stmt_could_throw_p (stmt))
 	    continue;
 
+	  /* Non local loads in a transaction cannot be hoisted out,
+	     because they may make partially redundant expressions
+	     totally redundant, which a later pass may move before its
+	     publication by another thread.
+
+	     For example:
+
+	       __transaction_atomic {
+	         if (flag)
+		   y = x + 4;
+		 else
+		   // stuff
+		 z = x + 4;
+	       }
+
+	     PRE can rewrite this into:
+
+	       __transaction_atomic {
+	         if (flag) {
+	           tmp = x + 4;
+		   y = tmp;
+		 } else {
+		   // stuff
+		   tmp = x + 4;
+		 }
+		 z = tmp;
+	       }
+
+	     A later pass can move the now totally redundant [x + 4]
+	     before its publication predicated by "flag":
+
+	       __transaction_atomic {
+	         tmp = x + 4;
+		 if (flag) {
+		 } else {
+		   // stuff
+		 }
+		 z = tmp;
+	  */
+	  if (flag_tm
+	      && gimple_in_transaction (stmt)
+	      && gimple_assign_single_p (stmt))
+	    {
+	      tree rhs = gimple_assign_rhs1 (stmt);
+	      if (DECL_P (rhs) && is_global_var (rhs))
+		continue;
+	    }
+
 	  switch (gimple_code (stmt))
 	    {
 	    case GIMPLE_RETURN:
@@ -4896,6 +4944,9 @@  execute_pre (bool do_fre)
   init_pre (do_fre);
   scev_initialize ();
 
+  if (flag_tm)
+    compute_transaction_bits ();
+
   /* Collect and value number expressions computed in each basic block.  */
   compute_avail ();