Message ID | 4F5A7AB0.6020709@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, 2012-03-09 at 15:48 -0600, Aldy Hernandez wrote: > Torvald is this what you were thinking of? Yes, but with an exit in the else branch or something that can cause x not being read after the condition. I _suppose_ that your original example would be an allowed transformation but just because x would be read anyway independently of flag's value; we can assume data-race freedom, and thus we must be able to read x in a data-race-free way even if flag is false, so flag's value actually doesn't matter. What about modifying the example like below? In this case, if flag2 is true, flag's value will matter and we can't move the load to x before it. Will PRE still introduce "tmp = x + 4" in such an example? Torvald > + __transaction_atomic { > + if (flag) > + y = x + 4; > + else > + // stuff if (flag2) return; > + z = x + 4; > + } > + > + PRE can rewrite this into: > + > + __transaction_atomic { > + if (flag) { > + tmp = x + 4; > + y = tmp; > + } else { > + // stuff > + tmp = x + 4; if (flag2) return; > + } > + z = tmp; > + } > + > + A later pass can move the now totally redundant [x + 4] > + before its publication predicated by "flag": > + > + __transaction_atomic { > + tmp = x + 4; > + if (flag) { > + } else { > + // stuff if (flag2) return; > + } > + z = tmp; > + */
On Fri, Mar 9, 2012 at 10:48 PM, Aldy Hernandez <aldyh@redhat.com> wrote: > >> Note that partial PRE (enabled at -O3) can insert expressions into paths >> that did _not_ execute the expression. For regular PRE you are right. >> >> Richard. > > > I've thought about this some more, and Torvald's comment makes a lot of > sense. PRE can make things completely redundant, and a later pass may move > things before its publication. I believe it's wise to disable PRE inside > transactions as originally discussed. The attached patch does so. > > Below is an example (from my patch) with an explanation of what may go > wrong. > > Torvald is this what you were thinking of? > > Richards, is this OK for the 4.7 branch and trunk (pending tests)? + if (flag_tm + && gimple_in_transaction (stmt) + && gimple_assign_single_p (stmt)) + { + tree rhs = gimple_assign_rhs1 (stmt); + if (DECL_P (rhs) && is_global_var (rhs)) + continue; this does not cover a read like 'a.b', nor a '*p' read. I think in some other thread I sketched some ref_may_refer_to_global function, maybe you remember. Richard. > Thanks. > Aldy > > + /* Non local loads in a transaction cannot be hoisted out, > + because they may make partially redundant expressions > + totally redundant, which a later pass may move before its > + publication by another thread. > + > + For example: > + > + __transaction_atomic { > + if (flag) > + y = x + 4; > + else > + // stuff > + z = x + 4; > + } > + > + PRE can rewrite this into: > + > + __transaction_atomic { > + if (flag) { > + tmp = x + 4; > + y = tmp; > + } else { > + // stuff > + tmp = x + 4; > + } > + z = tmp; > + } > + > + A later pass can move the now totally redundant [x + 4] > + before its publication predicated by "flag": > + > + __transaction_atomic { > + tmp = x + 4; > + if (flag) { > + } else { > + // stuff > + } > + z = tmp; > + */
On 03/10/12 08:14, Torvald Riegel wrote: > On Fri, 2012-03-09 at 15:48 -0600, Aldy Hernandez wrote: >> Torvald is this what you were thinking of? > > Yes, but with an exit in the else branch or something that can cause x > not being read after the condition. I _suppose_ that your original > example would be an allowed transformation but just because x would be > read anyway independently of flag's value; we can assume data-race > freedom, and thus we must be able to read x in a data-race-free way even > if flag is false, so flag's value actually doesn't matter. > > What about modifying the example like below? In this case, if flag2 is > true, flag's value will matter and we can't move the load to x before > it. Will PRE still introduce "tmp = x + 4" in such an example? > > Torvald > >> + __transaction_atomic { >> + if (flag) >> + y = x + 4; >> + else >> + // stuff > if (flag2) > return; >> + z = x + 4; >> + } Hmmm, by adding the exit, PRE introduces the read of "x + 4" correctly *after* the read of flag2, so something like this: __transaction_atomic { if (flag) { tmp = x + 4; y = tmp; } else { if (flag2) return; tmp = x + 4; } z = tmp; So... by your logic, this is allowed because the read of "x" would happen anyway (it is not inserted in the "flag2 != 0" case). I'm back to having no testcase, so perhaps I should drop this patch until we can come up with a PRE testcase that actually triggers a publication safety violation. Richi, you mentioned partial PRE inserting code into code paths that previously did not have reads. Do you have an example? As much as I tried, I could not trigger a partial PRE, but that may be because I don't understand the algorithm very well. Thanks guys. Aldy
Index: testsuite/gcc.dg/tm/pr51752.c =================================================================== --- testsuite/gcc.dg/tm/pr51752.c (revision 0) +++ testsuite/gcc.dg/tm/pr51752.c (revision 0) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-fgnu-tm -fdump-tree-pre -O2" } */ + +int flag, hoist, y, z, george; + +void +foo (void) +{ + __transaction_atomic + { + if (george) + { + if (flag) + y = hoist + 4; + else + flag = 888; + z = hoist + 4; + } + } +} + +/* { dg-final { scan-tree-dump-times "pretmp.*= hoist" 0 "pre" } } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: tree-ssa-pre.c =================================================================== --- tree-ssa-pre.c (revision 184935) +++ tree-ssa-pre.c (working copy) @@ -3986,6 +3986,54 @@ compute_avail (void) || stmt_could_throw_p (stmt)) continue; + /* Non local loads in a transaction cannot be hoisted out, + because they may make partially redundant expressions + totally redundant, which a later pass may move before its + publication by another thread. + + For example: + + __transaction_atomic { + if (flag) + y = x + 4; + else + // stuff + z = x + 4; + } + + PRE can rewrite this into: + + __transaction_atomic { + if (flag) { + tmp = x + 4; + y = tmp; + } else { + // stuff + tmp = x + 4; + } + z = tmp; + } + + A later pass can move the now totally redundant [x + 4] + before its publication predicated by "flag": + + __transaction_atomic { + tmp = x + 4; + if (flag) { + } else { + // stuff + } + z = tmp; + */ + if (flag_tm + && gimple_in_transaction (stmt) + && gimple_assign_single_p (stmt)) + { + tree rhs = gimple_assign_rhs1 (stmt); + if (DECL_P (rhs) && is_global_var (rhs)) + continue; + } + switch (gimple_code (stmt)) { case GIMPLE_RETURN: @@ -4896,6 +4944,9 @@ execute_pre (bool do_fre) init_pre (do_fre); scev_initialize (); + if (flag_tm) + compute_transaction_bits (); + /* Collect and value number expressions computed in each basic block. */ compute_avail ();