diff mbox

Fix -fcrossjumping at -O1 (PR rtl-optimization/48156)

Message ID 20110318161312.GC30899@tyan-ft48-01.lab.bos.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek March 18, 2011, 4:13 p.m. UTC
Hi!

The testcase below is miscompiled on x86_64-linux.
The problem is that try_head_merge_bb uses df_get_bb_dirty
to see if it can use df_get_live_out () info
(through simulate_backwards_to_point) reliably, but as at
-O1 the live problem isn't computed, only lr problem,
df_get_live_out () returns the lr out bitmap, but
df_get_bb_dirty which only looks at df_live problem always
returns false.  Thus, if the merge_bb's to compute live_union
from for can_move_insns_across has dirty lr solution caused by earlier
successful crossjumping in the same pass, we happily use its out of data
lr bitmap.

Fixed by returning if the bb has dirty lr problem if live problem
isn't computed.  df_get_bb_dirty is only used here in crossjumping and
in ifcvt, but in the latter we add the live problem at the start of
ifcvt and remove it at the end of ifcvt for -O1.

Bootstrapped/regtested on x86_64-linux, ok for trunk and 4.6.1?

While it is a wrong-code bug, I doubt many people are using
-O -fcrossjumping
options together, and -fcrossjumping is only the default for -O2+,
where live problem is added by default.

2011-03-18  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/48156
	* df-core.c (df_get_bb_dirty): Use df_lr if df_live is NULL.

	* gcc.dg/pr48156.c: New test.


	Jakub

Comments

Kenneth Zadeck March 18, 2011, 4:23 p.m. UTC | #1
I believe that this is not the right way to go.

if someone specifies -fcrossjumping, then the pass should turn on live 
for the duration of the pass just as ifcvt does.    If they ask for 
crossjumping you should give them crossjumping and not some crippled 
version of it.

kenny



On 03/18/2011 12:13 PM, Jakub Jelinek wrote:
> Hi!
>
> The testcase below is miscompiled on x86_64-linux.
> The problem is that try_head_merge_bb uses df_get_bb_dirty
> to see if it can use df_get_live_out () info
> (through simulate_backwards_to_point) reliably, but as at
> -O1 the live problem isn't computed, only lr problem,
> df_get_live_out () returns the lr out bitmap, but
> df_get_bb_dirty which only looks at df_live problem always
> returns false.  Thus, if the merge_bb's to compute live_union
> from for can_move_insns_across has dirty lr solution caused by earlier
> successful crossjumping in the same pass, we happily use its out of data
> lr bitmap.
>
> Fixed by returning if the bb has dirty lr problem if live problem
> isn't computed.  df_get_bb_dirty is only used here in crossjumping and
> in ifcvt, but in the latter we add the live problem at the start of
> ifcvt and remove it at the end of ifcvt for -O1.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk and 4.6.1?
>
> While it is a wrong-code bug, I doubt many people are using
> -O -fcrossjumping
> options together, and -fcrossjumping is only the default for -O2+,
> where live problem is added by default.
>
> 2011-03-18  Jakub Jelinek<jakub@redhat.com>
>
> 	PR rtl-optimization/48156
> 	* df-core.c (df_get_bb_dirty): Use df_lr if df_live is NULL.
>
> 	* gcc.dg/pr48156.c: New test.
>
> --- gcc/df-core.c.jj	2010-12-14 08:11:39.000000000 +0100
> +++ gcc/df-core.c	2011-03-18 14:22:43.000000000 +0100
> @@ -1400,10 +1400,16 @@ df_mark_solutions_dirty (void)
>   bool
>   df_get_bb_dirty (basic_block bb)
>   {
> -  if (df&&  df_live)
> -    return bitmap_bit_p (df_live->out_of_date_transfer_functions, bb->index);
> -  else
> -    return false;
> +  if (df)
> +    {
> +      if (df_live)
> +	return bitmap_bit_p (df_live->out_of_date_transfer_functions,
> +			     bb->index);
> +      else if (df_lr)
> +	return bitmap_bit_p (df_lr->out_of_date_transfer_functions,
> +			     bb->index);
> +    }
> +  return false;
>   }
>
>
> --- gcc/testsuite/gcc.dg/pr48156.c.jj	2011-03-18 14:57:34.000000000 +0100
> +++ gcc/testsuite/gcc.dg/pr48156.c	2011-03-18 14:57:14.000000000 +0100
> @@ -0,0 +1,45 @@
> +/* PR rtl-optimization/48156 */
> +/* { dg-do run } */
> +/* { dg-options "-O -fcrossjumping --param min-crossjump-insns=1" } */
> +
> +extern void abort (void);
> +
> +static int __attribute__ ((noinline, noclone))
> +equals (int s1, int s2)
> +{
> +  return s1 == s2;
> +}
> +
> +static int __attribute__ ((noinline, noclone))
> +bar (void)
> +{
> +  return 1;
> +}
> +
> +static void __attribute__ ((noinline, noclone))
> +baz (int f, int j)
> +{
> +  if (f != 4 || j != 2)
> +    abort ();
> +}
> +
> +void
> +foo (int x)
> +{
> +  int i = 0, j = bar ();
> +
> +  if (x == 1)
> +    i = 2;
> +
> +  if (j&&  equals (i, j))
> +    baz (8, i);
> +  else
> +    baz (4, i);
> +}
> +
> +int
> +main ()
> +{
> +  foo (1);
> +  return 0;
> +}
>
> 	Jakub
Paolo Bonzini March 18, 2011, 4:25 p.m. UTC | #2
On Fri, Mar 18, 2011 at 17:23, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
> I believe that this is not the right way to go.
>
> if someone specifies -fcrossjumping, then the pass should turn on live for
> the duration of the pass just as ifcvt does.    If they ask for crossjumping
> you should give them crossjumping and not some crippled version of it.

Then we should have both patches, but df_lr is not always being used
as a fallback for df_live, and that is a bug.

Paolo
Jakub Jelinek March 18, 2011, 4:31 p.m. UTC | #3
On Fri, Mar 18, 2011 at 12:23:11PM -0400, Kenneth Zadeck wrote:
> I believe that this is not the right way to go.
> 
> if someone specifies -fcrossjumping, then the pass should turn on
> live for the duration of the pass just as ifcvt does.    If they ask
> for crossjumping you should give them crossjumping and not some
> crippled version of it.

Such (untested) patch is in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48156#c7
But Paolo just added comment there that he prefers this version.

Is the live problem so much more useful for crossjumping than lr problem?
All it wants to prove is if it can safely move a couple of instructions
across some other instructions, and the live_union bitmap that is computed
using the live/lr problem is used to find out if registers set by the
moved instructions are actually live at the end of those instructions
or not.  What would be an example where live problem would allow optimizing
more than lr?

	Jakub
Michael Matz March 18, 2011, 4:34 p.m. UTC | #4
Hi,

On Fri, 18 Mar 2011, Kenneth Zadeck wrote:

> I believe that this is not the right way to go.
> 
> if someone specifies -fcrossjumping, then the pass should turn on live for the
> duration of the pass just as ifcvt does.    If they ask for crossjumping you
> should give them crossjumping and not some crippled version of it.

OTOH the function is called df_get_bb_dirty, not 
df_get_bb_dirty_but_only_for_df_live.  So irrespective of how 
-fcrossjumping should behave with -O1 the patch seems to be a good idea.


Ciao,
Michael.
Kenneth Zadeck March 18, 2011, 4:35 p.m. UTC | #5
I could be happy with both patches.

On 03/18/2011 12:25 PM, Paolo Bonzini wrote:
> On Fri, Mar 18, 2011 at 17:23, Kenneth Zadeck<zadeck@naturalbridge.com>  wrote:
>> I believe that this is not the right way to go.
>>
>> if someone specifies -fcrossjumping, then the pass should turn on live for
>> the duration of the pass just as ifcvt does.    If they ask for crossjumping
>> you should give them crossjumping and not some crippled version of it.
> Then we should have both patches, but df_lr is not always being used
> as a fallback for df_live, and that is a bug.
>
> Paolo
Paolo Bonzini March 18, 2011, 5:45 p.m. UTC | #6
On Fri, Mar 18, 2011 at 17:31, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Mar 18, 2011 at 12:23:11PM -0400, Kenneth Zadeck wrote:
>> I believe that this is not the right way to go.
>>
>> if someone specifies -fcrossjumping, then the pass should turn on
>> live for the duration of the pass just as ifcvt does.    If they ask
>> for crossjumping you should give them crossjumping and not some
>> crippled version of it.
>
> Such (untested) patch is in
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48156#c7
> But Paolo just added comment there that he prefers this version.
>
> Is the live problem so much more useful for crossjumping than lr problem?
> All it wants to prove is if it can safely move a couple of instructions
> across some other instructions, and the live_union bitmap that is computed
> using the live/lr problem is used to find out if registers set by the
> moved instructions are actually live at the end of those instructions
> or not.  What would be an example where live problem would allow optimizing
> more than lr?

LIVE == LR except when you have uninitialized uses.

Unless it is needed for correctness, I see no reason to prefer LIVE to
LR at -O1.

Paolo
Kenneth Zadeck March 18, 2011, 6:18 p.m. UTC | #7
i see your point.    I had forgotten that crossjumping is not really a 
pass, it is more of an infectious agent.

On 03/18/2011 01:45 PM, Paolo Bonzini wrote:
> On Fri, Mar 18, 2011 at 17:31, Jakub Jelinek<jakub@redhat.com>  wrote:
>> On Fri, Mar 18, 2011 at 12:23:11PM -0400, Kenneth Zadeck wrote:
>>> I believe that this is not the right way to go.
>>>
>>> if someone specifies -fcrossjumping, then the pass should turn on
>>> live for the duration of the pass just as ifcvt does.    If they ask
>>> for crossjumping you should give them crossjumping and not some
>>> crippled version of it.
>> Such (untested) patch is in
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48156#c7
>> But Paolo just added comment there that he prefers this version.
>>
>> Is the live problem so much more useful for crossjumping than lr problem?
>> All it wants to prove is if it can safely move a couple of instructions
>> across some other instructions, and the live_union bitmap that is computed
>> using the live/lr problem is used to find out if registers set by the
>> moved instructions are actually live at the end of those instructions
>> or not.  What would be an example where live problem would allow optimizing
>> more than lr?
> LIVE == LR except when you have uninitialized uses.
>
> Unless it is needed for correctness, I see no reason to prefer LIVE to
> LR at -O1.
>
yes, but i think that the reason this is a pr is that it seems to be 
needed for correctness.
i certainly am not advocating using live at O1.    it is more 
expensive.    however, we do have the ability to turn on some problem 
only in one pass, and that was what i was suggesting.   you make 
crossjumping require LIVE, and you will get LIVE for that pass, not for 
the entire compilation.
> Paolo
Paolo Bonzini March 19, 2011, 9:19 a.m. UTC | #8
On Fri, Mar 18, 2011 at 19:18, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
> yes, but i think that the reason this is a pr is that it seems to be needed
> for correctness.

Note that df_get_bb_dirty is defaulting to "return false", not
"abort".  This is what tricked crossjumping and caused the bug.

If I can get hold of my old SPEC2000 copy, I'd like to remove LIVE
completely from most -O2 passes (certainly those running after
initialization of uninitialized regs!) in 4.7.

Paolo
Kenneth Zadeck March 19, 2011, 12:08 p.m. UTC | #9
On 03/19/2011 05:19 AM, Paolo Bonzini wrote:
> On Fri, Mar 18, 2011 at 19:18, Kenneth Zadeck<zadeck@naturalbridge.com>  wrote:
>> yes, but i think that the reason this is a pr is that it seems to be needed
>> for correctness.
> Note that df_get_bb_dirty is defaulting to "return false", not
> "abort".  This is what tricked crossjumping and caused the bug.
>
where does the compiler fail if you change it to abort, (after doing the 
check from jakub's patch)?
> If I can get hold of my old SPEC2000 copy, I'd like to remove LIVE
> completely from most -O2 passes (certainly those running after
> initialization of uninitialized regs!) in 4.7.
>
i had thought that there were other reasons for doing live throughout 
-O2 and higher, but i have forgotten.   it has been a long time.
> Paolo
Jakub Jelinek March 19, 2011, 12:15 p.m. UTC | #10
On Sat, Mar 19, 2011 at 08:08:48AM -0400, Kenneth Zadeck wrote:
> 
> On 03/19/2011 05:19 AM, Paolo Bonzini wrote:
> >On Fri, Mar 18, 2011 at 19:18, Kenneth Zadeck<zadeck@naturalbridge.com>  wrote:
> >>yes, but i think that the reason this is a pr is that it seems to be needed
> >>for correctness.
> >Note that df_get_bb_dirty is defaulting to "return false", not
> >"abort".  This is what tricked crossjumping and caused the bug.
> >
> where does the compiler fail if you change it to abort, (after doing
> the check from jakub's patch)?

With neither of the patches of mine for this issue, df_get_bb_dirty
would ICE with added gcc_assert (df && df_live);
The point is that df_get_bb_dirty was silently returning false even for
blocks which had lr not recumputed when live wasn't computed at all.

So live isn't needed for crossjumping correctness, assuming df_get_bb_dirty
is compatible with what df_get_live_out does (the first patch).

	Jakub
Kenneth Zadeck March 19, 2011, 12:48 p.m. UTC | #11
On 03/19/2011 08:15 AM, Jakub Jelinek wrote:
> On Sat, Mar 19, 2011 at 08:08:48AM -0400, Kenneth Zadeck wrote:
>> On 03/19/2011 05:19 AM, Paolo Bonzini wrote:
>>> On Fri, Mar 18, 2011 at 19:18, Kenneth Zadeck<zadeck@naturalbridge.com>   wrote:
>>>> yes, but i think that the reason this is a pr is that it seems to be needed
>>>> for correctness.
>>> Note that df_get_bb_dirty is defaulting to "return false", not
>>> "abort".  This is what tricked crossjumping and caused the bug.
>>>
>> where does the compiler fail if you change it to abort, (after doing
>> the check from jakub's patch)?
> With neither of the patches of mine for this issue, df_get_bb_dirty
> would ICE with added gcc_assert (df&&  df_live);
> The point is that df_get_bb_dirty was silently returning false even for
> blocks which had lr not recumputed when live wasn't computed at all.
>
> So live isn't needed for crossjumping correctness, assuming df_get_bb_dirty
> is compatible with what df_get_live_out does (the first patch).
>
> 	Jakub
i think that there are two separate questions here:

1) should your original patch go in as you did it, or should it go in 
with the last "return false" be an abort?

2) which dataflow problems should be used at what points during 
optimization at any level.   There is, and will always be, a question of 
precision vs cost.   Live is both more expensive and more precise.    
The questions about the second patch with respect to crossjumping is: 
does the extra precision either make any difference? and is required for 
correctness?    My guess is that it most likely is not required for 
correctness and may not even make any difference.  However, I never 
looked closely at what cross jumping did - i get the feeling that Paolo 
has more experience here so I defer to him on this.

I do believe in the principal that if someone specifies something like 
cross jumping explicitly, that they get substantially the same 
optimization at whatever -O level they run it.   So if live does make 
any difference in the quality of the optimization, then we should turn 
on live to run it.

However, if it does not, then some form of your first patch is all that 
is necessary.

kenny
Jakub Jelinek March 19, 2011, 1:01 p.m. UTC | #12
On Sat, Mar 19, 2011 at 08:48:55AM -0400, Kenneth Zadeck wrote:
> i think that there are two separate questions here:
> 
> 1) should your original patch go in as you did it, or should it go
> in with the last "return false" be an abort?

bool
df_get_bb_dirty (basic_block bb)
{
  return bitmap_bit_p ((df_live
			? df_live : df_lr)->out_of_date_transfer_functions,
		       bb->index);
}

would IMHO work equally well (no need to assert IMHO, it isn't much
different from just segfaulting).  From what I can see, df_get_bb_dirty
is called from ifcvt (both passes, but both are guarded with optimize > 0
and happen while df is computed) and crossjumping (run with -fcrossjumping
even at -O0, but after pass_df_initialize_no_opt, so df_lr is computed too
at least, if not even df_live).
I'd need to bootstrap/regtest it of course, but I don't expect any failures.

> 2) which dataflow problems should be used at what points during
...
> However, if it does not, then some form of your first patch is all
> that is necessary.

I believe it doesn't make any difference, but I'll let Paolo comment on it.
I could try to write a patch that would check for any differences
(basically compute for -O2 -fcrossjumping live_union from both DF_LR_OUT (bb)
and df_get_live_out (bb) and if it is different, run can_move_*across twice
and see whether it makes any difference).

	Jakub
Kenneth Zadeck March 19, 2011, 1:42 p.m. UTC | #13
On 03/19/2011 09:01 AM, Jakub Jelinek wrote:
> On Sat, Mar 19, 2011 at 08:48:55AM -0400, Kenneth Zadeck wrote:
>> i think that there are two separate questions here:
>>
>> 1) should your original patch go in as you did it, or should it go
>> in with the last "return false" be an abort?
> bool
> df_get_bb_dirty (basic_block bb)
> {
>    return bitmap_bit_p ((df_live
> 			? df_live : df_lr)->out_of_date_transfer_functions,
> 		       bb->index);
> }
>
> would IMHO work equally well (no need to assert IMHO, it isn't much
> different from just segfaulting).  From what I can see, df_get_bb_dirty
> is called from ifcvt (both passes, but both are guarded with optimize>  0
> and happen while df is computed) and crossjumping (run with -fcrossjumping
> even at -O0, but after pass_df_initialize_no_opt, so df_lr is computed too
> at least, if not even df_live).
> I'd need to bootstrap/regtest it of course, but I don't expect any failures.
>
i think that this is the right way to go.    my view is not just what 
the code currently does, but also from the perspective of the way i want 
this to work as the back ends evolve.
>> 2) which dataflow problems should be used at what points during
> ...
>> However, if it does not, then some form of your first patch is all
>> that is necessary.
> I believe it doesn't make any difference, but I'll let Paolo comment on it.
> I could try to write a patch that would check for any differences
> (basically compute for -O2 -fcrossjumping live_union from both DF_LR_OUT (bb)
> and df_get_live_out (bb) and if it is different, run can_move_*across twice
> and see whether it makes any difference).
>
yes, see if paolo wants to add anything here.
> 	Jakub
diff mbox

Patch

--- gcc/df-core.c.jj	2010-12-14 08:11:39.000000000 +0100
+++ gcc/df-core.c	2011-03-18 14:22:43.000000000 +0100
@@ -1400,10 +1400,16 @@  df_mark_solutions_dirty (void)
 bool
 df_get_bb_dirty (basic_block bb)
 {
-  if (df && df_live)
-    return bitmap_bit_p (df_live->out_of_date_transfer_functions, bb->index);
-  else
-    return false;
+  if (df)
+    {
+      if (df_live)
+	return bitmap_bit_p (df_live->out_of_date_transfer_functions,
+			     bb->index);
+      else if (df_lr)
+	return bitmap_bit_p (df_lr->out_of_date_transfer_functions,
+			     bb->index);
+    }
+  return false;
 }
 
 
--- gcc/testsuite/gcc.dg/pr48156.c.jj	2011-03-18 14:57:34.000000000 +0100
+++ gcc/testsuite/gcc.dg/pr48156.c	2011-03-18 14:57:14.000000000 +0100
@@ -0,0 +1,45 @@ 
+/* PR rtl-optimization/48156 */
+/* { dg-do run } */
+/* { dg-options "-O -fcrossjumping --param min-crossjump-insns=1" } */
+
+extern void abort (void);
+
+static int __attribute__ ((noinline, noclone))
+equals (int s1, int s2)
+{
+  return s1 == s2;
+}
+
+static int __attribute__ ((noinline, noclone))
+bar (void)
+{
+  return 1;
+}
+
+static void __attribute__ ((noinline, noclone))
+baz (int f, int j)
+{
+  if (f != 4 || j != 2)
+    abort ();
+}
+
+void
+foo (int x)
+{
+  int i = 0, j = bar ();
+
+  if (x == 1)
+    i = 2;
+
+  if (j && equals (i, j))
+    baz (8, i);
+  else
+    baz (4, i);
+}
+
+int
+main ()
+{
+  foo (1);
+  return 0;
+}