Patchwork Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)

login
register
mail settings
Submitter Steven Bosscher
Date Nov. 25, 2012, 10:37 p.m.
Message ID <CABu31nPq7f9L-jj8LOerZ7jxe1WK+wGJWPPcdh6g+CdeWxB2jw@mail.gmail.com>
Download mbox | patch
Permalink /patch/201580/
State New
Headers show

Comments

Steven Bosscher - Nov. 25, 2012, 10:37 p.m.
On Sun, Nov 25, 2012 at 9:44 PM, Steven Bosscher wrote:
> On Sat, Nov 24, 2012 at 2:09 AM, Steven Bosscher wrote:
>> On Fri, Nov 23, 2012 at 11:45 PM, Steven Bosscher wrote:
>>> Removing the note is easier but it may hurt optimization later on:
>>> CSE2 puts the note back in but this introduces a pass ordering
>>> dependency. Perhaps that's not a big problem, but I'm not sure...
>>
>> Hmm, actually CSE2 fails to put the note back, I guess because it's a
>> local CSE pass only...
>
> Here's another possible fix: Just punt on all REG_EQUAL notes for the
> IV. It's a bit of a big hammer, but there are no code changes in my
> set of cc1-i files (compiled with -funroll-loops) on x86_64 and
> powerpc64 so apparently it's not all that important to keep the notes.
>
> Comments on this one?

Or rather this one. Same hammer, different color. It turns out that
the rtlanal.c change caused problems, so I've got to use a home-brewn
equivalent of remove_reg_equal_equiv_notes_for_regno...

Test case is unchanged so I'm omitting it here.

Ciao!
Steven


gcc/
	* loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
	(analyze_iv_to_split_insn): Record it.
	(maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
	notes that refer to IVs that are being split.
	(apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
	Use FOR_BB_INSNS_SAFE.

testsuite/
	* gcc.dg/pr55006.c: New test.
Dominique Dhumieres - Nov. 26, 2012, 2:38 p.m.
> Or rather this one. Same hammer, different color. It turns out that
> the rtlanal.c change caused problems, so I've got to use a home-brewn
> equivalent of remove_reg_equal_equiv_notes_for_regno...

This does not fix pr55006 on x86_64-apple-darwin10;-(while the patch in
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01607.html
does).

> Test case is unchanged so I'm omitting it here.

This test passes on x86_64-apple-darwin10 for a clean trunk.

Thanks for looking at the problem.

Dominique
Steven Bosscher - Nov. 26, 2012, 3:45 p.m.
On Mon, Nov 26, 2012 at 3:38 PM, Dominique Dhumieres <dominiq@lps.ens.fr> wrote:
>> Or rather this one. Same hammer, different color. It turns out that
>> the rtlanal.c change caused problems, so I've got to use a home-brewn
>> equivalent of remove_reg_equal_equiv_notes_for_regno...
>
> This does not fix pr55006 on x86_64-apple-darwin10;-(while the patch in
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01607.html
> does).

Does it fail on x86_64-linux also? If not as-is, can you try with -fPIC?


>> Test case is unchanged so I'm omitting it here.
>
> This test passes on x86_64-apple-darwin10 for a clean trunk.

Yes, I know. As you can probably tell from the patch, I've been
working on an older revision because I can't reproduce the problem on
the trunk. It's a very elusive problem.

Ciao!
Steven
Eric Botcazou - Nov. 27, 2012, 9:55 a.m.
> Or rather this one. Same hammer, different color. It turns out that
> the rtlanal.c change caused problems, so I've got to use a home-brewn
> equivalent of remove_reg_equal_equiv_notes_for_regno...
> 
> Test case is unchanged so I'm omitting it here.
> 
> Ciao!
> Steven
> 
> 
> gcc/
> 	* loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
> 	(analyze_iv_to_split_insn): Record it.
> 	(maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
> 	notes that refer to IVs that are being split.
> 	(apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
> 	Use FOR_BB_INSNS_SAFE.

That's fine with me, thanks.  You might want to defer applying it until the 
reason why it isn't apparently sufficient for aermod.f90 is understood though.
Steven Bosscher - Nov. 27, 2012, 10:34 a.m.
On Tue, Nov 27, 2012 at 10:55 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Or rather this one. Same hammer, different color. It turns out that
>> the rtlanal.c change caused problems, so I've got to use a home-brewn
>> equivalent of remove_reg_equal_equiv_notes_for_regno...
>>
>> Test case is unchanged so I'm omitting it here.
>>
>> Ciao!
>> Steven
>>
>>
>> gcc/
>>       * loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
>>       (analyze_iv_to_split_insn): Record it.
>>       (maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
>>       notes that refer to IVs that are being split.
>>       (apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
>>       Use FOR_BB_INSNS_SAFE.
>
> That's fine with me, thanks.  You might want to defer applying it until the
> reason why it isn't apparently sufficient for aermod.f90 is understood though.

Thanks. Yes, I'm first going to try and figure out why this doesn't
fix aermod for Dominique. I suspect the problem is in variable
expansion in the unroller. A previous patch of mine updates REG_EQUAL
notes in variable expansion, but it doesn't clean up notes that refer
to maybe dead registers.

I have to say, I've got a uncomfortable feeling about REG_EQUAL notes
not being allowed to refer to dead registers. In the case at hand, the
code basically does:

S1: r1 = r2 + 0x4  // r2 is the reg from split_iv, r1 was the original IV
S2: r3 = mem[r1]
S3: if ... goto L1;
S4: r4 = r3 // REG_EQUAL mem[r1]
L1:
S5: r1 = r2 + 0x8

At S4, r1 is not live in the usual meaning of register liveness, but
the DEF from S1 still reaches the REG_EQUAL note. This situation is
not only created by loop unrolling. At least gcse.c (PRE),
loop-invariant.c, cse.c, dce.c and combine.c can create situations
like the above. ifcvt.c jumps through hoops to avoid it (see e.g.
PR46315, which you fixed).

Most of the problems are papered over by occasional calls to
df_note_add_problem from some passes, so that df_remove_dead_eq_notes
cleans up any notes referencing dead registers. But if we really want
to declare this kind of REG_EQUAL note reference to a dead register
invalid RTL (which is something at least you, Paolo, and me agree on),
and we expect passes to clean up their own mess by either updating or
removing any notes they invalidate, then df_remove_dead_eq_notes
shouldn't be necessary for correctness because all notes it encounters
should be valid (being updated by passes).

Removing df_remove_dead_eq_notes breaks bootstrap on the four targets
I tried it on (x86_64, powerpc64, ia64, and hppa). That is, breaks
*without* -funroll-loops, and *without* -fweb. With a hack to disable
pass_initialize_regs, bootstrap still fails, but other files are
miscompiled. With another hack on top to disable CSE2, bootstrap still
fails, and yet other files are miscompiled.

What I'm really trying to say, is that even when I fix this particular
issue with aermod (which is apparently 3 issues: the one I fixed last
month for x86_64, the one fixed with the patch in this thread for
SPARC, and the yet-to-be-identified problem Dominique is still seeing
on darwin10) then it's likely other, similar bugs will show up. Bugs
that are hard to reproduce, dependent on the target's RTL, and
difficult to debug as they are usually wrong-code bugs on larger test
cases.

We really need a more robust solution. I've added Jeff and rth to the
CC, looking for opinions/thoughts/suggestions/$0.02s.

Ciao!
Steven
Steven Bosscher - Nov. 27, 2012, noon
On Tue, Nov 27, 2012 at 11:34 AM, Steven Bosscher wrote:
>>> gcc/
>>>       * loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
>>>       (analyze_iv_to_split_insn): Record it.
>>>       (maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
>>>       notes that refer to IVs that are being split.
>>>       (apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
>>>       Use FOR_BB_INSNS_SAFE.
>>
>> That's fine with me, thanks.  You might want to defer applying it until the
>> reason why it isn't apparently sufficient for aermod.f90 is understood though.
>
> Thanks. Yes, I'm first going to try and figure out why this doesn't
> fix aermod for Dominique. I suspect the problem is in variable
> expansion in the unroller. A previous patch of mine updates REG_EQUAL
> notes in variable expansion, but it doesn't clean up notes that refer
> to maybe dead registers.

Well, that's not it. But what I said below:


> I have to say, I've got a uncomfortable feeling about REG_EQUAL notes
> not being allowed to refer to dead registers.

applies even more after analyzing Dominique's bug.

At the start of RTL PRE we have:

 2000 {r719:DI=r719:DI+0x4;clobber flags:CC;}
      REG_UNUSED: flags:CC

where r719:DI+0x4 is found to be partially redundant. So PRE optimizes this to:

 2889 r719:DI=r1562:DI
      REG_EQUAL: r719:DI+0x4
...
 2913 r1562:DI=r719:DI+0x4

This self-reference in the REG_EQUAL note is the problem. The SET
kills r719, but there is a USE in the PRE'ed expression that keeps
r719 alive. But this use is copy-propagated away in CPROP2:

 2913 r1562:DI=r1562:DI+0x4

Now that USE of r719 is a use of a dead register, rendering the
REG_EQUAL invalid. From there on the problem is the same as the ones I
had to "fix" in loop-unroll.c. First the webizer puts the USE in a
different web and renames the USE to r1591:

 2889 r719:DI=r1562:DI
      REG_EQUAL: r1591:DI+0x4

CSE2 uses this and the equivalence of r719 and r1562 to "optimize" the
PRE expression:

 2913 r1562:DI=r1591:DI+0x8

Then init-regs notices that this reg is quasi-used uninitialized:

 3122 r1591:DI=0
 2913 r1562:DI=r1591:DI+0x8
      REG_DEAD: r1591:DI

And combine finally produces:

 2913 r1562:DI=0x8


I'm not sure what to name as the "root cause" for all of this. It all
starts with PRE creating a REG_EQUAL note that references the register
that's SET in the insn the note is attached to, but the register is
live after the insn so from that point of view the note is not
invalid. CPROP2 kills r179 by propagating it out and making the note
invalid. I think CPROP2 could do that to any register, and if passes
should make sure REG_EQUAL notes are updated or removed then this is a
bug in RTL CPROP.

Very, very messy...

Ciao!
Steven
Paolo Bonzini - Nov. 27, 2012, 2:32 p.m.
Il 27/11/2012 13:00, Steven Bosscher ha scritto:
> It all
> starts with PRE creating a REG_EQUAL note that references the register
> that's SET in the insn the note is attached to, but the register is
> live after the insn so from that point of view the note is not
> invalid.

This note seems very very weird.  For one thing, it becomes invalid on
the very instruction where it is created.  I would say that it should
not be there.

Paolo
Eric Botcazou - Nov. 27, 2012, 4:57 p.m.
> This note seems very very weird.  For one thing, it becomes invalid on
> the very instruction where it is created.  I would say that it should
> not be there.

Agreed.

Patch

Index: loop-unroll.c
===================================================================
--- loop-unroll.c	(revision 193394)
+++ loop-unroll.c	(working copy)
@@ -74,6 +74,7 @@  along with GCC; see the file COPYING3.
 struct iv_to_split
 {
   rtx insn;		/* The insn in that the induction variable occurs.  */
+  rtx orig_var;		/* The variable (register) for the IV before split.  */
   rtx base_var;		/* The variable on that the values in the further
 			   iterations are based.  */
   rtx step;		/* Step of the induction variable.  */
@@ -1833,6 +1834,7 @@  analyze_iv_to_split_insn (rtx insn)
   /* Record the insn to split.  */
   ivts = XNEW (struct iv_to_split);
   ivts->insn = insn;
+  ivts->orig_var = dest;
   ivts->base_var = NULL_RTX;
   ivts->step = iv.step;
   ivts->next = NULL;
@@ -2253,6 +2255,32 @@  combine_var_copies_in_loop_exit (struct
   emit_insn_after (seq, insn);
 }

+/* Strip away REG_EQUAL notes for IVs we're splitting.
+
+   Updating REG_EQUAL notes for IVs we split is tricky: We
+   cannot tell until after unrolling, DF-rescanning, and liveness
+   updating, whether an EQ_USE is reached by the split IV while
+   the IV reg is still live.  See PR55006.
+
+   ??? We cannot use remove_reg_equal_equiv_notes_for_regno,
+   because RTL loop-iv requires us to defer rescanning insns and
+   any notes attached to them.  So resort to old techniques...  */
+
+static void
+maybe_strip_eq_note_for_split_iv (struct opt_info *opt_info, rtx insn)
+{
+  struct iv_to_split *ivts;
+  rtx note = find_reg_equal_equiv_note (insn);
+  if (! note)
+    return;
+  for (ivts = opt_info->iv_to_split_head; ivts; ivts = ivts->next)
+    if (reg_mentioned_p (ivts->orig_var, note))
+      {
+	remove_note (insn, note);
+	return;
+      }
+}
+
 /* Apply loop optimizations in loop copies using the
    data which gathered during the unrolling.  Structure
    OPT_INFO record that data.
@@ -2293,9 +2321,8 @@  apply_opt_in_copies (struct opt_info *op
 					unrolling);
       bb->aux = 0;
       orig_insn = BB_HEAD (orig_bb);
-      for (insn = BB_HEAD (bb); insn != NEXT_INSN (BB_END (bb)); insn = next)
+      FOR_BB_INSNS_SAFE (bb, insn, next)
         {
-          next = NEXT_INSN (insn);
 	  if (!INSN_P (insn)
 	      || (DEBUG_INSN_P (insn)
 		  && TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == LABEL_DECL))
@@ -2313,6 +2340,8 @@  apply_opt_in_copies (struct opt_info *op
           /* Apply splitting iv optimization.  */
           if (opt_info->insns_to_split)
             {
+	      maybe_strip_eq_note_for_split_iv (opt_info, insn);
+
               ivts = (struct iv_to_split *)
 		htab_find (opt_info->insns_to_split, &ivts_templ);

@@ -2378,6 +2407,8 @@  apply_opt_in_copies (struct opt_info *op
           ivts_templ.insn = orig_insn;
           if (opt_info->insns_to_split)
             {
+	      maybe_strip_eq_note_for_split_iv (opt_info, orig_insn);
+
               ivts = (struct iv_to_split *)
 		htab_find (opt_info->insns_to_split, &ivts_templ);
               if (ivts)