diff mbox series

Fix RTL thread_jump for doloop branches (PR rtl-optimization/89634)

Message ID 20190309075833.GS7611@tucnak
State New
Headers show
Series Fix RTL thread_jump for doloop branches (PR rtl-optimization/89634) | expand

Commit Message

Jakub Jelinek March 9, 2019, 7:58 a.m. UTC
Hi!

The following testcase used to be miscompiled by RTL jump threading
on s390x-linux with -march=zEC12 at -O2 before r269302 which made it latent.
The problem is we had:
(jump_insn 26 25 87 4 (parallel [
            (set (pc)
                (if_then_else (eq (reg/v:DI 5 %r5 [orig:74 e ] [74])
                        (const_int 1 [0x1]))
                    (label_ref 43)
                    (pc)))
            (clobber (reg:CC 33 %cc))
        ]) "rh1686696.c":16:7 1263 {*cmp_and_br_signed_di}
     (int_list:REG_BR_PROB 118111604 (nil))
 -> 43)
as the sole non-note non-code_label insn in basic block 4, and
(jump_insn 68 67 134 12 (parallel [
            (set (pc)
                (if_then_else (ne (reg/v:DI 5 %r5 [orig:74 e ] [74])
                        (const_int 1 [0x1]))
                    (label_ref:DI 23)
                    (pc)))
            (set (reg/v:DI 5 %r5 [orig:74 e ] [74])
                (plus:DI (reg/v:DI 5 %r5 [orig:74 e ] [74])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (scratch:DI))
            (clobber (reg:CC 33 %cc))
        ]) "rh1686696.c":13:3 1922 {doloop_di}
     (int_list:REG_BR_PROB 904381916 (nil))
 -> 23)
as the sole non-note non-code_label insn in basic block 12, the doloop
conditional jump branching to the bb with the above jump_insn 26.
thread_jump sees one condition being %r5 == 1 and the other %r5 != 1,
performs the cselib assisted checks if the b sequence is really a noop
(but that starts with the original complete bb as a base and only does
nonequal processing in the e->src bb) and decided to change insn 68
to jump after the jump_insn 26 because the comparison of %r5 against 1
has been done already.

It didn't take into account that %r5 has been modified in the doloop_di
insn though, so the conditions are comparing different values.

I have thought about different approaches, below is the simplest one,
just punt if modified_in_p.  Another possibility would be to handle the
BB_END (e->src) insn in the second rather than first loop and do the
nonequal handling in there, but I couldn't figure out what kind of insns
it would handle better than this more simple approach.  It wouldn't handle
hypothetical condjump instruction which compares one register and modifies
another one, as it would add that register into nonequal and unless that
register has been unconditionally set to something else, would keep it
in nonequal and bail out.

Bootstrapped/regtested on {x86_64,i686}-linux (where it didn't check much
because those targets don't have doloops) and on r269253 trunk (i.e. before
the r269302 change) on {powerpc64le,s390x}-linux (which both have doloop
insns).  Ok for trunk?

2019-03-09  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/89634
	* cfgcleanup.c (thread_jump): Punt if registers mentioned in cond1
	are modified in BB_END (e->src) instruction.

	* gcc.c-torture/execute/pr89634.c: New test.


	Jakub

Comments

Richard Sandiford March 9, 2019, 8:23 a.m. UTC | #1
Jakub Jelinek <jakub@redhat.com> writes:
> Hi!
>
> The following testcase used to be miscompiled by RTL jump threading
> on s390x-linux with -march=zEC12 at -O2 before r269302 which made it latent.
> The problem is we had:
> (jump_insn 26 25 87 4 (parallel [
>             (set (pc)
>                 (if_then_else (eq (reg/v:DI 5 %r5 [orig:74 e ] [74])
>                         (const_int 1 [0x1]))
>                     (label_ref 43)
>                     (pc)))
>             (clobber (reg:CC 33 %cc))
>         ]) "rh1686696.c":16:7 1263 {*cmp_and_br_signed_di}
>      (int_list:REG_BR_PROB 118111604 (nil))
>  -> 43)
> as the sole non-note non-code_label insn in basic block 4, and
> (jump_insn 68 67 134 12 (parallel [
>             (set (pc)
>                 (if_then_else (ne (reg/v:DI 5 %r5 [orig:74 e ] [74])
>                         (const_int 1 [0x1]))
>                     (label_ref:DI 23)
>                     (pc)))
>             (set (reg/v:DI 5 %r5 [orig:74 e ] [74])
>                 (plus:DI (reg/v:DI 5 %r5 [orig:74 e ] [74])
>                     (const_int -1 [0xffffffffffffffff])))
>             (clobber (scratch:DI))
>             (clobber (reg:CC 33 %cc))
>         ]) "rh1686696.c":13:3 1922 {doloop_di}
>      (int_list:REG_BR_PROB 904381916 (nil))
>  -> 23)
> as the sole non-note non-code_label insn in basic block 12, the doloop
> conditional jump branching to the bb with the above jump_insn 26.
> thread_jump sees one condition being %r5 == 1 and the other %r5 != 1,
> performs the cselib assisted checks if the b sequence is really a noop
> (but that starts with the original complete bb as a base and only does
> nonequal processing in the e->src bb) and decided to change insn 68
> to jump after the jump_insn 26 because the comparison of %r5 against 1
> has been done already.
>
> It didn't take into account that %r5 has been modified in the doloop_di
> insn though, so the conditions are comparing different values.
>
> I have thought about different approaches, below is the simplest one,
> just punt if modified_in_p.  Another possibility would be to handle the
> BB_END (e->src) insn in the second rather than first loop and do the
> nonequal handling in there, but I couldn't figure out what kind of insns
> it would handle better than this more simple approach.  It wouldn't handle
> hypothetical condjump instruction which compares one register and modifies
> another one, as it would add that register into nonequal and unless that
> register has been unconditionally set to something else, would keep it
> in nonequal and bail out.

Yeah.  Looks like we'd have to do something overly complicated to
separate the sets for the final EXECUTE_IF_SET_IN_REG_SET, and at
that point it's equivalent to the simple way.

> Bootstrapped/regtested on {x86_64,i686}-linux (where it didn't check much
> because those targets don't have doloops) and on r269253 trunk (i.e. before
> the r269302 change) on {powerpc64le,s390x}-linux (which both have doloop
> insns).  Ok for trunk?
>
> 2019-03-09  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR rtl-optimization/89634
> 	* cfgcleanup.c (thread_jump): Punt if registers mentioned in cond1
> 	are modified in BB_END (e->src) instruction.
>
> 	* gcc.c-torture/execute/pr89634.c: New test.

OK, thanks.

Richard
diff mbox series

Patch

--- gcc/cfgcleanup.c.jj	2019-01-22 10:11:59.652429658 +0100
+++ gcc/cfgcleanup.c	2019-03-08 17:47:11.997954352 +0100
@@ -308,6 +308,11 @@  thread_jump (edge e, basic_block b)
       || !rtx_equal_p (XEXP (cond1, 1), XEXP (cond2, 1)))
     return NULL;
 
+  /* Punt if BB_END (e->src) is doloop-like conditional jump that modifies
+     the registers used in cond1.  */
+  if (modified_in_p (cond1, BB_END (e->src)))
+    return NULL;
+
   /* Short circuit cases where block B contains some side effects, as we can't
      safely bypass it.  */
   for (insn = NEXT_INSN (BB_HEAD (b)); insn != NEXT_INSN (BB_END (b));
--- gcc/testsuite/gcc.c-torture/execute/pr89634.c.jj	2019-03-08 17:48:08.161045920 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr89634.c	2019-03-08 17:47:52.151304876 +0100
@@ -0,0 +1,40 @@ 
+/* PR rtl-optimization/89634 */
+
+static unsigned long *
+foo (unsigned long *x)
+{
+  return x + (1 + *x);
+}
+
+__attribute__((noipa)) unsigned long
+bar (unsigned long *x)
+{
+  unsigned long c, d = 1, e, *f, g, h = 0, i;
+  for (e = *x - 1; e > 0; e--)
+    {
+      f = foo (x + 1);
+      for (i = 1; i < e; i++)
+	f = foo (f);
+      c = *f;
+      if (c == 2)
+	d *= 2;
+      else
+	{
+	  i = (c - 1) / 2 - 1;
+	  g = (2 * i + 1) * (d + 1) + (2 * d + 1);
+	  if (g > h)
+	    h = g;
+	  d *= c;
+	}
+    }
+  return h;
+}
+
+int
+main ()
+{
+  unsigned long a[18] = { 4, 2, -200, 200, 2, -400, 400, 3, -600, 0, 600, 5, -100, -66, 0, 66, 100, __LONG_MAX__ / 8 + 1 };
+  if (bar (a) != 17)
+    __builtin_abort ();
+  return 0;
+}