diff mbox

Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

Message ID 20150109091014.GR1405@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Jan. 9, 2015, 9:10 a.m. UTC
Hi!

The following testcase is miscompiled on s390x.  The problem is that there
is massive cross-jumping going on, and after that post_order_compute
decides to call tidy_fallthru_edges, including on an edge from a bb ending
with a table jump to a bb with now a single successor where all the
jump_table_data entries point to.  rtl_tidy_fallthru_edge happily removes
the tablejump and anything in between the current bb and next bb (i.e.
jump_table_data + its code_label + barrier if any), but doesn't care about
any possible uses of the code_label (on the testcase e.g. the label
reference is hoisted before the loop).
Now, if I try some artificial testcase like:
int a;
#define A(n) case n: a++; break;
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)

void
foo (int x)
{
  switch (x)
    {
    C(1)
    }
}
say on x86_64, tidy_fallthru_edges isn't called at all (it would be only if
there are unrelated unreachable blocks in the function at the same time),
so I think spending time on trying to handle tablejump_p right in
rtl_tidy_fallthru_edge is wasteful, my preference (for both 4.9 and trunk)
would be as the patch below just not handle tablejump_p in that function,
and for trunk perhaps try to handle it elsewhere where it will be optimized
even for the above testcase (somewhere in cfgcleanup?).
Also, eventually it would be really nice if tree-ssa-tail-merge.c could
handle this already at the GIMPLE level.

Thoughts on this?

Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk and on
{x86_64,i686,ppc64,ppc64le,s390,s390x,armv7hl}-linux on 4.9 branch.

2015-01-09  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/64536
	* cfgrtl.c (rtl_tidy_fallthru_edge): Don't remove tablejumps.

	* gcc.dg/pr64536.c: New test.


	Jakub

Comments

Richard Biener Jan. 9, 2015, 9:36 a.m. UTC | #1
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled on s390x.  The problem is that there
> is massive cross-jumping going on, and after that post_order_compute
> decides to call tidy_fallthru_edges, including on an edge from a bb ending
> with a table jump to a bb with now a single successor where all the
> jump_table_data entries point to.  rtl_tidy_fallthru_edge happily removes
> the tablejump and anything in between the current bb and next bb (i.e.
> jump_table_data + its code_label + barrier if any), but doesn't care about
> any possible uses of the code_label (on the testcase e.g. the label
> reference is hoisted before the loop).
> Now, if I try some artificial testcase like:
> int a;
> #define A(n) case n: a++; break;
> #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
> #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
> 
> void
> foo (int x)
> {
>   switch (x)
>     {
>     C(1)
>     }
> }
> say on x86_64, tidy_fallthru_edges isn't called at all (it would be only if
> there are unrelated unreachable blocks in the function at the same time),
> so I think spending time on trying to handle tablejump_p right in
> rtl_tidy_fallthru_edge is wasteful, my preference (for both 4.9 and trunk)
> would be as the patch below just not handle tablejump_p in that function,
> and for trunk perhaps try to handle it elsewhere where it will be optimized
> even for the above testcase (somewhere in cfgcleanup?).

I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
that break the just computed postorder?

Other than that, why doesn't can't the issue show up with non-table-jumps?

What does it take to preserve (all) the labels?

Richard.

> Also, eventually it would be really nice if tree-ssa-tail-merge.c could
> handle this already at the GIMPLE level.
> 
> Thoughts on this?
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk and on
> {x86_64,i686,ppc64,ppc64le,s390,s390x,armv7hl}-linux on 4.9 branch.
> 
> 2015-01-09  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/64536
> 	* cfgrtl.c (rtl_tidy_fallthru_edge): Don't remove tablejumps.
> 
> 	* gcc.dg/pr64536.c: New test.
> 
> --- gcc/cfgrtl.c.jj	2015-01-05 13:07:12.000000000 +0100
> +++ gcc/cfgrtl.c	2015-01-08 17:03:18.511218340 +0100
> @@ -1782,10 +1782,14 @@ rtl_tidy_fallthru_edge (edge e)
>      if (INSN_P (q))
>        return;
>  
> +  q = BB_END (b);
> +  /* Don't remove table jumps here.  */
> +  if (tablejump_p (q, NULL, NULL))
> +    return;
> +
>    /* Remove what will soon cease being the jump insn from the source block.
>       If block B consisted only of this single jump, turn it into a deleted
>       note.  */
> -  q = BB_END (b);
>    if (JUMP_P (q)
>        && onlyjump_p (q)
>        && (any_uncondjump_p (q)
> --- gcc/testsuite/gcc.dg/pr64536.c.jj	2015-01-08 17:13:32.218929003 +0100
> +++ gcc/testsuite/gcc.dg/pr64536.c	2015-01-08 17:28:56.758428958 +0100
> @@ -0,0 +1,67 @@
> +/* PR rtl-optimization/64536 */
> +/* { dg-do link } */
> +/* { dg-options "-O2" } */
> +/* { dg-additional-options "-fPIC" { target fpic } } */
> +
> +struct S { long q; } *h;
> +long a, b, g, j, k, *c, *d, *e, *f, *i;
> +long *baz (void)
> +{
> +  asm volatile ("" : : : "memory");
> +  return e;
> +}
> +
> +void
> +bar (int x)
> +{
> +  int y;
> +  for (y = 0; y < x; y++)
> +    {
> +      switch (b)
> +	{
> +	case 0:
> +	case 2:
> +	  a++;
> +	  break;
> +	case 3:
> +	  a++;
> +	  break;
> +	case 1:
> +	  a++;
> +	}
> +      if (d)
> +	{
> +	  f = baz ();
> +	  g = k++;
> +	  if (&h->q)
> +	    {
> +	      j = *f;
> +	      h->q = *f;
> +	    }
> +	  else
> +	    i = (long *) (h->q = *f);
> +	  *c++ = (long) f;
> +	  e += 6;
> +	}
> +      else
> +	{
> +	  f = baz ();
> +	  g = k++;
> +	  if (&h->q)
> +	    {
> +	      j = *f;
> +	      h->q = *f;
> +	    }
> +	  else
> +	    i = (long *) (h->q = *f);
> +	  *c++ = (long) f;
> +	  e += 6;
> +	}
> +    }
> +}
> +
> +int
> +main ()
> +{
> +  return 0;
> +}
> 
> 	Jakub
> 
>
Richard Biener Jan. 9, 2015, 9:37 a.m. UTC | #2
On Fri, 9 Jan 2015, Richard Biener wrote:

> On Fri, 9 Jan 2015, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > The following testcase is miscompiled on s390x.  The problem is that there
> > is massive cross-jumping going on, and after that post_order_compute
> > decides to call tidy_fallthru_edges, including on an edge from a bb ending
> > with a table jump to a bb with now a single successor where all the
> > jump_table_data entries point to.  rtl_tidy_fallthru_edge happily removes
> > the tablejump and anything in between the current bb and next bb (i.e.
> > jump_table_data + its code_label + barrier if any), but doesn't care about
> > any possible uses of the code_label (on the testcase e.g. the label
> > reference is hoisted before the loop).
> > Now, if I try some artificial testcase like:
> > int a;
> > #define A(n) case n: a++; break;
> > #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
> > #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
> > 
> > void
> > foo (int x)
> > {
> >   switch (x)
> >     {
> >     C(1)
> >     }
> > }
> > say on x86_64, tidy_fallthru_edges isn't called at all (it would be only if
> > there are unrelated unreachable blocks in the function at the same time),
> > so I think spending time on trying to handle tablejump_p right in
> > rtl_tidy_fallthru_edge is wasteful, my preference (for both 4.9 and trunk)
> > would be as the patch below just not handle tablejump_p in that function,
> > and for trunk perhaps try to handle it elsewhere where it will be optimized
> > even for the above testcase (somewhere in cfgcleanup?).
> 
> I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> that break the just computed postorder?
> 
> Other than that, why doesn't can't the issue show up with non-table-jumps?
> 
> What does it take to preserve (all) the labels?

That is, just remove q, not everything between q and BB_HEAD (c)?

> Richard.
> 
> > Also, eventually it would be really nice if tree-ssa-tail-merge.c could
> > handle this already at the GIMPLE level.
> > 
> > Thoughts on this?
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk and on
> > {x86_64,i686,ppc64,ppc64le,s390,s390x,armv7hl}-linux on 4.9 branch.
> > 
> > 2015-01-09  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR rtl-optimization/64536
> > 	* cfgrtl.c (rtl_tidy_fallthru_edge): Don't remove tablejumps.
> > 
> > 	* gcc.dg/pr64536.c: New test.
> > 
> > --- gcc/cfgrtl.c.jj	2015-01-05 13:07:12.000000000 +0100
> > +++ gcc/cfgrtl.c	2015-01-08 17:03:18.511218340 +0100
> > @@ -1782,10 +1782,14 @@ rtl_tidy_fallthru_edge (edge e)
> >      if (INSN_P (q))
> >        return;
> >  
> > +  q = BB_END (b);
> > +  /* Don't remove table jumps here.  */
> > +  if (tablejump_p (q, NULL, NULL))
> > +    return;
> > +
> >    /* Remove what will soon cease being the jump insn from the source block.
> >       If block B consisted only of this single jump, turn it into a deleted
> >       note.  */
> > -  q = BB_END (b);
> >    if (JUMP_P (q)
> >        && onlyjump_p (q)
> >        && (any_uncondjump_p (q)
> > --- gcc/testsuite/gcc.dg/pr64536.c.jj	2015-01-08 17:13:32.218929003 +0100
> > +++ gcc/testsuite/gcc.dg/pr64536.c	2015-01-08 17:28:56.758428958 +0100
> > @@ -0,0 +1,67 @@
> > +/* PR rtl-optimization/64536 */
> > +/* { dg-do link } */
> > +/* { dg-options "-O2" } */
> > +/* { dg-additional-options "-fPIC" { target fpic } } */
> > +
> > +struct S { long q; } *h;
> > +long a, b, g, j, k, *c, *d, *e, *f, *i;
> > +long *baz (void)
> > +{
> > +  asm volatile ("" : : : "memory");
> > +  return e;
> > +}
> > +
> > +void
> > +bar (int x)
> > +{
> > +  int y;
> > +  for (y = 0; y < x; y++)
> > +    {
> > +      switch (b)
> > +	{
> > +	case 0:
> > +	case 2:
> > +	  a++;
> > +	  break;
> > +	case 3:
> > +	  a++;
> > +	  break;
> > +	case 1:
> > +	  a++;
> > +	}
> > +      if (d)
> > +	{
> > +	  f = baz ();
> > +	  g = k++;
> > +	  if (&h->q)
> > +	    {
> > +	      j = *f;
> > +	      h->q = *f;
> > +	    }
> > +	  else
> > +	    i = (long *) (h->q = *f);
> > +	  *c++ = (long) f;
> > +	  e += 6;
> > +	}
> > +      else
> > +	{
> > +	  f = baz ();
> > +	  g = k++;
> > +	  if (&h->q)
> > +	    {
> > +	      j = *f;
> > +	      h->q = *f;
> > +	    }
> > +	  else
> > +	    i = (long *) (h->q = *f);
> > +	  *c++ = (long) f;
> > +	  e += 6;
> > +	}
> > +    }
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  return 0;
> > +}
> > 
> > 	Jakub
> > 
> > 
> 
>
Jakub Jelinek Jan. 9, 2015, 9:54 a.m. UTC | #3
On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> that break the just computed postorder?

Dunno, but I think it shouldn't break anything, the function doesn't remove
any blocks, just in the typical case of an unconditional jump to the next bb
or conditional jump to the next bb (if only successor) removes the jump and
makes the edge EDGE_FALLTHRU.

> Other than that, why doesn't can't the issue show up with non-table-jumps?

I think tablejumps are the only case where (at least during jump2)
code_labels live in between the basic blocks, not inside of them.

> What does it take to preserve (all) the labels?

Then we'd need to remove all the instructions in between the two basic
blocks (as we currently do), but move any code_labels from there first to
the start of the next basic block.  Probably better just call tablejump_p
with non-NULL args and move precisely that code_label that it sets.

But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
called, so we'd need to do it at another place too.

	Jakub
Richard Biener Jan. 9, 2015, 10:15 a.m. UTC | #4
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> > I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> > that break the just computed postorder?
> 
> Dunno, but I think it shouldn't break anything, the function doesn't remove
> any blocks, just in the typical case of an unconditional jump to the next bb
> or conditional jump to the next bb (if only successor) removes the jump and
> makes the edge EDGE_FALLTHRU.
> 
> > Other than that, why doesn't can't the issue show up with non-table-jumps?
> 
> I think tablejumps are the only case where (at least during jump2)
> code_labels live in between the basic blocks, not inside of them.
> 
> > What does it take to preserve (all) the labels?
> 
> Then we'd need to remove all the instructions in between the two basic
> blocks (as we currently do), but move any code_labels from there first to
> the start of the next basic block.  Probably better just call tablejump_p
> with non-NULL args and move precisely that code_label that it sets.
> 
> But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
> called, so we'd need to do it at another place too.

Ok, I see.  I still wonder why we call tidy_fallthru_edges from
postorder_compute.  If we delete unreachable blocks that means
we at most remove incoming edges to a block - that should never
change any other edges fallthru status...?

Does just removing that call there work and make the bug latent again?

Richard.
Richard Biener Jan. 9, 2015, 10:59 a.m. UTC | #5
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 11:15:14AM +0100, Richard Biener wrote:
> > On Fri, 9 Jan 2015, Jakub Jelinek wrote:
> > 
> > > On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> > > > I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> > > > that break the just computed postorder?
> > > 
> > > Dunno, but I think it shouldn't break anything, the function doesn't remove
> > > any blocks, just in the typical case of an unconditional jump to the next bb
> > > or conditional jump to the next bb (if only successor) removes the jump and
> > > makes the edge EDGE_FALLTHRU.
> > > 
> > > > Other than that, why doesn't can't the issue show up with non-table-jumps?
> > > 
> > > I think tablejumps are the only case where (at least during jump2)
> > > code_labels live in between the basic blocks, not inside of them.
> > > 
> > > > What does it take to preserve (all) the labels?
> > > 
> > > Then we'd need to remove all the instructions in between the two basic
> > > blocks (as we currently do), but move any code_labels from there first to
> > > the start of the next basic block.  Probably better just call tablejump_p
> > > with non-NULL args and move precisely that code_label that it sets.
> > > 
> > > But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
> > > called, so we'd need to do it at another place too.
> > 
> > Ok, I see.  I still wonder why we call tidy_fallthru_edges from
> > postorder_compute.  If we delete unreachable blocks that means
> > we at most remove incoming edges to a block - that should never
> > change any other edges fallthru status...?
> 
> The call has been added by
> http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00095.html
> and is only done if post_order_compute is called with the special flag,
> supposedly that replaced explicit delete_unreachable_blocks or similar.
> And, if you remove unreachable blocks, if they are in between some bb
> and its single successor, then indeed that is something that should be
> tidied, as we don't have to jump around nothing.

Ah, indeed.

> If you want, I can try instead of disabling it for tablejumps
> just move the label.

Yeah, I'd prefer that - it can't be too difficult, no?

> Still, I think we should be able to optimize it somewhere else too
> (we can remove the tablejumps not just if all jump_table_data entries
> point to next_bb, but even when they point to some completely different bb,
> as long as it is a single_succ_p).  And ideally also optimize it at GIMPLE,
> but guess that is GCC 6 material.

cfgcleanup material, similar for GIMPLE I guess.

Richard.
Jakub Jelinek Jan. 9, 2015, 11 a.m. UTC | #6
On Fri, Jan 09, 2015 at 11:15:14AM +0100, Richard Biener wrote:
> On Fri, 9 Jan 2015, Jakub Jelinek wrote:
> 
> > On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> > > I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> > > that break the just computed postorder?
> > 
> > Dunno, but I think it shouldn't break anything, the function doesn't remove
> > any blocks, just in the typical case of an unconditional jump to the next bb
> > or conditional jump to the next bb (if only successor) removes the jump and
> > makes the edge EDGE_FALLTHRU.
> > 
> > > Other than that, why doesn't can't the issue show up with non-table-jumps?
> > 
> > I think tablejumps are the only case where (at least during jump2)
> > code_labels live in between the basic blocks, not inside of them.
> > 
> > > What does it take to preserve (all) the labels?
> > 
> > Then we'd need to remove all the instructions in between the two basic
> > blocks (as we currently do), but move any code_labels from there first to
> > the start of the next basic block.  Probably better just call tablejump_p
> > with non-NULL args and move precisely that code_label that it sets.
> > 
> > But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
> > called, so we'd need to do it at another place too.
> 
> Ok, I see.  I still wonder why we call tidy_fallthru_edges from
> postorder_compute.  If we delete unreachable blocks that means
> we at most remove incoming edges to a block - that should never
> change any other edges fallthru status...?

The call has been added by
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00095.html
and is only done if post_order_compute is called with the special flag,
supposedly that replaced explicit delete_unreachable_blocks or similar.
And, if you remove unreachable blocks, if they are in between some bb
and its single successor, then indeed that is something that should be
tidied, as we don't have to jump around nothing.

If you want, I can try instead of disabling it for tablejumps
just move the label.

Still, I think we should be able to optimize it somewhere else too
(we can remove the tablejumps not just if all jump_table_data entries
point to next_bb, but even when they point to some completely different bb,
as long as it is a single_succ_p).  And ideally also optimize it at GIMPLE,
but guess that is GCC 6 material.

	Jakub
diff mbox

Patch

--- gcc/cfgrtl.c.jj	2015-01-05 13:07:12.000000000 +0100
+++ gcc/cfgrtl.c	2015-01-08 17:03:18.511218340 +0100
@@ -1782,10 +1782,14 @@  rtl_tidy_fallthru_edge (edge e)
     if (INSN_P (q))
       return;
 
+  q = BB_END (b);
+  /* Don't remove table jumps here.  */
+  if (tablejump_p (q, NULL, NULL))
+    return;
+
   /* Remove what will soon cease being the jump insn from the source block.
      If block B consisted only of this single jump, turn it into a deleted
      note.  */
-  q = BB_END (b);
   if (JUMP_P (q)
       && onlyjump_p (q)
       && (any_uncondjump_p (q)
--- gcc/testsuite/gcc.dg/pr64536.c.jj	2015-01-08 17:13:32.218929003 +0100
+++ gcc/testsuite/gcc.dg/pr64536.c	2015-01-08 17:28:56.758428958 +0100
@@ -0,0 +1,67 @@ 
+/* PR rtl-optimization/64536 */
+/* { dg-do link } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+
+struct S { long q; } *h;
+long a, b, g, j, k, *c, *d, *e, *f, *i;
+long *baz (void)
+{
+  asm volatile ("" : : : "memory");
+  return e;
+}
+
+void
+bar (int x)
+{
+  int y;
+  for (y = 0; y < x; y++)
+    {
+      switch (b)
+	{
+	case 0:
+	case 2:
+	  a++;
+	  break;
+	case 3:
+	  a++;
+	  break;
+	case 1:
+	  a++;
+	}
+      if (d)
+	{
+	  f = baz ();
+	  g = k++;
+	  if (&h->q)
+	    {
+	      j = *f;
+	      h->q = *f;
+	    }
+	  else
+	    i = (long *) (h->q = *f);
+	  *c++ = (long) f;
+	  e += 6;
+	}
+      else
+	{
+	  f = baz ();
+	  g = k++;
+	  if (&h->q)
+	    {
+	      j = *f;
+	      h->q = *f;
+	    }
+	  else
+	    i = (long *) (h->q = *f);
+	  *c++ = (long) f;
+	  e += 6;
+	}
+    }
+}
+
+int
+main ()
+{
+  return 0;
+}