diff mbox series

shrink-wrapping: Fix up prologue block discovery [PR103860]

Message ID 20211230094332.GZ2646553@tucnak
State New
Headers show
Series shrink-wrapping: Fix up prologue block discovery [PR103860] | expand

Commit Message

Jakub Jelinek Dec. 30, 2021, 9:43 a.m. UTC
Hi!

The following testcase is miscompiled, because a prologue which
contains subq $8, %rsp instruction is emitted at the start of
a basic block which contains conditional jump that depends on
flags register set in an earlier basic block, the prologue instruction
then clobbers those flags.
Normally this case is checked by can_get_prologue predicate, but this
is done only at the start of the loop.  If we update pro later in the
loop (because some bb shouldn't be duplicated) and then don't push
anything further into vec and the vec is already empty (this can happen
when the new pro is already in bb_with bitmask and either has no successors
(that is the case in the testcase where that bb ends with a trap) or
all the successors are already in bb_with, then the loop doesn't iterate
further and can_get_prologue will not be checked.

The following simple patch makes sure we call can_get_prologue even after
the last former iteration when vec is already empty and only break from
the loop afterwards (and only if the updating of pro done because of
!can_get_prologue didn't push anything into vec again).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-12-30  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/103860
	* shrink-wrap.c (try_shrink_wrapping): Make sure can_get_prologue is
	called on pro even if nothing further is pushed into vec.

	* gcc.dg/pr103860.c: New test.


	Jakub

Comments

Segher Boessenkool Dec. 30, 2021, 10:08 a.m. UTC | #1
Hi!

On Thu, Dec 30, 2021 at 10:43:32AM +0100, Jakub Jelinek wrote:
> The following testcase is miscompiled, because a prologue which
> contains subq $8, %rsp instruction is emitted at the start of
> a basic block which contains conditional jump that depends on
> flags register set in an earlier basic block, the prologue instruction
> then clobbers those flags.
> Normally this case is checked by can_get_prologue predicate, but this
> is done only at the start of the loop.  If we update pro later in the
> loop (because some bb shouldn't be duplicated) and then don't push
> anything further into vec and the vec is already empty (this can happen
> when the new pro is already in bb_with bitmask and either has no successors
> (that is the case in the testcase where that bb ends with a trap) or
> all the successors are already in bb_with, then the loop doesn't iterate
> further and can_get_prologue will not be checked.
> 
> The following simple patch makes sure we call can_get_prologue even after
> the last former iteration when vec is already empty and only break from
> the loop afterwards (and only if the updating of pro done because of
> !can_get_prologue didn't push anything into vec again).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

That looks good, and very simple, thanks!

git blame says I wrote 69.9% of shrink-wrap.c, but I am not maintainer
of it, so I cannot approve your patch -- but it is fine afaics.


Segher


> 2021-12-30  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/103860
> 	* shrink-wrap.c (try_shrink_wrapping): Make sure can_get_prologue is
> 	called on pro even if nothing further is pushed into vec.
> 
> 	* gcc.dg/pr103860.c: New test.
Jeff Law Dec. 30, 2021, 12:52 p.m. UTC | #2
On 12/30/2021 3:08 AM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Dec 30, 2021 at 10:43:32AM +0100, Jakub Jelinek wrote:
>> The following testcase is miscompiled, because a prologue which
>> contains subq $8, %rsp instruction is emitted at the start of
>> a basic block which contains conditional jump that depends on
>> flags register set in an earlier basic block, the prologue instruction
>> then clobbers those flags.
>> Normally this case is checked by can_get_prologue predicate, but this
>> is done only at the start of the loop.  If we update pro later in the
>> loop (because some bb shouldn't be duplicated) and then don't push
>> anything further into vec and the vec is already empty (this can happen
>> when the new pro is already in bb_with bitmask and either has no successors
>> (that is the case in the testcase where that bb ends with a trap) or
>> all the successors are already in bb_with, then the loop doesn't iterate
>> further and can_get_prologue will not be checked.
>>
>> The following simple patch makes sure we call can_get_prologue even after
>> the last former iteration when vec is already empty and only break from
>> the loop afterwards (and only if the updating of pro done because of
>> !can_get_prologue didn't push anything into vec again).
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> That looks good, and very simple, thanks!
>
> git blame says I wrote 69.9% of shrink-wrap.c, but I am not maintainer
> of it, so I cannot approve your patch -- but it is fine afaics.
Well, we should probably fix the maintainer status of shrink-wrap.c.

In the mean time, based on Segher's review, this is fine for the trunk.

jeff
diff mbox series

Patch

--- gcc/shrink-wrap.c.jj	2021-06-07 09:24:57.731689612 +0200
+++ gcc/shrink-wrap.c	2021-12-29 19:16:28.114148496 +0100
@@ -781,7 +781,7 @@  try_shrink_wrapping (edge *entry_edge, r
   unsigned max_grow_size = get_uncond_jump_length ();
   max_grow_size *= param_max_grow_copy_bb_insns;
 
-  while (!vec.is_empty () && pro != entry)
+  while (pro != entry)
     {
       while (pro != entry && !can_get_prologue (pro, prologue_clobbered))
 	{
@@ -791,6 +791,9 @@  try_shrink_wrapping (edge *entry_edge, r
 	    vec.quick_push (pro);
 	}
 
+      if (vec.is_empty ())
+	break;
+
       basic_block bb = vec.pop ();
       if (!can_dup_for_shrink_wrapping (bb, pro, max_grow_size))
 	while (!dominated_by_p (CDI_DOMINATORS, bb, pro))
--- gcc/testsuite/gcc.dg/pr103860.c.jj	2021-12-29 19:19:23.047696170 +0100
+++ gcc/testsuite/gcc.dg/pr103860.c	2021-12-29 19:19:08.360902059 +0100
@@ -0,0 +1,31 @@ 
+/* PR rtl-optimization/103860 */
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+
+static int d, *e;
+int f;
+
+__attribute__((noinline)) signed char
+foo (signed char b, signed char c)
+{
+  return b + c;
+}
+
+int
+main ()
+{
+  signed char l;
+  for (l = -1; l; l = foo (l, 1))
+    {
+      while (d < 0)
+	;
+      if (d > 0)
+	{
+	  f = 0;
+	  *e = 0;
+	}
+    }
+  d = 0;
+  return 0;
+}