Patchwork PR 52175: another dbr_schedule-induced cfi ICE

login
register
mail settings
Submitter Richard Sandiford
Date Feb. 11, 2012, 9:02 a.m.
Message ID <87k43t68us.fsf@firetop.home>
Download mbox | patch
Permalink /patch/140773/
State New
Headers show

Comments

Richard Sandiford - Feb. 11, 2012, 9:02 a.m.
This patch fixes PR rtl-optimization/52175, which is another case of
dbr_schedule mishandling the result of shrink-wrapping.  There is code
to allow:

        add     rN,rN,CONST

to be put into a delay slot by adding a compensating:

        add     rN,rN,-CONST

to the opposite thread.  (Which has caused headaches before because
it increases code size.)  In the testcase, we end up doing this for
a frame-related stack allocation: we allocate stack in the delay
slot, then immediately deallocate it again on the opposite thread.
This triggers an ICE because the deallocation is not marked as
frame-related, so it looks to the cfi code as though we can reach
the same point with two different CFAs.

We don't really want the optimisers to add new frame-related
annotations, and it's difficult to do that anyway in the general
case where the annotation is based on an instruction that had
REG_FRAME_RELATED_EXPR notes attached.  I also can't really think
of many cases where it would be worthwhile.  So this patch stops
us applying for the optimisation in that case, just like we did
for PR 51471.

I strongly suspect that doing this optimisation for the stack pointer is
independently wrong -- e.g. if an alloca is protected by a bounds check --
but since that part isn't a regression, I'm not tackling it here.

And, to really over-egg this one-liner, I thought I'd better
mention why I fixed it this way.  The code reads:

  /* If we haven't found anything for this delay slot and it is very
     likely that the branch will be taken, see if the insn at our target
     increments or decrements a register with an increment that does not
     depend on the destination register.  If so, try to place the opposite
     arithmetic insn after the jump insn and put the arithmetic insn in the
     delay slot.  If we can't do this, return.  */
  if (delay_list == 0 && likely
      && new_thread && !ANY_RETURN_P (new_thread)
      && NONJUMP_INSN_P (new_thread)
      && GET_CODE (PATTERN (new_thread)) != ASM_INPUT
      && asm_noperands (PATTERN (new_thread)) < 0)
    {
      rtx pat = PATTERN (new_thread);
      rtx dest;
      rtx src;

      trial = new_thread;
      pat = PATTERN (trial);

      if (!NONJUMP_INSN_P (trial)
	  || GET_CODE (pat) != SET
	  || ! eligible_for_delay (insn, 0, trial, flags)
	  || can_throw_internal (trial))
	return 0;

And there's obviously some redundancy between the two if blocks here.
So which is the right way to handle the unoptimisable case?
Returning, or falling through?  I think it's falling through,
so that we still redirect the jump in cases where we have skipped
a redundant insn without actually filling a delay slot.  I therefore
added the check to the first "if" statement rather than the second.

Despite all that, this felt obvious enough to self-approve.
Tested on various mips* targets and applied.

Richard


gcc/
	PR rtl-optimization/52175
	* reorg.c (fill_slots_from_thread): Don't apply add/sub optimization
	to frame-related instructions.

gcc/testsuite/
	PR rtl-optimization/52175
	* gcc.c-torture/compile/pr52175.c: New test.

Patch

Index: gcc/reorg.c
===================================================================
--- gcc/reorg.c	2012-02-07 22:28:46.000000000 +0000
+++ gcc/reorg.c	2012-02-07 22:40:58.000000000 +0000
@@ -2937,6 +2937,7 @@  fill_slots_from_thread (rtx insn, rtx co
   if (delay_list == 0 && likely
       && new_thread && !ANY_RETURN_P (new_thread)
       && NONJUMP_INSN_P (new_thread)
+      && !RTX_FRAME_RELATED_P (new_thread)
       && GET_CODE (PATTERN (new_thread)) != ASM_INPUT
       && asm_noperands (PATTERN (new_thread)) < 0)
     {
Index: gcc/testsuite/gcc.c-torture/compile/pr52175.c
===================================================================
--- /dev/null	2012-02-08 18:03:45.062776123 +0000
+++ gcc/testsuite/gcc.c-torture/compile/pr52175.c	2012-02-08 19:13:26.000000000 +0000
@@ -0,0 +1,25 @@ 
+void bad (void);
+char *foo (char *src, char **last)
+{
+  char *dst;
+  int ch;
+  dst = src = (src ? src : *last);
+
+  if (*src == 0)
+    return 0;
+
+  while (src[0])
+    {
+      if (!src[1])
+	{
+	  bad ();
+	  break;
+	}
+      *dst = *src;
+      dst += 1;
+      src += 2;
+    }
+  *last = src;
+  *dst = 0;
+  return *last;
+}