diff mbox

Combiner fix for PR79910

Message ID 95129f0b-df37-a90f-af88-3d2392b88be4@redhat.com
State New
Headers show

Commit Message

Bernd Schmidt March 15, 2017, 3 p.m. UTC
On 03/15/2017 12:09 AM, Bernd Schmidt wrote:
>  I'll retest with your
> suggestion and with the bitmap creation conditional on i1 being nonnull.

Like this (also had to throw in a bitmap_empty_p). Retested as before. Ok?


Bernd

Comments

Jeff Law March 15, 2017, 5:07 p.m. UTC | #1
On 03/15/2017 09:00 AM, Bernd Schmidt wrote:
> On 03/15/2017 12:09 AM, Bernd Schmidt wrote:
>>  I'll retest with your
>> suggestion and with the bitmap creation conditional on i1 being nonnull.
>
> Like this (also had to throw in a bitmap_empty_p). Retested as before. Ok?
Yea, not surprised you needed the bitmap_empty_p.

OK with or without the debug counter -- your decide on whether or not to 
include it.

jeff
Jeff Law March 17, 2017, 3:10 p.m. UTC | #2
On 03/15/2017 11:07 AM, Jeff Law wrote:
> On 03/15/2017 09:00 AM, Bernd Schmidt wrote:
>> On 03/15/2017 12:09 AM, Bernd Schmidt wrote:
>>>  I'll retest with your
>>> suggestion and with the bitmap creation conditional on i1 being nonnull.
>>
>> Like this (also had to throw in a bitmap_empty_p). Retested as before.
>> Ok?
> Yea, not surprised you needed the bitmap_empty_p.
>
> OK with or without the debug counter -- your decide on whether or not to
> include it.
Given your note that you had planned to remove the dbg_cnt stuff, I went 
ahead and remove the dbg_cnt stuff, and installed the patch (and testcase).

Jeff
Segher Boessenkool March 17, 2017, 10:14 p.m. UTC | #3
Thanks for not cc:ing me on any of this.

On Wed, Mar 15, 2017 at 04:00:21PM +0100, Bernd Schmidt wrote:
> +/* Set up a set of registers used in an insn.  Called through note_uses,
> +   arguments as described for that function.  */
> +
> +static void
> +record_used_regs (rtx *xptr, void *data)
> +{
> +  bitmap set = (bitmap)data;

Space after cast, throughout.

> +  int i, j;
> +  enum rtx_code code;
> +  const char *fmt;
> +  rtx x = *xptr;
> +
> +  /* repeat is used to turn tail-recursion into iteration since GCC
> +     can't do it when there's no return value.  */
> + repeat:

This comment is incorrect.

> +	  /* If we are about to do the last recursive call
> +	     needed at this level, change it into iteration.
> +	     This function is called enough to be worth it.  */

This function is called enough that it should not be called at all.

> +  /* For combinations that may result in two insns, we have to gather
> +     some extra information about registers used, so that we can
> +     update all relevant LOG_LINKS later.  */

Please just refuse to do the combination in such cases instead.

> +	/* See comments above where we calculate the bitmap.  */
> +	EXECUTE_IF_SET_IN_BITMAP ((bitmap)new_regs_in_i2,
> +				  LAST_VIRTUAL_REGISTER, i, iter)

Why do you need a cast here at all?

> +	  {
> +	    rtx reg = regno_reg_rtx[i];
> +	    rtx_insn *other;
> +	    for (other = NEXT_INSN (i2); other != i3; other = NEXT_INSN (other))
> +	      if (NONDEBUG_INSN_P (other)
> +		  && (reg_overlap_mentioned_p (reg, PATTERN (other))
> +		      || (CALL_P (other) && find_reg_fusage (other, USE, reg))))
> +		{
> +		  if (dump_file)
> +		    fprintf (dump_file,
> +			     "found extra use of reg %d at insn %d\n", i,
> +			     INSN_UID (other));
> +		  insn_link **plink;
> +		  for (plink = &LOG_LINKS (other);
> +		       *plink;
> +		       plink = &(*plink)->next)
> +		    {
> +		      insn_link *link = *plink;
> +		      if (link->regno == i)
> +			{
> +			  *plink = link->next;
> +			  link->next = i3links;
> +			  i3links = link;
> +			  break;
> +			}
> +		    }
> +		  break;
> +		}
> +	  }

This should be a separate function.

So, no, I'm not okay with this.  It is very expensive, it is doing open
heart surgery on combine's internal structures (in a way that may or may
not work), and all that to combine some insns in a case that should not
exist anyway.


Segher
Jeff Law March 17, 2017, 10:23 p.m. UTC | #4
On 03/17/2017 04:14 PM, Segher Boessenkool wrote:
> Thanks for not cc:ing me on any of this.
There's really no need for getting upset.  Bernd posted the message to 
the patches list.  That's sufficient.

>
> On Wed, Mar 15, 2017 at 04:00:21PM +0100, Bernd Schmidt wrote:
>> +/* Set up a set of registers used in an insn.  Called through note_uses,
>> +   arguments as described for that function.  */
>> +
>> +static void
>> +record_used_regs (rtx *xptr, void *data)
>> +{
>> +  bitmap set = (bitmap)data;
>
> Space after cast, throughout.
A nit.  Bernd, can you please fix this.

>
>> +  int i, j;
>> +  enum rtx_code code;
>> +  const char *fmt;
>> +  rtx x = *xptr;
>> +
>> +  /* repeat is used to turn tail-recursion into iteration since GCC
>> +     can't do it when there's no return value.  */
>> + repeat:
>
> This comment is incorrect.
Depends on the situation.  GCC's ability to turn tail recursion into 
iteration is not good.

>
>> +	  /* If we are about to do the last recursive call
>> +	     needed at this level, change it into iteration.
>> +	     This function is called enough to be worth it.  */
>
> This function is called enough that it should not be called at all.
I disagree.  We have a code gen bug. We need a solution.  Feel free to 
come up with something better.

>
>> +  /* For combinations that may result in two insns, we have to gather
>> +     some extra information about registers used, so that we can
>> +     update all relevant LOG_LINKS later.  */
>
> Please just refuse to do the combination in such cases instead.
?!? That would essentially mean we can't do 3->2 combinations.

>
>> +	/* See comments above where we calculate the bitmap.  */
>> +	EXECUTE_IF_SET_IN_BITMAP ((bitmap)new_regs_in_i2,
>> +				  LAST_VIRTUAL_REGISTER, i, iter)
>
> Why do you need a cast here at all?
>
>> +	  {
>> +	    rtx reg = regno_reg_rtx[i];
>> +	    rtx_insn *other;
>> +	    for (other = NEXT_INSN (i2); other != i3; other = NEXT_INSN (other))
>> +	      if (NONDEBUG_INSN_P (other)
>> +		  && (reg_overlap_mentioned_p (reg, PATTERN (other))
>> +		      || (CALL_P (other) && find_reg_fusage (other, USE, reg))))
>> +		{
>> +		  if (dump_file)
>> +		    fprintf (dump_file,
>> +			     "found extra use of reg %d at insn %d\n", i,
>> +			     INSN_UID (other));
>> +		  insn_link **plink;
>> +		  for (plink = &LOG_LINKS (other);
>> +		       *plink;
>> +		       plink = &(*plink)->next)
>> +		    {
>> +		      insn_link *link = *plink;
>> +		      if (link->regno == i)
>> +			{
>> +			  *plink = link->next;
>> +			  link->next = i3links;
>> +			  i3links = link;
>> +			  break;
>> +			}
>> +		    }
>> +		  break;
>> +		}
>> +	  }
>
> This should be a separate function.
Bernd, can you pull this out into its own function.

>
> So, no, I'm not okay with this.  It is very expensive, it is doing open
> heart surgery on combine's internal structures (in a way that may or may
> not work), and all that to combine some insns in a case that should not
> exist anyway.
Please don't be dramatic.  If you've got a better suggestion for how to 
fix this, be my guest.  I don't like the compile-time cost either, but I 
don't see a better solution.

Jeff
diff mbox

Patch

Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(revision 245685)
+++ gcc/combine.c	(working copy)
@@ -104,6 +104,7 @@  along with GCC; see the file COPYING3.
 #include "valtrack.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "dbgcnt.h"
 
 /* Number of attempts to combine instructions in this function.  */
 
@@ -2559,6 +2560,57 @@  can_split_parallel_of_n_reg_sets (rtx_in
   return true;
 }
 
+/* Set up a set of registers used in an insn.  Called through note_uses,
+   arguments as described for that function.  */
+
+static void
+record_used_regs (rtx *xptr, void *data)
+{
+  bitmap set = (bitmap)data;
+  int i, j;
+  enum rtx_code code;
+  const char *fmt;
+  rtx x = *xptr;
+
+  /* repeat is used to turn tail-recursion into iteration since GCC
+     can't do it when there's no return value.  */
+ repeat:
+  if (x == 0)
+    return;
+
+  code = GET_CODE (x);
+  if (REG_P (x))
+    {
+      unsigned regno = REGNO (x);
+      unsigned end_regno = END_REGNO (x);
+      while (regno < end_regno)
+	bitmap_set_bit (set, regno++);
+      return;
+    }
+
+  /* Recursively scan the operands of this expression.  */
+
+  for (i = GET_RTX_LENGTH (code) - 1, fmt = GET_RTX_FORMAT (code); i >= 0; i--)
+    {
+      if (fmt[i] == 'e')
+	{
+	  /* If we are about to do the last recursive call
+	     needed at this level, change it into iteration.
+	     This function is called enough to be worth it.  */
+	  if (i == 0)
+	    {
+	      x = XEXP (x, 0);
+	      goto repeat;
+	    }
+
+	  record_used_regs (&XEXP (x, i), data);
+	}
+      else if (fmt[i] == 'E')
+	for (j = 0; j < XVECLEN (x, i); j++)
+	  record_used_regs (&XVECEXP (x, i, j), data);
+    }
+}
+
 /* Try to combine the insns I0, I1 and I2 into I3.
    Here I0, I1 and I2 appear earlier than I3.
    I0 and I1 can be zero; then we combine just I2 into I3, or I1 and I2 into
@@ -2742,6 +2794,27 @@  try_combine (rtx_insn *i3, rtx_insn *i2,
 
   added_links_insn = 0;
 
+  /* For combinations that may result in two insns, we have to gather
+     some extra information about registers used, so that we can
+     update all relevant LOG_LINKS later.  */
+  auto_bitmap i2_regset, i3_regset, links_regset;
+  if (i1)
+    {
+      note_uses (&PATTERN (i2), record_used_regs, (bitmap)i2_regset);
+      note_uses (&PATTERN (i3), record_used_regs, (bitmap)i3_regset);
+      insn_link *ll;
+      FOR_EACH_LOG_LINK (ll, i3)
+	bitmap_set_bit (links_regset, ll->regno);
+      FOR_EACH_LOG_LINK (ll, i2)
+	bitmap_set_bit (links_regset, ll->regno);
+      if (i1)
+	FOR_EACH_LOG_LINK (ll, i1)
+	  bitmap_set_bit (links_regset, ll->regno);
+      if (i0)
+	FOR_EACH_LOG_LINK (ll, i0)
+	  bitmap_set_bit (links_regset, ll->regno);
+    }
+
   /* First check for one important special case that the code below will
      not handle.  Namely, the case where I1 is zero, I2 is a PARALLEL
      and I3 is a SET whose SET_SRC is a SET_DEST in I2.  In that case,
@@ -4004,6 +4077,12 @@  try_combine (rtx_insn *i3, rtx_insn *i2,
 	}
     }
 
+  if (!dbg_cnt (combine))
+    {
+      undo_all ();
+      return 0;
+    }
+
   /* If it still isn't recognized, fail and change things back the way they
      were.  */
   if ((insn_code_number < 0
@@ -4051,6 +4130,33 @@  try_combine (rtx_insn *i3, rtx_insn *i2,
       return 0;
     }
 
+  auto_bitmap new_regs_in_i2;
+  if (newi2pat)
+    {
+      /* We need to discover situations where we introduce a use of a
+	 register into I2, where none of the existing LOG_LINKS contain
+	 a reference to it.  This can happen if previously I3 referenced
+	 the reg, and there is an additional use between I2 and I3.  We
+	 must remove the LOG_LINKS entry from that additional use and
+	 distribute it along with our own ones.  */
+	note_uses (&newi2pat, record_used_regs, (bitmap)new_regs_in_i2);
+	bitmap_and_compl_into (new_regs_in_i2, i2_regset);
+	bitmap_and_compl_into (new_regs_in_i2, links_regset);
+
+	/* Here, we first look for situations where a hard register use
+	   moved, and just give up.  This should happen approximately
+	   never, and it's not worth it to deal with possibilities like
+	   multi-word registers.  Later, when fixing up LOG_LINKS, we
+	   deal with the case where a pseudo use moved.  */
+	if (!bitmap_empty_p (new_regs_in_i2)
+	    && prev_nonnote_insn (i3) != i2
+	    && bitmap_first_set_bit (new_regs_in_i2) < FIRST_PSEUDO_REGISTER)
+	  {
+	    undo_all ();
+	    return 0;
+	  }
+    }
+
   if (MAY_HAVE_DEBUG_INSNS)
     {
       struct undo *undo;
@@ -4494,6 +4600,45 @@  try_combine (rtx_insn *i3, rtx_insn *i2,
 			    NULL_RTX, NULL_RTX, NULL_RTX);
       }
 
+    if (newi2pat)
+      {
+	bitmap_iterator iter;
+	unsigned int i;
+
+	/* See comments above where we calculate the bitmap.  */
+	EXECUTE_IF_SET_IN_BITMAP ((bitmap)new_regs_in_i2,
+				  LAST_VIRTUAL_REGISTER, i, iter)
+	  {
+	    rtx reg = regno_reg_rtx[i];
+	    rtx_insn *other;
+	    for (other = NEXT_INSN (i2); other != i3; other = NEXT_INSN (other))
+	      if (NONDEBUG_INSN_P (other)
+		  && (reg_overlap_mentioned_p (reg, PATTERN (other))
+		      || (CALL_P (other) && find_reg_fusage (other, USE, reg))))
+		{
+		  if (dump_file)
+		    fprintf (dump_file,
+			     "found extra use of reg %d at insn %d\n", i,
+			     INSN_UID (other));
+		  insn_link **plink;
+		  for (plink = &LOG_LINKS (other);
+		       *plink;
+		       plink = &(*plink)->next)
+		    {
+		      insn_link *link = *plink;
+		      if (link->regno == i)
+			{
+			  *plink = link->next;
+			  link->next = i3links;
+			  i3links = link;
+			  break;
+			}
+		    }
+		  break;
+		}
+	  }
+      }
+
     distribute_links (i3links);
     distribute_links (i2links);
     distribute_links (i1links);