diff mbox

[PR43920,6/9] Cross-jumping - Use reg-notes.

Message ID 4D95E724.4030600@codesourcery.com
State New
Headers show

Commit Message

Tom de Vries April 1, 2011, 2:54 p.m. UTC
On 03/31/2011 11:16 PM, Tom de Vries wrote:
> On 03/31/2011 08:52 PM, Jeff Law wrote:
> 
>> On 03/31/11 12:42, Tom de Vries wrote:
>>> Uses regnotes to analyze whether we can replace insn a by insn b, even
>>> if we cannot replace insn b by insn a. Uses this info in crossjumping.
> 
>> Shouldn't this be using single_set rather than digging through PATTERN,
>> then verifying both are SETs, etc.?
>>
>> Otherwise don't you miss most of the benefit on architectures where most
>> insns clobber the flags register in a PARALLEL with the SET?
> 
> I see what you mean about missing these insns currently.
> 
> I guess I will have to check that the non-SET part of the PARALLEL is
> identical between the 2 insns.
> 
> I'll update the patch to handle this case.

changes compared to previous posting:
- add ChangeLog.
- use single_set
- add equal_different_set_p and use it in can_replace_by

Retested on x86_64.

Thanks,
- Tom

Comments

Tom de Vries April 4, 2011, 4:13 p.m. UTC | #1
Hi Jeff,

On 04/01/2011 04:54 PM, Tom de Vries wrote:
> On 03/31/2011 11:16 PM, Tom de Vries wrote:
>> On 03/31/2011 08:52 PM, Jeff Law wrote:
>>
>>> On 03/31/11 12:42, Tom de Vries wrote:
>>>> Uses regnotes to analyze whether we can replace insn a by insn b, even
>>>> if we cannot replace insn b by insn a. Uses this info in crossjumping.
>>
>>> Shouldn't this be using single_set rather than digging through PATTERN,
>>> then verifying both are SETs, etc.?
>>>
>>> Otherwise don't you miss most of the benefit on architectures where most
>>> insns clobber the flags register in a PARALLEL with the SET?
>>
>> I see what you mean about missing these insns currently.
>>
>> I guess I will have to check that the non-SET part of the PARALLEL is
>> identical between the 2 insns.
>>
>> I'll update the patch to handle this case.
> 
> changes compared to previous posting:
> - add ChangeLog.
> - use single_set
> - add equal_different_set_p and use it in can_replace_by
> 
> Retested on x86_64.
> 

Do these changes (
http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00038.html ) address your
concerns? Is the patch OK for trunk now?

Thanks,
- Tom
Jeff Law April 6, 2011, 5:41 p.m. UTC | #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/01/11 08:54, Tom de Vries wrote:
> On 03/31/2011 11:16 PM, Tom de Vries wrote:
>> On 03/31/2011 08:52 PM, Jeff Law wrote:
>>
>>> On 03/31/11 12:42, Tom de Vries wrote:
>>>> Uses regnotes to analyze whether we can replace insn a by insn b, even
>>>> if we cannot replace insn b by insn a. Uses this info in crossjumping.
>>
>>> Shouldn't this be using single_set rather than digging through PATTERN,
>>> then verifying both are SETs, etc.?
>>>
>>> Otherwise don't you miss most of the benefit on architectures where most
>>> insns clobber the flags register in a PARALLEL with the SET?
>>
>> I see what you mean about missing these insns currently.
>>
>> I guess I will have to check that the non-SET part of the PARALLEL is
>> identical between the 2 insns.
>>
>> I'll update the patch to handle this case.
> 
> changes compared to previous posting:
> - add ChangeLog.
> - use single_set
> - add equal_different_set_p and use it in can_replace_by
> 
> Retested on x86_64.


> 	PR target/43920
> 	* cfgcleanup.c (equal_different_set_p, can_replace_by, merge_dir): New
> 	function.
> 	(old_insns_match_p): Change return type.  Replace return false/true with
> 	return dir_none/dir_both.  Use can_replace_by.
> 	(flow_find_cross_jump): Add dir_p parameter.  Init replacement direction
> 	from dir_p.  Register replacement direction in dir, last_dir and
> 	afterlast_dir.	Handle new return type of old_insns_match_p using
> 	merge_dir.  Return replacement direction in dir_p.
> 	(flow_find_head_matching_sequence, outgoing_edges_match): Handle new
> 	return type of old_insns_match_p.
> 	(try_crossjump_to_edge): Add argument to call to flow_find_cross_jump.
> 	* ifcvt.c ( cond_exec_process_if_block): Add argument to call to
> 	flow_find_cross_jump.
> 	* basic-block.h (enum replace_direction): New type.
> 	(flow_find_cross_jump): Add parameter to declaration.
OK

Jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNnKW/AAoJEBRtltQi2kC7TC8IAJdB1hgkPmmC787EUBBycPCF
/ROYeWMZ62WyVqOD+eTVFXvv6v4s0XjPHQgS+zANBQPdvA3L8V2ugFYy66SWmQZj
1NSplCrBMRhS9Fu9M8uEWjvuVEUhqxOYLnPKXqeW/gD8UEHt2+gMLAGGFI4pxQRS
L+caqVMGvNvVZqMNAUTU7FLQfT1Zo50sBvbvm9w/GfjSVNC/dmkHRqf4Ta0oIDW/
Zm5oyX4FWzun8NbW+scaQlsxAiEA5xoxzXyGlLnj9UGCTiEeaYIsgg+SyYO8CeO0
o3FpsRfe+jMSK170cd7+mufPktjmCuAdiWQa2M7W6R04AOvdOV7DxNglHMHXljg=
=NrNE
-----END PGP SIGNATURE-----
diff mbox

Patch

2011-04-01  Tom de Vries  <tom@codesourcery.com>

	PR target/43920
	* cfgcleanup.c (equal_different_set_p, can_replace_by, merge_dir): New
	function.
	(old_insns_match_p): Change return type.  Replace return false/true with
	return dir_none/dir_both.  Use can_replace_by.
	(flow_find_cross_jump): Add dir_p parameter.  Init replacement direction
	from dir_p.  Register replacement direction in dir, last_dir and
	afterlast_dir.	Handle new return type of old_insns_match_p using
	merge_dir.  Return replacement direction in dir_p.
	(flow_find_head_matching_sequence, outgoing_edges_match): Handle new
	return type of old_insns_match_p.
	(try_crossjump_to_edge): Add argument to call to flow_find_cross_jump.
	* ifcvt.c ( cond_exec_process_if_block): Add argument to call to
	flow_find_cross_jump.
	* basic-block.h (enum replace_direction): New type.
	(flow_find_cross_jump): Add parameter to declaration.

diff -u gcc/cfgcleanup.c gcc/cfgcleanup.c
--- gcc/cfgcleanup.c	(working copy)
+++ gcc/cfgcleanup.c	(working copy)
@@ -72,7 +72,7 @@ 
 static bool try_crossjump_to_edge (int, edge, edge);
 static bool try_crossjump_bb (int, basic_block);
 static bool outgoing_edges_match (int, basic_block, basic_block);
-static bool old_insns_match_p (int, rtx, rtx);
+static enum replace_direction old_insns_match_p (int, rtx, rtx);
 
 static void merge_blocks_move_predecessor_nojumps (basic_block, basic_block);
 static void merge_blocks_move_successor_nojumps (basic_block, basic_block);
@@ -950,27 +950,143 @@ 
 }
 
 
+ /* Checks if patterns P1 and P2 are equivalent, apart from the possibly
+    different single sets S1 and S2.  */
+
+static bool
+equal_different_set_p (rtx p1, rtx s1, rtx p2, rtx s2)
+{
+  int i;
+  rtx e1, e2;
+
+  if (p1 == s1 && p2 == s2)
+    return true;
+
+  if (GET_CODE (p1) != PARALLEL || GET_CODE (p2) != PARALLEL)
+    return false;
+
+  if (XVECLEN (p1, 0) != XVECLEN (p2, 0))
+    return false;
+
+  for (i = 0; i < XVECLEN (p1, 0); i++)
+    {
+      e1 = XVECEXP (p1, 0, i);
+      e2 = XVECEXP (p2, 0, i);
+      if (e1 == s1 && e2 == s2)
+        continue;
+      if (reload_completed
+          ? rtx_renumbered_equal_p (e1, e2) : rtx_equal_p (e1, e2))
+        continue;
+
+        return false;
+    }
+
+  return true;
+}
+
+/* Examine register notes on I1 and I2 and return:
+   - dir_forward if I1 can be replaced by I2, or
+   - dir_backward if I2 can be replaced by I1, or
+   - dir_both if both are the case.  */
+
+static enum replace_direction
+can_replace_by (rtx i1, rtx i2)
+{
+  rtx s1, s2, d1, d2, src1, src2, note1, note2;
+  bool c1, c2;
+
+  /* Check for 2 sets.  */
+  s1 = single_set (i1);
+  s2 = single_set (i2);
+  if (s1 == NULL_RTX || s2 == NULL_RTX)
+    return dir_none;
+
+  /* Check that the 2 sets set the same dest.  */
+  d1 = SET_DEST (s1);
+  d2 = SET_DEST (s2);
+  if (!(reload_completed
+        ? rtx_renumbered_equal_p (d1, d2) : rtx_equal_p (d1, d2)))
+    return dir_none;
+
+  /* Find identical req_equiv or reg_equal note, which implies that the 2 sets
+     set dest to the same value.  */
+  note1 = find_reg_equal_equiv_note (i1);
+  note2 = find_reg_equal_equiv_note (i2);
+  if (!note1 || !note2 || !rtx_equal_p (XEXP (note1, 0), XEXP (note2, 0))
+      || !CONST_INT_P (XEXP (note1, 0)))
+    return dir_none;
+
+  if (!equal_different_set_p (PATTERN (i1), s1, PATTERN (i2), s2))
+    return dir_none;
+
+  /* Although the 2 sets set dest to the same value, we cannot replace
+       (set (dest) (const_int))
+     by
+       (set (dest) (reg))
+     because we don't know if the reg is live and has the same value at the
+     location of replacement.  */
+  src1 = SET_SRC (s1);
+  src2 = SET_SRC (s2);
+  c1 = CONST_INT_P (src1);
+  c2 = CONST_INT_P (src2);
+  if (c1 && c2)
+    return dir_both;
+  else if (c2)
+    return dir_forward;
+  else if (c1)
+    return dir_backward;
+
+  return dir_none;
+}
+
+/* Merges directions A and B.  */
+
+static enum replace_direction
+merge_dir (enum replace_direction a, enum replace_direction b)
+{
+  /* Implements the following table:
+        |bo fw bw no
+     ---+-----------
+     bo |bo fw bw no
+     fw |-- fw no no
+     bw |-- -- bw no
+     no |-- -- -- no.  */
+
+  if (a == b)
+    return a;
+
+  if (a == dir_both)
+    return b;
+  if (b == dir_both)
+    return a;
+
+  return dir_none;
+}
+
-/* Return true if I1 and I2 are equivalent and thus can be crossjumped.  */
+/* Examine I1 and I2 and return:
+   - dir_forward if I1 can be replaced by I2, or
+   - dir_backward if I2 can be replaced by I1, or
+   - dir_both if both are the case.  */
 
-static bool
+static enum replace_direction
 old_insns_match_p (int mode ATTRIBUTE_UNUSED, rtx i1, rtx i2)
 {
   rtx p1, p2;
 
   /* Verify that I1 and I2 are equivalent.  */
   if (GET_CODE (i1) != GET_CODE (i2))
-    return false;
+    return dir_none;
 
   /* __builtin_unreachable() may lead to empty blocks (ending with
      NOTE_INSN_BASIC_BLOCK).  They may be crossjumped. */
   if (NOTE_INSN_BASIC_BLOCK_P (i1) && NOTE_INSN_BASIC_BLOCK_P (i2))
-    return true;
+    return dir_both;
 
   p1 = PATTERN (i1);
   p2 = PATTERN (i2);
 
   if (GET_CODE (p1) != GET_CODE (p2))
-    return false;
+    return dir_none;
 
   /* If this is a CALL_INSN, compare register usage information.
      If we don't check this on stack register machines, the two
@@ -991,15 +1107,15 @@ 
       rtx n2 = find_reg_note (i2, REG_EH_REGION, 0);
 
       if (!n1 && n2)
-	return false;
+	return dir_none;
 
       if (n1 && (!n2 || XEXP (n1, 0) != XEXP (n2, 0)))
-	return false;
+	return dir_none;
 
       if (!rtx_equal_p (CALL_INSN_FUNCTION_USAGE (i1),
 			CALL_INSN_FUNCTION_USAGE (i2))
 	  || SIBLING_CALL_P (i1) != SIBLING_CALL_P (i2))
-	return false;
+	return dir_none;
     }
 
 #ifdef STACK_REGS
@@ -1028,15 +1144,15 @@ 
 	  SET_HARD_REG_BIT (i2_regset, REGNO (XEXP (note, 0)));
 
       if (!hard_reg_set_equal_p (i1_regset, i2_regset))
-	return false;
+	return dir_none;
     }
 #endif
 
   if (reload_completed
       ? rtx_renumbered_equal_p (p1, p2) : rtx_equal_p (p1, p2))
-    return true;
+    return dir_both;
 
-  return false;
+  return can_replace_by (i1, i2);
 }
 
 /* When comparing insns I1 and I2 in flow_find_cross_jump or
@@ -1063,18 +1179,32 @@ 
 }
 
 /* Look through the insns at the end of BB1 and BB2 and find the longest
-   sequence that are equivalent.  Store the first insns for that sequence
-   in *F1 and *F2 and return the sequence length.
+   sequence that are either equivalent, or allow forward or backward
+   replacement.  Store the first insns for that sequence in *F1 and *F2 and
+   return the sequence length.
+
+   DIR_P indicates the allowed replacement direction on function entry, and
+   the actual replacement direction on function exit.  If NULL, only equivalent
+   sequences are allowed.
 
    To simplify callers of this function, if the blocks match exactly,
    store the head of the blocks in *F1 and *F2.  */
 
 int
-flow_find_cross_jump (basic_block bb1, basic_block bb2, rtx *f1, rtx *f2)
+flow_find_cross_jump (basic_block bb1, basic_block bb2, rtx *f1, rtx *f2,
+                      enum replace_direction *dir_p)
 {
   rtx i1, i2, last1, last2, afterlast1, afterlast2;
   int ninsns = 0;
   rtx p1;
+  enum replace_direction dir, last_dir, afterlast_dir;
+
+  if (dir_p)
+    dir = *dir_p;
+  else
+    dir = dir_both;
+  afterlast_dir = dir;
+  last_dir = afterlast_dir;
 
   /* Skip simple jumps at the end of the blocks.  Complex jumps still
      need to be compared for equivalence, which we'll do below.  */
@@ -1111,7 +1241,8 @@ 
       if (i1 == BB_HEAD (bb1) || i2 == BB_HEAD (bb2))
 	break;
 
-      if (!old_insns_match_p (0, i1, i2))
+      dir = merge_dir (dir, old_insns_match_p (0, i1, i2));
+      if (dir == dir_none || (!dir_p && dir != dir_both))
 	break;
 
       merge_memattrs (i1, i2);
@@ -1123,6 +1254,8 @@ 
 
 	  afterlast1 = last1, afterlast2 = last2;
 	  last1 = i1, last2 = i2;
+	  afterlast_dir = last_dir;
+	  last_dir = dir;
 	  p1 = PATTERN (i1);
 	  if (!(GET_CODE (p1) == USE || GET_CODE (p1) == CLOBBER))
             ninsns++;
@@ -1136,7 +1269,7 @@ 
   /* Don't allow the insn after a compare to be shared by
      cross-jumping unless the compare is also shared.  */
   if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1))
-    last1 = afterlast1, last2 = afterlast2, ninsns--;
+    last1 = afterlast1, last2 = afterlast2, last_dir = afterlast_dir, ninsns--;
 #endif
 
   /* Include preceding notes and labels in the cross-jump.  One,
@@ -1162,7 +1295,9 @@ 
       *f2 = last2;
     }
 
+  if (dir_p)
+    *dir_p = last_dir;
   return ninsns;
 }
 
       /* Ignore notes.  */
@@ -1226,7 +1361,7 @@ 
 	      && nehedges1 != nehedges2))
 	break;
 
-      if (!old_insns_match_p (0, i1, i2))
+      if (old_insns_match_p (0, i1, i2) != dir_both)
 	break;
 
       merge_memattrs (i1, i2);
@@ -1455,7 +1590,8 @@ 
 		  rr.update_label_nuses = false;
 		  for_each_rtx (&BB_END (bb1), replace_label, &rr);
 
-		  match = old_insns_match_p (mode, BB_END (bb1), BB_END (bb2));
+		  match = (old_insns_match_p (mode, BB_END (bb1), BB_END (bb2))
+			   == dir_both);
 		  if (dump_file && match)
 		    fprintf (dump_file,
 			     "Tablejumps in bb %i and %i match.\n",
@@ -1477,7 +1613,7 @@ 
 
   /* First ensure that the instructions match.  There may be many outgoing
      edges so this test is generally cheaper.  */
-  if (!old_insns_match_p (mode, BB_END (bb1), BB_END (bb2)))
+  if (old_insns_match_p (mode, BB_END (bb1), BB_END (bb2)) != dir_both)
     return false;
 
   /* Search the outgoing edges, ensure that the counts do match, find possible
@@ -1578,6 +1714,7 @@ 
   int nmatch;
   basic_block src1 = e1->src, src2 = e2->src;
   basic_block redirect_to, redirect_from, to_remove;
+  enum replace_direction dir;
   rtx newpos1, newpos2;
   edge s;
   edge_iterator ei;
@@ -1633,7 +1770,8 @@ 
     return false;
 
   /* ... and part the second.  */
-  nmatch = flow_find_cross_jump (src1, src2, &newpos1, &newpos2);
+  dir = dir_forward;
+  nmatch = flow_find_cross_jump (src1, src2, &newpos1, &newpos2, &dir);
 
   /* Don't proceed with the crossjump unless we found a sufficient number
      of matching instructions or the 'from' block was totally matched
--- gcc/ifcvt.c	(revision 170556)
+++ gcc/ifcvt.c	(working copy)
@@ -476,7 +476,8 @@  cond_exec_process_if_block (ce_if_block_
       /* Look for matching sequences at the head and tail of the two blocks,
 	 and limit the range of insns to be converted if possible.  */
       n_matching = flow_find_cross_jump (then_bb, else_bb,
-					 &then_first_tail, &else_first_tail);
+					 &then_first_tail, &else_first_tail,
+                                         NULL);
       if (then_first_tail == BB_HEAD (then_bb))
 	then_start = then_end = NULL_RTX;
       if (else_first_tail == BB_HEAD (else_bb))
--- gcc/basic-block.h	(revision 170556)
+++ gcc/basic-block.h	(working copy)
@@ -803,9 +803,12 @@  extern bool purge_dead_edges (basic_bloc
 extern void find_many_sub_basic_blocks (sbitmap);
 extern void rtl_make_eh_edge (sbitmap, basic_block, rtx);
 
+enum replace_direction { dir_none, dir_forward, dir_backward, dir_both };
+
 /* In cfgcleanup.c.  */
 extern bool cleanup_cfg (int);
-extern int flow_find_cross_jump (basic_block, basic_block, rtx *, rtx *);
+extern int flow_find_cross_jump (basic_block, basic_block, rtx *, rtx *,
+                                 enum replace_direction*);
 extern int flow_find_head_matching_sequence (basic_block, basic_block,
 					     rtx *, rtx *, int);