diff mbox

Improvements to -freorder-blocks-and-partition support

Message ID 20130514214201.5DA3980BE7@tjsboxrox.mtv.corp.google.com
State New
Headers show

Commit Message

Teresa Johnson May 14, 2013, 9:42 p.m. UTC
Patch 1 of 3 split out from the patch I sent last week that fixes problems with
-freorder-blocks-and-partition, with some additional verification improvements.

See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.

This patch adds a flag to the rtl_data structure to indicate whether any
partitioning was actually performed, so that optimizations which were
conservatively disabled whenever the flag_reorder_blocks_and_partition
is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less
conservative for functions where no partitions were formed (e.g. they are
completely hot).

It also adds another flag to the rtl_data structure to indicate whether bb
reordering is complete, and if so enables sanity checking that there is
at most one transition in the layout order between hot and cold sections.
This was moved from verify_hot_cold_block_grouping, which was only called
once at the end of the bbro pass and is now removed.

Tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
builds and regression testing. Additionally built/ran cpu2006 with profile
feedback and -freorder-blocks-and-partition enabled (which currently has
build failures from splitting until my follow-on patches are in, but confirmed
no new failures due to this patch), as well as gcc regression testing with
-freorder-blocks-and-partition enabled.

Ok for trunk?

Thanks,
Teresa

2013-05-14  Teresa Johnson  <tejohnson@google.com>

	* function.h (has_bb_partition): New rtl_data flag.
        (bb_reorder_complete): Ditto.
        * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify
        that text sections switch at most once in layout order.
	* bb-reorder.c (connect_traces): Check for has_bb_partition
        instead of flag_reorder_blocks_and_partition.
	(verify_hot_cold_block_grouping): Remove.
	(reorder_basic_blocks): Remove call to deleted
        verify_hot_cold_block_grouping, and set bb_reorder_complete.
	(partition_hot_cold_basic_blocks): Set has_bb_partition.
	* cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition
        instead of flag_reorder_blocks_and_partition.

Comments

Steven Bosscher May 14, 2013, 9:49 p.m. UTC | #1
On Tue, May 14, 2013 at 11:42 PM, Teresa Johnson wrote:
>
>         * function.h (has_bb_partition): New rtl_data flag.
>         (bb_reorder_complete): Ditto.
>         * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify
>         that text sections switch at most once in layout order.
>         * bb-reorder.c (connect_traces): Check for has_bb_partition
>         instead of flag_reorder_blocks_and_partition.
>         (verify_hot_cold_block_grouping): Remove.
>         (reorder_basic_blocks): Remove call to deleted
>         verify_hot_cold_block_grouping, and set bb_reorder_complete.
>         (partition_hot_cold_basic_blocks): Set has_bb_partition.
>         * cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition
>         instead of flag_reorder_blocks_and_partition.
>

I can't approve this (I'm probably the most experienced GCC
contributor without any approval authority :-) but it looks good to
me.

One nit: Can you keep the verify_hot_cold_block_grouping function
separate? rtl_verify_flow_info* is already too big and complex
(somewhere down on my TODO list is splitting it up and improving
cfglayout mode checking, e.g. to make sure there are no barriers/notes
between basic blocks...).

Ciao!
Steven
Teresa Johnson May 14, 2013, 10:50 p.m. UTC | #2
On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Tue, May 14, 2013 at 11:42 PM, Teresa Johnson wrote:
>>
>>         * function.h (has_bb_partition): New rtl_data flag.
>>         (bb_reorder_complete): Ditto.
>>         * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify
>>         that text sections switch at most once in layout order.
>>         * bb-reorder.c (connect_traces): Check for has_bb_partition
>>         instead of flag_reorder_blocks_and_partition.
>>         (verify_hot_cold_block_grouping): Remove.
>>         (reorder_basic_blocks): Remove call to deleted
>>         verify_hot_cold_block_grouping, and set bb_reorder_complete.
>>         (partition_hot_cold_basic_blocks): Set has_bb_partition.
>>         * cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition
>>         instead of flag_reorder_blocks_and_partition.
>>
>
> I can't approve this (I'm probably the most experienced GCC
> contributor without any approval authority :-) but it looks good to
> me.
>
> One nit: Can you keep the verify_hot_cold_block_grouping function
> separate? rtl_verify_flow_info* is already too big and complex
> (somewhere down on my TODO list is splitting it up and improving
> cfglayout mode checking, e.g. to make sure there are no barriers/notes
> between basic blocks...).

Initially that's what I did, but then it seemed less efficient because
it adds another iteration over the BBs, so I instead merged the check
with one of the existing BB iterations in that routine. Do you still
prefer that I outline it?

As an alternative, perhaps rtl_verify_flow_info_1 could be split in
half in a follow-on patch since there are a couple walks over the BBs
in there. WDYT?

Thanks,
Teresa

>
> Ciao!
> Steven



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Jeff Law May 15, 2013, 4:33 a.m. UTC | #3
On 05/14/2013 03:42 PM, Teresa Johnson wrote:
> Patch 1 of 3 split out from the patch I sent last week that fixes problems with
> -freorder-blocks-and-partition, with some additional verification improvements.
>
> See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.
>
> This patch adds a flag to the rtl_data structure to indicate whether any
> partitioning was actually performed, so that optimizations which were
> conservatively disabled whenever the flag_reorder_blocks_and_partition
> is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less
> conservative for functions where no partitions were formed (e.g. they are
> completely hot).
>
> It also adds another flag to the rtl_data structure to indicate whether bb
> reordering is complete, and if so enables sanity checking that there is
> at most one transition in the layout order between hot and cold sections.
> This was moved from verify_hot_cold_block_grouping, which was only called
> once at the end of the bbro pass and is now removed.
>
> Tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
> builds and regression testing. Additionally built/ran cpu2006 with profile
> feedback and -freorder-blocks-and-partition enabled (which currently has
> build failures from splitting until my follow-on patches are in, but confirmed
> no new failures due to this patch), as well as gcc regression testing with
> -freorder-blocks-and-partition enabled.
>
> Ok for trunk?
>
> Thanks,
> Teresa
>
> 2013-05-14  Teresa Johnson  <tejohnson@google.com>
>
> 	* function.h (has_bb_partition): New rtl_data flag.
>          (bb_reorder_complete): Ditto.
>          * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify
>          that text sections switch at most once in layout order.
> 	* bb-reorder.c (connect_traces): Check for has_bb_partition
>          instead of flag_reorder_blocks_and_partition.
> 	(verify_hot_cold_block_grouping): Remove.
> 	(reorder_basic_blocks): Remove call to deleted
>          verify_hot_cold_block_grouping, and set bb_reorder_complete.
> 	(partition_hot_cold_basic_blocks): Set has_bb_partition.
> 	* cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition
>          instead of flag_reorder_blocks_and_partition.
This is good.  Please install on the trunk.

jeff
Steven Bosscher May 15, 2013, 6:02 a.m. UTC | #4
On Wed, May 15, 2013 at 12:50 AM, Teresa Johnson wrote:
> On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher wrote:
>> One nit: Can you keep the verify_hot_cold_block_grouping function
>> separate? rtl_verify_flow_info* is already too big and complex
>> (somewhere down on my TODO list is splitting it up and improving
>> cfglayout mode checking, e.g. to make sure there are no barriers/notes
>> between basic blocks...).
>
> Initially that's what I did, but then it seemed less efficient because
> it adds another iteration over the BBs, so I instead merged the check
> with one of the existing BB iterations in that routine. Do you still
> prefer that I outline it?

I'm not concerned about efficiency of the checker routines. They only
run with checking enabled anyway. These checkers are almost the best
"documentation" of the rules of GCC's intermediate representations
that we have, so it's more important to me to make them easy to
understand, and to make sure these verifiers themselves are complete
and correct.

So another BB walk wouldn't be a problem IMHO. Some of the other
checkers do far worse things (some of them include non-linear
algorithms, for example).

Ciao!
Steven
Jeff Law May 15, 2013, 6:04 a.m. UTC | #5
On 05/15/2013 12:02 AM, Steven Bosscher wrote:
> On Wed, May 15, 2013 at 12:50 AM, Teresa Johnson wrote:
>> On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher wrote:
>>> One nit: Can you keep the verify_hot_cold_block_grouping function
>>> separate? rtl_verify_flow_info* is already too big and complex
>>> (somewhere down on my TODO list is splitting it up and improving
>>> cfglayout mode checking, e.g. to make sure there are no barriers/notes
>>> between basic blocks...).
>>
>> Initially that's what I did, but then it seemed less efficient because
>> it adds another iteration over the BBs, so I instead merged the check
>> with one of the existing BB iterations in that routine. Do you still
>> prefer that I outline it?
>
> I'm not concerned about efficiency of the checker routines. They only
> run with checking enabled anyway. These checkers are almost the best
> "documentation" of the rules of GCC's intermediate representations
> that we have, so it's more important to me to make them easy to
> understand, and to make sure these verifiers themselves are complete
> and correct.
Right.

>
> So another BB walk wouldn't be a problem IMHO. Some of the other
> checkers do far worse things (some of them include non-linear
> algorithms, for example).
I've got no problem with another BB walk.  I'll pre-approve the existing 
patch with the checker moved back out.

jeff
Teresa Johnson May 15, 2013, 2:17 p.m. UTC | #6
On Tue, May 14, 2013 at 11:04 PM, Jeff Law <law@redhat.com> wrote:
> On 05/15/2013 12:02 AM, Steven Bosscher wrote:
>>
>> On Wed, May 15, 2013 at 12:50 AM, Teresa Johnson wrote:
>>>
>>> On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher wrote:
>>>>
>>>> One nit: Can you keep the verify_hot_cold_block_grouping function
>>>> separate? rtl_verify_flow_info* is already too big and complex
>>>> (somewhere down on my TODO list is splitting it up and improving
>>>> cfglayout mode checking, e.g. to make sure there are no barriers/notes
>>>> between basic blocks...).
>>>
>>>
>>> Initially that's what I did, but then it seemed less efficient because
>>> it adds another iteration over the BBs, so I instead merged the check
>>> with one of the existing BB iterations in that routine. Do you still
>>> prefer that I outline it?
>>
>>
>> I'm not concerned about efficiency of the checker routines. They only
>> run with checking enabled anyway. These checkers are almost the best
>> "documentation" of the rules of GCC's intermediate representations
>> that we have, so it's more important to me to make them easy to
>> understand, and to make sure these verifiers themselves are complete
>> and correct.
>
> Right.
>
>
>>
>> So another BB walk wouldn't be a problem IMHO. Some of the other
>> checkers do far worse things (some of them include non-linear
>> algorithms, for example).
>
> I've got no problem with another BB walk.  I'll pre-approve the existing
> patch with the checker moved back out.

Done, retested with bootstrap/regression on x86_64-unknown-linux-gnu,
and committed as r198934.

Steven - I was looking at rtl_verify_flow_info_1 last night and have
some ideas about how to refactor it a bit. I will try to send a patch
for that in the next day since I will be adding to that verification
code with my other function splitting patches and it would be good to
get that refactored first.

Thanks,
Teresa

>
> jeff



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
diff mbox

Patch

Index: bb-reorder.c
===================================================================
--- bb-reorder.c	(revision 198891)
+++ bb-reorder.c	(working copy)
@@ -1053,7 +1053,7 @@  connect_traces (int n_traces, struct trace *traces
   current_partition = BB_PARTITION (traces[0].first);
   two_passes = false;
 
-  if (flag_reorder_blocks_and_partition)
+  if (crtl->has_bb_partition)
     for (i = 0; i < n_traces && !two_passes; i++)
       if (BB_PARTITION (traces[0].first)
 	  != BB_PARTITION (traces[i].first))
@@ -1262,7 +1262,7 @@  connect_traces (int n_traces, struct trace *traces
 		      }
 		  }
 
-	      if (flag_reorder_blocks_and_partition)
+	      if (crtl->has_bb_partition)
 		try_copy = false;
 
 	      /* Copy tiny blocks always; copy larger blocks only when the
@@ -2068,43 +2068,6 @@  add_reg_crossing_jump_notes (void)
 	add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
 }
 
-/* Verify, in the basic block chain, that there is at most one switch
-   between hot/cold partitions. This is modelled on
-   rtl_verify_flow_info_1, but it cannot go inside that function
-   because this condition will not be true until after
-   reorder_basic_blocks is called.  */
-
-static void
-verify_hot_cold_block_grouping (void)
-{
-  basic_block bb;
-  int err = 0;
-  bool switched_sections = false;
-  int current_partition = 0;
-
-  FOR_EACH_BB (bb)
-    {
-      if (!current_partition)
-	current_partition = BB_PARTITION (bb);
-      if (BB_PARTITION (bb) != current_partition)
-	{
-	  if (switched_sections)
-	    {
-	      error ("multiple hot/cold transitions found (bb %i)",
-		     bb->index);
-	      err = 1;
-	    }
-	  else
-	    {
-	      switched_sections = true;
-	      current_partition = BB_PARTITION (bb);
-	    }
-	}
-    }
-
-  gcc_assert(!err);
-}
-
 /* Reorder basic blocks.  The main entry point to this file.  FLAGS is
    the set of flags to pass to cfg_layout_initialize().  */
 
@@ -2157,8 +2120,9 @@  reorder_basic_blocks (void)
       dump_flow_info (dump_file, dump_flags);
     }
 
-  if (flag_reorder_blocks_and_partition)
-    verify_hot_cold_block_grouping ();
+  /* Signal that rtl_verify_flow_info_1 can now verify that there
+     is at most one switch between hot/cold sections.  */
+  crtl->bb_reorder_complete = true;
 }
 
 /* Determine which partition the first basic block in the function
@@ -2503,6 +2467,8 @@  partition_hot_cold_basic_blocks (void)
   if (!crossing_edges.exists ())
     return 0;
 
+  crtl->has_bb_partition = true;
+
   /* Make sure the source of any crossing edge ends in a jump and the
      destination of any crossing edge has a label.  */
   add_labels_and_missing_jumps (crossing_edges);
Index: cfgrtl.c
===================================================================
--- cfgrtl.c	(revision 198891)
+++ cfgrtl.c	(working copy)
@@ -2082,6 +2082,8 @@  rtl_verify_flow_info_1 (void)
   rtx x;
   int err = 0;
   basic_block bb;
+  bool switched_sections = false;
+  int current_partition = BB_UNPARTITIONED;
 
   /* Check the general integrity of the basic blocks.  */
   FOR_EACH_BB_REVERSE (bb)
@@ -2299,6 +2301,29 @@  rtl_verify_flow_info_1 (void)
 	  err = 1;
 	}
 
+      /* Verify, in the basic block chain, that there is at most one switch
+         between hot/cold partitions. This condition will not be true until
+         after reorder_basic_blocks is called.  */
+      if (crtl->bb_reorder_complete)
+        {
+          if (current_partition != BB_UNPARTITIONED
+              && BB_PARTITION (bb) != current_partition)
+            {
+              if (switched_sections)
+              {
+                error ("multiple hot/cold transitions found (bb %i)",
+                       bb->index);
+                err = 1;
+              }
+              else
+                switched_sections = true;
+
+              if (!crtl->has_bb_partition)
+                error ("partition found but function partition flag not set");
+            }
+          current_partition = BB_PARTITION (bb);
+        }
+
       if (BB_END (bb) == x)
 	/* Do checks for empty blocks here.  */
 	;
Index: function.h
===================================================================
--- function.h	(revision 198891)
+++ function.h	(working copy)
@@ -446,6 +446,15 @@  struct GTY(()) rtl_data {
      sched2) and is useful only if the port defines LEAF_REGISTERS.  */
   bool uses_only_leaf_regs;
 
+  /* Nonzero if the function being compiled has undergone hot/cold partitioning
+     (under flag_reorder_blocks_and_partition) and has at least one cold
+     block.  */
+  bool has_bb_partition;
+
+  /* Nonzero if the function being compiled has completed the bb reordering
+     pass.  */
+  bool bb_reorder_complete;
+
   /* Like regs_ever_live, but 1 if a reg is set or clobbered from an
      asm.  Unlike regs_ever_live, elements of this array corresponding
      to eliminable regs (like the frame pointer) are set if an asm
Index: cfgcleanup.c
===================================================================
--- cfgcleanup.c	(revision 198891)
+++ cfgcleanup.c	(working copy)
@@ -1864,7 +1864,7 @@  try_crossjump_to_edge (int mode, edge e1, edge e2,
      partition boundaries).  See the comments at the top of
      bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */
 
-  if (flag_reorder_blocks_and_partition && reload_completed)
+  if (crtl->has_bb_partition && reload_completed)
     return false;
 
   /* Search backward through forwarder blocks.  We don't need to worry