Patchwork Fix PR 53743 and other -freorder-blocks-and-partition failures

login
register
mail settings
Submitter Teresa Johnson
Date May 20, 2013, 2:55 a.m.
Message ID <20130520025501.E9F99805D7@tjsboxrox.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/244834/
State New
Headers show

Comments

Teresa Johnson - May 20, 2013, 2:55 a.m.
Patch 2 of 3 split out from the patch I sent last week that fixes problems with
-freorder-blocks-and-partition, with changes/fixes discussed in that thread,
along with some additional verification improvements.

See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.

This portion of the original patch fixes failures encountered when enabling
-freorder-blocks-and-partition, including the failure reported in PR 53743.

I attempted to make the handling of partition updates through the optimization
passes much more consistent, removing a number of partial fixes in the code
stream in the process. The code to fixup partitions (including the BB_PARTITION
assignment, region crossing jump notes, and switch text section notes) is
now handled in a few centralized locations. For example, inside
rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that callers
don't need to attempt the fixup themselves.

The main changes from the earlier patch are: (1) The switch text section
notes are not inserted until the free_cfg pass, avoiding the need to
maintain these properly when going in and out of cfglayout mode;
(2) Unnecessary forward blocks between partitions are avoided by
suppressing unnecessary calls to force_nonfallthru during bbpart, avoiding
the need for additional cleanup later; and (3) fixing a few places
where region crossing jump notes were not being maintained properly,
exposed by additional verification checks I added.

Tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
builds and regression testing. Additionally built/ran cpu2006int with profile
feedback and -freorder-blocks-and-partition enabled - all benchmarks now
pass (previously only 6 passed). This also passes a profiledbootstrap with
-freorder-blocks-and-partition enabled by default (although this required
forcing the dwarf version down to 2 to workaround issues with debug range
generation and partitioning that still need to be addressed for -g
compilations).

Ok for trunk?

Thanks,
Teresa

2013-05-19  Teresa Johnson  <tejohnson@google.com>

	* ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
        as this is now done by redirect_edge_and_branch_force.
	* function.c (thread_prologue_and_epilogue_insns): Insert new bb after
        barriers, and fix interaction with splitting.
	* emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
	* cfgcleanup.c (try_forward_edges): Fix early return value to properly
        reflect changes made in the routine.
	* bb-reorder.c (emit_barrier_after_bb): Handle insertion in
        non-cfglayout mode.
	(fix_up_fall_thru_edges): Remove incorrect check for bb layout order
        since this is called in cfglayout mode, and replace partition fixup
        with assert as that is now done by force_nonfallthru_and_redirect.
	(add_reg_crossing_jump_notes): Handle the fact that some jumps may
        already be marked with region crossing note.
	(insert_section_boundary_note): Make non-static, gate on flag
        has_bb_partition, rewrite to also check for multiple partitions.
	(rest_of_handle_reorder_blocks): Remove call to
        insert_section_boundary_note, now done later during free_cfg.
	* bb-reorder.h: Declare insert_section_boundary_note and
        emit_barrier_after_bb, which are no longer static.
	* Makefile.in (cfgrtl.o): Depend on bb-reorder.h
	* cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
        invoke insert_section_boundary_note.
	(try_redirect_by_replacing_jump): Remove unnecessary
        check for region crossing note.
	(fixup_partition_crossing): New function.
	(rtl_redirect_edge_and_branch): Fixup partition boundaries.
	(force_nonfallthru_and_redirect): Fixup partition boundaries,
        remove old code that tried to do this. Emit barrier correctly
        when we are in cfglayout mode.
	(rtl_split_edge): Correctly fixup partition boundaries.
	(commit_one_edge_insertion): Remove old code that tried to
        fixup region crossing edge since this is now handled in
        split_block, and set up insertion point correctly since
        block may now end in a jump.
	(rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
        notes.
	(fixup_reorder_chain): Remove old code that attempted to fixup region
        crossing note as this is now handled in force_nonfallthru_and_redirect.
	(duplicate_insn_chain): Don't duplicate switch section notes.
	(rtl_can_remove_branch_p): Remove unnecessary check for region crossing
        note.
Teresa Johnson - May 20, 2013, 2:59 a.m.
Please use this email to respond - I removed a bogus email address I
inadvertently included. Also, I have included the patch as an
attachment here as well, since it looks like some of the spacing got
messed up when it was included inline.

Thanks,
Teresa

On Sun, May 19, 2013 at 7:55 PM, Teresa Johnson <tejohnson@google.com> wrote:
> Patch 2 of 3 split out from the patch I sent last week that fixes problems with
> -freorder-blocks-and-partition, with changes/fixes discussed in that thread,
> along with some additional verification improvements.
>
> See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.
>
> This portion of the original patch fixes failures encountered when enabling
> -freorder-blocks-and-partition, including the failure reported in PR 53743.
>
> I attempted to make the handling of partition updates through the optimization
> passes much more consistent, removing a number of partial fixes in the code
> stream in the process. The code to fixup partitions (including the BB_PARTITION
> assignment, region crossing jump notes, and switch text section notes) is
> now handled in a few centralized locations. For example, inside
> rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that callers
> don't need to attempt the fixup themselves.
>
> The main changes from the earlier patch are: (1) The switch text section
> notes are not inserted until the free_cfg pass, avoiding the need to
> maintain these properly when going in and out of cfglayout mode;
> (2) Unnecessary forward blocks between partitions are avoided by
> suppressing unnecessary calls to force_nonfallthru during bbpart, avoiding
> the need for additional cleanup later; and (3) fixing a few places
> where region crossing jump notes were not being maintained properly,
> exposed by additional verification checks I added.
>
> Tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
> builds and regression testing. Additionally built/ran cpu2006int with profile
> feedback and -freorder-blocks-and-partition enabled - all benchmarks now
> pass (previously only 6 passed). This also passes a profiledbootstrap with
> -freorder-blocks-and-partition enabled by default (although this required
> forcing the dwarf version down to 2 to workaround issues with debug range
> generation and partitioning that still need to be addressed for -g
> compilations).
>
> Ok for trunk?
>
> Thanks,
> Teresa
>
> 2013-05-19  Teresa Johnson  <tejohnson@google.com>
>
>         * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
>         as this is now done by redirect_edge_and_branch_force.
>         * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
>         barriers, and fix interaction with splitting.
>         * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
>         * cfgcleanup.c (try_forward_edges): Fix early return value to properly
>         reflect changes made in the routine.
>         * bb-reorder.c (emit_barrier_after_bb): Handle insertion in
>         non-cfglayout mode.
>         (fix_up_fall_thru_edges): Remove incorrect check for bb layout order
>         since this is called in cfglayout mode, and replace partition fixup
>         with assert as that is now done by force_nonfallthru_and_redirect.
>         (add_reg_crossing_jump_notes): Handle the fact that some jumps may
>         already be marked with region crossing note.
>         (insert_section_boundary_note): Make non-static, gate on flag
>         has_bb_partition, rewrite to also check for multiple partitions.
>         (rest_of_handle_reorder_blocks): Remove call to
>         insert_section_boundary_note, now done later during free_cfg.
>         * bb-reorder.h: Declare insert_section_boundary_note and
>         emit_barrier_after_bb, which are no longer static.
>         * Makefile.in (cfgrtl.o): Depend on bb-reorder.h
>         * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
>         invoke insert_section_boundary_note.
>         (try_redirect_by_replacing_jump): Remove unnecessary
>         check for region crossing note.
>         (fixup_partition_crossing): New function.
>         (rtl_redirect_edge_and_branch): Fixup partition boundaries.
>         (force_nonfallthru_and_redirect): Fixup partition boundaries,
>         remove old code that tried to do this. Emit barrier correctly
>         when we are in cfglayout mode.
>         (rtl_split_edge): Correctly fixup partition boundaries.
>         (commit_one_edge_insertion): Remove old code that tried to
>         fixup region crossing edge since this is now handled in
>         split_block, and set up insertion point correctly since
>         block may now end in a jump.
>         (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
>         notes.
>         (fixup_reorder_chain): Remove old code that attempted to fixup region
>         crossing note as this is now handled in force_nonfallthru_and_redirect.
>         (duplicate_insn_chain): Don't duplicate switch section notes.
>         (rtl_can_remove_branch_p): Remove unnecessary check for region crossing
>         note.
>
> Index: ifcvt.c
> ===================================================================
> --- ifcvt.c     (revision 199014)
> +++ ifcvt.c     (working copy)
> @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg
>    if (new_bb)
>      {
>        df_bb_replace (then_bb_index, new_bb);
> -      /* Since the fallthru edge was redirected from test_bb to new_bb,
> -         we need to ensure that new_bb is in the same partition as
> -         test bb (you can not fall through across section boundaries).  */
> -      BB_COPY_PARTITION (new_bb, test_bb);
> +      /* This should have been done above via force_nonfallthru_and_redirect
> +         (possibly called from redirect_edge_and_branch_force).  */
> +      gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
>      }
>
>    num_true_changes++;
> Index: function.c
> ===================================================================
> --- function.c  (revision 199014)
> +++ function.c  (working copy)
> @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void)
>                     break;
>                 if (e)
>                   {
> -                   copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
> -                                                 NULL_RTX, e->src);
> +                    /* Make sure we insert after any barriers.  */
> +                    rtx end = get_last_bb_insn (e->src);
> +                    copy_bb = create_basic_block (NEXT_INSN (end),
> +                                                  NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
>                   }
>                 else
> @@ -6538,7 +6540,7 @@ epilogue_done:
>        basic_block simple_return_block_cold = NULL;
>        edge pending_edge_hot = NULL;
>        edge pending_edge_cold = NULL;
> -      basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
> +      basic_block exit_pred;
>        int i;
>
>        gcc_assert (entry_edge != orig_entry_edge);
> @@ -6566,6 +6568,12 @@ epilogue_done:
>             else
>               pending_edge_cold = e;
>           }
> +
> +      /* Save a pointer to the exit's predecessor BB for use in
> +         inserting new BBs at the end of the function. Do this
> +         after the call to split_block above which may split
> +         the original exit pred.  */
> +      exit_pred = EXIT_BLOCK_PTR->prev_bb;
>
>        FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e)
>         {
> Index: emit-rtl.c
> ===================================================================
> --- emit-rtl.c  (revision 199014)
> +++ emit-rtl.c  (working copy)
> @@ -3574,6 +3574,7 @@ try_split (rtx pat, rtx trial, int last)
>           break;
>
>         case REG_NON_LOCAL_GOTO:
> +       case REG_CROSSING_JUMP:
>           for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn))
>             {
>               if (JUMP_P (insn))
> Index: cfgcleanup.c
> ===================================================================
> --- cfgcleanup.c        (revision 199014)
> +++ cfgcleanup.c        (working copy)
> @@ -456,7 +456,7 @@ try_forward_edges (int mode, basic_block b)
>
>        if (first != EXIT_BLOCK_PTR
>           && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX))
> -       return false;
> +       return changed;
>
>        while (counter < n_basic_blocks)
>         {
> Index: bb-reorder.c
> ===================================================================
> --- bb-reorder.c        (revision 199014)
> +++ bb-reorder.c        (working copy)
> @@ -1380,13 +1380,16 @@ get_uncond_jump_length (void)
>    return length;
>  }
>
> -/* Emit a barrier into the footer of BB.  */
> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode.  */
>
> -static void
> +void
>  emit_barrier_after_bb (basic_block bb)
>  {
>    rtx barrier = emit_barrier_after (BB_END (bb));
> -  BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
> +  gcc_assert (current_ir_type() == IR_RTL_CFGRTL
> +              || current_ir_type () == IR_RTL_CFGLAYOUT);
> +  if (current_ir_type () == IR_RTL_CFGLAYOUT)
> +    BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
>  }
>
>  /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
> @@ -1720,8 +1723,7 @@ fix_up_fall_thru_edges (void)
>                      (i.e. fix it so the fall through does not cross and
>                      the cond jump does).  */
>
> -                 if (!cond_jump_crosses
> -                     && cur_bb->aux == cond_jump->dest)
> +                 if (!cond_jump_crosses)
>                     {
>                       /* Find label in fall_thru block. We've already added
>                          any missing labels, so there must be one.  */
> @@ -1765,10 +1767,10 @@ fix_up_fall_thru_edges (void)
>                       new_bb->aux = cur_bb->aux;
>                       cur_bb->aux = new_bb;
>
> -                     /* Make sure new fall-through bb is in same
> -                        partition as bb it's falling through from.  */
> +                      /* This is done by force_nonfallthru_and_redirect.  */
> +                     gcc_assert (BB_PARTITION (new_bb)
> +                                  == BB_PARTITION (cur_bb));
>
> -                     BB_COPY_PARTITION (new_bb, cur_bb);
>                       single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
>                     }
>                   else
> @@ -2064,7 +2066,10 @@ add_reg_crossing_jump_notes (void)
>    FOR_EACH_BB (bb)
>      FOR_EACH_EDGE (e, ei, bb->succs)
>        if ((e->flags & EDGE_CROSSING)
> -         && JUMP_P (BB_END (e->src)))
> +         && JUMP_P (BB_END (e->src))
> +          /* Some notes were added during fix_up_fall_thru_edges, via
> +             force_nonfallthru_and_redirect.  */
> +          && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX))
>         add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
>  }
>
> @@ -2133,25 +2138,39 @@ reorder_basic_blocks (void)
>     encountering this note will make the compiler switch between the
>     hot and cold text sections.  */
>
> -static void
> +void
>  insert_section_boundary_note (void)
>  {
>    basic_block bb;
> -  int first_partition = 0;
> +  int err = 0;
> +  bool switched_sections = false;
> +  int current_partition = 0;
>
> -  if (!flag_reorder_blocks_and_partition)
> +  if (!crtl->has_bb_partition)
>      return;
>
>    FOR_EACH_BB (bb)
>      {
> -      if (!first_partition)
> -       first_partition = BB_PARTITION (bb);
> -      if (BB_PARTITION (bb) != first_partition)
> +      if (!current_partition)
> +       current_partition = BB_PARTITION (bb);
> +      if (BB_PARTITION (bb) != current_partition)
>         {
> -         emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
> -         break;
> +         if (switched_sections)
> +           {
> +             error ("multiple hot/cold transitions found (bb %i)",
> +                    bb->index);
> +             err = 1;
> +           }
> +         else
> +           {
> +             switched_sections = true;
> +              emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
> +             current_partition = BB_PARTITION (bb);
> +           }
>         }
>      }
> +
> +  gcc_assert(!err);
>  }
>
>  static bool
> @@ -2180,8 +2199,6 @@ rest_of_handle_reorder_blocks (void)
>        bb->aux = bb->next_bb;
>    cfg_layout_finalize ();
>
> -  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
> -  insert_section_boundary_note ();
>    return 0;
>  }
>
> Index: bb-reorder.h
> ===================================================================
> --- bb-reorder.h        (revision 199014)
> +++ bb-reorder.h        (working copy)
> @@ -35,4 +35,8 @@ extern struct target_bb_reorder *this_target_bb_re
>
>  extern int get_uncond_jump_length (void);
>
> +extern void insert_section_boundary_note (void);
> +
> +extern void emit_barrier_after_bb (basic_block bb);
> +
>  #endif
> Index: Makefile.in
> ===================================================================
> --- Makefile.in (revision 199014)
> +++ Makefile.in (working copy)
> @@ -3151,7 +3151,7 @@ cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety
>     $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
>     insn-config.h $(EXPR_H) \
>     $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \
> -   $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h
> +   $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h
>  cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \
>     $(TIMEVAR_H) sbitmap.h $(BITMAP_H)
>  cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
> Index: cfgrtl.c
> ===================================================================
> --- cfgrtl.c    (revision 199014)
> +++ cfgrtl.c    (working copy)
> @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree.h"
>  #include "hard-reg-set.h"
>  #include "basic-block.h"
> +#include "bb-reorder.h"
>  #include "regs.h"
>  #include "flags.h"
>  #include "function.h"
> @@ -451,6 +452,9 @@ rest_of_pass_free_cfg (void)
>      }
>  #endif
>
> +  if (crtl->has_bb_partition)
> +    insert_section_boundary_note ();
> +
>    free_bb_for_insn ();
>    return 0;
>  }
> @@ -981,8 +985,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc
>       partition boundaries).  See  the comments at the top of
>       bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */
>
> -  if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
> -      || BB_PARTITION (src) != BB_PARTITION (target))
> +  if (BB_PARTITION (src) != BB_PARTITION (target))
>      return NULL;
>
>    /* We can replace or remove a complex jump only when we have exactly
> @@ -1291,6 +1294,53 @@ redirect_branch_edge (edge e, basic_block target)
>    return e;
>  }
>
> +/* Called when edge E has been redirected to a new destination,
> +   in order to update the region crossing flag on the edge and
> +   jump.  */
> +
> +static void
> +fixup_partition_crossing (edge e)
> +{
> +  rtx note;
> +
> +  if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR)
> +    return;
> +  /* If we redirected an existing edge, it may already be marked
> +     crossing, even though the new src is missing a reg crossing note.
> +     But make sure reg crossing note doesn't already exist before
> +     inserting.  */
> +  if (BB_PARTITION (e->src) != BB_PARTITION (e->dest))
> +    {
> +      e->flags |= EDGE_CROSSING;
> +      note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> +      if (JUMP_P (BB_END (e->src))
> +          && !note)
> +        add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> +    }
> +  else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest))
> +    {
> +      e->flags &= ~EDGE_CROSSING;
> +      /* Remove the region crossing note from jump at end of
> +         src if it exists, and if no other successors are
> +         still crossing.  */
> +      note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> +      if (note)
> +        {
> +          bool has_crossing_succ = false;
> +          edge e2;
> +          edge_iterator ei;
> +          FOR_EACH_EDGE (e2, ei, e->src->succs)
> +            {
> +              has_crossing_succ |= (e2->flags & EDGE_CROSSING);
> +              if (has_crossing_succ)
> +                break;
> +            }
> +          if (!has_crossing_succ)
> +            remove_note (BB_END (e->src), note);
> +        }
> +    }
> +}
> +
>  /* Attempt to change code to redirect edge E to TARGET.  Don't do that on
>     expense of adding new instructions or reordering basic blocks.
>
> @@ -1307,16 +1357,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block
>  {
>    edge ret;
>    basic_block src = e->src;
> +  basic_block dest = e->dest;
>
>    if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
>      return NULL;
>
> -  if (e->dest == target)
> +  if (dest == target)
>      return e;
>
>    if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL)
>      {
>        df_set_bb_dirty (src);
> +      fixup_partition_crossing (ret);
>        return ret;
>      }
>
> @@ -1325,6 +1377,7 @@ rtl_redirect_edge_and_branch (edge e, basic_block
>      return NULL;
>
>    df_set_bb_dirty (src);
> +  fixup_partition_crossing (ret);
>    return ret;
>  }
>
> @@ -1492,12 +1545,6 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>        /* Make sure new block ends up in correct hot/cold section.  */
>
>        BB_COPY_PARTITION (jump_block, e->src);
> -      if (flag_reorder_blocks_and_partition
> -         && targetm_common.have_named_sections
> -         && JUMP_P (BB_END (jump_block))
> -         && !any_condjump_p (BB_END (jump_block))
> -         && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING))
> -       add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX);
>
>        /* Wire edge in.  */
>        new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU);
> @@ -1508,6 +1555,10 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>        redirect_edge_pred (e, jump_block);
>        e->probability = REG_BR_PROB_BASE;
>
> +      /* If e->src was previously region crossing, it no longer is
> +         and the reg crossing note should be removed.  */
> +      fixup_partition_crossing (new_edge);
> +
>        /* If asm goto has any label refs to target's label,
>          add also edge from asm goto bb to target.  */
>        if (asm_goto_edge)
> @@ -1559,13 +1610,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>        LABEL_NUSES (label)++;
>      }
>
> -  emit_barrier_after (BB_END (jump_block));
> +  /* We might be in cfg layout mode, and if so, the following routine will
> +     insert the barrier correctly.  */
> +  emit_barrier_after_bb (jump_block);
>    redirect_edge_succ_nodup (e, target);
>
>    if (abnormal_edge_flags)
>      make_edge (src, target, abnormal_edge_flags);
>
>    df_mark_solutions_dirty ();
> +  fixup_partition_crossing (e);
>    return new_bb;
>  }
>
> @@ -1664,7 +1718,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU
>  static basic_block
>  rtl_split_edge (edge edge_in)
>  {
> -  basic_block bb;
> +  basic_block bb, new_bb;
>    rtx before;
>
>    /* Abnormal edges cannot be split.  */
> @@ -1697,12 +1751,26 @@ rtl_split_edge (edge edge_in)
>    else
>      {
>        bb = create_basic_block (before, NULL, edge_in->dest->prev_bb);
> -      /* ??? Why not edge_in->dest->prev_bb here?  */
> -      BB_COPY_PARTITION (bb, edge_in->dest);
> +      if (edge_in->src == ENTRY_BLOCK_PTR)
> +        BB_COPY_PARTITION (bb, edge_in->dest);
> +      else
> +        /* Put the split bb into the src partition, to avoid creating
> +           a situation where a cold bb dominates a hot bb, in the case
> +           where src is cold and dest is hot. The src will dominate
> +           the new bb (whereas it might not have dominated dest).  */
> +        BB_COPY_PARTITION (bb, edge_in->src);
>      }
>
>    make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU);
>
> +  /* Can't allow a region crossing edge to be fallthrough.  */
> +  if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest)
> +      && edge_in->dest != EXIT_BLOCK_PTR)
> +    {
> +      new_bb = force_nonfallthru (single_succ_edge (bb));
> +      gcc_assert (!new_bb);
> +    }
> +
>    /* For non-fallthru edges, we must adjust the predecessor's
>       jump instruction to target our new block.  */
>    if ((edge_in->flags & EDGE_FALLTHRU) == 0)
> @@ -1815,17 +1883,13 @@ commit_one_edge_insertion (edge e)
>    else
>      {
>        bb = split_edge (e);
> -      after = BB_END (bb);
>
> -      if (flag_reorder_blocks_and_partition
> -         && targetm_common.have_named_sections
> -         && e->src != ENTRY_BLOCK_PTR
> -         && BB_PARTITION (e->src) == BB_COLD_PARTITION
> -         && !(e->flags & EDGE_CROSSING)
> -         && JUMP_P (after)
> -         && !any_condjump_p (after)
> -         && (single_succ_edge (bb)->flags & EDGE_CROSSING))
> -       add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX);
> +      /* If E crossed a partition boundary, we needed to make bb end in
> +         a region-crossing jump, even though it was originally fallthru.  */
> +      if (JUMP_P (BB_END (bb)))
> +       before = BB_END (bb);
> +      else
> +        after = BB_END (bb);
>      }
>
>    /* Now that we've found the spot, do the insertion.  */
> @@ -2116,6 +2180,7 @@ rtl_verify_edges (void)
>        edge e, fallthru = NULL;
>        edge_iterator ei;
>        rtx note;
> +      bool has_crossing_edge = false;
>
>        if (JUMP_P (BB_END (bb))
>           && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX))
> @@ -2141,6 +2206,7 @@ rtl_verify_edges (void)
>           is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest)
>                          && e->src != ENTRY_BLOCK_PTR
>                          && e->dest != EXIT_BLOCK_PTR);
> +          has_crossing_edge |= is_crossing;
>           if (e->flags & EDGE_CROSSING)
>             {
>               if (!is_crossing)
> @@ -2160,6 +2226,13 @@ rtl_verify_edges (void)
>                          e->src->index);
>                   err = 1;
>                 }
> +              if (JUMP_P (BB_END (bb))
> +                  && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
> +               {
> +                 error ("No region crossing jump at section boundary in bb %i",
> +                        bb->index);
> +                 err = 1;
> +               }
>             }
>           else if (is_crossing)
>             {
> @@ -2188,6 +2261,15 @@ rtl_verify_edges (void)
>             n_abnormal++;
>         }
>
> +        if (!has_crossing_edge
> +            && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
> +          {
> +            print_rtl_with_bb (stderr, get_insns (), TDF_RTL | TDF_BLOCKS | TDF_DETAILS);
> +            error ("Region crossing jump across same section in bb %i",
> +                   bb->index);
> +            err = 1;
> +          }
> +
>        if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX))
>         {
>           error ("missing REG_EH_REGION note at the end of bb %i", bb->index);
> @@ -3343,7 +3425,7 @@ fixup_reorder_chain (void)
>        edge e_fall, e_taken, e;
>        rtx bb_end_insn;
>        rtx ret_label = NULL_RTX;
> -      basic_block nb, src_bb;
> +      basic_block nb;
>        edge_iterator ei;
>
>        if (EDGE_COUNT (bb->succs) == 0)
> @@ -3478,7 +3560,6 @@ fixup_reorder_chain (void)
>        /* We got here if we need to add a new jump insn.
>          Note force_nonfallthru can delete E_FALL and thus we have to
>          save E_FALL->src prior to the call to force_nonfallthru.  */
> -      src_bb = e_fall->src;
>        nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
>        if (nb)
>         {
> @@ -3486,17 +3567,6 @@ fixup_reorder_chain (void)
>           bb->aux = nb;
>           /* Don't process this new block.  */
>           bb = nb;
> -
> -         /* Make sure new bb is tagged for correct section (same as
> -            fall-thru source, since you cannot fall-thru across
> -            section boundaries).  */
> -         BB_COPY_PARTITION (src_bb, single_pred (bb));
> -         if (flag_reorder_blocks_and_partition
> -             && targetm_common.have_named_sections
> -             && JUMP_P (BB_END (bb))
> -             && !any_condjump_p (BB_END (bb))
> -             && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING))
> -           add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX);
>         }
>      }
>
> @@ -3796,10 +3866,11 @@ duplicate_insn_chain (rtx from, rtx to)
>             case NOTE_INSN_FUNCTION_BEG:
>               /* There is always just single entry to function.  */
>             case NOTE_INSN_BASIC_BLOCK:
> +              /* We should only switch text sections once.  */
> +           case NOTE_INSN_SWITCH_TEXT_SECTIONS:
>               break;
>
>             case NOTE_INSN_EPILOGUE_BEG:
> -           case NOTE_INSN_SWITCH_TEXT_SECTIONS:
>               emit_note_copy (insn);
>               break;
>
> @@ -4611,8 +4682,7 @@ rtl_can_remove_branch_p (const_edge e)
>    if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
>      return false;
>
> -  if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
> -      || BB_PARTITION (src) != BB_PARTITION (target))
> +  if (BB_PARTITION (src) != BB_PARTITION (target))
>      return false;
>
>    if (!onlyjump_p (insn)
Steven Bosscher - May 20, 2013, 10:42 p.m.
On Mon, May 20, 2013 at 4:55 AM, Teresa Johnson wrote:

>         * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
>         as this is now done by redirect_edge_and_branch_force.
>         * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
>         barriers, and fix interaction with splitting.
>         * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
>         * cfgcleanup.c (try_forward_edges): Fix early return value to properly
>         reflect changes made in the routine.
>         * bb-reorder.c (emit_barrier_after_bb): Handle insertion in
>         non-cfglayout mode.
>         (fix_up_fall_thru_edges): Remove incorrect check for bb layout order
>         since this is called in cfglayout mode, and replace partition fixup
>         with assert as that is now done by force_nonfallthru_and_redirect.
>         (add_reg_crossing_jump_notes): Handle the fact that some jumps may
>         already be marked with region crossing note.
>         (insert_section_boundary_note): Make non-static, gate on flag
>         has_bb_partition, rewrite to also check for multiple partitions.
>         (rest_of_handle_reorder_blocks): Remove call to
>         insert_section_boundary_note, now done later during free_cfg.
>         * bb-reorder.h: Declare insert_section_boundary_note and
>         emit_barrier_after_bb, which are no longer static.
>         * Makefile.in (cfgrtl.o): Depend on bb-reorder.h
>         * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
>         invoke insert_section_boundary_note.
>         (try_redirect_by_replacing_jump): Remove unnecessary
>         check for region crossing note.
>         (fixup_partition_crossing): New function.
>         (rtl_redirect_edge_and_branch): Fixup partition boundaries.
>         (force_nonfallthru_and_redirect): Fixup partition boundaries,
>         remove old code that tried to do this. Emit barrier correctly
>         when we are in cfglayout mode.
>         (rtl_split_edge): Correctly fixup partition boundaries.
>         (commit_one_edge_insertion): Remove old code that tried to
>         fixup region crossing edge since this is now handled in
>         split_block, and set up insertion point correctly since
>         block may now end in a jump.
>         (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
>         notes.
>         (fixup_reorder_chain): Remove old code that attempted to fixup region
>         crossing note as this is now handled in force_nonfallthru_and_redirect.
>         (duplicate_insn_chain): Don't duplicate switch section notes.
>         (rtl_can_remove_branch_p): Remove unnecessary check for region crossing
>         note.
>
> Index: ifcvt.c
> ===================================================================
> --- ifcvt.c     (revision 199014)
> +++ ifcvt.c     (working copy)
> @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg
>    if (new_bb)
>      {
>        df_bb_replace (then_bb_index, new_bb);
> -      /* Since the fallthru edge was redirected from test_bb to new_bb,
> -         we need to ensure that new_bb is in the same partition as
> -         test bb (you can not fall through across section boundaries).  */
> -      BB_COPY_PARTITION (new_bb, test_bb);
> +      /* This should have been done above via force_nonfallthru_and_redirect
> +         (possibly called from redirect_edge_and_branch_force).  */
> +      gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
>      }

Nit: gcc_checking_assert() ?


> Index: function.c
> ===================================================================
> --- function.c  (revision 199014)
> +++ function.c  (working copy)
> @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void)
>                     break;
>                 if (e)
>                   {
> -                   copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
> -                                                 NULL_RTX, e->src);
> +                    /* Make sure we insert after any barriers.  */
> +                    rtx end = get_last_bb_insn (e->src);
> +                    copy_bb = create_basic_block (NEXT_INSN (end),
> +                                                  NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
>                   }
>                 else

This is a nice catch. The change at first glance looks unrelated to
the profiling fixes, but apparently you have a test case where e is
not an EDGE_FALLTHRU edge? Otherwise there can't be a barrier. The
change looks OK to me.


> Index: bb-reorder.c
> ===================================================================
> --- bb-reorder.c        (revision 199014)
> +++ bb-reorder.c        (working copy)
> @@ -1380,13 +1380,16 @@ get_uncond_jump_length (void)
>    return length;
>  }
>
> -/* Emit a barrier into the footer of BB.  */
> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode.  */
>
> -static void
> +void
>  emit_barrier_after_bb (basic_block bb)
>  {

I still think this function should just be moved out of bb-reorder.c
then. Move it to cfgrtl,c please, and prototype it in basic-block.h
(one day we'll have to split basic-block.h into rtl/gimple/generic
parts, TODO++).

But to be honest, I still don't really understand why we emit a
barrier at all if we're in cfglayout mode. They're ignored, they're
going to be overlooked if someone looks for barriers after basic
blocks (nobody looks in the footer, and they should not), and
fixup_reorder_chain ought to put barriers where needed when coming out
of cfglayout mode. (TODO++ - that's how it goes for me every time I
look at gcc code ;-)


>         {
> -         emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
> -         break;
> +         if (switched_sections)
> +           {
> +             error ("multiple hot/cold transitions found (bb %i)",
> +                    bb->index);
> +             err = 1;

Just gcc_assert (! switched_sections).


> @@ -2180,8 +2199,6 @@ rest_of_handle_reorder_blocks (void)
>        bb->aux = bb->next_bb;
>    cfg_layout_finalize ();
>
> -  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
> -  insert_section_boundary_note ();
>    return 0;
>  }

...and the titans were defeated.


> +      /* Remove the region crossing note from jump at end of

s/region/section/

Looks good to me, overall. The only thing missing is test cases, if
you have any.

Can you file a bug report for the debug info issues?

Ciao!
Steven
Teresa Johnson - May 22, 2013, 5:17 a.m.
On Mon, May 20, 2013 at 3:42 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Mon, May 20, 2013 at 4:55 AM, Teresa Johnson wrote:
>
>>         * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
>>         as this is now done by redirect_edge_and_branch_force.
>>         * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
>>         barriers, and fix interaction with splitting.
>>         * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
>>         * cfgcleanup.c (try_forward_edges): Fix early return value to properly
>>         reflect changes made in the routine.
>>         * bb-reorder.c (emit_barrier_after_bb): Handle insertion in
>>         non-cfglayout mode.
>>         (fix_up_fall_thru_edges): Remove incorrect check for bb layout order
>>         since this is called in cfglayout mode, and replace partition fixup
>>         with assert as that is now done by force_nonfallthru_and_redirect.
>>         (add_reg_crossing_jump_notes): Handle the fact that some jumps may
>>         already be marked with region crossing note.
>>         (insert_section_boundary_note): Make non-static, gate on flag
>>         has_bb_partition, rewrite to also check for multiple partitions.
>>         (rest_of_handle_reorder_blocks): Remove call to
>>         insert_section_boundary_note, now done later during free_cfg.
>>         * bb-reorder.h: Declare insert_section_boundary_note and
>>         emit_barrier_after_bb, which are no longer static.
>>         * Makefile.in (cfgrtl.o): Depend on bb-reorder.h
>>         * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
>>         invoke insert_section_boundary_note.
>>         (try_redirect_by_replacing_jump): Remove unnecessary
>>         check for region crossing note.
>>         (fixup_partition_crossing): New function.
>>         (rtl_redirect_edge_and_branch): Fixup partition boundaries.
>>         (force_nonfallthru_and_redirect): Fixup partition boundaries,
>>         remove old code that tried to do this. Emit barrier correctly
>>         when we are in cfglayout mode.
>>         (rtl_split_edge): Correctly fixup partition boundaries.
>>         (commit_one_edge_insertion): Remove old code that tried to
>>         fixup region crossing edge since this is now handled in
>>         split_block, and set up insertion point correctly since
>>         block may now end in a jump.
>>         (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
>>         notes.
>>         (fixup_reorder_chain): Remove old code that attempted to fixup region
>>         crossing note as this is now handled in force_nonfallthru_and_redirect.
>>         (duplicate_insn_chain): Don't duplicate switch section notes.
>>         (rtl_can_remove_branch_p): Remove unnecessary check for region crossing
>>         note.
>>
>> Index: ifcvt.c
>> ===================================================================
>> --- ifcvt.c     (revision 199014)
>> +++ ifcvt.c     (working copy)
>> @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg
>>    if (new_bb)
>>      {
>>        df_bb_replace (then_bb_index, new_bb);
>> -      /* Since the fallthru edge was redirected from test_bb to new_bb,
>> -         we need to ensure that new_bb is in the same partition as
>> -         test bb (you can not fall through across section boundaries).  */
>> -      BB_COPY_PARTITION (new_bb, test_bb);
>> +      /* This should have been done above via force_nonfallthru_and_redirect
>> +         (possibly called from redirect_edge_and_branch_force).  */
>> +      gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
>>      }
>
> Nit: gcc_checking_assert() ?

ok.

>
>
>> Index: function.c
>> ===================================================================
>> --- function.c  (revision 199014)
>> +++ function.c  (working copy)
>> @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void)
>>                     break;
>>                 if (e)
>>                   {
>> -                   copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>> -                                                 NULL_RTX, e->src);
>> +                    /* Make sure we insert after any barriers.  */
>> +                    rtx end = get_last_bb_insn (e->src);
>> +                    copy_bb = create_basic_block (NEXT_INSN (end),
>> +                                                  NULL_RTX, e->src);
>>                     BB_COPY_PARTITION (copy_bb, e->src);
>>                   }
>>                 else
>
> This is a nice catch. The change at first glance looks unrelated to
> the profiling fixes, but apparently you have a test case where e is
> not an EDGE_FALLTHRU edge? Otherwise there can't be a barrier. The
> change looks OK to me.

Yes, according to my notes this fix was from a case in hmmer where the
edge was a crossing edge.

>
>
>> Index: bb-reorder.c
>> ===================================================================
>> --- bb-reorder.c        (revision 199014)
>> +++ bb-reorder.c        (working copy)
>> @@ -1380,13 +1380,16 @@ get_uncond_jump_length (void)
>>    return length;
>>  }
>>
>> -/* Emit a barrier into the footer of BB.  */
>> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode.  */
>>
>> -static void
>> +void
>>  emit_barrier_after_bb (basic_block bb)
>>  {
>
> I still think this function should just be moved out of bb-reorder.c
> then. Move it to cfgrtl,c please, and prototype it in basic-block.h
> (one day we'll have to split basic-block.h into rtl/gimple/generic
> parts, TODO++).
>
> But to be honest, I still don't really understand why we emit a
> barrier at all if we're in cfglayout mode. They're ignored, they're
> going to be overlooked if someone looks for barriers after basic
> blocks (nobody looks in the footer, and they should not), and
> fixup_reorder_chain ought to put barriers where needed when coming out
> of cfglayout mode. (TODO++ - that's how it goes for me every time I
> look at gcc code ;-)

Ok, I've moved it. Interestingly, previous to my modification to
enable this to be called in non-cfglayout mode it was only called from
bbpart, which is in cfglayout mode. This may be another case where
bbpart is confused about what mode it is in? Probably worth some
investigation/cleanup at some point as you note.

>
>
>>         {
>> -         emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
>> -         break;
>> +         if (switched_sections)
>> +           {
>> +             error ("multiple hot/cold transitions found (bb %i)",
>> +                    bb->index);
>> +             err = 1;
>
> Just gcc_assert (! switched_sections).

The rest of the rtl flow verification code used error () and returned
an err=1, so I was trying to be consistent with that. Do we want to be
different in this routine?

>
>
>> @@ -2180,8 +2199,6 @@ rest_of_handle_reorder_blocks (void)
>>        bb->aux = bb->next_bb;
>>    cfg_layout_finalize ();
>>
>> -  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
>> -  insert_section_boundary_note ();
>>    return 0;
>>  }
>
> ...and the titans were defeated.
>
>
>> +      /* Remove the region crossing note from jump at end of
>
> s/region/section/
>
> Looks good to me, overall. The only thing missing is test cases, if
> you have any.

I manually ran all of the gcc.c-torture/execute tests to collect and
feedback fdo, and enable -freorder-blocks-and-partitions, with and
without my fixes. This gave me a few test cases that I will add as
fdo/partitioning test cases. A couple were still problems that I
hadn't flushed out yet, relating to the new verification code that
checks that we only have a single section switch in the function,
because a couple post-bbro phases (e.g. compgoto) were not being
careful enough.

Revised patch under testing and coming tomorrow.

>
> Can you file a bug report for the debug info issues?

Will do.

Thanks,
Teresa

>
> Ciao!
> Steven



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Steven Bosscher - May 22, 2013, 10:07 p.m.
On Wed, May 22, 2013 at 7:17 AM, Teresa Johnson wrote:
> On Mon, May 20, 2013 at 3:42 PM, Steven Bosscher wrote:

>> But to be honest, I still don't really understand why we emit a
>> barrier at all if we're in cfglayout mode. They're ignored, they're
>> going to be overlooked if someone looks for barriers after basic
>> blocks (nobody looks in the footer, and they should not), and
>> fixup_reorder_chain ought to put barriers where needed when coming out
>> of cfglayout mode. (TODO++ - that's how it goes for me every time I
>> look at gcc code ;-)
>
> Ok, I've moved it. Interestingly, previous to my modification to
> enable this to be called in non-cfglayout mode it was only called from
> bbpart, which is in cfglayout mode. This may be another case where
> bbpart is confused about what mode it is in? Probably worth some
> investigation/cleanup at some point as you note.

The problem here is two things:

1. Many GCC developers still don't fully grasp the difference between
cfglayout mode and the older cfgrtl mode.

2. There isn't anything in rtl_verify_flow_info that verifies
cfglayout assumptions (such as "nothing lives between basic blocks")

Cleaning that up is just yet another thing on my TODO list. If you
have the opportunity to spend some time on it (maybe starting with
some verify_flow_info bits), that'd be really welcome.



>>>         {
>>> -         emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
>>> -         break;
>>> +         if (switched_sections)
>>> +           {
>>> +             error ("multiple hot/cold transitions found (bb %i)",
>>> +                    bb->index);
>>> +             err = 1;
>>
>> Just gcc_assert (! switched_sections).
>
> The rest of the rtl flow verification code used error () and returned
> an err=1, so I was trying to be consistent with that. Do we want to be
> different in this routine?

But this is in insert_section_boundary_note, which isn't really
verification code. I haven't looked - does rtl_verify_flow_info
already check that there is only one section switch? If not, and this
code in insert_section_boundary_note is actually verification code,
then this code should probably be moved to cfgrtl.c to be called by
rtl_verify_flow_info. Can be done as a follow-up, though.



> I manually ran all of the gcc.c-torture/execute tests to collect and
> feedback fdo, and enable -freorder-blocks-and-partitions, with and
> without my fixes. This gave me a few test cases that I will add as
> fdo/partitioning test cases.

Impressive!

> A couple were still problems that I
> hadn't flushed out yet, relating to the new verification code that
> checks that we only have a single section switch in the function,
> because a couple post-bbro phases (e.g. compgoto) were not being
> careful enough.

Uh-oh, that compgoto pass is my doing. If you file a PR for it, please
assign it to me :-)

Ciao!
Steven
Teresa Johnson - May 22, 2013, 10:24 p.m.
On Wed, May 22, 2013 at 3:07 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Wed, May 22, 2013 at 7:17 AM, Teresa Johnson wrote:
>> On Mon, May 20, 2013 at 3:42 PM, Steven Bosscher wrote:
>
>>> But to be honest, I still don't really understand why we emit a
>>> barrier at all if we're in cfglayout mode. They're ignored, they're
>>> going to be overlooked if someone looks for barriers after basic
>>> blocks (nobody looks in the footer, and they should not), and
>>> fixup_reorder_chain ought to put barriers where needed when coming out
>>> of cfglayout mode. (TODO++ - that's how it goes for me every time I
>>> look at gcc code ;-)
>>
>> Ok, I've moved it. Interestingly, previous to my modification to
>> enable this to be called in non-cfglayout mode it was only called from
>> bbpart, which is in cfglayout mode. This may be another case where
>> bbpart is confused about what mode it is in? Probably worth some
>> investigation/cleanup at some point as you note.
>
> The problem here is two things:
>
> 1. Many GCC developers still don't fully grasp the difference between
> cfglayout mode and the older cfgrtl mode.
>
> 2. There isn't anything in rtl_verify_flow_info that verifies
> cfglayout assumptions (such as "nothing lives between basic blocks")
>
> Cleaning that up is just yet another thing on my TODO list. If you
> have the opportunity to spend some time on it (maybe starting with
> some verify_flow_info bits), that'd be really welcome.

Yep, I'll try to do more cleanup. I'm still working on performance
tuning with splitting, so more cleanup may come out of that as well.

>
>
>
>>>>         {
>>>> -         emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
>>>> -         break;
>>>> +         if (switched_sections)
>>>> +           {
>>>> +             error ("multiple hot/cold transitions found (bb %i)",
>>>> +                    bb->index);
>>>> +             err = 1;
>>>
>>> Just gcc_assert (! switched_sections).
>>
>> The rest of the rtl flow verification code used error () and returned
>> an err=1, so I was trying to be consistent with that. Do we want to be
>> different in this routine?
>
> But this is in insert_section_boundary_note, which isn't really
> verification code. I haven't looked - does rtl_verify_flow_info
> already check that there is only one section switch? If not, and this
> code in insert_section_boundary_note is actually verification code,
> then this code should probably be moved to cfgrtl.c to be called by
> rtl_verify_flow_info. Can be done as a follow-up, though.

Oh got it, there is nearly identical code in
verify_hot_cold_block_grouping, and I got confused. I can fix this.
It's not fixed in the newest version of the patch I just sent out an
hour ago.

>
>
>
>> I manually ran all of the gcc.c-torture/execute tests to collect and
>> feedback fdo, and enable -freorder-blocks-and-partitions, with and
>> without my fixes. This gave me a few test cases that I will add as
>> fdo/partitioning test cases.
>
> Impressive!
>
>> A couple were still problems that I
>> hadn't flushed out yet, relating to the new verification code that
>> checks that we only have a single section switch in the function,
>> because a couple post-bbro phases (e.g. compgoto) were not being
>> careful enough.
>
> Uh-oh, that compgoto pass is my doing. If you file a PR for it, please
> assign it to me :-)

It's fixed in the patch I just sent. See if things look ok there.

Thanks!
Teresa

>
> Ciao!
> Steven



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Jeff Law - May 23, 2013, 2:07 a.m.
On 05/22/2013 04:07 PM, Steven Bosscher wrote:
>
> The problem here is two things:
>
> 1. Many GCC developers still don't fully grasp the difference between
> cfglayout mode and the older cfgrtl mode.
Absolutely true.  I'd actually love it if someone (you?) could write up 
the basics of cfglayout mode.  Why does it exist, what things should a 
developer need to know about it, etc.

Jeff
Steven Bosscher - May 23, 2013, 5:20 a.m.
On Thu, May 23, 2013 at 4:07 AM, Jeff Law wrote:
> On 05/22/2013 04:07 PM, Steven Bosscher wrote:
>>
>>
>> The problem here is two things:
>>
>> 1. Many GCC developers still don't fully grasp the difference between
>> cfglayout mode and the older cfgrtl mode.
>
> Absolutely true.  I'd actually love it if someone (you?) could write up the
> basics of cfglayout mode.  Why does it exist, what things should a developer
> need to know about it, etc.

I did that already, way back in 2006, based on a presentation I gave
at the Moscow Gelato meeting (see attachment). I first posted it as a
patch for doc/cfg.texi (twice) but it never got reviewed, so I put it
on the GCC Wiki instead: http://gcc.gnu.org/wiki/cfglayout_mode

The change-over to MoinMoin mangled the page markup so it's a bit
messy right now. It's also 7 year old text that hasn't been updated to
reflect the current state of the compiler (with almost all pre
register allocator passes working in cfglayout mode).

I will clean it up again and create a new diff for cfg.texi.

Ciao!
Steven
Jeff Law - May 23, 2013, 3:35 p.m.
On 05/22/2013 11:20 PM, Steven Bosscher wrote:
> On Thu, May 23, 2013 at 4:07 AM, Jeff Law wrote:
>> On 05/22/2013 04:07 PM, Steven Bosscher wrote:
>>>
>>>
>>> The problem here is two things:
>>>
>>> 1. Many GCC developers still don't fully grasp the difference between
>>> cfglayout mode and the older cfgrtl mode.
>>
>> Absolutely true.  I'd actually love it if someone (you?) could write up the
>> basics of cfglayout mode.  Why does it exist, what things should a developer
>> need to know about it, etc.
>
> I did that already, way back in 2006, based on a presentation I gave
> at the Moscow Gelato meeting (see attachment). I first posted it as a
> patch for doc/cfg.texi (twice) but it never got reviewed, so I put it
> on the GCC Wiki instead: http://gcc.gnu.org/wiki/cfglayout_mode
>
> The change-over to MoinMoin mangled the page markup so it's a bit
> messy right now. It's also 7 year old text that hasn't been updated to
> reflect the current state of the compiler (with almost all pre
> register allocator passes working in cfglayout mode).
>
> I will clean it up again and create a new diff for cfg.texi.
Thanks.  I wasn't aware of that wiki page.  I'll be reading it today :-)

jeff
Steven Bosscher - May 24, 2013, 1:03 p.m.
On Thu, May 23, 2013 at 5:35 PM, Jeff Law wrote:
> Thanks.  I wasn't aware of that wiki page.  I'll be reading it today :-)

The .odp attachment is actually a bit more informative, you should
take a look at that too, if you have the time.

Comments welcome, so I can include that in the new .texi version I'll
put together this weekend :-)

Ciao!
Steven
Jan Hubicka - May 24, 2013, 1:25 p.m.
> On Thu, May 23, 2013 at 5:35 PM, Jeff Law wrote:
> > Thanks.  I wasn't aware of that wiki page.  I'll be reading it today :-)
> 
> The .odp attachment is actually a bit more informative, you should
> take a look at that too, if you have the time.
> 
> Comments welcome, so I can include that in the new .texi version I'll
> put together this weekend :-)

Thanks for writting this down ;)

note that the CFG infrastructure also existed in reg-stack.c and flow.c, not only
in Haifa.  The original idea behind CFG code was to unify those implementations
into something more generally useful and one developed by Rth for liveness was
taken a as the winner to rule them all.

http://www.ucw.cz/~hubicka/papers/proj.pdf
has some more historical notes, too ;)
Honza
> 
> Ciao!
> Steven
Miles Bader - May 27, 2013, 8:42 a.m.
A nitpick: the dragon book was first published 36 years ago... (!)

-miles

Patch

Index: ifcvt.c
===================================================================
--- ifcvt.c	(revision 199014)
+++ ifcvt.c	(working copy)
@@ -3905,10 +3905,9 @@  find_if_case_1 (basic_block test_bb, edge then_edg
   if (new_bb)
     {
       df_bb_replace (then_bb_index, new_bb);
-      /* Since the fallthru edge was redirected from test_bb to new_bb,
-         we need to ensure that new_bb is in the same partition as
-         test bb (you can not fall through across section boundaries).  */
-      BB_COPY_PARTITION (new_bb, test_bb);
+      /* This should have been done above via force_nonfallthru_and_redirect
+         (possibly called from redirect_edge_and_branch_force).  */
+      gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
     }
 
   num_true_changes++;
Index: function.c
===================================================================
--- function.c	(revision 199014)
+++ function.c	(working copy)
@@ -6270,8 +6270,10 @@  thread_prologue_and_epilogue_insns (void)
 		    break;
 		if (e)
 		  {
-		    copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
-						  NULL_RTX, e->src);
+                    /* Make sure we insert after any barriers.  */
+                    rtx end = get_last_bb_insn (e->src);
+                    copy_bb = create_basic_block (NEXT_INSN (end),
+                                                  NULL_RTX, e->src);
 		    BB_COPY_PARTITION (copy_bb, e->src);
 		  }
 		else
@@ -6538,7 +6540,7 @@  epilogue_done:
       basic_block simple_return_block_cold = NULL;
       edge pending_edge_hot = NULL;
       edge pending_edge_cold = NULL;
-      basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
+      basic_block exit_pred;
       int i;
 
       gcc_assert (entry_edge != orig_entry_edge);
@@ -6566,6 +6568,12 @@  epilogue_done:
 	    else
 	      pending_edge_cold = e;
 	  }
+      
+      /* Save a pointer to the exit's predecessor BB for use in
+         inserting new BBs at the end of the function. Do this
+         after the call to split_block above which may split
+         the original exit pred.  */
+      exit_pred = EXIT_BLOCK_PTR->prev_bb;
 
       FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e)
 	{
Index: emit-rtl.c
===================================================================
--- emit-rtl.c	(revision 199014)
+++ emit-rtl.c	(working copy)
@@ -3574,6 +3574,7 @@  try_split (rtx pat, rtx trial, int last)
 	  break;
 
 	case REG_NON_LOCAL_GOTO:
+	case REG_CROSSING_JUMP:
 	  for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn))
 	    {
 	      if (JUMP_P (insn))
Index: cfgcleanup.c
===================================================================
--- cfgcleanup.c	(revision 199014)
+++ cfgcleanup.c	(working copy)
@@ -456,7 +456,7 @@  try_forward_edges (int mode, basic_block b)
 
       if (first != EXIT_BLOCK_PTR
 	  && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX))
-	return false;
+	return changed;
 
       while (counter < n_basic_blocks)
 	{
Index: bb-reorder.c
===================================================================
--- bb-reorder.c	(revision 199014)
+++ bb-reorder.c	(working copy)
@@ -1380,13 +1380,16 @@  get_uncond_jump_length (void)
   return length;
 }
 
-/* Emit a barrier into the footer of BB.  */
+/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode.  */
 
-static void
+void
 emit_barrier_after_bb (basic_block bb)
 {
   rtx barrier = emit_barrier_after (BB_END (bb));
-  BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
+  gcc_assert (current_ir_type() == IR_RTL_CFGRTL
+              || current_ir_type () == IR_RTL_CFGLAYOUT);
+  if (current_ir_type () == IR_RTL_CFGLAYOUT)
+    BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
 }
 
 /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
@@ -1720,8 +1723,7 @@  fix_up_fall_thru_edges (void)
 		     (i.e. fix it so the fall through does not cross and
 		     the cond jump does).  */
 
-		  if (!cond_jump_crosses
-		      && cur_bb->aux == cond_jump->dest)
+		  if (!cond_jump_crosses)
 		    {
 		      /* Find label in fall_thru block. We've already added
 			 any missing labels, so there must be one.  */
@@ -1765,10 +1767,10 @@  fix_up_fall_thru_edges (void)
 		      new_bb->aux = cur_bb->aux;
 		      cur_bb->aux = new_bb;
 
-		      /* Make sure new fall-through bb is in same
-			 partition as bb it's falling through from.  */
+                      /* This is done by force_nonfallthru_and_redirect.  */
+		      gcc_assert (BB_PARTITION (new_bb)
+                                  == BB_PARTITION (cur_bb));
 
-		      BB_COPY_PARTITION (new_bb, cur_bb);
 		      single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
 		    }
 		  else
@@ -2064,7 +2066,10 @@  add_reg_crossing_jump_notes (void)
   FOR_EACH_BB (bb)
     FOR_EACH_EDGE (e, ei, bb->succs)
       if ((e->flags & EDGE_CROSSING)
-	  && JUMP_P (BB_END (e->src)))
+	  && JUMP_P (BB_END (e->src))
+          /* Some notes were added during fix_up_fall_thru_edges, via
+             force_nonfallthru_and_redirect.  */
+          && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX))
 	add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
 }
 
@@ -2133,25 +2138,39 @@  reorder_basic_blocks (void)
    encountering this note will make the compiler switch between the
    hot and cold text sections.  */
 
-static void
+void
 insert_section_boundary_note (void)
 {
   basic_block bb;
-  int first_partition = 0;
+  int err = 0;
+  bool switched_sections = false;
+  int current_partition = 0;
 
-  if (!flag_reorder_blocks_and_partition)
+  if (!crtl->has_bb_partition)
     return;
 
   FOR_EACH_BB (bb)
     {
-      if (!first_partition)
-	first_partition = BB_PARTITION (bb);
-      if (BB_PARTITION (bb) != first_partition)
+      if (!current_partition)
+	current_partition = BB_PARTITION (bb);
+      if (BB_PARTITION (bb) != current_partition)
 	{
-	  emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
-	  break;
+	  if (switched_sections)
+	    {
+	      error ("multiple hot/cold transitions found (bb %i)",
+		     bb->index);
+	      err = 1;
+	    }
+	  else
+	    {
+	      switched_sections = true;
+              emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
+	      current_partition = BB_PARTITION (bb);
+	    }
 	}
     }
+
+  gcc_assert(!err);
 }
 
 static bool
@@ -2180,8 +2199,6 @@  rest_of_handle_reorder_blocks (void)
       bb->aux = bb->next_bb;
   cfg_layout_finalize ();
 
-  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
-  insert_section_boundary_note ();
   return 0;
 }
 
Index: bb-reorder.h
===================================================================
--- bb-reorder.h	(revision 199014)
+++ bb-reorder.h	(working copy)
@@ -35,4 +35,8 @@  extern struct target_bb_reorder *this_target_bb_re
 
 extern int get_uncond_jump_length (void);
 
+extern void insert_section_boundary_note (void);
+
+extern void emit_barrier_after_bb (basic_block bb);
+
 #endif
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 199014)
+++ Makefile.in	(working copy)
@@ -3151,7 +3151,7 @@  cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety
    $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
    insn-config.h $(EXPR_H) \
    $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \
-   $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h
+   $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h
 cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \
    $(TIMEVAR_H) sbitmap.h $(BITMAP_H)
 cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
Index: cfgrtl.c
===================================================================
--- cfgrtl.c	(revision 199014)
+++ cfgrtl.c	(working copy)
@@ -44,6 +44,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree.h"
 #include "hard-reg-set.h"
 #include "basic-block.h"
+#include "bb-reorder.h"
 #include "regs.h"
 #include "flags.h"
 #include "function.h"
@@ -451,6 +452,9 @@  rest_of_pass_free_cfg (void)
     }
 #endif
 
+  if (crtl->has_bb_partition)
+    insert_section_boundary_note ();
+
   free_bb_for_insn ();
   return 0;
 }
@@ -981,8 +985,7 @@  try_redirect_by_replacing_jump (edge e, basic_bloc
      partition boundaries).  See  the comments at the top of
      bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */
 
-  if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
-      || BB_PARTITION (src) != BB_PARTITION (target))
+  if (BB_PARTITION (src) != BB_PARTITION (target))
     return NULL;
 
   /* We can replace or remove a complex jump only when we have exactly
@@ -1291,6 +1294,53 @@  redirect_branch_edge (edge e, basic_block target)
   return e;
 }
 
+/* Called when edge E has been redirected to a new destination,
+   in order to update the region crossing flag on the edge and
+   jump.  */
+
+static void
+fixup_partition_crossing (edge e)
+{
+  rtx note;
+
+  if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR)
+    return;
+  /* If we redirected an existing edge, it may already be marked
+     crossing, even though the new src is missing a reg crossing note.
+     But make sure reg crossing note doesn't already exist before
+     inserting.  */
+  if (BB_PARTITION (e->src) != BB_PARTITION (e->dest))
+    {
+      e->flags |= EDGE_CROSSING;
+      note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
+      if (JUMP_P (BB_END (e->src))
+          && !note)
+        add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
+    }
+  else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest))
+    {
+      e->flags &= ~EDGE_CROSSING;
+      /* Remove the region crossing note from jump at end of
+         src if it exists, and if no other successors are
+         still crossing.  */
+      note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
+      if (note)
+        {
+          bool has_crossing_succ = false;
+          edge e2;
+          edge_iterator ei;
+          FOR_EACH_EDGE (e2, ei, e->src->succs)
+            {
+              has_crossing_succ |= (e2->flags & EDGE_CROSSING);
+              if (has_crossing_succ)
+                break;
+            }
+          if (!has_crossing_succ)
+            remove_note (BB_END (e->src), note);
+        }
+    }
+}
+
 /* Attempt to change code to redirect edge E to TARGET.  Don't do that on
    expense of adding new instructions or reordering basic blocks.
 
@@ -1307,16 +1357,18 @@  rtl_redirect_edge_and_branch (edge e, basic_block
 {
   edge ret;
   basic_block src = e->src;
+  basic_block dest = e->dest;
 
   if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
     return NULL;
 
-  if (e->dest == target)
+  if (dest == target)
     return e;
 
   if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL)
     {
       df_set_bb_dirty (src);
+      fixup_partition_crossing (ret);
       return ret;
     }
 
@@ -1325,6 +1377,7 @@  rtl_redirect_edge_and_branch (edge e, basic_block
     return NULL;
 
   df_set_bb_dirty (src);
+  fixup_partition_crossing (ret);
   return ret;
 }
 
@@ -1492,12 +1545,6 @@  force_nonfallthru_and_redirect (edge e, basic_bloc
       /* Make sure new block ends up in correct hot/cold section.  */
 
       BB_COPY_PARTITION (jump_block, e->src);
-      if (flag_reorder_blocks_and_partition
-	  && targetm_common.have_named_sections
-	  && JUMP_P (BB_END (jump_block))
-	  && !any_condjump_p (BB_END (jump_block))
-	  && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING))
-	add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX);
 
       /* Wire edge in.  */
       new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU);
@@ -1508,6 +1555,10 @@  force_nonfallthru_and_redirect (edge e, basic_bloc
       redirect_edge_pred (e, jump_block);
       e->probability = REG_BR_PROB_BASE;
 
+      /* If e->src was previously region crossing, it no longer is
+         and the reg crossing note should be removed.  */
+      fixup_partition_crossing (new_edge);
+
       /* If asm goto has any label refs to target's label,
 	 add also edge from asm goto bb to target.  */
       if (asm_goto_edge)
@@ -1559,13 +1610,16 @@  force_nonfallthru_and_redirect (edge e, basic_bloc
       LABEL_NUSES (label)++;
     }
 
-  emit_barrier_after (BB_END (jump_block));
+  /* We might be in cfg layout mode, and if so, the following routine will
+     insert the barrier correctly.  */
+  emit_barrier_after_bb (jump_block);
   redirect_edge_succ_nodup (e, target);
 
   if (abnormal_edge_flags)
     make_edge (src, target, abnormal_edge_flags);
 
   df_mark_solutions_dirty ();
+  fixup_partition_crossing (e);
   return new_bb;
 }
 
@@ -1664,7 +1718,7 @@  rtl_move_block_after (basic_block bb ATTRIBUTE_UNU
 static basic_block
 rtl_split_edge (edge edge_in)
 {
-  basic_block bb;
+  basic_block bb, new_bb;
   rtx before;
 
   /* Abnormal edges cannot be split.  */
@@ -1697,12 +1751,26 @@  rtl_split_edge (edge edge_in)
   else
     {
       bb = create_basic_block (before, NULL, edge_in->dest->prev_bb);
-      /* ??? Why not edge_in->dest->prev_bb here?  */
-      BB_COPY_PARTITION (bb, edge_in->dest);
+      if (edge_in->src == ENTRY_BLOCK_PTR)
+        BB_COPY_PARTITION (bb, edge_in->dest);
+      else
+        /* Put the split bb into the src partition, to avoid creating
+           a situation where a cold bb dominates a hot bb, in the case
+           where src is cold and dest is hot. The src will dominate
+           the new bb (whereas it might not have dominated dest).  */
+        BB_COPY_PARTITION (bb, edge_in->src);
     }
 
   make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU);
 
+  /* Can't allow a region crossing edge to be fallthrough.  */
+  if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest)
+      && edge_in->dest != EXIT_BLOCK_PTR)
+    {
+      new_bb = force_nonfallthru (single_succ_edge (bb));
+      gcc_assert (!new_bb);
+    }
+
   /* For non-fallthru edges, we must adjust the predecessor's
      jump instruction to target our new block.  */
   if ((edge_in->flags & EDGE_FALLTHRU) == 0)
@@ -1815,17 +1883,13 @@  commit_one_edge_insertion (edge e)
   else
     {
       bb = split_edge (e);
-      after = BB_END (bb);
 
-      if (flag_reorder_blocks_and_partition
-	  && targetm_common.have_named_sections
-	  && e->src != ENTRY_BLOCK_PTR
-	  && BB_PARTITION (e->src) == BB_COLD_PARTITION
-	  && !(e->flags & EDGE_CROSSING)
-	  && JUMP_P (after)
-	  && !any_condjump_p (after)
-	  && (single_succ_edge (bb)->flags & EDGE_CROSSING))
-	add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX);
+      /* If E crossed a partition boundary, we needed to make bb end in
+         a region-crossing jump, even though it was originally fallthru.  */
+      if (JUMP_P (BB_END (bb)))
+	before = BB_END (bb);
+      else
+        after = BB_END (bb);
     }
 
   /* Now that we've found the spot, do the insertion.  */
@@ -2116,6 +2180,7 @@  rtl_verify_edges (void)
       edge e, fallthru = NULL;
       edge_iterator ei;
       rtx note;
+      bool has_crossing_edge = false;
 
       if (JUMP_P (BB_END (bb))
 	  && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX))
@@ -2141,6 +2206,7 @@  rtl_verify_edges (void)
 	  is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest)
 			 && e->src != ENTRY_BLOCK_PTR
 			 && e->dest != EXIT_BLOCK_PTR);
+          has_crossing_edge |= is_crossing;
 	  if (e->flags & EDGE_CROSSING)
 	    {
 	      if (!is_crossing)
@@ -2160,6 +2226,13 @@  rtl_verify_edges (void)
 			 e->src->index);
 		  err = 1;
 		}
+              if (JUMP_P (BB_END (bb))
+                  && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
+		{
+		  error ("No region crossing jump at section boundary in bb %i",
+			 bb->index);
+		  err = 1;
+		}
 	    }
 	  else if (is_crossing)
 	    {
@@ -2188,6 +2261,15 @@  rtl_verify_edges (void)
 	    n_abnormal++;
 	}
 
+        if (!has_crossing_edge
+            && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
+          {
+            print_rtl_with_bb (stderr, get_insns (), TDF_RTL | TDF_BLOCKS | TDF_DETAILS);
+            error ("Region crossing jump across same section in bb %i",
+                   bb->index);
+            err = 1;
+          }
+
       if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX))
 	{
 	  error ("missing REG_EH_REGION note at the end of bb %i", bb->index);
@@ -3343,7 +3425,7 @@  fixup_reorder_chain (void)
       edge e_fall, e_taken, e;
       rtx bb_end_insn;
       rtx ret_label = NULL_RTX;
-      basic_block nb, src_bb;
+      basic_block nb;
       edge_iterator ei;
 
       if (EDGE_COUNT (bb->succs) == 0)
@@ -3478,7 +3560,6 @@  fixup_reorder_chain (void)
       /* We got here if we need to add a new jump insn. 
 	 Note force_nonfallthru can delete E_FALL and thus we have to
 	 save E_FALL->src prior to the call to force_nonfallthru.  */
-      src_bb = e_fall->src;
       nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
       if (nb)
 	{
@@ -3486,17 +3567,6 @@  fixup_reorder_chain (void)
 	  bb->aux = nb;
 	  /* Don't process this new block.  */
 	  bb = nb;
-
-	  /* Make sure new bb is tagged for correct section (same as
-	     fall-thru source, since you cannot fall-thru across
-	     section boundaries).  */
-	  BB_COPY_PARTITION (src_bb, single_pred (bb));
-	  if (flag_reorder_blocks_and_partition
-	      && targetm_common.have_named_sections
-	      && JUMP_P (BB_END (bb))
-	      && !any_condjump_p (BB_END (bb))
-	      && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING))
-	    add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX);
 	}
     }
 
@@ -3796,10 +3866,11 @@  duplicate_insn_chain (rtx from, rtx to)
 	    case NOTE_INSN_FUNCTION_BEG:
 	      /* There is always just single entry to function.  */
 	    case NOTE_INSN_BASIC_BLOCK:
+              /* We should only switch text sections once.  */
+	    case NOTE_INSN_SWITCH_TEXT_SECTIONS:
 	      break;
 
 	    case NOTE_INSN_EPILOGUE_BEG:
-	    case NOTE_INSN_SWITCH_TEXT_SECTIONS:
 	      emit_note_copy (insn);
 	      break;
 
@@ -4611,8 +4682,7 @@  rtl_can_remove_branch_p (const_edge e)
   if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
     return false;
 
-  if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
-      || BB_PARTITION (src) != BB_PARTITION (target))
+  if (BB_PARTITION (src) != BB_PARTITION (target))
     return false;
 
   if (!onlyjump_p (insn)