diff mbox

Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

Message ID CAAe5K+UYQQAu4EGHXWa=uoLuN3AA9sGiAxC-wx3WmGxiwgk=Mw@mail.gmail.com
State New
Headers show

Commit Message

Teresa Johnson Nov. 1, 2012, 5:19 p.m. UTC
On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote:
>> Sure, I will give this a try after your verification patch tests
>> complete. Does this mean that the patch you posted above to
>> force_nonfallthru_and_redirect is no longer needed either? I'll see if
>> I can avoid the need for some of my fixes, although I believe at least
>> the function.c one will still be needed. I'll check.
>
> The force_nonfallthru change is still necessary, because
> force_nonfallthru should be almost a no-op in cfglayout mode. The
> whole concept of a fallthru edge doesn't exist in cfglayout mode: any
> single_succ edge is a fallthru edge until the order of the basic
> blocks has been determined and the insn chain is re-linked (cfglayout
> mode originally was developed for bb-reorder, to move blocks around
> more easily). So the correct patch would actually be:
>
> Index: cfgrtl.c
> ===================================================================
> --- cfgrtl.c    (revision 193046)
> +++ cfgrtl.c    (working copy)
> @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = {
>    cfg_layout_split_edge,
>    rtl_make_forwarder_block,
>    NULL, /* tidy_fallthru_edge */
> -  rtl_force_nonfallthru,
> +  NULL, /* force_nonfallthru */
>    rtl_block_ends_with_call_p,
>    rtl_block_ends_with_condjump_p,
>    rtl_flow_call_edges_add,
>
> (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge
> hooks, they are cfgrtl-only.)
>
> But obviously that won't work because
> bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in
> cfglayout mode. That is a bug. The call to force_nonfallthru results
> in a "dangling" barrier:
>
> cfgrtl.c:1523  emit_barrier_after (BB_END (jump_block));
>
> In cfglayout mode, barriers don't exist in the insns chain, and they
> don't have to because every edge is a fallthru edge. If there are
> barriers before cfglayout mode, they are either removed or linked in
> the basic block footer, and fixup_reorder_chain restores or inserts
> barriers where necessary to drop out of cfglayout mode. This
> emit_barrier_after call hangs a barrier after BB_END but not in the
> footer, and I'm pretty sure the result will be that the barrier is
> lost in fixup_reorder_chain. See also emit_barrier_after_bb for how
> inserting a barrier should be done in cfglayout mode.
>
> So in short, bbpart doesn't know what it wants to be: a cfgrtl or a
> cfglayout pass. It doesn't work without cfglayout but it's doing
> things that are only correct in the cfgrtl world and Very Wrong Indeed
> in cfglayout-land.
>
>
>> Regarding your earlier question about why we needed to add the
>> barrier, I need to dig up the details again but essentially I found
>> that the barriers were being added by bbpart, but bbro was reordering
>> things and the block that ended up at the border between the hot and
>> cold section didn't necessarily have a barrier on it because it was
>> not previously at the region boundary.
>
> That doesn't sound right. bbpart doesn't actually re-order the basic
> blocks, it only marks the blocks with the partition they will be
> assigned to. Whatever ends up at the border between the two partitions
> is not relevant: the hot section cannot end in a fall-through edge to
> the cold section (verify_flow_info even checks for that, see "fallthru
> edge crosses section boundary (bb %i)") so it must end in some
> explicit jump. Such jumps are always followed by a barrier. The only
> reason I can think of why there might be a missing barrier, is because
> fixup_reorder_chain has a bug and forgets to insert the barrier in
> some cases (and I suspect this may be the case for return patterns, or
> the a.m. issue of a dropper barrier).
>
> I would like to work on debugging this, but it's hard without test cases...

I'm working on trying to reproduce some of these failures in a test
case I can share. So far, I have only been able to reproduce the
failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still
working on getting a smaller/shareable test case for the other 2
issues.

The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I
had fixed with my original changes to cfgrtl.c. Need to understand why
there is a reg crossing note between 2 bbs in the same partition.

In the hmmer test case I also hit a failures in rtl_verify_flow_info
and rtl_verify_flow_info_1:

gcc  -c -o sre_math.o -DSPEC_CPU -D
NDEBUG    -fprofile-use -freorder-blocks-and-partition   -O2
     sre_math.c
sre_math.c: In function ‘Gammln’:
sre_math.c:161:1: error: EDGE_CROSSING incorrectly set across same section
 }
 ^
sre_math.c:161:1: error: missing barrier after block 6
sre_math.c:161:1: internal compiler error: verify_flow_info failed


This was due to some code in thread_prologue_and_epilogue_insns that
duplicated tail blocks:

                if (e)
                  {
                    copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
                                                  NULL_RTX, e->src);
                    BB_COPY_PARTITION (copy_bb, e->src);
                  }

In this case e->src (bb 6) was in the cold section and e->dest was in
the hot section, and e->src ended with a REG_CROSSING_JUMP followed by
a barrier. The new copy_bb got put into the cold section by the copy
partition above, leading to the first error. And because the
create_basic_block call inserted the new copy_bb before NEXT_INSN
(BB_END (e->src)), which in this case was the barrier, we ended up
without the barrier after the crossing edge.

I fixed this by making the following change:

                else
                  {

But I am not sure this is really correct in all cases - for example,
what if another hot bb that also didn't require a prologue branched
into the new cloned tail sequence, which is now cold? E.g.
dup_block_and_redirect will redirect all predecessors that don't need
a prologue to the new copy.

I'm going to see if I can get the other 2 failures I had found to
trigger on spec or a smaller test case.

Teresa

>
> Ciao!
> Steven



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413

Comments

Teresa Johnson Nov. 1, 2012, 9:26 p.m. UTC | #1
On Thu, Nov 1, 2012 at 10:19 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote:
>>> Sure, I will give this a try after your verification patch tests
>>> complete. Does this mean that the patch you posted above to
>>> force_nonfallthru_and_redirect is no longer needed either? I'll see if
>>> I can avoid the need for some of my fixes, although I believe at least
>>> the function.c one will still be needed. I'll check.
>>
>> The force_nonfallthru change is still necessary, because
>> force_nonfallthru should be almost a no-op in cfglayout mode. The
>> whole concept of a fallthru edge doesn't exist in cfglayout mode: any
>> single_succ edge is a fallthru edge until the order of the basic
>> blocks has been determined and the insn chain is re-linked (cfglayout
>> mode originally was developed for bb-reorder, to move blocks around
>> more easily). So the correct patch would actually be:
>>
>> Index: cfgrtl.c
>> ===================================================================
>> --- cfgrtl.c    (revision 193046)
>> +++ cfgrtl.c    (working copy)
>> @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = {
>>    cfg_layout_split_edge,
>>    rtl_make_forwarder_block,
>>    NULL, /* tidy_fallthru_edge */
>> -  rtl_force_nonfallthru,
>> +  NULL, /* force_nonfallthru */
>>    rtl_block_ends_with_call_p,
>>    rtl_block_ends_with_condjump_p,
>>    rtl_flow_call_edges_add,
>>
>> (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge
>> hooks, they are cfgrtl-only.)
>>
>> But obviously that won't work because
>> bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in
>> cfglayout mode. That is a bug. The call to force_nonfallthru results
>> in a "dangling" barrier:
>>
>> cfgrtl.c:1523  emit_barrier_after (BB_END (jump_block));
>>
>> In cfglayout mode, barriers don't exist in the insns chain, and they
>> don't have to because every edge is a fallthru edge. If there are
>> barriers before cfglayout mode, they are either removed or linked in
>> the basic block footer, and fixup_reorder_chain restores or inserts
>> barriers where necessary to drop out of cfglayout mode. This
>> emit_barrier_after call hangs a barrier after BB_END but not in the
>> footer, and I'm pretty sure the result will be that the barrier is
>> lost in fixup_reorder_chain. See also emit_barrier_after_bb for how
>> inserting a barrier should be done in cfglayout mode.
>>
>> So in short, bbpart doesn't know what it wants to be: a cfgrtl or a
>> cfglayout pass. It doesn't work without cfglayout but it's doing
>> things that are only correct in the cfgrtl world and Very Wrong Indeed
>> in cfglayout-land.
>>
>>
>>> Regarding your earlier question about why we needed to add the
>>> barrier, I need to dig up the details again but essentially I found
>>> that the barriers were being added by bbpart, but bbro was reordering
>>> things and the block that ended up at the border between the hot and
>>> cold section didn't necessarily have a barrier on it because it was
>>> not previously at the region boundary.
>>
>> That doesn't sound right. bbpart doesn't actually re-order the basic
>> blocks, it only marks the blocks with the partition they will be
>> assigned to. Whatever ends up at the border between the two partitions
>> is not relevant: the hot section cannot end in a fall-through edge to
>> the cold section (verify_flow_info even checks for that, see "fallthru
>> edge crosses section boundary (bb %i)") so it must end in some
>> explicit jump. Such jumps are always followed by a barrier. The only
>> reason I can think of why there might be a missing barrier, is because
>> fixup_reorder_chain has a bug and forgets to insert the barrier in
>> some cases (and I suspect this may be the case for return patterns, or
>> the a.m. issue of a dropper barrier).
>>
>> I would like to work on debugging this, but it's hard without test cases...
>
> I'm working on trying to reproduce some of these failures in a test
> case I can share. So far, I have only been able to reproduce the
> failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still
> working on getting a smaller/shareable test case for the other 2
> issues.
>
> The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I
> had fixed with my original changes to cfgrtl.c. Need to understand why
> there is a reg crossing note between 2 bbs in the same partition.

Interestingly, this turned out to be the same root cause as the
verify_flow_info failures below. It is fixed by the same fix to
thread_prologue_and_epilogue_insns. When the code below created the
copy_bb and put it in e->src's partition, it made it insufficient for
the merge blocks routine to check if the two bbs were in the same
partition, because they were in the same partition but separated by
the region crossing jump.

I'll do some testing of the fix below, but do you have any comments on
the correctness or the potential issue I raised (see my note just
below the patch)?

Do you recommend pursuing the move of the bb partition phase until
later, after we leave cfglayout mode? I need to revisit to see if my
prologue/epilogue fix below also addresses the issue I saw when I
tried moving it.

Thanks,
Teresa

>
> In the hmmer test case I also hit a failures in rtl_verify_flow_info
> and rtl_verify_flow_info_1:
>
> gcc  -c -o sre_math.o -DSPEC_CPU -D
> NDEBUG    -fprofile-use -freorder-blocks-and-partition   -O2
>      sre_math.c
> sre_math.c: In function ‘Gammln’:
> sre_math.c:161:1: error: EDGE_CROSSING incorrectly set across same section
>  }
>  ^
> sre_math.c:161:1: error: missing barrier after block 6
> sre_math.c:161:1: internal compiler error: verify_flow_info failed
>
>
> This was due to some code in thread_prologue_and_epilogue_insns that
> duplicated tail blocks:
>
>                 if (e)
>                   {
>                     copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>                                                   NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
>                   }
>
> In this case e->src (bb 6) was in the cold section and e->dest was in
> the hot section, and e->src ended with a REG_CROSSING_JUMP followed by
> a barrier. The new copy_bb got put into the cold section by the copy
> partition above, leading to the first error. And because the
> create_basic_block call inserted the new copy_bb before NEXT_INSN
> (BB_END (e->src)), which in this case was the barrier, we ended up
> without the barrier after the crossing edge.
>
> I fixed this by making the following change:
>
> --- function.c  (revision 192692)
> +++ function.c  (working copy)
> @@ -6249,9 +6249,18 @@ thread_prologue_and_epilogue_insns (void)
>                     break;
>                 if (e)
>                   {
> +                    rtx note;
>                     copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>                                                   NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
> +                    /* Remove the region crossing note from jump at end of
> +                       e->src if it exists.  */
> +                    note = find_reg_note (BB_END (e->src),
> REG_CROSSING_JUMP, NULL_RTX);
> +                    if (note)
> +                      /* There would also have been a barrier after
> e->src, that
> +                         is now after copy_bb, but that shouldn't be a
> +                         problem?.  */
> +                      remove_note (BB_END (e->src), note);
>                   }
>                 else
>                   {
>
> But I am not sure this is really correct in all cases - for example,
> what if another hot bb that also didn't require a prologue branched
> into the new cloned tail sequence, which is now cold? E.g.
> dup_block_and_redirect will redirect all predecessors that don't need
> a prologue to the new copy.
>
> I'm going to see if I can get the other 2 failures I had found to
> trigger on spec or a smaller test case.
>
> Teresa
>
>>
>> Ciao!
>> Steven
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Steven Bosscher Nov. 1, 2012, 9:45 p.m. UTC | #2
On Thu, Nov 1, 2012 at 10:26 PM, Teresa Johnson wrote:
> I'll do some testing of the fix below, but do you have any comments on
> the correctness or the potential issue I raised (see my note just
> below the patch)?

Sorry, I don't know the pro- and epilogue threading code well enough
to be of any help to you...

> Do you recommend pursuing the move of the bb partition phase until
> later, after we leave cfglayout mode? I need to revisit to see if my
> prologue/epilogue fix below also addresses the issue I saw when I
> tried moving it.

I think it should not be moved, cfglayout mode is the natural mode for
this kind of CFG transformations. But someone will have to tackle the
force_nonfallthru issue.

Ciao!
Steven
Teresa Johnson Nov. 3, 2012, 9:44 p.m. UTC | #3
On Thu, Nov 1, 2012 at 2:26 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Thu, Nov 1, 2012 at 10:19 AM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>> On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote:
>>>> Sure, I will give this a try after your verification patch tests
>>>> complete. Does this mean that the patch you posted above to
>>>> force_nonfallthru_and_redirect is no longer needed either? I'll see if
>>>> I can avoid the need for some of my fixes, although I believe at least
>>>> the function.c one will still be needed. I'll check.
>>>
>>> The force_nonfallthru change is still necessary, because
>>> force_nonfallthru should be almost a no-op in cfglayout mode. The
>>> whole concept of a fallthru edge doesn't exist in cfglayout mode: any
>>> single_succ edge is a fallthru edge until the order of the basic
>>> blocks has been determined and the insn chain is re-linked (cfglayout
>>> mode originally was developed for bb-reorder, to move blocks around
>>> more easily). So the correct patch would actually be:
>>>
>>> Index: cfgrtl.c
>>> ===================================================================
>>> --- cfgrtl.c    (revision 193046)
>>> +++ cfgrtl.c    (working copy)
>>> @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = {
>>>    cfg_layout_split_edge,
>>>    rtl_make_forwarder_block,
>>>    NULL, /* tidy_fallthru_edge */
>>> -  rtl_force_nonfallthru,
>>> +  NULL, /* force_nonfallthru */
>>>    rtl_block_ends_with_call_p,
>>>    rtl_block_ends_with_condjump_p,
>>>    rtl_flow_call_edges_add,
>>>
>>> (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge
>>> hooks, they are cfgrtl-only.)
>>>
>>> But obviously that won't work because
>>> bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in
>>> cfglayout mode. That is a bug. The call to force_nonfallthru results
>>> in a "dangling" barrier:
>>>
>>> cfgrtl.c:1523  emit_barrier_after (BB_END (jump_block));
>>>
>>> In cfglayout mode, barriers don't exist in the insns chain, and they
>>> don't have to because every edge is a fallthru edge. If there are
>>> barriers before cfglayout mode, they are either removed or linked in
>>> the basic block footer, and fixup_reorder_chain restores or inserts
>>> barriers where necessary to drop out of cfglayout mode. This
>>> emit_barrier_after call hangs a barrier after BB_END but not in the
>>> footer, and I'm pretty sure the result will be that the barrier is
>>> lost in fixup_reorder_chain. See also emit_barrier_after_bb for how
>>> inserting a barrier should be done in cfglayout mode.
>>>
>>> So in short, bbpart doesn't know what it wants to be: a cfgrtl or a
>>> cfglayout pass. It doesn't work without cfglayout but it's doing
>>> things that are only correct in the cfgrtl world and Very Wrong Indeed
>>> in cfglayout-land.
>>>
>>>
>>>> Regarding your earlier question about why we needed to add the
>>>> barrier, I need to dig up the details again but essentially I found
>>>> that the barriers were being added by bbpart, but bbro was reordering
>>>> things and the block that ended up at the border between the hot and
>>>> cold section didn't necessarily have a barrier on it because it was
>>>> not previously at the region boundary.
>>>
>>> That doesn't sound right. bbpart doesn't actually re-order the basic
>>> blocks, it only marks the blocks with the partition they will be
>>> assigned to. Whatever ends up at the border between the two partitions
>>> is not relevant: the hot section cannot end in a fall-through edge to
>>> the cold section (verify_flow_info even checks for that, see "fallthru
>>> edge crosses section boundary (bb %i)") so it must end in some
>>> explicit jump. Such jumps are always followed by a barrier. The only
>>> reason I can think of why there might be a missing barrier, is because
>>> fixup_reorder_chain has a bug and forgets to insert the barrier in
>>> some cases (and I suspect this may be the case for return patterns, or
>>> the a.m. issue of a dropper barrier).
>>>
>>> I would like to work on debugging this, but it's hard without test cases...
>>
>> I'm working on trying to reproduce some of these failures in a test
>> case I can share. So far, I have only been able to reproduce the
>> failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still
>> working on getting a smaller/shareable test case for the other 2
>> issues.
>>
>> The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I
>> had fixed with my original changes to cfgrtl.c. Need to understand why
>> there is a reg crossing note between 2 bbs in the same partition.
>
> Interestingly, this turned out to be the same root cause as the
> verify_flow_info failures below. It is fixed by the same fix to
> thread_prologue_and_epilogue_insns. When the code below created the
> copy_bb and put it in e->src's partition, it made it insufficient for
> the merge blocks routine to check if the two bbs were in the same
> partition, because they were in the same partition but separated by
> the region crossing jump.
>
> I'll do some testing of the fix below, but do you have any comments on
> the correctness or the potential issue I raised (see my note just
> below the patch)?
>
> Do you recommend pursuing the move of the bb partition phase until
> later, after we leave cfglayout mode? I need to revisit to see if my
> prologue/epilogue fix below also addresses the issue I saw when I
> tried moving it.

I found that indeed it wasn't a complete fix as there can be cases
where we needed to redirect another pred branch in the hot section to
the copy_bb that was now marked cold. I also found another case
further down in the same thread_prologue_and_epilogue_insns routine
that required the exact same kind of fixup (where we create a new
block to hold a simple return and redirect edges to that block). Here
too a new block was created, put into the same partition as the pred
that we redirect to it, and that pred no longer needed to be marked
with a region crossing jump. In all of these cases the edge
redirection is handled by rtl_redirect_edge_and_branch_force, so I
simply moved the fixup there and had it fixup jumps/edges from all
preds as necessary based on the new target's partition.

Finally, I also fixed the below call to create_basic_block to
correctly insert the new basic block after any barriers following the
BB_END (e->src).

Another email coming shortly on the root cause I found for one of the
other issues I was attempting to fix in my original patch set, that
you also had questions about.

Thanks,
Teresa

>
> Thanks,
> Teresa
>
>>
>> In the hmmer test case I also hit a failures in rtl_verify_flow_info
>> and rtl_verify_flow_info_1:
>>
>> gcc  -c -o sre_math.o -DSPEC_CPU -D
>> NDEBUG    -fprofile-use -freorder-blocks-and-partition   -O2
>>      sre_math.c
>> sre_math.c: In function ‘Gammln’:
>> sre_math.c:161:1: error: EDGE_CROSSING incorrectly set across same section
>>  }
>>  ^
>> sre_math.c:161:1: error: missing barrier after block 6
>> sre_math.c:161:1: internal compiler error: verify_flow_info failed
>>
>>
>> This was due to some code in thread_prologue_and_epilogue_insns that
>> duplicated tail blocks:
>>
>>                 if (e)
>>                   {
>>                     copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>>                                                   NULL_RTX, e->src);
>>                     BB_COPY_PARTITION (copy_bb, e->src);
>>                   }
>>
>> In this case e->src (bb 6) was in the cold section and e->dest was in
>> the hot section, and e->src ended with a REG_CROSSING_JUMP followed by
>> a barrier. The new copy_bb got put into the cold section by the copy
>> partition above, leading to the first error. And because the
>> create_basic_block call inserted the new copy_bb before NEXT_INSN
>> (BB_END (e->src)), which in this case was the barrier, we ended up
>> without the barrier after the crossing edge.
>>
>> I fixed this by making the following change:
>>
>> --- function.c  (revision 192692)
>> +++ function.c  (working copy)
>> @@ -6249,9 +6249,18 @@ thread_prologue_and_epilogue_insns (void)
>>                     break;
>>                 if (e)
>>                   {
>> +                    rtx note;
>>                     copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>>                                                   NULL_RTX, e->src);
>>                     BB_COPY_PARTITION (copy_bb, e->src);
>> +                    /* Remove the region crossing note from jump at end of
>> +                       e->src if it exists.  */
>> +                    note = find_reg_note (BB_END (e->src),
>> REG_CROSSING_JUMP, NULL_RTX);
>> +                    if (note)
>> +                      /* There would also have been a barrier after
>> e->src, that
>> +                         is now after copy_bb, but that shouldn't be a
>> +                         problem?.  */
>> +                      remove_note (BB_END (e->src), note);
>>                   }
>>                 else
>>                   {
>>
>> But I am not sure this is really correct in all cases - for example,
>> what if another hot bb that also didn't require a prologue branched
>> into the new cloned tail sequence, which is now cold? E.g.
>> dup_block_and_redirect will redirect all predecessors that don't need
>> a prologue to the new copy.
>>
>> I'm going to see if I can get the other 2 failures I had found to
>> trigger on spec or a smaller test case.
>>
>> Teresa
>>
>>>
>>> Ciao!
>>> Steven
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
diff mbox

Patch

--- function.c  (revision 192692)
+++ function.c  (working copy)
@@ -6249,9 +6249,18 @@  thread_prologue_and_epilogue_insns (void)
                    break;
                if (e)
                  {
+                    rtx note;
                    copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
                                                  NULL_RTX, e->src);
                    BB_COPY_PARTITION (copy_bb, e->src);
+                    /* Remove the region crossing note from jump at end of
+                       e->src if it exists.  */
+                    note = find_reg_note (BB_END (e->src),
REG_CROSSING_JUMP, NULL_RTX);
+                    if (note)
+                      /* There would also have been a barrier after
e->src, that
+                         is now after copy_bb, but that shouldn't be a
+                         problem?.  */
+                      remove_note (BB_END (e->src), note);
                  }