Patchwork [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver

login
register
mail settings
Submitter Hans-Peter Nilsson
Date Oct. 16, 2012, 12:29 a.m.
Message ID <alpine.BSF.2.00.1210151945010.94107@dair.pair.com>
Download mbox | patch
Permalink /patch/191686/
State New
Headers show

Comments

Hans-Peter Nilsson - Oct. 16, 2012, 12:29 a.m.
On Fri, 12 Oct 2012, Eric Botcazou wrote:
> > (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> >         (plus:DI (reg/f:DI 253 $253)
> >             (const_int 24 [0x18])))
> > /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> > -1 (nil))
> > (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
...

> > Note that insn 168 deleted, which seems a logical optimization.  The
> > bug is to emit the clobber, not that the restoring insn is removed.
>
> Had that worked in the past for MMIX?

Yes, for svn revision 106027 (20051030) 4.1.0-era (!)
<http://gcc.gnu.org/ml/gcc-testresults/2005-10/msg01340.html>
where the test must have passed, as
gcc.c-torture/execute/built-in-setjmp.c is at least four years
older than that.

>  If so, what changed recently?

By "these days" I didn't mean "recent", just not "eons ago". :)
I see in a gcc-test-results posting from Mike Stein (whom I'd
like to thank for test-results posting over the years), matching
FAILs for svn revision 126095 (20070628) 4.3.0-era
<http://gcc.gnu.org/ml/gcc-testresults/2007-06/msg01287.html>.

Sorry, I have nothing in between those reports, my bad.  Though
I see no point narrowing down the failing revision further here
IMO; as mentioned the bug is not that the restoring insn is
removed.

> Agreed.  However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
> block from expand_nl_goto_receiver as it still sounds valid to me.

Oops, my bad; I see I removed all the good comments.  Fixed.

> > 	* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> > 	expand_builtin_setjmp_receiver.
> > 	(expand_label): Adjust, call expand_builtin_setjmp_receiver
> > 	with NULL for the label parameter.
> > 	* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> > 	the frame-pointer.  Adjust comments.
> > 	[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> > 	only if LABEL is non-NULL.
>
> I cannot formally approve, but this looks good to me modulo:

> > +   If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> > +   nonlocal goto handler are emitted.  */
>
> The "port-specific parts" wording is a bit confusing I think.  I'd just write:
>
>   If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.

Sure.  Thanks for the review.  Updated patch below.  As nothing
was changed from the previous post but comments as per the
review (mostly moving / reviving, fixing one grammo), already
covered by the changelog quoted above, the previous testing is
still valid.

Ok for trunk, approvers?

brgds, H-P
Hans-Peter Nilsson - Oct. 21, 2012, 4:06 a.m.
CC:ing middle-end maintainers this time.  I was a bit surprised
when Eric Botcazou wrote in his review, quoted below, that he's
not one of you.  Maybe approve that too?

On Mon, 15 Oct 2012, Hans-Peter Nilsson wrote:

> On Fri, 12 Oct 2012, Eric Botcazou wrote:
> > > (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> > >         (plus:DI (reg/f:DI 253 $253)
> > >             (const_int 24 [0x18])))
> > > /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> > > -1 (nil))
> > > (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
> ...
>
> > > Note that insn 168 deleted, which seems a logical optimization.  The
> > > bug is to emit the clobber, not that the restoring insn is removed.
> >
> > Had that worked in the past for MMIX?
>
> Yes, for svn revision 106027 (20051030) 4.1.0-era (!)
> <http://gcc.gnu.org/ml/gcc-testresults/2005-10/msg01340.html>
> where the test must have passed, as
> gcc.c-torture/execute/built-in-setjmp.c is at least four years
> older than that.
>
> >  If so, what changed recently?
>
> By "these days" I didn't mean "recent", just not "eons ago". :)
> I see in a gcc-test-results posting from Mike Stein (whom I'd
> like to thank for test-results posting over the years), matching
> FAILs for svn revision 126095 (20070628) 4.3.0-era
> <http://gcc.gnu.org/ml/gcc-testresults/2007-06/msg01287.html>.
>
> Sorry, I have nothing in between those reports, my bad.  Though
> I see no point narrowing down the failing revision further here
> IMO; as mentioned the bug is not that the restoring insn is
> removed.
>
> > Agreed.  However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
> > block from expand_nl_goto_receiver as it still sounds valid to me.
>
> Oops, my bad; I see I removed all the good comments.  Fixed.
>
> > > 	* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> > > 	expand_builtin_setjmp_receiver.
> > > 	(expand_label): Adjust, call expand_builtin_setjmp_receiver
> > > 	with NULL for the label parameter.
> > > 	* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> > > 	the frame-pointer.  Adjust comments.
> > > 	[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> > > 	only if LABEL is non-NULL.
> >
> > I cannot formally approve, but this looks good to me modulo:
>
> > > +   If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> > > +   nonlocal goto handler are emitted.  */
> >
> > The "port-specific parts" wording is a bit confusing I think.  I'd just write:
> >
> >   If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
>
> Sure.  Thanks for the review.  Updated patch below.  As nothing
> was changed from the previous post but comments as per the
> review (mostly moving / reviving, fixing one grammo), already
> covered by the changelog quoted above, the previous testing is
> still valid.
>
> Ok for trunk, approvers?
>
> Index: gcc/builtins.c
> ===================================================================
> --- gcc/builtins.c	(revision 192353)
> +++ gcc/builtins.c	(working copy)
> @@ -885,14 +885,15 @@ expand_builtin_setjmp_setup (rtx buf_add
>  }
>
>  /* Construct the trailing part of a __builtin_setjmp call.  This is
> -   also called directly by the SJLJ exception handling code.  */
> +   also called directly by the SJLJ exception handling code.
> +   If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.  */
>
>  void
>  expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
>  {
>    rtx chain;
>
> -  /* Clobber the FP when we get here, so we have to make sure it's
> +  /* Mark the FP as used when we get here, so we have to make sure it's
>       marked as used by this function.  */
>    emit_use (hard_frame_pointer_rtx);
>
> @@ -907,17 +908,28 @@ expand_builtin_setjmp_receiver (rtx rece
>  #ifdef HAVE_nonlocal_goto
>    if (! HAVE_nonlocal_goto)
>  #endif
> -    {
> -      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> -      /* This might change the hard frame pointer in ways that aren't
> -	 apparent to early optimization passes, so force a clobber.  */
> -      emit_clobber (hard_frame_pointer_rtx);
> -    }
> +    /* First adjust our frame pointer to its actual value.  It was
> +       previously set to the start of the virtual area corresponding to
> +       the stacked variables when we branched here and now needs to be
> +       adjusted to the actual hardware fp value.
> +
> +       Assignments to virtual registers are converted by
> +       instantiate_virtual_regs into the corresponding assignment
> +       to the underlying register (fp in this case) that makes
> +       the original assignment true.
> +       So the following insn will actually be decrementing fp by
> +       STARTING_FRAME_OFFSET.  */
> +    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
>
>  #if !HARD_FRAME_POINTER_IS_ARG_POINTER
>    if (fixed_regs[ARG_POINTER_REGNUM])
>      {
>  #ifdef ELIMINABLE_REGS
> +      /* If the argument pointer can be eliminated in favor of the
> +	 frame pointer, we don't need to restore it.  We assume here
> +	 that if such an elimination is present, it can always be used.
> +	 This is the case on all known machines; if we don't make this
> +	 assumption, we do unnecessary saving on many machines.  */
>        size_t i;
>        static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
>
> @@ -938,7 +950,7 @@ expand_builtin_setjmp_receiver (rtx rece
>  #endif
>
>  #ifdef HAVE_builtin_setjmp_receiver
> -  if (HAVE_builtin_setjmp_receiver)
> +  if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
>      emit_insn (gen_builtin_setjmp_receiver (receiver_label));
>    else
>  #endif
> Index: gcc/stmt.c
> ===================================================================
> --- gcc/stmt.c	(revision 192353)
> +++ gcc/stmt.c	(working copy)
> @@ -102,7 +102,6 @@ typedef struct case_node *case_node_ptr;
>
>  static int n_occurrences (int, const char *);
>  static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
> -static void expand_nl_goto_receiver (void);
>  static bool check_operand_nalternatives (tree, tree);
>  static bool check_unique_operand_names (tree, tree, tree);
>  static char *resolve_operand_name_1 (char *, tree, tree, tree);
> @@ -196,7 +195,7 @@ expand_label (tree label)
>
>    if (DECL_NONLOCAL (label))
>      {
> -      expand_nl_goto_receiver ();
> +      expand_builtin_setjmp_receiver (NULL);
>        nonlocal_goto_handler_labels
>  	= gen_rtx_EXPR_LIST (VOIDmode, label_r,
>  			     nonlocal_goto_handler_labels);
> @@ -1552,77 +1551,6 @@ expand_return (tree retval)
>      }
>  }
>
> -/* Emit code to restore vital registers at the beginning of a nonlocal goto
> -   handler.  */
> -static void
> -expand_nl_goto_receiver (void)
> -{
> -  rtx chain;
> -
> -  /* Clobber the FP when we get here, so we have to make sure it's
> -     marked as used by this function.  */
> -  emit_use (hard_frame_pointer_rtx);
> -
> -  /* Mark the static chain as clobbered here so life information
> -     doesn't get messed up for it.  */
> -  chain = targetm.calls.static_chain (current_function_decl, true);
> -  if (chain && REG_P (chain))
> -    emit_clobber (chain);
> -
> -#ifdef HAVE_nonlocal_goto
> -  if (! HAVE_nonlocal_goto)
> -#endif
> -    /* First adjust our frame pointer to its actual value.  It was
> -       previously set to the start of the virtual area corresponding to
> -       the stacked variables when we branched here and now needs to be
> -       adjusted to the actual hardware fp value.
> -
> -       Assignments are to virtual registers are converted by
> -       instantiate_virtual_regs into the corresponding assignment
> -       to the underlying register (fp in this case) that makes
> -       the original assignment true.
> -       So the following insn will actually be
> -       decrementing fp by STARTING_FRAME_OFFSET.  */
> -    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> -
> -#if !HARD_FRAME_POINTER_IS_ARG_POINTER
> -  if (fixed_regs[ARG_POINTER_REGNUM])
> -    {
> -#ifdef ELIMINABLE_REGS
> -      /* If the argument pointer can be eliminated in favor of the
> -	 frame pointer, we don't need to restore it.  We assume here
> -	 that if such an elimination is present, it can always be used.
> -	 This is the case on all known machines; if we don't make this
> -	 assumption, we do unnecessary saving on many machines.  */
> -      static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
> -      size_t i;
> -
> -      for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
> -	if (elim_regs[i].from == ARG_POINTER_REGNUM
> -	    && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
> -	  break;
> -
> -      if (i == ARRAY_SIZE (elim_regs))
> -#endif
> -	{
> -	  /* Now restore our arg pointer from the address at which it
> -	     was saved in our stack frame.  */
> -	  emit_move_insn (crtl->args.internal_arg_pointer,
> -			  copy_to_reg (get_arg_pointer_save_area ()));
> -	}
> -    }
> -#endif
> -
> -#ifdef HAVE_nonlocal_goto_receiver
> -  if (HAVE_nonlocal_goto_receiver)
> -    emit_insn (gen_nonlocal_goto_receiver ());
> -#endif
> -
> -  /* We must not allow the code we just generated to be reordered by
> -     scheduling.  Specifically, the update of the frame pointer must
> -     happen immediately, not later.  */
> -  emit_insn (gen_blockage ());
> -}
>
>  /* Emit code to save the current value of stack.  */
>  rtx
> brgds, H-P
>
Richard Guenther - Oct. 22, 2012, 7:32 a.m.
On Sun, 21 Oct 2012, Hans-Peter Nilsson wrote:

> CC:ing middle-end maintainers this time.  I was a bit surprised
> when Eric Botcazou wrote in his review, quoted below, that he's
> not one of you.  Maybe approve that too?

If Eric is fine with the patch it is ok.  Yes, he is not
middle-end maintainer but RTL optimizer reviewer.

Thanks,
Richard.

> On Mon, 15 Oct 2012, Hans-Peter Nilsson wrote:
> 
> > On Fri, 12 Oct 2012, Eric Botcazou wrote:
> > > > (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> > > >         (plus:DI (reg/f:DI 253 $253)
> > > >             (const_int 24 [0x18])))
> > > > /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> > > > -1 (nil))
> > > > (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
> > ...
> >
> > > > Note that insn 168 deleted, which seems a logical optimization.  The
> > > > bug is to emit the clobber, not that the restoring insn is removed.
> > >
> > > Had that worked in the past for MMIX?
> >
> > Yes, for svn revision 106027 (20051030) 4.1.0-era (!)
> > <http://gcc.gnu.org/ml/gcc-testresults/2005-10/msg01340.html>
> > where the test must have passed, as
> > gcc.c-torture/execute/built-in-setjmp.c is at least four years
> > older than that.
> >
> > >  If so, what changed recently?
> >
> > By "these days" I didn't mean "recent", just not "eons ago". :)
> > I see in a gcc-test-results posting from Mike Stein (whom I'd
> > like to thank for test-results posting over the years), matching
> > FAILs for svn revision 126095 (20070628) 4.3.0-era
> > <http://gcc.gnu.org/ml/gcc-testresults/2007-06/msg01287.html>.
> >
> > Sorry, I have nothing in between those reports, my bad.  Though
> > I see no point narrowing down the failing revision further here
> > IMO; as mentioned the bug is not that the restoring insn is
> > removed.
> >
> > > Agreed.  However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
> > > block from expand_nl_goto_receiver as it still sounds valid to me.
> >
> > Oops, my bad; I see I removed all the good comments.  Fixed.
> >
> > > > 	* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> > > > 	expand_builtin_setjmp_receiver.
> > > > 	(expand_label): Adjust, call expand_builtin_setjmp_receiver
> > > > 	with NULL for the label parameter.
> > > > 	* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> > > > 	the frame-pointer.  Adjust comments.
> > > > 	[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> > > > 	only if LABEL is non-NULL.
> > >
> > > I cannot formally approve, but this looks good to me modulo:
> >
> > > > +   If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> > > > +   nonlocal goto handler are emitted.  */
> > >
> > > The "port-specific parts" wording is a bit confusing I think.  I'd just write:
> > >
> > >   If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
> >
> > Sure.  Thanks for the review.  Updated patch below.  As nothing
> > was changed from the previous post but comments as per the
> > review (mostly moving / reviving, fixing one grammo), already
> > covered by the changelog quoted above, the previous testing is
> > still valid.
> >
> > Ok for trunk, approvers?
> >
> > Index: gcc/builtins.c
> > ===================================================================
> > --- gcc/builtins.c	(revision 192353)
> > +++ gcc/builtins.c	(working copy)
> > @@ -885,14 +885,15 @@ expand_builtin_setjmp_setup (rtx buf_add
> >  }
> >
> >  /* Construct the trailing part of a __builtin_setjmp call.  This is
> > -   also called directly by the SJLJ exception handling code.  */
> > +   also called directly by the SJLJ exception handling code.
> > +   If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.  */
> >
> >  void
> >  expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
> >  {
> >    rtx chain;
> >
> > -  /* Clobber the FP when we get here, so we have to make sure it's
> > +  /* Mark the FP as used when we get here, so we have to make sure it's
> >       marked as used by this function.  */
> >    emit_use (hard_frame_pointer_rtx);
> >
> > @@ -907,17 +908,28 @@ expand_builtin_setjmp_receiver (rtx rece
> >  #ifdef HAVE_nonlocal_goto
> >    if (! HAVE_nonlocal_goto)
> >  #endif
> > -    {
> > -      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> > -      /* This might change the hard frame pointer in ways that aren't
> > -	 apparent to early optimization passes, so force a clobber.  */
> > -      emit_clobber (hard_frame_pointer_rtx);
> > -    }
> > +    /* First adjust our frame pointer to its actual value.  It was
> > +       previously set to the start of the virtual area corresponding to
> > +       the stacked variables when we branched here and now needs to be
> > +       adjusted to the actual hardware fp value.
> > +
> > +       Assignments to virtual registers are converted by
> > +       instantiate_virtual_regs into the corresponding assignment
> > +       to the underlying register (fp in this case) that makes
> > +       the original assignment true.
> > +       So the following insn will actually be decrementing fp by
> > +       STARTING_FRAME_OFFSET.  */
> > +    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> >
> >  #if !HARD_FRAME_POINTER_IS_ARG_POINTER
> >    if (fixed_regs[ARG_POINTER_REGNUM])
> >      {
> >  #ifdef ELIMINABLE_REGS
> > +      /* If the argument pointer can be eliminated in favor of the
> > +	 frame pointer, we don't need to restore it.  We assume here
> > +	 that if such an elimination is present, it can always be used.
> > +	 This is the case on all known machines; if we don't make this
> > +	 assumption, we do unnecessary saving on many machines.  */
> >        size_t i;
> >        static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
> >
> > @@ -938,7 +950,7 @@ expand_builtin_setjmp_receiver (rtx rece
> >  #endif
> >
> >  #ifdef HAVE_builtin_setjmp_receiver
> > -  if (HAVE_builtin_setjmp_receiver)
> > +  if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
> >      emit_insn (gen_builtin_setjmp_receiver (receiver_label));
> >    else
> >  #endif
> > Index: gcc/stmt.c
> > ===================================================================
> > --- gcc/stmt.c	(revision 192353)
> > +++ gcc/stmt.c	(working copy)
> > @@ -102,7 +102,6 @@ typedef struct case_node *case_node_ptr;
> >
> >  static int n_occurrences (int, const char *);
> >  static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
> > -static void expand_nl_goto_receiver (void);
> >  static bool check_operand_nalternatives (tree, tree);
> >  static bool check_unique_operand_names (tree, tree, tree);
> >  static char *resolve_operand_name_1 (char *, tree, tree, tree);
> > @@ -196,7 +195,7 @@ expand_label (tree label)
> >
> >    if (DECL_NONLOCAL (label))
> >      {
> > -      expand_nl_goto_receiver ();
> > +      expand_builtin_setjmp_receiver (NULL);
> >        nonlocal_goto_handler_labels
> >  	= gen_rtx_EXPR_LIST (VOIDmode, label_r,
> >  			     nonlocal_goto_handler_labels);
> > @@ -1552,77 +1551,6 @@ expand_return (tree retval)
> >      }
> >  }
> >
> > -/* Emit code to restore vital registers at the beginning of a nonlocal goto
> > -   handler.  */
> > -static void
> > -expand_nl_goto_receiver (void)
> > -{
> > -  rtx chain;
> > -
> > -  /* Clobber the FP when we get here, so we have to make sure it's
> > -     marked as used by this function.  */
> > -  emit_use (hard_frame_pointer_rtx);
> > -
> > -  /* Mark the static chain as clobbered here so life information
> > -     doesn't get messed up for it.  */
> > -  chain = targetm.calls.static_chain (current_function_decl, true);
> > -  if (chain && REG_P (chain))
> > -    emit_clobber (chain);
> > -
> > -#ifdef HAVE_nonlocal_goto
> > -  if (! HAVE_nonlocal_goto)
> > -#endif
> > -    /* First adjust our frame pointer to its actual value.  It was
> > -       previously set to the start of the virtual area corresponding to
> > -       the stacked variables when we branched here and now needs to be
> > -       adjusted to the actual hardware fp value.
> > -
> > -       Assignments are to virtual registers are converted by
> > -       instantiate_virtual_regs into the corresponding assignment
> > -       to the underlying register (fp in this case) that makes
> > -       the original assignment true.
> > -       So the following insn will actually be
> > -       decrementing fp by STARTING_FRAME_OFFSET.  */
> > -    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> > -
> > -#if !HARD_FRAME_POINTER_IS_ARG_POINTER
> > -  if (fixed_regs[ARG_POINTER_REGNUM])
> > -    {
> > -#ifdef ELIMINABLE_REGS
> > -      /* If the argument pointer can be eliminated in favor of the
> > -	 frame pointer, we don't need to restore it.  We assume here
> > -	 that if such an elimination is present, it can always be used.
> > -	 This is the case on all known machines; if we don't make this
> > -	 assumption, we do unnecessary saving on many machines.  */
> > -      static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
> > -      size_t i;
> > -
> > -      for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
> > -	if (elim_regs[i].from == ARG_POINTER_REGNUM
> > -	    && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
> > -	  break;
> > -
> > -      if (i == ARRAY_SIZE (elim_regs))
> > -#endif
> > -	{
> > -	  /* Now restore our arg pointer from the address at which it
> > -	     was saved in our stack frame.  */
> > -	  emit_move_insn (crtl->args.internal_arg_pointer,
> > -			  copy_to_reg (get_arg_pointer_save_area ()));
> > -	}
> > -    }
> > -#endif
> > -
> > -#ifdef HAVE_nonlocal_goto_receiver
> > -  if (HAVE_nonlocal_goto_receiver)
> > -    emit_insn (gen_nonlocal_goto_receiver ());
> > -#endif
> > -
> > -  /* We must not allow the code we just generated to be reordered by
> > -     scheduling.  Specifically, the update of the frame pointer must
> > -     happen immediately, not later.  */
> > -  emit_insn (gen_blockage ());
> > -}
> >
> >  /* Emit code to save the current value of stack.  */
> >  rtx
> > brgds, H-P
> >
> 
>

Patch

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 192353)
+++ gcc/builtins.c	(working copy)
@@ -885,14 +885,15 @@  expand_builtin_setjmp_setup (rtx buf_add
 }

 /* Construct the trailing part of a __builtin_setjmp call.  This is
-   also called directly by the SJLJ exception handling code.  */
+   also called directly by the SJLJ exception handling code.
+   If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.  */

 void
 expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
 {
   rtx chain;

-  /* Clobber the FP when we get here, so we have to make sure it's
+  /* Mark the FP as used when we get here, so we have to make sure it's
      marked as used by this function.  */
   emit_use (hard_frame_pointer_rtx);

@@ -907,17 +908,28 @@  expand_builtin_setjmp_receiver (rtx rece
 #ifdef HAVE_nonlocal_goto
   if (! HAVE_nonlocal_goto)
 #endif
-    {
-      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-      /* This might change the hard frame pointer in ways that aren't
-	 apparent to early optimization passes, so force a clobber.  */
-      emit_clobber (hard_frame_pointer_rtx);
-    }
+    /* First adjust our frame pointer to its actual value.  It was
+       previously set to the start of the virtual area corresponding to
+       the stacked variables when we branched here and now needs to be
+       adjusted to the actual hardware fp value.
+
+       Assignments to virtual registers are converted by
+       instantiate_virtual_regs into the corresponding assignment
+       to the underlying register (fp in this case) that makes
+       the original assignment true.
+       So the following insn will actually be decrementing fp by
+       STARTING_FRAME_OFFSET.  */
+    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);

 #if !HARD_FRAME_POINTER_IS_ARG_POINTER
   if (fixed_regs[ARG_POINTER_REGNUM])
     {
 #ifdef ELIMINABLE_REGS
+      /* If the argument pointer can be eliminated in favor of the
+	 frame pointer, we don't need to restore it.  We assume here
+	 that if such an elimination is present, it can always be used.
+	 This is the case on all known machines; if we don't make this
+	 assumption, we do unnecessary saving on many machines.  */
       size_t i;
       static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;

@@ -938,7 +950,7 @@  expand_builtin_setjmp_receiver (rtx rece
 #endif

 #ifdef HAVE_builtin_setjmp_receiver
-  if (HAVE_builtin_setjmp_receiver)
+  if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
     emit_insn (gen_builtin_setjmp_receiver (receiver_label));
   else
 #endif
Index: gcc/stmt.c
===================================================================
--- gcc/stmt.c	(revision 192353)
+++ gcc/stmt.c	(working copy)
@@ -102,7 +102,6 @@  typedef struct case_node *case_node_ptr;

 static int n_occurrences (int, const char *);
 static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
-static void expand_nl_goto_receiver (void);
 static bool check_operand_nalternatives (tree, tree);
 static bool check_unique_operand_names (tree, tree, tree);
 static char *resolve_operand_name_1 (char *, tree, tree, tree);
@@ -196,7 +195,7 @@  expand_label (tree label)

   if (DECL_NONLOCAL (label))
     {
-      expand_nl_goto_receiver ();
+      expand_builtin_setjmp_receiver (NULL);
       nonlocal_goto_handler_labels
 	= gen_rtx_EXPR_LIST (VOIDmode, label_r,
 			     nonlocal_goto_handler_labels);
@@ -1552,77 +1551,6 @@  expand_return (tree retval)
     }
 }

-/* Emit code to restore vital registers at the beginning of a nonlocal goto
-   handler.  */
-static void
-expand_nl_goto_receiver (void)
-{
-  rtx chain;
-
-  /* Clobber the FP when we get here, so we have to make sure it's
-     marked as used by this function.  */
-  emit_use (hard_frame_pointer_rtx);
-
-  /* Mark the static chain as clobbered here so life information
-     doesn't get messed up for it.  */
-  chain = targetm.calls.static_chain (current_function_decl, true);
-  if (chain && REG_P (chain))
-    emit_clobber (chain);
-
-#ifdef HAVE_nonlocal_goto
-  if (! HAVE_nonlocal_goto)
-#endif
-    /* First adjust our frame pointer to its actual value.  It was
-       previously set to the start of the virtual area corresponding to
-       the stacked variables when we branched here and now needs to be
-       adjusted to the actual hardware fp value.
-
-       Assignments are to virtual registers are converted by
-       instantiate_virtual_regs into the corresponding assignment
-       to the underlying register (fp in this case) that makes
-       the original assignment true.
-       So the following insn will actually be
-       decrementing fp by STARTING_FRAME_OFFSET.  */
-    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-
-#if !HARD_FRAME_POINTER_IS_ARG_POINTER
-  if (fixed_regs[ARG_POINTER_REGNUM])
-    {
-#ifdef ELIMINABLE_REGS
-      /* If the argument pointer can be eliminated in favor of the
-	 frame pointer, we don't need to restore it.  We assume here
-	 that if such an elimination is present, it can always be used.
-	 This is the case on all known machines; if we don't make this
-	 assumption, we do unnecessary saving on many machines.  */
-      static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
-      size_t i;
-
-      for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
-	if (elim_regs[i].from == ARG_POINTER_REGNUM
-	    && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
-	  break;
-
-      if (i == ARRAY_SIZE (elim_regs))
-#endif
-	{
-	  /* Now restore our arg pointer from the address at which it
-	     was saved in our stack frame.  */
-	  emit_move_insn (crtl->args.internal_arg_pointer,
-			  copy_to_reg (get_arg_pointer_save_area ()));
-	}
-    }
-#endif
-
-#ifdef HAVE_nonlocal_goto_receiver
-  if (HAVE_nonlocal_goto_receiver)
-    emit_insn (gen_nonlocal_goto_receiver ());
-#endif
-
-  /* We must not allow the code we just generated to be reordered by
-     scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  */
-  emit_insn (gen_blockage ());
-}

 /* Emit code to save the current value of stack.  */
 rtx