diff mbox

[RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver

Message ID alpine.BSF.2.00.1210111922120.76076@dair.pair.com
State New
Headers show

Commit Message

Hans-Peter Nilsson Oct. 11, 2012, 11:34 p.m. UTC
The md.texi entry for nonlocal_goto_receiver says "A typical reason
why you might need this pattern is if some value, such as a pointer to
a global table, must be restored when the frame pointer is restored.
Note that a nonlocal goto only occurs within a unit-of-translation, so
a global table pointer that is shared by all functions of a given
module need not be restored".  One use would be to restore a
hardware-register-value saved in the current frame; the frame where
__builtin_setjmp is called (i.e. not a global context).  This is what
the MMIX port does, saving the register-stack-pointer-register for use
when unwinding the register stack.  I can imagine other similar
register-restoring needs that require something saved in the current
frame, but current ports with that pattern (or setjmp_receiver) but
without a nonlocal_goto pattern (see the HAVE_nonlocal_goto condition
in the patch) don't.  (The AVR port performs what appears to be a
cargo-cult song and dance; the bug below copied into the port, but
the port will not otherwise be affected by the bug or this patch, as it
has a nonlocal_goto pattern.)

But, in the builtins.c:expand_builtin_setjmp_receiver, the
frame-pointer is *clobbered* for a mysterious and fuddy reason:

      /* This might change the hard frame pointer in ways that aren't
	 apparent to early optimization passes, so force a clobber.  */
      emit_clobber (hard_frame_pointer_rtx);

That comment might have been true eons ago, but these days clobbering
the frame-pointer means that its value is void and any restoring insns
emitted before the clobber are deleted *because* of the clobber.  For
example, in built-in-setjmp.c at -O2.  Before .191r.ud_dce (i.e. in
the .190r.init-regs dump):

(code_label/s 47 115 48 3 3 "" [2 uses])
(note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 49 48 168 3 (use (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))
(insn 168 49 51 3 (set (reg/f:DI 253 $253)
        (plus:DI (reg/f:DI 253 $253)
            (const_int 24 [0x18]))) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))
(insn 51 168 52 3 (clobber (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))
(insn 52 51 54 3 (parallel [
            (unspec_volatile [
                    (reg/f:DI 253 $253)
                ] 1)
            (clobber (scratch:DI))
            (clobber (reg:DI 259 rJ))
        ]) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 63 {*nonlocal_goto_receiver_expanded}
     (expr_list:REG_UNUSED (reg:DI 259 rJ)
        (nil)))
(insn 54 52 136 3 (asm_input/v ("") (null):0) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))

After:

(code_label/s 47 115 48 3 3 "" [2 uses])
(note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 49 48 51 3 (use (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))
(insn 51 49 52 3 (clobber (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))
(insn 52 51 54 3 (parallel [
            (unspec_volatile [
                    (reg/f:DI 253 $253)
                ] 1)
            (clobber (scratch:DI))
            (clobber (reg:DI 259 rJ))
        ]) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 63 {*nonlocal_goto_receiver_expanded}
     (expr_list:REG_UNUSED (reg:DI 259 rJ)
        (nil)))
(insn 54 52 136 3 (asm_input/v ("") (null):0) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
     (nil))

Note that insn 168 deleted, which seems a logical optimization.  The
bug is to emit the clobber, not that the restoring insn is removed.
While grepping around for other emitters of nonlocal_goto_receiver I
noticed that builtins.c:expand_builtin_setjmp_receiver is identical to
stmt.c:expand_nl_goto_receiver save for two things: the frame-pointer
clobbering(!) and that expand_builtin_setjmp_receiver instead prefers to
emit setjmp_receiver.  I don't see how the frame-pointer-clobbering
would be needed as part of emitting setjmp_receiver.

I suggest eliminating the bug and one copy of the apparently bug-prone
code.  I kept the function in builtins.c for obvious reasons (if not
obvious, consider the name: expand *builtin* setjmp_receiver) with the
setjmp-ness expressed through the label parameter, which is non-NULL
for pre-existing calls.  Note also the fixed clobber-comment,
obviously incorrect in the stmt.c almost-copy, and at least on the
wrong line in expand_builtin_setjmp_receiver.

Tested mmix-knuth-mmixware, x86_64-unknown-linux-gnu (for good measure)
and rl78-elf (a SJLJ target; fixed-up with a patch from the maintainer
for the current breakage in PR54882, but without fortran), no regressions.

This fixes the following FAILs for mmix-knuth-mmixware:

Running /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp ...
...
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O2
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O3 -fomit-frame-pointer
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O3 -fomit-frame-pointer -funroll-loops
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-function\
s
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O3 -g
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -Os
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O2 -flto -fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
...
Running /tmp/mmiximp2/gcc/gcc/testsuite/gcc.dg/torture/stackalign/stackalign.exp ...
...
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O2  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O3 -fomit-frame-pointer  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O3 -fomit-frame-pointer -funroll-loops  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  execution t\
est
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O3 -g  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -Os  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O2  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O3 -fomit-frame-pointer  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O3 -fomit-frame-pointer -funroll-loops  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  execution t\
est
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O3 -g  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -Os  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O2  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O3 -fomit-frame-pointer  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O3 -fomit-frame-pointer -funroll-loops  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  execution t\
est
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O3 -g  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -Os  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test

Ok to commit?

gcc:
	* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
	expand_builtin_setjmp_receiver.
	(expand_label): Adjust, call expand_builtin_setjmp_receiver
	with NULL for the label parameter.
	* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
	the frame-pointer.  Adjust comments.
	[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
	only if LABEL is non-NULL.


brgds, H-P

Comments

Eric Botcazou Oct. 12, 2012, 8:32 a.m. UTC | #1
> But, in the builtins.c:expand_builtin_setjmp_receiver, the
> frame-pointer is *clobbered* for a mysterious and fuddy reason:
> 
>       /* This might change the hard frame pointer in ways that aren't
> 	 apparent to early optimization passes, so force a clobber.  */
>       emit_clobber (hard_frame_pointer_rtx);
> 
> That comment might have been true eons ago, but these days clobbering
> the frame-pointer means that its value is void and any restoring insns
> emitted before the clobber are deleted *because* of the clobber.  For
> example, in built-in-setjmp.c at -O2.  Before .191r.ud_dce (i.e. in
> the .190r.init-regs dump):
> 
> (code_label/s 47 115 48 3 3 "" [2 uses])
> (note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 49 48 168 3 (use (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 168 49 51 3 (set (reg/f:DI 253 $253)
>         (plus:DI (reg/f:DI 253 $253)
>             (const_int 24 [0x18])))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 52 51 54 3 (parallel [
>             (unspec_volatile [
>                     (reg/f:DI 253 $253)
>                 ] 1)
>             (clobber (scratch:DI))
>             (clobber (reg:DI 259 rJ))
>         ])
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> 63 {*nonlocal_goto_receiver_expanded} (expr_list:REG_UNUSED (reg:DI 259 rJ)
>         (nil)))
> (insn 54 52 136 3 (asm_input/v ("") (null):0)
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> 
> After:
> 
> (code_label/s 47 115 48 3 3 "" [2 uses])
> (note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 49 48 51 3 (use (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 51 49 52 3 (clobber (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 52 51 54 3 (parallel [
>             (unspec_volatile [
>                     (reg/f:DI 253 $253)
>                 ] 1)
>             (clobber (scratch:DI))
>             (clobber (reg:DI 259 rJ))
>         ])
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> 63 {*nonlocal_goto_receiver_expanded} (expr_list:REG_UNUSED (reg:DI 259 rJ)
>         (nil)))
> (insn 54 52 136 3 (asm_input/v ("") (null):0)
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> 
> Note that insn 168 deleted, which seems a logical optimization.  The
> bug is to emit the clobber, not that the restoring insn is removed.

Had that worked in the past for MMIX?  If so, what changed recently?

> While grepping around for other emitters of nonlocal_goto_receiver I
> noticed that builtins.c:expand_builtin_setjmp_receiver is identical to
> stmt.c:expand_nl_goto_receiver save for two things: the frame-pointer
> clobbering(!) and that expand_builtin_setjmp_receiver instead prefers to
> emit setjmp_receiver.  I don't see how the frame-pointer-clobbering
> would be needed as part of emitting setjmp_receiver.

Indeed, the discrepancy seems to have no firm basis.

> I suggest eliminating the bug and one copy of the apparently bug-prone
> code.  I kept the function in builtins.c for obvious reasons (if not
> obvious, consider the name: expand *builtin* setjmp_receiver) with the
> setjmp-ness expressed through the label parameter, which is non-NULL
> for pre-existing calls.  Note also the fixed clobber-comment,
> obviously incorrect in the stmt.c almost-copy, and at least on the
> wrong line in expand_builtin_setjmp_receiver.

Agreed.  However, I'd suggest rescuing the comment for the ELIMINABLE_REGS 
block from expand_nl_goto_receiver as it still sounds valid to me.

> 	* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> 	expand_builtin_setjmp_receiver.
> 	(expand_label): Adjust, call expand_builtin_setjmp_receiver
> 	with NULL for the label parameter.
> 	* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> 	the frame-pointer.  Adjust comments.
> 	[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> 	only if LABEL is non-NULL.

I cannot formally approve, but this looks good to me modulo:

> Index: gcc/builtins.c
> ===================================================================
> --- gcc/builtins.c	(revision 192353)
> +++ gcc/builtins.c	(working copy)
> @@ -885,14 +885,16 @@ expand_builtin_setjmp_setup (rtx buf_add
>  }
> 
>  /* Construct the trailing part of a __builtin_setjmp call.  This is
> -   also called directly by the SJLJ exception handling code.  */
> +   also called directly by the SJLJ exception handling code.
> +   If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> +   nonlocal goto handler are emitted.  */

The "port-specific parts" wording is a bit confusing I think.  I'd just write:

  If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
diff mbox

Patch

Index: gcc/stmt.c
===================================================================
--- gcc/stmt.c	(revision 192353)
+++ gcc/stmt.c	(working copy)
@@ -102,7 +102,6 @@  typedef struct case_node *case_node_ptr;

 static int n_occurrences (int, const char *);
 static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
-static void expand_nl_goto_receiver (void);
 static bool check_operand_nalternatives (tree, tree);
 static bool check_unique_operand_names (tree, tree, tree);
 static char *resolve_operand_name_1 (char *, tree, tree, tree);
@@ -196,7 +195,7 @@  expand_label (tree label)

   if (DECL_NONLOCAL (label))
     {
-      expand_nl_goto_receiver ();
+      expand_builtin_setjmp_receiver (NULL);
       nonlocal_goto_handler_labels
 	= gen_rtx_EXPR_LIST (VOIDmode, label_r,
 			     nonlocal_goto_handler_labels);
@@ -1552,77 +1551,6 @@  expand_return (tree retval)
     }
 }

-/* Emit code to restore vital registers at the beginning of a nonlocal goto
-   handler.  */
-static void
-expand_nl_goto_receiver (void)
-{
-  rtx chain;
-
-  /* Clobber the FP when we get here, so we have to make sure it's
-     marked as used by this function.  */
-  emit_use (hard_frame_pointer_rtx);
-
-  /* Mark the static chain as clobbered here so life information
-     doesn't get messed up for it.  */
-  chain = targetm.calls.static_chain (current_function_decl, true);
-  if (chain && REG_P (chain))
-    emit_clobber (chain);
-
-#ifdef HAVE_nonlocal_goto
-  if (! HAVE_nonlocal_goto)
-#endif
-    /* First adjust our frame pointer to its actual value.  It was
-       previously set to the start of the virtual area corresponding to
-       the stacked variables when we branched here and now needs to be
-       adjusted to the actual hardware fp value.
-
-       Assignments are to virtual registers are converted by
-       instantiate_virtual_regs into the corresponding assignment
-       to the underlying register (fp in this case) that makes
-       the original assignment true.
-       So the following insn will actually be
-       decrementing fp by STARTING_FRAME_OFFSET.  */
-    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-
-#if !HARD_FRAME_POINTER_IS_ARG_POINTER
-  if (fixed_regs[ARG_POINTER_REGNUM])
-    {
-#ifdef ELIMINABLE_REGS
-      /* If the argument pointer can be eliminated in favor of the
-	 frame pointer, we don't need to restore it.  We assume here
-	 that if such an elimination is present, it can always be used.
-	 This is the case on all known machines; if we don't make this
-	 assumption, we do unnecessary saving on many machines.  */
-      static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
-      size_t i;
-
-      for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
-	if (elim_regs[i].from == ARG_POINTER_REGNUM
-	    && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
-	  break;
-
-      if (i == ARRAY_SIZE (elim_regs))
-#endif
-	{
-	  /* Now restore our arg pointer from the address at which it
-	     was saved in our stack frame.  */
-	  emit_move_insn (crtl->args.internal_arg_pointer,
-			  copy_to_reg (get_arg_pointer_save_area ()));
-	}
-    }
-#endif
-
-#ifdef HAVE_nonlocal_goto_receiver
-  if (HAVE_nonlocal_goto_receiver)
-    emit_insn (gen_nonlocal_goto_receiver ());
-#endif
-
-  /* We must not allow the code we just generated to be reordered by
-     scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  */
-  emit_insn (gen_blockage ());
-}

 /* Emit code to save the current value of stack.  */
 rtx
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 192353)
+++ gcc/builtins.c	(working copy)
@@ -885,14 +885,16 @@  expand_builtin_setjmp_setup (rtx buf_add
 }

 /* Construct the trailing part of a __builtin_setjmp call.  This is
-   also called directly by the SJLJ exception handling code.  */
+   also called directly by the SJLJ exception handling code.
+   If RECEIVER_LABEL is NULL, instead the port-specific parts of a
+   nonlocal goto handler are emitted.  */

 void
 expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
 {
   rtx chain;

-  /* Clobber the FP when we get here, so we have to make sure it's
+  /* Mark the FP as used when we get here, so we have to make sure it's
      marked as used by this function.  */
   emit_use (hard_frame_pointer_rtx);

@@ -907,12 +909,7 @@  expand_builtin_setjmp_receiver (rtx rece
 #ifdef HAVE_nonlocal_goto
   if (! HAVE_nonlocal_goto)
 #endif
-    {
-      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-      /* This might change the hard frame pointer in ways that aren't
-	 apparent to early optimization passes, so force a clobber.  */
-      emit_clobber (hard_frame_pointer_rtx);
-    }
+    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);

 #if !HARD_FRAME_POINTER_IS_ARG_POINTER
   if (fixed_regs[ARG_POINTER_REGNUM])
@@ -938,7 +935,7 @@  expand_builtin_setjmp_receiver (rtx rece
 #endif

 #ifdef HAVE_builtin_setjmp_receiver
-  if (HAVE_builtin_setjmp_receiver)
+  if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
     emit_insn (gen_builtin_setjmp_receiver (receiver_label));
   else
 #endif


Oh, I almost forgot, there's a port-bug as well; the MMIX
nonlocal_goto_receiver need, in the pattern, naturally has to *refer*
to the frame-pointer for that to stay around.  In built-in-setjmp.c
there's otherwise no other need: after __builtin_setjmp returns
non-zero (well, always 1), the function exits only through calls to
noreturn functions and there's no other frame-local data references
after the __builtin_setjmp call (the variable "p" is in a register).
I'll commit this together with the patch above if approved.  Without
that, it has no effect on test-results.  RTL dumps above are obviously
with this patch applied. :)

	* config/mmix/mmix.md ("nonlocal_goto_receiver"): Refer to the
	frame-pointer as an operand.
	("*nonlocal_goto_receiver_expanded"): Ditto.  Use
	mmix_output_register_setting instead of naked output_asm_insn for
	the offset from the frame-pointer to the saved rO.
	* config/mmix/mmix.c (mmix_output_register_setting): Emit NEGU for
	values -255..0.
	* config/mmix/predicates.md ("frame_pointer_operand"): New.
	* config/mmix/constraints.md ("Yf"): New.

--- gcc/config/mmix/mmix.md.orig	2012-09-10 01:41:59.000000000 +0200
+++ gcc/config/mmix/mmix.md	2012-10-09 01:54:43.000000000 +0200
@@ -1120,7 +1120,7 @@  DIVU %1,%1,%2\;GET %0,:rR\;NEGU %2,0,%0\
 ;; of "pop 0,0" until rO equals the saved value.  (If it goes lower, we
 ;; should die with a trap.)
 (define_expand "nonlocal_goto_receiver"
-  [(parallel [(unspec_volatile [(const_int 0)] 1)
+  [(parallel [(unspec_volatile [(match_dup 1)] 1)
 	      (clobber (scratch:DI))
 	      (clobber (reg:DI MMIX_rJ_REGNUM))])
    (set (reg:DI MMIX_rJ_REGNUM) (match_dup 0))]
@@ -1131,6 +1131,13 @@  DIVU %1,%1,%2\;GET %0,:rR\;NEGU %2,0,%0\
     = mmix_get_hard_reg_initial_val (Pmode,
 				     MMIX_INCOMING_RETURN_ADDRESS_REGNUM);

+  /* We need the frame-pointer to be live or the equivalent
+     expression, so refer to in in the pattern.  We can't use a MEM
+     (that may contain out-of-range offsets in the final expression)
+     for fear that middle-end will legitimize it or replace the address
+     using temporary registers (which are not revived at this point).  */
+  operands[1] = frame_pointer_rtx;
+
   /* Mark this function as containing a landing-pad.  */
   cfun->machine->has_landing_pad = 1;
 }")
@@ -1140,45 +1147,40 @@  DIVU %1,%1,%2\;GET %0,:rR\;NEGU %2,0,%0\
 ;; address and re-use them after the register stack unwind, so it's best
 ;; to form the address ourselves.
 (define_insn "*nonlocal_goto_receiver_expanded"
-  [(unspec_volatile [(const_int 0)] 1)
+  [(unspec_volatile [(match_operand:DI 1 "frame_pointer_operand" "Yf")] 1)
    (clobber (match_scratch:DI 0 "=&r"))
    (clobber (reg:DI MMIX_rJ_REGNUM))]
   ""
 {
-  rtx temp_reg = operands[0];
-  rtx my_operands[2];
-  HOST_WIDEST_INT offs;
+  rtx my_operands[3];
   const char *my_template
     = "GETA $255,0f\;PUT rJ,$255\;LDOU $255,%a0\n\
 0:\;GET %1,rO\;CMPU %1,%1,$255\;BNP %1,1f\;POP 0,0\n1:";

-  my_operands[1] = temp_reg;
+  my_operands[1] = operands[0];
+  my_operands[2] = GEN_INT (-MMIX_fp_rO_OFFSET);

-  /* If we have a frame-pointer (hence unknown stack-pointer offset),
-     just use the frame-pointer and the known offset.  */
-  if (frame_pointer_needed)
+  if (operands[1] == hard_frame_pointer_rtx)
     {
-      my_operands[0] = GEN_INT (-MMIX_fp_rO_OFFSET);
-
-      output_asm_insn ("NEGU %1,0,%0", my_operands);
-      my_operands[0] = gen_rtx_PLUS (Pmode, frame_pointer_rtx, temp_reg);
+      mmix_output_register_setting (asm_out_file, REGNO (operands[0]),
+				    MMIX_fp_rO_OFFSET, 1);
+      my_operands[0]
+	= gen_rtx_PLUS (Pmode, hard_frame_pointer_rtx, operands[0]);
     }
   else
     {
-      /* We know the fp-based offset, so "eliminate" it to be sp-based.  */
-      offs
-	= (mmix_initial_elimination_offset (MMIX_FRAME_POINTER_REGNUM,
-					    MMIX_STACK_POINTER_REGNUM)
-	   + MMIX_fp_rO_OFFSET);
+      HOST_WIDEST_INT offs = INTVAL (XEXP (operands[1], 1));
+      offs += MMIX_fp_rO_OFFSET;

-      if (offs >= 0 && offs <= 255)
+      if (insn_const_int_ok_for_constraint (offs, CONSTRAINT_I))
 	my_operands[0]
 	  = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offs));
       else
 	{
-	  mmix_output_register_setting (asm_out_file, REGNO (temp_reg),
+	  mmix_output_register_setting (asm_out_file, REGNO (operands[0]),
 					offs, 1);
-	  my_operands[0] = gen_rtx_PLUS (Pmode, stack_pointer_rtx, temp_reg);
+	  my_operands[0]
+	    = gen_rtx_PLUS (Pmode, stack_pointer_rtx, operands[0]);
 	}
     }

--- gcc/config/mmix/mmix.c.orig2	2012-10-07 05:11:52.000000000 +0200
+++ gcc/config/mmix/mmix.c	2012-10-09 01:56:17.000000000 +0200
@@ -2281,7 +2281,9 @@  mmix_output_register_setting (FILE *stre
   if (do_begin_end)
     fprintf (stream, "\t");

-  if (mmix_shiftable_wyde_value ((unsigned HOST_WIDEST_INT) value))
+  if (insn_const_int_ok_for_constraint (value, CONSTRAINT_K))
+    fprintf (stream, "NEGU %s,0," HOST_WIDEST_INT_PRINT_DEC, reg_names[regno], -value);
+  else if (mmix_shiftable_wyde_value ((unsigned HOST_WIDEST_INT) value))
     {
       /* First, the one-insn cases.  */
       mmix_output_shiftvalue_op_from_str (stream, "SET",
--- gcc/config/mmix/predicates.md.orig	2012-09-10 01:41:59.000000000 +0200
+++ gcc/config/mmix/predicates.md	2012-10-08 20:10:08.000000000 +0200
@@ -161,3 +161,14 @@ 
   (if_then_else (match_test "reload_in_progress || reload_completed")
     (match_test "strict_memory_address_p (Pmode, op)")
     (match_test "memory_address_p (Pmode, op)")))
+
+(define_predicate "frame_pointer_operand"
+  (ior
+   (and
+    (match_code "reg")
+    (match_test "op == hard_frame_pointer_rtx || op == frame_pointer_rtx"))
+   (and
+    (match_code "plus")
+    (match_code "reg" "0")
+    (match_code "const_int" "1")
+    (match_test "XEXP (op, 0) == stack_pointer_rtx"))))
--- gcc/config/mmix/constraints.md.orig	2012-09-10 01:41:59.000000000 +0200
+++ gcc/config/mmix/constraints.md	2012-10-08 09:07:48.000000000 +0200
@@ -110,3 +110,7 @@ 
 (define_address_constraint "U"
   "@internal"
   (match_operand 0 "mmix_address_operand"))
+
+(define_constraint "Yf"
+  "@internal"
+  (match_operand 0 "frame_pointer_operand"))