[3/4,ARC] Save mlo/mhi registers when ISR.
diff mbox series

Message ID 20200122081452.4710-3-claziss@gmail.com
State New
Headers show
Series
  • [1/4,ARC] Make libgcc compatible with ARC's reduced register set config.
Related show

Commit Message

Claudiu Zissulescu Ianculescu Jan. 22, 2020, 8:14 a.m. UTC
ARC600 when configured with mul64 instructions uses mlo and mhi
registers to store the 64 result of the multiplication. In the ARC600
ISA documentation we have the next register configuration when ARC600
is configured only with mul64 extension:

Register | Name | Use
---------+------+------------------------------------
r57      | mlo  | Multiply low 32 bits, read only
r58      | mmid | Multiply middle 32 bits, read only
r59      | mhi  | Multiply high 32 bits, read only
-----------------------------------------------------

When used for Co-existence configurations we have for mul64 the next
registers used:

Register | Name | Use
---------+------+------------------------------------
r58      | mlo  | Multiply low 32 bits, read only
r59      | mhi  | Multiply high 32 bits, read only
-----------------------------------------------------

Note that mlo/mhi assignment doesn't swap when bigendian CPU
configuration is used.

The compiler will always use r58 for mlo, regardless of the
configuration choosen to ensure mlo/mhi correct splitting. Fixing mlo
to the right register number is done at assembly time. The dwarf info
is also notified via DBX_... macro. Both mlo/mhi registers needs to
saved when ISR happens using a custom sequence.

gcc/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc-protos.h (gen_mlo): Remove.
	(gen_mhi): Likewise.
	* config/arc/arc.c (AUX_MULHI): Define.
	(arc_must_save_reister): Special handling for r58/59.
	(arc_compute_frame_size): Consider mlo/mhi registers.
	(arc_save_callee_saves): Emit fp/sp move only when emit_move
	paramter is true.
	(arc_conditional_register_usage): Remove TARGET_BIG_ENDIAN from
	mlo/mhi name selection.
	(arc_restore_callee_saves): Don't early restore blink when ISR.
	(arc_expand_prologue): Add mlo/mhi saving.
	(arc_expand_epilogue): Add mlo/mhi restoring.
	(gen_mlo): Remove.
	(gen_mhi): Remove.
	* config/arc/arc.h (DBX_REGISTER_NUMBER): Correct register
	numbering when MUL64 option is used.
	(DWARF2_FRAME_REG_OUT): Define.
	* config/arc/arc.md (arc600_stall): New pattern.
	(VUNSPEC_ARC_ARC600_STALL): Define.
	(mulsi64): Use correct mlo/mhi registers.
	(mulsi_600): Clean it up.
	* config/arc/predicates.md (mlo_operand): Remove any dependency on
	TARGET_BIG_ENDIAN.
	(mhi_operand): Likewise.

testsuite/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>
	* gcc.target/arc/code-density-flag.c: Update test.
	* gcc.target/arc/interrupt-6.c: Likewise.
---
 gcc/config/arc/arc-protos.h                   |   2 -
 gcc/config/arc/arc.c                          | 279 +++++++++++-------
 gcc/config/arc/arc.h                          |  27 +-
 gcc/config/arc/arc.md                         |  43 ++-
 gcc/config/arc/predicates.md                  |   4 +-
 .../gcc.target/arc/code-density-flag.c        |   1 +
 gcc/testsuite/gcc.target/arc/interrupt-6.c    |   2 +-
 7 files changed, 220 insertions(+), 138 deletions(-)

Comments

Jeff Law Jan. 22, 2020, 8:31 p.m. UTC | #1
On Wed, 2020-01-22 at 10:14 +0200, Claudiu Zissulescu wrote:
> ARC600 when configured with mul64 instructions uses mlo and mhi
> registers to store the 64 result of the multiplication. In the ARC600
> ISA documentation we have the next register configuration when ARC600
> is configured only with mul64 extension:
> 
> Register | Name | Use
> ---------+------+------------------------------------
> r57      | mlo  | Multiply low 32 bits, read only
> r58      | mmid | Multiply middle 32 bits, read only
> r59      | mhi  | Multiply high 32 bits, read only
> -----------------------------------------------------
> 
> When used for Co-existence configurations we have for mul64 the next
> registers used:
> 
> Register | Name | Use
> ---------+------+------------------------------------
> r58      | mlo  | Multiply low 32 bits, read only
> r59      | mhi  | Multiply high 32 bits, read only
> -----------------------------------------------------
> 
> Note that mlo/mhi assignment doesn't swap when bigendian CPU
> configuration is used.
> 
> The compiler will always use r58 for mlo, regardless of the
> configuration choosen to ensure mlo/mhi correct splitting. Fixing mlo
> to the right register number is done at assembly time. The dwarf info
> is also notified via DBX_... macro. Both mlo/mhi registers needs to
> saved when ISR happens using a custom sequence.
> 
> gcc/
> xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config/arc/arc-protos.h (gen_mlo): Remove.
> 	(gen_mhi): Likewise.
> 	* config/arc/arc.c (AUX_MULHI): Define.
> 	(arc_must_save_reister): Special handling for r58/59.
> 	(arc_compute_frame_size): Consider mlo/mhi registers.
> 	(arc_save_callee_saves): Emit fp/sp move only when emit_move
> 	paramter is true.
> 	(arc_conditional_register_usage): Remove TARGET_BIG_ENDIAN from
> 	mlo/mhi name selection.
> 	(arc_restore_callee_saves): Don't early restore blink when ISR.
> 	(arc_expand_prologue): Add mlo/mhi saving.
> 	(arc_expand_epilogue): Add mlo/mhi restoring.
> 	(gen_mlo): Remove.
> 	(gen_mhi): Remove.
> 	* config/arc/arc.h (DBX_REGISTER_NUMBER): Correct register
> 	numbering when MUL64 option is used.
> 	(DWARF2_FRAME_REG_OUT): Define.
> 	* config/arc/arc.md (arc600_stall): New pattern.
> 	(VUNSPEC_ARC_ARC600_STALL): Define.
> 	(mulsi64): Use correct mlo/mhi registers.
> 	(mulsi_600): Clean it up.
> 	* config/arc/predicates.md (mlo_operand): Remove any dependency on
> 	TARGET_BIG_ENDIAN.
> 	(mhi_operand): Likewise.
> 
> testsuite/
> xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>
> 	* gcc.target/arc/code-density-flag.c: Update test.
> 	* gcc.target/arc/interrupt-6.c: Likewise.
Ugh.  But OK.

jeff
>
Claudiu Zissulescu Ianculescu Jan. 27, 2020, 1:18 p.m. UTC | #2
Yes, I know :(

Thank you for your help. All four patches pushed.
Claudiu

On Wed, Jan 22, 2020 at 10:31 PM Jeff Law <law@redhat.com> wrote:
>
> On Wed, 2020-01-22 at 10:14 +0200, Claudiu Zissulescu wrote:
> > ARC600 when configured with mul64 instructions uses mlo and mhi
> > registers to store the 64 result of the multiplication. In the ARC600
> > ISA documentation we have the next register configuration when ARC600
> > is configured only with mul64 extension:
> >
> > Register | Name | Use
> > ---------+------+------------------------------------
> > r57      | mlo  | Multiply low 32 bits, read only
> > r58      | mmid | Multiply middle 32 bits, read only
> > r59      | mhi  | Multiply high 32 bits, read only
> > -----------------------------------------------------
> >
> > When used for Co-existence configurations we have for mul64 the next
> > registers used:
> >
> > Register | Name | Use
> > ---------+------+------------------------------------
> > r58      | mlo  | Multiply low 32 bits, read only
> > r59      | mhi  | Multiply high 32 bits, read only
> > -----------------------------------------------------
> >
> > Note that mlo/mhi assignment doesn't swap when bigendian CPU
> > configuration is used.
> >
> > The compiler will always use r58 for mlo, regardless of the
> > configuration choosen to ensure mlo/mhi correct splitting. Fixing mlo
> > to the right register number is done at assembly time. The dwarf info
> > is also notified via DBX_... macro. Both mlo/mhi registers needs to
> > saved when ISR happens using a custom sequence.
> >
> > gcc/
> > xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>
> >
> >       * config/arc/arc-protos.h (gen_mlo): Remove.
> >       (gen_mhi): Likewise.
> >       * config/arc/arc.c (AUX_MULHI): Define.
> >       (arc_must_save_reister): Special handling for r58/59.
> >       (arc_compute_frame_size): Consider mlo/mhi registers.
> >       (arc_save_callee_saves): Emit fp/sp move only when emit_move
> >       paramter is true.
> >       (arc_conditional_register_usage): Remove TARGET_BIG_ENDIAN from
> >       mlo/mhi name selection.
> >       (arc_restore_callee_saves): Don't early restore blink when ISR.
> >       (arc_expand_prologue): Add mlo/mhi saving.
> >       (arc_expand_epilogue): Add mlo/mhi restoring.
> >       (gen_mlo): Remove.
> >       (gen_mhi): Remove.
> >       * config/arc/arc.h (DBX_REGISTER_NUMBER): Correct register
> >       numbering when MUL64 option is used.
> >       (DWARF2_FRAME_REG_OUT): Define.
> >       * config/arc/arc.md (arc600_stall): New pattern.
> >       (VUNSPEC_ARC_ARC600_STALL): Define.
> >       (mulsi64): Use correct mlo/mhi registers.
> >       (mulsi_600): Clean it up.
> >       * config/arc/predicates.md (mlo_operand): Remove any dependency on
> >       TARGET_BIG_ENDIAN.
> >       (mhi_operand): Likewise.
> >
> > testsuite/
> > xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>
> >       * gcc.target/arc/code-density-flag.c: Update test.
> >       * gcc.target/arc/interrupt-6.c: Likewise.
> Ugh.  But OK.
>
> jeff
> >
>

Patch
diff mbox series

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index 2a13b0f19f0..c72d78e3b9e 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -75,8 +75,6 @@  extern int arc_hazard (rtx_insn *, rtx_insn *);
 extern int arc_write_ext_corereg (rtx);
 extern rtx gen_acc1 (void);
 extern rtx gen_acc2 (void);
-extern rtx gen_mlo (void);
-extern rtx gen_mhi (void);
 extern bool arc_branch_size_unknown_p (void);
 struct arc_ccfsm;
 extern void arc_ccfsm_record_condition (rtx, bool, rtx_insn *,
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 25d66e9cac9..fc174361b02 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -213,6 +213,9 @@  static int rgf_banked_register_count;
 /* FPX AUX registers.  */
 #define AUX_DPFP_START 0x301
 
+/* ARC600 MULHI register.  */
+#define AUX_MULHI 0x12
+
 /* A nop is needed between a 4 byte insn that sets the condition codes and
    a branch that uses them (the same isn't true for an 8 byte insn that sets
    the condition codes).  Set by arc_ccfsm_advance.  Used by
@@ -1919,8 +1922,8 @@  arc_conditional_register_usage (void)
 	 this way, we don't have to carry clobbers of that reg around in every
 	 isntruction that modifies mlo and/or mhi.  */
       strcpy (rname57, "");
-      strcpy (rname58, TARGET_BIG_ENDIAN ? "mhi" : "mlo");
-      strcpy (rname59, TARGET_BIG_ENDIAN ? "mlo" : "mhi");
+      strcpy (rname58, "mlo");
+      strcpy (rname59, "mhi");
     }
 
   /* The nature of arc_tp_regno is actually something more like a global
@@ -2711,8 +2714,6 @@  arc_must_save_register (int regno, struct function *func, bool special_p)
     case R55_REG:
     case R56_REG:
     case R57_REG:
-    case R58_REG:
-    case R59_REG:
       /* The Extension Registers.  */
       if (ARC_INTERRUPT_P (fn_type)
 	  && (df_regs_ever_live_p (RETURN_ADDR_REGNUM)
@@ -2723,6 +2724,20 @@  arc_must_save_register (int regno, struct function *func, bool special_p)
 	return true;
       return false;
 
+    case R58_REG:
+    case R59_REG:
+      /* ARC600 specifies those ones as mlo/mhi registers, otherwise
+	 just handle them like any other extension register.  */
+      if (ARC_INTERRUPT_P (fn_type)
+	  && (df_regs_ever_live_p (RETURN_ADDR_REGNUM)
+	      || df_regs_ever_live_p (regno))
+	  /* Not all extension registers are available, choose the
+	     real ones.  */
+	  && ((!fixed_regs[regno] && !special_p)
+	      || (TARGET_MUL64_SET && special_p)))
+	return true;
+      return false;
+
     case 61:
     case 62:
     case 63:
@@ -2809,6 +2824,7 @@  arc_compute_frame_size (void)
   int size;
   unsigned int extra_plus_reg_size;
   unsigned int extra_plus_reg_size_aligned;
+  unsigned int fn_type = arc_compute_function_type (cfun);
 
   /* The answer might already be known.  */
   if (cfun->machine->frame_info.initialized)
@@ -2863,8 +2879,8 @@  arc_compute_frame_size (void)
 
   /* Saving blink reg for millicode thunk calls.  */
   if (TARGET_MILLICODE_THUNK_SET
-      && !crtl->calls_eh_return
-      && !ARC_INTERRUPT_P (arc_compute_function_type (cfun)))
+      && !ARC_INTERRUPT_P (fn_type)
+      && !crtl->calls_eh_return)
     {
       if (arc_compute_millicode_save_restore_regs (gmask, frame_info))
 	frame_info->save_return_addr = true;
@@ -2882,11 +2898,18 @@  arc_compute_frame_size (void)
 			      cfun, TARGET_DPFP))
     reg_size += UNITS_PER_WORD * 2;
 
+  /* Check for special MLO/MHI case used by ARC600' MUL64
+     extension.  */
+  if (arc_must_save_register (R58_REG, cfun, TARGET_MUL64_SET))
+    reg_size += UNITS_PER_WORD * 2;
+
   /* 4) Calculate extra size made up of the blink + fp size.  */
   extra_size = 0;
   if (arc_must_save_return_addr (cfun))
     extra_size = 4;
-  if (arc_frame_pointer_needed ())
+  /* Add FP size only when it is not autosaved.  */
+  if (arc_frame_pointer_needed ()
+      && !ARC_AUTOFP_IRQ_P (fn_type))
     extra_size += 4;
 
   /* 5) Space for variable arguments passed in registers */
@@ -3086,7 +3109,6 @@  pop_reg (rtx reg)
   return GET_MODE_SIZE (GET_MODE (reg));
 }
 
-
 /* Check if we have a continous range to be save/restored with the
    help of enter/leave instructions.  A vaild register range starts
    from $r13 and is up to (including) $r26.  */
@@ -3118,7 +3140,8 @@  static int
 arc_save_callee_saves (uint64_t gmask,
 		       bool save_blink,
 		       bool save_fp,
-		       HOST_WIDE_INT offset)
+		       HOST_WIDE_INT offset,
+		       bool emit_move)
 {
   rtx reg;
   int frame_allocated = 0;
@@ -3154,41 +3177,6 @@  arc_save_callee_saves (uint64_t gmask,
 	offset = 0;
       }
 
-  /* Check if we need to save the ZOL machinery.  */
-  if (arc_lpcwidth != 0 && arc_must_save_register (LP_COUNT, cfun, true))
-    {
-      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
-      emit_insn (gen_rtx_SET (reg0,
-			      gen_rtx_UNSPEC_VOLATILE
-			      (Pmode, gen_rtvec (1, GEN_INT (AUX_LP_START)),
-			       VUNSPEC_ARC_LR)));
-      frame_allocated += push_reg (reg0);
-      emit_insn (gen_rtx_SET (reg0,
-			      gen_rtx_UNSPEC_VOLATILE
-			      (Pmode, gen_rtvec (1, GEN_INT (AUX_LP_END)),
-			       VUNSPEC_ARC_LR)));
-      frame_allocated += push_reg (reg0);
-      emit_move_insn (reg0, gen_rtx_REG (SImode, LP_COUNT));
-      frame_allocated += push_reg (reg0);
-    }
-
-  /* Save AUX regs used by FPX machinery.  */
-  if (arc_must_save_register (TARGET_BIG_ENDIAN ? R41_REG : R40_REG,
-			      cfun, TARGET_DPFP))
-    {
-      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
-
-      for (i = 0; i < 4; i++)
-	{
-	  emit_insn (gen_rtx_SET (reg0,
-				  gen_rtx_UNSPEC_VOLATILE
-				  (Pmode, gen_rtvec (1, GEN_INT (AUX_DPFP_START
-								 + i)),
-				   VUNSPEC_ARC_LR)));
-	  frame_allocated += push_reg (reg0);
-	}
-    }
-
   /* Save frame pointer if needed.  First save the FP on stack, if not
      autosaved.  Unfortunately, I cannot add it to gmask and use the
      above loop to save fp because our ABI states fp goes aftert all
@@ -3200,7 +3188,7 @@  arc_save_callee_saves (uint64_t gmask,
     }
 
   /* Emit mov fp,sp.  */
-  if (arc_frame_pointer_needed ())
+  if (emit_move)
     frame_move (hard_frame_pointer_rtx, stack_pointer_rtx);
 
   return frame_allocated;
@@ -3219,6 +3207,7 @@  arc_restore_callee_saves (uint64_t gmask,
   rtx reg;
   int frame_deallocated = 0;
   HOST_WIDE_INT offs = cfun->machine->frame_info.reg_size;
+  unsigned int fn_type = arc_compute_function_type (cfun);
   bool early_blink_restore;
   int i;
 
@@ -3237,43 +3226,6 @@  arc_restore_callee_saves (uint64_t gmask,
       frame_deallocated += frame_restore_reg (hard_frame_pointer_rtx, 0);
     }
 
-  /* Restore AUX-regs used by FPX machinery.  */
-  if (arc_must_save_register (TARGET_BIG_ENDIAN ? R41_REG : R40_REG,
-			      cfun, TARGET_DPFP))
-    {
-      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
-
-      gcc_assert (offset == 0);
-      for (i = 0; i < 4; i++)
-	{
-	  frame_deallocated += pop_reg (reg0);
-	  emit_insn (gen_rtx_UNSPEC_VOLATILE
-		     (VOIDmode, gen_rtvec (2, reg0, GEN_INT (AUX_DPFP_START
-							     + i)),
-		      VUNSPEC_ARC_SR));
-	}
-    }
-
-  /* Check if we need to restore the ZOL machinery.  */
-  if (arc_lpcwidth !=0 && arc_must_save_register (LP_COUNT, cfun, true))
-    {
-      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
-
-      gcc_assert (offset == 0);
-      frame_deallocated += pop_reg (reg0);
-      emit_move_insn (gen_rtx_REG (SImode, LP_COUNT), reg0);
-
-      frame_deallocated += pop_reg (reg0);
-      emit_insn (gen_rtx_UNSPEC_VOLATILE
-		 (VOIDmode, gen_rtvec (2, reg0, GEN_INT (AUX_LP_END)),
-		  VUNSPEC_ARC_SR));
-
-      frame_deallocated += pop_reg (reg0);
-      emit_insn (gen_rtx_UNSPEC_VOLATILE
-		 (VOIDmode, gen_rtvec (2, reg0, GEN_INT (AUX_LP_START)),
-		  VUNSPEC_ARC_SR));
-    }
-
   if (offset)
     {
       /* No $fp involved, we need to do an add to set the $sp to the
@@ -3283,9 +3235,10 @@  arc_restore_callee_saves (uint64_t gmask,
       offset = 0;
     }
 
-  /* When we do not optimize for size, restore first blink.  */
+  /* When we do not optimize for size or we aren't in an interrupt,
+     restore first blink.  */
   early_blink_restore = restore_blink && !optimize_size && offs
-    && !ARC_INTERRUPT_P (arc_compute_function_type (cfun));
+    && !ARC_INTERRUPT_P (fn_type);
   if (early_blink_restore)
     {
       rtx addr = plus_constant (Pmode, stack_pointer_rtx, offs);
@@ -3829,6 +3782,7 @@  arc_expand_prologue (void)
   unsigned int fn_type = arc_compute_function_type (cfun);
   bool save_blink = false;
   bool save_fp = false;
+  bool emit_move = false;
 
   /* Naked functions don't have prologue.  */
   if (ARC_NAKED_P (fn_type))
@@ -3866,7 +3820,9 @@  arc_expand_prologue (void)
 
   save_blink = arc_must_save_return_addr (cfun)
     && !ARC_AUTOBLINK_IRQ_P (fn_type);
-  save_fp = arc_frame_pointer_needed () && !ARC_AUTOFP_IRQ_P (fn_type);
+  save_fp = arc_frame_pointer_needed () && !ARC_AUTOFP_IRQ_P (fn_type)
+    && !ARC_INTERRUPT_P (fn_type);
+  emit_move = arc_frame_pointer_needed () && !ARC_INTERRUPT_P (fn_type);
 
   /* Use enter/leave only for non-interrupt functions.  */
   if (TARGET_CODE_DENSITY
@@ -3885,7 +3841,55 @@  arc_expand_prologue (void)
 						     frame->reg_size);
   else
     frame_size_to_allocate -= arc_save_callee_saves (gmask, save_blink, save_fp,
-						     first_offset);
+						     first_offset, emit_move);
+
+  /* Check if we need to save the ZOL machinery.  */
+  if (arc_lpcwidth != 0 && arc_must_save_register (LP_COUNT, cfun, true))
+    {
+      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
+      emit_insn (gen_rtx_SET (reg0,
+			      gen_rtx_UNSPEC_VOLATILE
+			      (Pmode, gen_rtvec (1, GEN_INT (AUX_LP_START)),
+			       VUNSPEC_ARC_LR)));
+      frame_size_to_allocate -= push_reg (reg0);
+      emit_insn (gen_rtx_SET (reg0,
+			      gen_rtx_UNSPEC_VOLATILE
+			      (Pmode, gen_rtvec (1, GEN_INT (AUX_LP_END)),
+			       VUNSPEC_ARC_LR)));
+      frame_size_to_allocate -= push_reg (reg0);
+      emit_move_insn (reg0, gen_rtx_REG (SImode, LP_COUNT));
+      frame_size_to_allocate -= push_reg (reg0);
+    }
+
+  /* Save AUX regs used by FPX machinery.  */
+  if (arc_must_save_register (TARGET_BIG_ENDIAN ? R41_REG : R40_REG,
+			      cfun, TARGET_DPFP))
+    {
+      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
+      int i;
+
+      for (i = 0; i < 4; i++)
+	{
+	  emit_insn (gen_rtx_SET (reg0,
+				  gen_rtx_UNSPEC_VOLATILE
+				  (Pmode, gen_rtvec (1, GEN_INT (AUX_DPFP_START
+								 + i)),
+				   VUNSPEC_ARC_LR)));
+	  frame_size_to_allocate -= push_reg (reg0);
+	}
+    }
+
+  /* Save ARC600' MUL64 registers.  */
+  if (arc_must_save_register (R58_REG, cfun, true))
+    frame_size_to_allocate -= arc_save_callee_saves (3ULL << 58,
+						     false, false, 0, false);
+
+  if (arc_frame_pointer_needed () && ARC_INTERRUPT_P (fn_type))
+    {
+      /* Just save fp at the end of the saving context.  */
+      frame_size_to_allocate -=
+	arc_save_callee_saves (0, false, !ARC_AUTOFP_IRQ_P (fn_type), 0, true);
+    }
 
   /* Allocate the stack frame.  */
   if (frame_size_to_allocate > 0)
@@ -3956,6 +3960,74 @@  arc_expand_epilogue (int sibcall_p)
   if (size)
     emit_insn (gen_blockage ());
 
+  if (ARC_INTERRUPT_P (fn_type) && restore_fp)
+    {
+      /* We need to restore FP before any SP operation in an
+	 interrupt.  */
+      size_to_deallocate -= arc_restore_callee_saves (0, false,
+						      restore_fp,
+						      first_offset,
+						      size_to_deallocate);
+      restore_fp = false;
+      first_offset = 0;
+    }
+
+  /* Restore ARC600' MUL64 registers.  */
+  if (arc_must_save_register (R58_REG, cfun, true))
+    {
+      rtx insn;
+      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
+      rtx reg1 = gen_rtx_REG (SImode, R1_REG);
+      size_to_deallocate -= pop_reg (reg0);
+      size_to_deallocate -= pop_reg (reg1);
+
+      insn = emit_insn (gen_mulu64 (reg0, const1_rtx));
+      add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (SImode, R58_REG));
+      RTX_FRAME_RELATED_P (insn) = 1;
+      emit_insn (gen_arc600_stall ());
+      insn = emit_insn (gen_rtx_UNSPEC_VOLATILE
+			(VOIDmode, gen_rtvec (2, reg1, GEN_INT (AUX_MULHI)),
+			 VUNSPEC_ARC_SR));
+      add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (SImode, R59_REG));
+      RTX_FRAME_RELATED_P (insn) = 1;
+    }
+
+  /* Restore AUX-regs used by FPX machinery.  */
+  if (arc_must_save_register (TARGET_BIG_ENDIAN ? R41_REG : R40_REG,
+			      cfun, TARGET_DPFP))
+    {
+      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
+      int i;
+
+      for (i = 0; i < 4; i++)
+	{
+	  size_to_deallocate -= pop_reg (reg0);
+	  emit_insn (gen_rtx_UNSPEC_VOLATILE
+		     (VOIDmode, gen_rtvec (2, reg0, GEN_INT (AUX_DPFP_START
+							     + i)),
+		      VUNSPEC_ARC_SR));
+	}
+    }
+
+  /* Check if we need to restore the ZOL machinery.  */
+  if (arc_lpcwidth !=0 && arc_must_save_register (LP_COUNT, cfun, true))
+    {
+      rtx reg0 = gen_rtx_REG (SImode, R0_REG);
+
+      size_to_deallocate -= pop_reg (reg0);
+      emit_move_insn (gen_rtx_REG (SImode, LP_COUNT), reg0);
+
+      size_to_deallocate -= pop_reg (reg0);
+      emit_insn (gen_rtx_UNSPEC_VOLATILE
+		 (VOIDmode, gen_rtvec (2, reg0, GEN_INT (AUX_LP_END)),
+		  VUNSPEC_ARC_SR));
+
+      size_to_deallocate -= pop_reg (reg0);
+      emit_insn (gen_rtx_UNSPEC_VOLATILE
+		 (VOIDmode, gen_rtvec (2, reg0, GEN_INT (AUX_LP_START)),
+		  VUNSPEC_ARC_SR));
+    }
+
   if (TARGET_CODE_DENSITY
       && TARGET_CODE_DENSITY_FRAME
       && !ARC_AUTOFP_IRQ_P (fn_type)
@@ -9825,25 +9897,28 @@  gen_acc2 (void)
   return gen_rtx_REG (SImode, TARGET_BIG_ENDIAN ? 57: 56);
 }
 
-/* Return a REG rtx for mlo.  N.B. the gcc-internal representation may
-   differ from the hardware register number in order to allow the generic
-   code to correctly split the concatenation of mhi and mlo.  */
-
-rtx
-gen_mlo (void)
+/* FIXME: a parameter should be added, and code added to final.c,
+   to reproduce this functionality in shorten_branches.  */
+#if 0
+/* Return nonzero iff BRANCH should be unaligned if possible by upsizing
+   a previous instruction.  */
+int
+arc_unalign_branch_p (rtx branch)
 {
-  return gen_rtx_REG (SImode, TARGET_BIG_ENDIAN ? 59: 58);
-}
-
-/* Return a REG rtx for mhi.  N.B. the gcc-internal representation may
-   differ from the hardware register number in order to allow the generic
-   code to correctly split the concatenation of mhi and mlo.  */
+  rtx note;
 
-rtx
-gen_mhi (void)
-{
-  return gen_rtx_REG (SImode, TARGET_BIG_ENDIAN ? 58: 59);
+  if (!TARGET_UNALIGN_BRANCH)
+    return 0;
+  /* Do not do this if we have a filled delay slot.  */
+  if (get_attr_delay_slot_filled (branch) == DELAY_SLOT_FILLED_YES
+      && !NEXT_INSN (branch)->deleted ())
+    return 0;
+  note = find_reg_note (branch, REG_BR_PROB, 0);
+  return (!note
+	  || (arc_unalign_prob_threshold && !br_prob_note_reliable_p (note))
+	  || INTVAL (XEXP (note, 0)) < arc_unalign_prob_threshold);
 }
+#endif
 
 /* When estimating sizes during arc_reorg, when optimizing for speed, there
    are three reasons why we need to consider branches to be length 6:
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 6fea9228662..1f4a8aafbef 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -1338,19 +1338,30 @@  do { \
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
 /* How to renumber registers for dbx and gdb.  */
-#define DBX_REGISTER_NUMBER(REGNO) \
+#define DBX_REGISTER_NUMBER(REGNO)				\
   ((TARGET_MULMAC_32BY16_SET && (REGNO) >= 56 && (REGNO) <= 57) \
-   ? ((REGNO) ^ !TARGET_BIG_ENDIAN) \
-   : (TARGET_MUL64_SET && (REGNO) >= 57 && (REGNO) <= 59) \
-   ? ((REGNO) == 57 \
-      ? 58 /* MMED */ \
-      : ((REGNO) & 1) ^ TARGET_BIG_ENDIAN \
-      ? 59 /* MHI */ \
-      : 57 + !!TARGET_MULMAC_32BY16_SET) /* MLO */ \
+   ? ((REGNO) ^ !TARGET_BIG_ENDIAN)				\
+   : (TARGET_MUL64_SET && (REGNO) >= 57 && (REGNO) <= 58)	\
+   ? (((REGNO) == 57)						\
+      ? 58 /* MMED */						\
+      : 57 + !!TARGET_MULMAC_32BY16_SET) /* MLO */		\
    : (REGNO))
 
+/* Use gcc hard register numbering for eh_frame.  */
 #define DWARF_FRAME_REGNUM(REG) (REG)
 
+/* Map register numbers held in the call frame info that gcc has
+   collected using DWARF_FRAME_REGNUM to those that should be output
+   in .debug_frame and .eh_frame.  */
+#define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH)			\
+  ((TARGET_MULMAC_32BY16_SET && (REGNO) >= 56 && (REGNO) <= 57) \
+   ? ((REGNO) ^ !TARGET_BIG_ENDIAN)				\
+   : (TARGET_MUL64_SET && (REGNO) >= 57 && (REGNO) <= 58)	\
+   ? (((REGNO) == 57)						\
+      ? 58 /* MMED */						\
+      : 57 + !!TARGET_MULMAC_32BY16_SET) /* MLO */		\
+   : (REGNO))
+
 #define DWARF_FRAME_RETURN_COLUMN 	DWARF_FRAME_REGNUM (31)
 
 #define INCOMING_RETURN_ADDR_RTX  gen_rtx_REG (Pmode, 31)
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 32707bc257b..9a96440025f 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -164,6 +164,7 @@ 
   VUNSPEC_ARC_BLOCKAGE
   VUNSPEC_ARC_EH_RETURN
   VUNSPEC_ARC_ARC600_RTIE
+  VUNSPEC_ARC_ARC600_STALL
   VUNSPEC_ARC_LDDI
   VUNSPEC_ARC_STDI
   ])
@@ -2251,6 +2252,9 @@  archs4x, archs4xd"
    (set_attr "predicable" "no, no, yes")
    (set_attr "cond" "nocond, canuse_limm, canuse")])
 
+; The gcc-internal representation may differ from the hardware
+; register number in order to allow the generic code to correctly
+; split the concatenation of mhi and mlo.
 (define_insn_and_split "mulsi64"
  [(set (match_operand:SI 0 "register_operand"            "=w")
 	(mult:SI (match_operand:SI 1 "register_operand"  "%c")
@@ -2260,12 +2264,13 @@  archs4x, archs4xd"
  "#"
  "TARGET_MUL64_SET && reload_completed"
   [(const_int 0)]
-{
-  emit_insn (gen_mulsi_600 (operands[1], operands[2],
-			gen_mlo (), gen_mhi ()));
-  emit_move_insn (operands[0], gen_mlo ());
-  DONE;
-}
+  {
+   rtx mhi = gen_rtx_REG (SImode, R59_REG);
+   rtx mlo = gen_rtx_REG (SImode, R58_REG);
+   emit_insn (gen_mulsi_600 (operands[1], operands[2], mlo, mhi));
+   emit_move_insn (operands[0], mlo);
+   DONE;
+  }
   [(set_attr "type" "multi")
    (set_attr "length" "8")])
 
@@ -2275,23 +2280,7 @@  archs4x, archs4xd"
 		 (match_operand:SI 1 "nonmemory_operand" "Rcq#q,cL,I,Cal")))
    (clobber (match_operand:SI 3 "mhi_operand" ""))]
   "TARGET_MUL64_SET"
-; The assembler mis-assembles mul64 / mulu64 with "I" constraint constants,
-; using a machine code pattern that only allows "L" constraint constants.
-;  "mul64%? \t0, %0, %1%&"
-{
-  if (satisfies_constraint_I (operands[1])
-      && !satisfies_constraint_L (operands[1]))
-    {
-      /* MUL64 <0,>b,s12 00101bbb10000100 0BBBssssssSSSSSS  */
-      int n = true_regnum (operands[0]);
-      int i = INTVAL (operands[1]);
-      asm_fprintf (asm_out_file, "\t.short %d`", 0x2884 + ((n & 7) << 8));
-      asm_fprintf (asm_out_file, "\t.short %d`",
-		   ((i & 0x3f) << 6) + ((i >> 6) & 0x3f) + ((n & 070) << 9));
-      return "; mul64%? \t0, %0, %1%&";
-    }
-  return "mul64%? \t0, %0, %1%&";
-}
+  "mul64%?\\t0,%0,%1"
   [(set_attr "length" "*,4,4,8")
    (set_attr "iscompact" "maybe,false,false,false")
    (set_attr "type" "multi,multi,multi,multi")
@@ -4311,6 +4300,14 @@  archs4x, archs4xd"
    (set_attr "type" "block")]
 )
 
+(define_insn "arc600_stall"
+  [(unspec_volatile [(const_int 0)] VUNSPEC_ARC_ARC600_STALL)]
+  "TARGET_MUL64_SET"
+  "mov\\t0,mlo\t;wait until multiply complete."
+  [(set_attr "length" "4")
+   (set_attr "type" "move")]
+)
+
 ;; Split up troublesome insns for better scheduling.
 
 ;; Peepholes go at the end.
diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index 286a5e69b89..3c03436c901 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -688,11 +688,11 @@ 
 
 (define_predicate "mlo_operand"
   (and (match_code "reg")
-       (match_test "REGNO (op) == (TARGET_BIG_ENDIAN ? 59 : 58)")))
+       (match_test "REGNO (op) == R58_REG")))
 
 (define_predicate "mhi_operand"
   (and (match_code "reg")
-       (match_test "REGNO (op) == (TARGET_BIG_ENDIAN ? 58 : 59)")))
+       (match_test "REGNO (op) == R59_REG")))
 
 (define_predicate "accl_operand"
   (and (match_code "reg")
diff --git a/gcc/testsuite/gcc.target/arc/code-density-flag.c b/gcc/testsuite/gcc.target/arc/code-density-flag.c
index 1ecf1a2ca29..7ff8042f14c 100644
--- a/gcc/testsuite/gcc.target/arc/code-density-flag.c
+++ b/gcc/testsuite/gcc.target/arc/code-density-flag.c
@@ -11,6 +11,7 @@ 
  * as well, else it is going to choke on such encodings.    */
 
 /* { dg-do assemble }                                       */
+/* { dg-do compile }                                        */
 /* { dg-skip-if "" { ! { clmcpu } } }                       */
 /* { dg-options "-mcpu=em_mini -mcode-density" }            */
 
diff --git a/gcc/testsuite/gcc.target/arc/interrupt-6.c b/gcc/testsuite/gcc.target/arc/interrupt-6.c
index 9cb0565f55c..b3d2bed4537 100644
--- a/gcc/testsuite/gcc.target/arc/interrupt-6.c
+++ b/gcc/testsuite/gcc.target/arc/interrupt-6.c
@@ -17,5 +17,5 @@  foo(void)
   bar (p);
 }
 /* { dg-final { scan-assembler-not ".*fp,\\\[sp" } } */
-/* { dg-final { scan-assembler "ld.*blink,\\\[sp" } } */
+/* { dg-final { scan-assembler "pop_s.*blink" } } */
 /* { dg-final { scan-assembler "push_s.*blink" } } */