diff mbox

Patches to fix GCC’s C++ exception handling on NetBSD/VAX

Message ID DB0276A6-0A26-44F7-8FDC-D086EA40C709@me.com
State New
Headers show

Commit Message

Jake Hamby March 26, 2016, 11:56 a.m. UTC
> On Mar 23, 2016, at 05:56, Christos Zoulas <christos@astron.com> wrote:
> 
> In article <F48D0C6B-A6DB-410B-BC97-C30D4E8B4612@me.com>,
> Jake Hamby  <jehamby420@me.com> wrote:
> 
> Hi,
> 
> Thanks a lot for your patch. I applied it to our gcc-5 in the tree.
> Unfortunately gcc-5 seems that it was never tested to even compile.
> I fixed the simple compilation issue, but it fails to compile some
> files with an internal error trying to construct a dwarf frame info
> element with the wrong register. If you have some time, can you
> take a look? I will too.
> 
> Thanks,
> 
> christos

Hi Christos,

I just rebased my patches on top of GCC 5.3, which I see you have recently switched to. Here’s a brief explanation of how I patched the dwarf frame error.

The problem is that FIRST_PSEUDO_REGISTER should be increased from 16 to 17, because PSW is considered a real register for DWARF debug purposes. This necessitated changing a few other macros, but nothing major. Since it’s marked as one of the FIXED_REGISTERS, it never actually gets used. Currently I’m doing a full build with GCC 5.3 and CONFIGURE_ARGS += —enable-checking=all, which is very much slower, of course.

One bug I discovered with —enable-checking=all on GCC 4.8.5 is a call to XEXP() that may not be valid, but which can be prevented by checking the other conditions first, and then calling XEXP() if the other conditions are true.

There seems to be a code generation bug with C++ that only affects a few things. Unfortunately, GCC itself (the native version, not the cross compiler) is one of the programs affected. The symptom when compiling almost anything complex in GCC 4.8.5 is a stack overflow as it recursively loops around trying to expand bits and pieces of the insns. It seems to be branching the wrong way.

In looking at this, I discovered one really broken issue in the current vax.md, namely the three peephole optimizations at the bottom. The comment on the bottom one that says “Leave this commented out until we can determine whether the second move precedes a jump which relies on the CC flags being set correctly.” is absolutely correct and I believe all three should be removed. I’ve done so in the diff below, and added a comment explaining why.

I have a theory that the source of any code generation bugs in GCC’s output (and I fear that GCC 5.3 won’t necessarily fix, even if the system itself is completely stable), is that the CC0 notification handler isn’t doing the right thing. I’ll send another email if I make any progress on this issue, but what I’ve discovered so far is that it makes no sense for VAX to switch away from the CC0 style of condition handling, because almost every single instruction updates PSW flags, and in a similar way. So the older style is really optimized for VAX, but it took me a very long time to understand exactly what vax_notice_update_cc() is doing and why correct behavior of it is so important.

The idea is that some part of the optimizer is able to remove explicit tst / cmp insns when a previous one has already set the flags in a way that’s useful. So my theory is that the current version mostly works, but it’s very difficult to understand as it’s written, and it may be very occasionally giving the wrong results due to either bugs in its logic or the instructions emitted not setting the flags in the usual way. Unfortunately, the m68k version of NOTICE_UPDATE_CC (the only other arch supported by NetBSD that uses the CC0 style) is even more convoluted.

So what I’d like to try next is rewriting the VAX version of notice_update_cc to use the style that the AVR backend and others use which is to add a “cc” attribute to the instructions themselves in the .md file describing what each one does in terms of flags. Then the notify function itself becomes much simpler, and, hopefully, more likely to be correct. I did spend a few hours yesterday investigating whether it would make sense to convert VAX to the newer condition codes style that all of the modern CPUs use, but because nearly every instruction, including all of the move instructions, clobbers / updates PSW, it would be much uglier than adding a cc attribute to every opcode, and, what’s worse, it caused all sorts of breakage in the compiler when I tried it because the (clobber (reg:CC VAX_PSW_REGNUM)) lines I had added prevented it from matching instructions.

I did find a patch on the GCC Bugzilla for a different architecture, where someone had gone down the lines of fixing the emit stuff where I found breakage. But VAX is so well tuned to the CC0 style that it makes more sense to refactor it in place in the style of Atmel AVR, NEC V850, Renesas H8/300, all of which are built from a similar template. The m68k backend could benefit from being refactored in a similar way, but I’ll focus on VAX for now. :-)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50582

Even with the patches attached to that bug, and the FIRST_PSEUDO_REGISTER that I’ve included below, it was still crashing because it can’t find any move insns that don’t clobber PSW. So it seems futile to switch away from CC0, because it’s so much better tuned to VAX, where almost every instruction compares the dest to 0, or to the source, which is usually where a branch would want to go.

The crash I’m seeing in GCC itself is in C++ code, so I had hoped that by commenting out the peephole optimizations at the bottom of vax.md I might fix it, but sadly, no. I’ll try it with the GCC 5.3 build as soon as it finishes. Here’s my current patch set. I’m very excited that the new binutils work (with this patch), and also “-O2” is as stable as the “-O1…” special CFLAGS that are defined now for VAX (which can be removed). Here are the patches I have now. The GCC ones have only been tested with GCC 4.8.5, but I rebased them and they compile and run, so they should be reasonably safe.


@@ -742,7 +742,7 @@
 ;; Rotate right on the VAX works by negating the shift count.
 (define_expand "rotrsi3"
   [(set (match_operand:SI 0 "general_operand" "=g")
-	(rotatert:SI (match_operand:SI 1 "general_operand" "g")
+	(rotatert:SI (match_operand:SI 1 "general_operand" "nrmT")
 		     (match_operand:QI 2 "general_operand" "g")))]
   ""
   "
@@ -789,6 +789,9 @@
 ;; netbsd changed this to REG_P (operands[0]) || (MEM_P (operands[0]) && ...
 ;; but gcc made it just !MEM_P (operands[0]) || ...
 
+;; netbsd changed this to REG_P (operands[0]) || (MEM_P (operands[0]) && ...
+;; but gcc made it just !MEM_P (operands[0]) || ...
+
 (define_insn ""
   [(set (zero_extract:SI (match_operand:SI 0 "register_operand" "+ro")
 			 (match_operand:QI 1 "const_int_operand" "n")
@@ -1216,7 +1219,7 @@
 	 (gt (plus:SI (match_operand:SI 0 "nonimmediate_operand" "+g")
 		      (const_int -1))
 	     (const_int 0))
-	 (label_ref (match_operand 1 "" ""))
+	 (label_ref (match_operand 1))
 	 (pc)))
    (set (match_dup 0)
 	(plus:SI (match_dup 0)
@@ -1230,7 +1233,7 @@
 	 (ge (plus:SI (match_operand:SI 0 "nonimmediate_operand" "+g")
 		      (const_int -1))
 	     (const_int 0))
-	 (label_ref (match_operand 1 "" ""))
+	 (label_ref (match_operand 1))
 	 (pc)))
    (set (match_dup 0)
 	(plus:SI (match_dup 0)
@@ -1245,7 +1248,7 @@
 	 (lt (plus:SI (match_operand:SI 0 "nonimmediate_operand" "+g")
 		      (const_int 1))
 	     (match_operand:SI 1 "general_operand" "nrmT"))
-	 (label_ref (match_operand 2 "" ""))
+	 (label_ref (match_operand 2))
 	 (pc)))
    (set (match_dup 0)
 	(plus:SI (match_dup 0)
@@ -1258,7 +1261,7 @@
 	(if_then_else
 	 (lt (match_operand:SI 0 "nonimmediate_operand" "+g")
 	     (match_operand:SI 1 "general_operand" "nrmT"))
-	 (label_ref (match_operand 2 "" ""))
+	 (label_ref (match_operand 2))
 	 (pc)))
    (set (match_dup 0)
 	(plus:SI (match_dup 0)
@@ -1272,7 +1275,7 @@
 	 (le (plus:SI (match_operand:SI 0 "nonimmediate_operand" "+g")
 		      (const_int 1))
 	     (match_operand:SI 1 "general_operand" "nrmT"))
-	 (label_ref (match_operand 2 "" ""))
+	 (label_ref (match_operand 2))
 	 (pc)))
    (set (match_dup 0)
 	(plus:SI (match_dup 0)
@@ -1285,7 +1288,7 @@
 	(if_then_else
 	 (le (match_operand:SI 0 "nonimmediate_operand" "+g")
 	     (match_operand:SI 1 "general_operand" "nrmT"))
-	 (label_ref (match_operand 2 "" ""))
+	 (label_ref (match_operand 2))
 	 (pc)))
    (set (match_dup 0)
 	(plus:SI (match_dup 0)
@@ -1309,6 +1312,11 @@
   ""
   "decl %0\;jgequ %l1")
 

+;; Note that operand 1 is total size of args, in bytes,
+;; and what the call insn wants is the number of words.
+;; It is used in the call instruction as a byte, but in the addl2 as
+;; a word.  Since the only time we actually use it in the call instruction
+;; is when it is a constant, SImode (for addl2) is the proper mode.
 (define_expand "call_pop"
   [(parallel [(call (match_operand:QI 0 "memory_operand" "")
 		    (match_operand:SI 1 "const_int_operand" ""))
@@ -1317,12 +1325,7 @@
 			    (match_operand:SI 3 "immediate_operand" "")))])]
   ""
 {
-  gcc_assert (INTVAL (operands[3]) <= 255 * 4 && INTVAL (operands[3]) % 4 == 0);
-
-  /* Operand 1 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[1] = GEN_INT (INTVAL (operands[3]) + 4);
+  gcc_assert (INTVAL (operands[1]) <= 255 * 4);
 })
 
 (define_insn "*call_pop"
@@ -1332,10 +1335,11 @@
 					(match_operand:SI 2 "immediate_operand" "i")))]
   ""
 {
-  operands[1] = GEN_INT ((INTVAL (operands[1]) - 4) / 4);
+  operands[1] = GEN_INT ((INTVAL (operands[1]) + 3) / 4);
   return "calls %1,%0";
 })
 
+
 (define_expand "call_value_pop"
   [(parallel [(set (match_operand 0 "" "")
 		   (call (match_operand:QI 1 "memory_operand" "")
@@ -1345,12 +1349,7 @@
 			    (match_operand:SI 4 "immediate_operand" "")))])]
   ""
 {
-  gcc_assert (INTVAL (operands[4]) <= 255 * 4 && INTVAL (operands[4]) % 4 == 0);
-
-  /* Operand 2 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[2] = GEN_INT (INTVAL (operands[4]) + 4);
+  gcc_assert (INTVAL (operands[2]) <= 255 * 4);
 })
 
 (define_insn "*call_value_pop"
@@ -1360,47 +1359,24 @@
    (set (reg:SI VAX_SP_REGNUM) (plus:SI (reg:SI VAX_SP_REGNUM)
 					(match_operand:SI 3 "immediate_operand" "i")))]
   ""
-  "*
 {
-  operands[2] = GEN_INT ((INTVAL (operands[2]) - 4) / 4);
-  return \"calls %2,%1\";
-}")
+  operands[2] = GEN_INT ((INTVAL (operands[2]) + 3) / 4);
+  return "calls %2,%1";
+})
 
-(define_expand "call"
-  [(call (match_operand:QI 0 "memory_operand" "")
-      (match_operand:SI 1 "const_int_operand" ""))]
-  ""
-  "
-{
-  /* Operand 1 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[1] = GEN_INT (INTVAL (operands[1]) + 4);
-}")
 
-(define_insn "*call"
-   [(call (match_operand:QI 0 "memory_operand" "m")
-	  (match_operand:SI 1 "const_int_operand" ""))]
+;; Define another set of these for the case of functions with no operands.
+;; These will allow the optimizers to do a slightly better job.
+(define_insn "call"
+  [(call (match_operand:QI 0 "memory_operand" "m")
+	 (const_int 0))]
   ""
   "calls $0,%0")
 
-(define_expand "call_value"
-  [(set (match_operand 0 "" "")
-      (call (match_operand:QI 1 "memory_operand" "")
-	    (match_operand:SI 2 "const_int_operand" "")))]
-  ""
-  "
-{
-  /* Operand 2 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[2] = GEN_INT (INTVAL (operands[2]) + 4);
-}")
-
-(define_insn "*call_value"
+(define_insn "call_value"
   [(set (match_operand 0 "" "")
 	(call (match_operand:QI 1 "memory_operand" "m")
-	      (match_operand:SI 2 "const_int_operand" "")))]
+	      (const_int 0)))]
   ""
   "calls $0,%1")
 
@@ -1682,47 +1658,15 @@
 
 (include "builtins.md")
 
-(define_peephole2
-  [(set (match_operand:SI 0 "push_operand" "")
-        (const_int 0))
-   (set (match_dup 0)
-        (match_operand:SI 1 "const_int_operand" ""))]
-  "INTVAL (operands[1]) >= 0"
-  [(set (match_dup 0)
-        (match_dup 1))]
-  "operands[0] = gen_rtx_MEM(DImode, XEXP (operands[0], 0));")
-
-(define_peephole2
-  [(set (match_operand:SI 0 "push_operand" "")
-        (match_operand:SI 1 "general_operand" ""))
-   (set (match_dup 0)
-        (match_operand:SI 2 "general_operand" ""))]
-  "vax_decomposed_dimode_operand_p (operands[2], operands[1])"
-  [(set (match_dup 0)
-        (match_dup 2))]
-  "{
-    operands[0] = gen_rtx_MEM(DImode, XEXP (operands[0], 0));
-    operands[2] = REG_P (operands[2])
-      ? gen_rtx_REG(DImode, REGNO (operands[2]))
-      : gen_rtx_MEM(DImode, XEXP (operands[2], 0));
-}")
-
-; Leave this commented out until we can determine whether the second move
-; precedes a jump which relies on the CC flags being set correctly.
-(define_peephole2
-  [(set (match_operand:SI 0 "nonimmediate_operand" "")
-        (match_operand:SI 1 "general_operand" ""))
-   (set (match_operand:SI 2 "nonimmediate_operand" "")
-        (match_operand:SI 3 "general_operand" ""))]
-  "0 && vax_decomposed_dimode_operand_p (operands[1], operands[3])
-   && vax_decomposed_dimode_operand_p (operands[0], operands[2])"
-  [(set (match_dup 0)
-        (match_dup 1))]
-  "{
-    operands[0] = REG_P (operands[0])
-      ? gen_rtx_REG(DImode, REGNO (operands[0]))
-      : gen_rtx_MEM(DImode, XEXP (operands[0], 0));
-    operands[1] = REG_P (operands[1])
-      ? gen_rtx_REG(DImode, REGNO (operands[1]))
-      : gen_rtx_MEM(DImode, XEXP (operands[1], 0));
-}")
+;; The earlier peephole definitions have been removed because they weren't
+;; setting the flags equivalently to the 32-bit instructions they replaced.
+;; It's also not possible to readily determine whether the second 32-bit move
+;; precedes a jump which relies on the CC flags being set correctly, as the
+;; previous comment noted (for one of the definitions that had been disabled).
+;;
+;; It's probably not worthwhile to make the CC0 handler any more complicated
+;; than it already is, to combine two adjacent 32-bit values into a 64-bit one,
+;; but only if there is no later branch with a dependency on the flag settings.
+;; This seems to happen mostly with C++ code that branches after object copying.
+;; It's also questionable whether any real VAX hardware would benefit from this.
+;; Note that 64-bit arithmetic is already optimized via longlong.h macros.

Comments

Jeff Law April 26, 2016, 8:22 p.m. UTC | #1
On 03/26/2016 05:56 AM, Jake Hamby wrote:
>> On Mar 23, 2016, at 05:56, Christos Zoulas <christos@astron.com>
>> wrote:
>>
>> In article <F48D0C6B-A6DB-410B-BC97-C30D4E8B4612@me.com>, Jake
>> Hamby  <jehamby420@me.com> wrote:
>>
>> Hi,
>>
>> Thanks a lot for your patch. I applied it to our gcc-5 in the
>> tree. Unfortunately gcc-5 seems that it was never tested to even
>> compile. I fixed the simple compilation issue, but it fails to
>> compile some files with an internal error trying to construct a
>> dwarf frame info element with the wrong register. If you have some
>> time, can you take a look? I will too.
>>
>> Thanks,
>>
>> christos
>
> Hi Christos,
>
> I just rebased my patches on top of GCC 5.3, which I see you have
> recently switched to. Here’s a brief explanation of how I patched the
> dwarf frame error.
>
> The problem is that FIRST_PSEUDO_REGISTER should be increased from 16
> to 17, because PSW is considered a real register for DWARF debug
> purposes. This necessitated changing a few other macros, but nothing
> major. Since it’s marked as one of the FIXED_REGISTERS, it never
> actually gets used. Currently I’m doing a full build with GCC 5.3 and
> CONFIGURE_ARGS += —enable-checking=all, which is very much slower, of
> course.
This patch could probably be pulled out and installed independent of the 
rest.  It's simple, well isolated and should be reviewable even by folks 
without intimate knowledge of hte vax port.

>
> One bug I discovered with —enable-checking=all on GCC 4.8.5 is a call
> to XEXP() that may not be valid, but which can be prevented by
> checking the other conditions first, and then calling XEXP() if the
> other conditions are true.
Where specifically?  And what's the hunk of RTL that's being examined 
incorrectly (hint, debug_rtx will dump a hunk of RTL symbolically).

>
> There seems to be a code generation bug with C++ that only affects a
> few things. Unfortunately, GCC itself (the native version, not the
> cross compiler) is one of the programs affected. The symptom when
> compiling almost anything complex in GCC 4.8.5 is a stack overflow as
> it recursively loops around trying to expand bits and pieces of the
> insns. It seems to be branching the wrong way.
Can't do much with this.  Given the known problems with set-cc0 
elimination on this port, perhaps we address those, then come back to 
this problem.

>
> In looking at this, I discovered one really broken issue in the
> current vax.md, namely the three peephole optimizations at the
> bottom. The comment on the bottom one that says “Leave this commented
> out until we can determine whether the second move precedes a jump
> which relies on the CC flags being set correctly.” is absolutely
> correct and I believe all three should be removed. I’ve done so in
> the diff below, and added a comment explaining why.
The comment is essentially useless because it refers back to patterns 
that don't exist.  But it's also not clear what patterns you're 
referring to.  I don't see any define_peepholes in vax.md going back to 
gcc-4.9.  What sources are you using and who's hacked on them?

Note the 0 && in the condition of the last peephole you removed 
essentially disabled that peephole.

It would really help if you were submitting patches against the current 
GCC trunk.

>
> The idea is that some part of the optimizer is able to remove
> explicit tst / cmp insns when a previous one has already set the
> flags in a way that’s useful. So my theory is that the current
> version mostly works, but it’s very difficult to understand as it’s
> written, and it may be very occasionally giving the wrong results due
> to either bugs in its logic or the instructions emitted not setting
> the flags in the usual way. Unfortunately, the m68k version of
> NOTICE_UPDATE_CC (the only other arch supported by NetBSD that uses
> the CC0 style) is even more convoluted.
This optimization happens in final.c

Essentially it tracks the state of the cc0 bits and a few other things. 
When it finds a comparison (cc0-setter), it refers back to the state of 
the bits and if hte bits are in a usable state, it'll "delete" the 
cc0-setting insn.

>
> So what I’d like to try next is rewriting the VAX version of
> notice_update_cc to use the style that the AVR backend and others use
> which is to add a “cc” attribute to the instructions themselves in
> the .md file describing what each one does in terms of flags.
This would be a step forward.  I would do this work independently of 
fixing hte dwarf/EH stuff.  Ideally patch submissions are independent 
changes rather than fixing several unrelated problems in a single patch.
diff mbox

Patch

Index: external/gpl3/binutils/dist/gas/config/tc-vax.c
===================================================================
RCS file: /cvsroot/src/external/gpl3/binutils/dist/gas/config/tc-vax.c,v
retrieving revision 1.10
diff -u -u -r1.10 tc-vax.c
--- external/gpl3/binutils/dist/gas/config/tc-vax.c	14 Feb 2016 19:00:04 -0000	1.10
+++ external/gpl3/binutils/dist/gas/config/tc-vax.c	26 Mar 2016 10:42:56 -0000
@@ -3430,11 +3430,12 @@ 
     }
 }
 
+static char *vax_cons_special_reloc;
+
 bfd_reloc_code_real_type
 vax_cons (expressionS *exp, int size)
 {
   char *save;
-  char *vax_cons_special_reloc;
 
   SKIP_WHITESPACE ();
   vax_cons_special_reloc = NULL;
@@ -3560,7 +3561,22 @@ 
 	 : nbytes == 2 ? BFD_RELOC_16
 	 : BFD_RELOC_32);
 
+  if (vax_cons_special_reloc)
+    {
+      if (*vax_cons_special_reloc == 'p')
+	{
+	  switch (nbytes)
+	    {
+	    case 1: r = BFD_RELOC_8_PCREL; break;
+	    case 2: r = BFD_RELOC_16_PCREL; break;
+	    case 4: r = BFD_RELOC_32_PCREL; break;
+	    default: abort ();
+	    }
+	}
+    }
+
   fix_new_exp (frag, where, (int) nbytes, exp, 0, r);
+  vax_cons_special_reloc = NULL;
 }
 
 char *
@@ -3598,6 +3614,8 @@ 
 void
 vax_cfi_emit_pcrel_expr (expressionS *expP, unsigned int nbytes)
 {
+  vax_cons_special_reloc = "pcrel";
   expP->X_add_number += nbytes;
   emit_expr (expP, nbytes);
+  vax_cons_special_reloc = NULL;
 }
Index: external/gpl3/gcc/dist/gcc/except.c
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/except.c,v
retrieving revision 1.3
diff -u -u -r1.3 except.c
--- external/gpl3/gcc/dist/gcc/except.c	23 Mar 2016 15:51:36 -0000	1.3
+++ external/gpl3/gcc/dist/gcc/except.c	26 Mar 2016 10:42:41 -0000
@@ -2288,7 +2288,8 @@ 
 #endif
     {
 #ifdef EH_RETURN_HANDLER_RTX
-      emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+      RTX_FRAME_RELATED_P (insn) = 1;
 #else
       error ("__builtin_eh_return not supported on this target");
 #endif
Index: external/gpl3/gcc/dist/gcc/config/m68k/m68k.md
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/m68k/m68k.md,v
retrieving revision 1.4
diff -u -u -r1.4 m68k.md
--- external/gpl3/gcc/dist/gcc/config/m68k/m68k.md	24 Jan 2016 09:43:33 -0000	1.4
+++ external/gpl3/gcc/dist/gcc/config/m68k/m68k.md	26 Mar 2016 10:42:41 -0000
@@ -2132,9 +2132,9 @@ 
 ;; into the kernel to emulate fintrz.  They should also be faster
 ;; than calling the subroutines fixsfsi or fixdfsi.
 
-(define_insn "fix_truncdfsi2"
+(define_insn "fix_trunc<mode>si2"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=dm")
-	(fix:SI (fix:DF (match_operand:DF 1 "register_operand" "f"))))
+	(fix:SI (match_operand:FP 1 "register_operand" "f")))
    (clobber (match_scratch:SI 2 "=d"))
    (clobber (match_scratch:SI 3 "=d"))]
   "TARGET_68881 && TUNE_68040"
@@ -2143,9 +2143,9 @@ 
   return "fmovem%.l %!,%2\;moveq #16,%3\;or%.l %2,%3\;and%.w #-33,%3\;fmovem%.l %3,%!\;fmove%.l %1,%0\;fmovem%.l %2,%!";
 })
 
-(define_insn "fix_truncdfhi2"
+(define_insn "fix_trunc<mode>hi2"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=dm")
-	(fix:HI (fix:DF (match_operand:DF 1 "register_operand" "f"))))
+	(fix:HI (match_operand:FP 1 "register_operand" "f")))
    (clobber (match_scratch:SI 2 "=d"))
    (clobber (match_scratch:SI 3 "=d"))]
   "TARGET_68881 && TUNE_68040"
@@ -2154,9 +2154,9 @@ 
   return "fmovem%.l %!,%2\;moveq #16,%3\;or%.l %2,%3\;and%.w #-33,%3\;fmovem%.l %3,%!\;fmove%.w %1,%0\;fmovem%.l %2,%!";
 })
 
-(define_insn "fix_truncdfqi2"
+(define_insn "fix_trunc<mode>qi2"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=dm")
-	(fix:QI (fix:DF (match_operand:DF 1 "register_operand" "f"))))
+	(fix:QI (match_operand:FP 1 "register_operand" "f")))
    (clobber (match_scratch:SI 2 "=d"))
    (clobber (match_scratch:SI 3 "=d"))]
   "TARGET_68881 && TUNE_68040"
Index: external/gpl3/gcc/dist/gcc/config/vax/elf.h
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/elf.h,v
retrieving revision 1.6
diff -u -u -r1.6 elf.h
--- external/gpl3/gcc/dist/gcc/config/vax/elf.h	23 Mar 2016 15:51:37 -0000	1.6
+++ external/gpl3/gcc/dist/gcc/config/vax/elf.h	26 Mar 2016 10:42:41 -0000
@@ -26,7 +26,7 @@ 
 #define REGISTER_PREFIX "%"
 #define REGISTER_NAMES \
   { "%r0", "%r1",  "%r2",  "%r3", "%r4", "%r5", "%r6", "%r7", \
-    "%r8", "%r9", "%r10", "%r11", "%ap", "%fp", "%sp", "%pc", }
+    "%r8", "%r9", "%r10", "%r11", "%ap", "%fp", "%sp", "%pc", "%psw", }
 
 #undef SIZE_TYPE
 #define SIZE_TYPE "long unsigned int"
@@ -45,18 +45,8 @@ 
    count pushed by the CALLS and before the start of the saved registers.  */
 #define INCOMING_FRAME_SP_OFFSET 0
 
-/* Offset from the frame pointer register value to the top of the stack.  */
-#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0
-
-/* We use R2-R5 (call-clobbered) registers for exceptions.  */
-#define EH_RETURN_DATA_REGNO(N) ((N) < 4 ? (N) + 2 : INVALID_REGNUM)
-
-/* Place the top of the stack for the DWARF2 EH stackadj value.  */
-#define EH_RETURN_STACKADJ_RTX						\
-  gen_rtx_MEM (SImode,							\
-	       plus_constant (Pmode,					\
-			      gen_rtx_REG (Pmode, FRAME_POINTER_REGNUM),\
-			      -4))
+/* We use R2-R3 (call-clobbered) registers for exceptions.  */
+#define EH_RETURN_DATA_REGNO(N) ((N) < 2 ? (N) + 2 : INVALID_REGNUM)
 
 /* Simple store the return handler into the call frame.  */
 #define EH_RETURN_HANDLER_RTX						\
@@ -66,10 +56,6 @@ 
 			      16))
 
 
-/* Reserve the top of the stack for exception handler stackadj value.  */
-#undef STARTING_FRAME_OFFSET
-#define STARTING_FRAME_OFFSET -4
-
 /* The VAX wants no space between the case instruction and the jump table.  */
 #undef  ASM_OUTPUT_BEFORE_CASE_LABEL
 #define ASM_OUTPUT_BEFORE_CASE_LABEL(FILE, PREFIX, NUM, TABLE)
Index: external/gpl3/gcc/dist/gcc/config/vax/vax-protos.h
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/vax-protos.h,v
retrieving revision 1.5
diff -u -u -r1.5 vax-protos.h
--- external/gpl3/gcc/dist/gcc/config/vax/vax-protos.h	23 Mar 2016 21:09:04 -0000	1.5
+++ external/gpl3/gcc/dist/gcc/config/vax/vax-protos.h	26 Mar 2016 10:42:41 -0000
@@ -30,7 +30,7 @@ 
 extern void print_operand (FILE *, rtx, int);
 extern void vax_notice_update_cc (rtx, rtx);
 extern void vax_expand_addsub_di_operands (rtx *, enum rtx_code);
-extern bool vax_decomposed_dimode_operand_p (rtx, rtx);
+/* extern bool vax_decomposed_dimode_operand_p (rtx, rtx); */
 extern const char * vax_output_int_move (rtx, rtx *, machine_mode);
 extern const char * vax_output_int_add (rtx, rtx *, machine_mode);
 extern const char * vax_output_int_subtract (rtx, rtx *, machine_mode);
Index: external/gpl3/gcc/dist/gcc/config/vax/vax.c
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/vax.c,v
retrieving revision 1.15
diff -u -u -r1.15 vax.c
--- external/gpl3/gcc/dist/gcc/config/vax/vax.c	24 Mar 2016 04:27:29 -0000	1.15
+++ external/gpl3/gcc/dist/gcc/config/vax/vax.c	26 Mar 2016 10:42:41 -0000
@@ -1,4 +1,4 @@ 
-/* Subroutines for insn-output.c for VAX.
+/* Subroutines used for code generation on VAX.
    Copyright (C) 1987-2015 Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -191,13 +191,17 @@ 
 vax_expand_prologue (void)
 {
   int regno, offset;
-  int mask = 0;
+  unsigned int mask = 0;
   HOST_WIDE_INT size;
   rtx insn;
 
   offset = 20;
-  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (df_regs_ever_live_p (regno) && !call_used_regs[regno])
+  /* We only care about r0 to r11 here. AP, FP, and SP are saved by CALLS.
+     Always save r2 and r3 when eh_return is called, to reserve space for
+     the stack unwinder to update them in the stack frame on exceptions.  */
+  for (regno = 0; regno < VAX_AP_REGNUM; regno++)
+    if ((df_regs_ever_live_p (regno) && !call_used_regs[regno])
+	|| (crtl->calls_eh_return && regno >= 2 && regno < 4))
       {
         mask |= 1 << regno;
         offset += 4;
@@ -240,8 +244,10 @@ 
   vax_add_reg_cfa_offset (insn, 16, pc_rtx);
 
   offset = 20;
-  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (mask & (1 << regno))
+
+  unsigned int testbit = 1;	/* Used to avoid calculating (1 << regno). */
+  for (regno = 0; regno < VAX_AP_REGNUM; regno++, testbit <<= 1)
+    if (mask & testbit)
       {
 	vax_add_reg_cfa_offset (insn, offset, gen_rtx_REG (SImode, regno));
 	offset += 4;
@@ -1909,12 +1915,20 @@ 
     return true;
   if (indirectable_address_p (x, strict, false))
     return true;
-  xfoo0 = XEXP (x, 0);
-  if (MEM_P (x) && indirectable_address_p (xfoo0, strict, true))
-    return true;
-  if ((GET_CODE (x) == PRE_DEC || GET_CODE (x) == POST_INC)
-      && BASE_REGISTER_P (xfoo0, strict))
-    return true;
+  /* Note: avoid calling XEXP until needed.  It may not be a valid type.
+     This fixes an assertion failure when RTX checking is enabled.  */
+  if (MEM_P (x))
+    {
+      xfoo0 = XEXP (x, 0);
+      if (indirectable_address_p (xfoo0, strict, true))
+	return true;
+    }
+  if (GET_CODE (x) == PRE_DEC || GET_CODE (x) == POST_INC)
+    {
+      xfoo0 = XEXP (x, 0);
+      if (BASE_REGISTER_P (xfoo0, strict))
+	return true;
+    }
   return false;
 }
 
@@ -2366,6 +2380,9 @@ 
 	   : (int_size_in_bytes (type) + 3) & ~3);
 }
 
+#if 0
+/* This is commented out because the only usage of it was the buggy
+   32-to-64-bit peephole optimizations that have been commented out.  */
 bool
 vax_decomposed_dimode_operand_p (rtx lo, rtx hi)
 {
@@ -2416,3 +2433,4 @@ 
 
   return rtx_equal_p(lo, hi) && lo_offset + 4 == hi_offset;
 }
+#endif
Index: external/gpl3/gcc/dist/gcc/config/vax/vax.h
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/vax.h,v
retrieving revision 1.7
diff -u -u -r1.7 vax.h
--- external/gpl3/gcc/dist/gcc/config/vax/vax.h	23 Mar 2016 15:51:37 -0000	1.7
+++ external/gpl3/gcc/dist/gcc/config/vax/vax.h	26 Mar 2016 10:42:41 -0000
@@ -120,13 +120,14 @@ 
    The hardware registers are assigned numbers for the compiler
    from 0 to just below FIRST_PSEUDO_REGISTER.
    All registers that the compiler knows about must be given numbers,
-   even those that are not normally considered general registers.  */
-#define FIRST_PSEUDO_REGISTER 16
+   even those that are not normally considered general registers.
+   This includes PSW, which the VAX backend did not originally include.  */
+#define FIRST_PSEUDO_REGISTER 17
 
 /* 1 for registers that have pervasive standard uses
    and are not available for the register allocator.
-   On the VAX, these are the AP, FP, SP and PC.  */
-#define FIXED_REGISTERS {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1}
+   On the VAX, these are the AP, FP, SP, PC, and PSW.  */
+#define FIXED_REGISTERS {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1}
 
 /* 1 for registers not available across function calls.
    These must include the FIXED_REGISTERS and also any
@@ -134,7 +135,7 @@ 
    The latter must include the registers where values are returned
    and the register where structure-value addresses are passed.
    Aside from that, you can include as many other registers as you like.  */
-#define CALL_USED_REGISTERS {1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1}
+#define CALL_USED_REGISTERS {1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1}
 
 /* Return number of consecutive hard regs needed starting at reg REGNO
    to hold something of mode MODE.
@@ -169,12 +170,12 @@ 
 /* Base register for access to local variables of the function.  */
 #define FRAME_POINTER_REGNUM VAX_FP_REGNUM
 
-/* Offset from the frame pointer register value to the top of stack.  */
-#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0
-
 /* Base register for access to arguments of the function.  */
 #define ARG_POINTER_REGNUM VAX_AP_REGNUM
 
+/* Offset from the argument pointer register value to the CFA.  */
+#define ARG_POINTER_CFA_OFFSET(FNDECL) 0
+
 /* Register in which static-chain is passed to a function.  */
 #define STATIC_CHAIN_REGNUM 0
 
@@ -395,9 +396,9 @@ 
    allocation.  */
 
 #define REGNO_OK_FOR_INDEX_P(regno)	\
-  ((regno) < FIRST_PSEUDO_REGISTER || reg_renumber[regno] >= 0)
+  ((regno) <= VAX_PC_REGNUM || reg_renumber[regno] >= 0)
 #define REGNO_OK_FOR_BASE_P(regno)	\
-  ((regno) < FIRST_PSEUDO_REGISTER || reg_renumber[regno] >= 0)
+  ((regno) <= VAX_PC_REGNUM || reg_renumber[regno] >= 0)
 

 /* Maximum number of registers that can appear in a valid memory address.  */
 
@@ -424,11 +425,11 @@ 
 
 /* Nonzero if X is a hard reg that can be used as an index
    or if it is a pseudo reg.  */
-#define REG_OK_FOR_INDEX_P(X) 1
+#define REG_OK_FOR_INDEX_P(X) ((regno) != VAX_PSW_REGNUM)
 
 /* Nonzero if X is a hard reg that can be used as a base reg
    or if it is a pseudo reg.  */
-#define REG_OK_FOR_BASE_P(X) 1
+#define REG_OK_FOR_BASE_P(X) ((regno) != VAX_PSW_REGNUM)
 
 #else
 
@@ -548,7 +551,7 @@ 
 #define REGISTER_PREFIX ""
 #define REGISTER_NAMES					\
   { "r0", "r1",  "r2",  "r3", "r4", "r5", "r6", "r7",	\
-    "r8", "r9", "r10", "r11", "ap", "fp", "sp", "pc", }
+    "r8", "r9", "r10", "r11", "ap", "fp", "sp", "pc", "psw", }
 
 /* This is BSD, so it wants DBX format.  */
 
Index: external/gpl3/gcc/dist/gcc/config/vax/vax.md
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/vax.md,v
retrieving revision 1.11
diff -u -u -r1.11 vax.md
--- external/gpl3/gcc/dist/gcc/config/vax/vax.md	23 Mar 2016 15:51:37 -0000	1.11
+++ external/gpl3/gcc/dist/gcc/config/vax/vax.md	26 Mar 2016 10:42:41 -0000
@@ -532,13 +532,13 @@ 
 
 ;This is left out because it is very slow;
 ;we are better off programming around the "lack" of this insn.
+;; It's also unclear whether the condition flags would be correct.
 ;(define_insn "divmoddisi4"
-;  [(set (match_operand:SI 0 "general_operand" "=g")
-;	(div:SI (match_operand:DI 1 "general_operand" "g")
-;		(match_operand:SI 2 "general_operand" "g")))
-;   (set (match_operand:SI 3 "general_operand" "=g")
-;	(mod:SI (match_operand:DI 1 "general_operand" "g")
-;		(match_operand:SI 2 "general_operand" "g")))]
+;  [(parallel [(set (match_operand:SI 0 "general_operand" "=g")
+;		   (div:SI (match_operand:DI 1 "general_operand" "nrmT")
+;			   (match_operand:SI 2 "general_operand" "nrmT")))
+;	      (set (match_operand:SI 3 "general_operand" "=g")
+;		   (mod:SI (match_dup 1) (match_dup 2)))])]
 ;  ""
 ;  "ediv %2,%1,%0,%3")