diff mbox

MIPS: MIPS32r2 FP MADD instruction set support

Message ID alpine.DEB.1.10.1302200035560.6762@tp.orcam.me.uk
State Superseded
Headers show

Commit Message

Maciej W. Rozycki Feb. 20, 2013, 11:50 p.m. UTC
Hi,

 This issue was originally raised here:

http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html

 We have a shortcoming in GCC in that we only allow the use half of the FP 
MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register 
mode (CP0.Status.FR == 1) on MIPS32r2 processors.  Furthermore we never 
enable the other half (NMADD.fmt and NMSUB.fmt) on those processors.  
However this whole instruction subset is always available on MIPS32r2 FPUs 
regardless of the mode selected, just as it always has been on FPUs of the 
64-bit ISA line from MIPS IV up.

 The paired-single format however is indeed only available in the 
64-bit/32-register mode as from the MIPS V ISA up.  We do explicitly allow 
it for some (or no) reason for NMADD.PS and NMSUB.PS on MIPS32r2 
processors in the 32-bit/16-register FPU mode (this is probably globally 
overridden elsewhere).

 I'm not sure where this GCC limitation came from, but there were typos in 
the formats listed for the MSUB.S, MSUB.D, NMADD.S, NMADD.D, NMSUB.S and 
NMSUB.D instructions up to and including rev. 2.50 of vol. II of the 
MIPS32r2 architecture documentation set (MIPS doc #MD00086).  This may or 
may not have contributed to this problem as these instructions were listed 
as available from "MIPS64" up rather than from "MIPS64, MIPS32 Release 2" 
up, so no mention of the FPU mode there.

 The change below lifts the relevant restrictions removing a lot of 
clutter that's not needed anymore now that the data mode does not have to 
be checked.

 Also, according to MIPS IV ISA documentation these operations are only 
fused (i.e. don't match original IEEE 754-1985 accuracy requirements) on 
the original MIPS IV R8000 CPU, and MIPS architecture specs don't mention 
any limitations of these instructions either, so I have updated the GCC 
manual to document that on non-R8000 CPUs (which are ones we really care 
about) they are numerically equivalent to computations made with 
corresponding individual operations.

 Finally, while at it, I found it interesting that we have separate 
conditions to cover MADD.fmt/MSUB.fmt (ISA_HAS_FP_MADD4_MSUB4) and 
NMADD.fmt/NMADD.fmt (ISA_HAS_NMADD4_NMSUB4) while all the four 
instructions need to be implemented as a whole group per data format 
supported and cannot be separated (the MIPS architecture specification 
explicitly forbids subsetting).  The difference between the two conditions 
is the former expands to ISA_HAS_FP4, that is enables the subsubset for 
any MIPS IV and up FPU while the latter has an extra "&& (!TARGET_MIPS5400 
|| TARGET_MAD)" qualifier.

 I went ahead and checked available NEC VR54xx documentation and here's 
what I came up with:

1. "VR5400 MIPS RISC Microprocessor Family" datasheet (NEC doc #13362) 
   says:

   "The VR5400 processor family complies with the MIPS IV instruction set 
   and IEEE-754 floating-point and IEEE-1149.1/1149.1a JTAG specification, 
   [...]"

2. "VR5432 MIPS RISC Microprocessor User's Manual, Volume 2" (NEC doc 
   #13751) lists all the individual MADD.fmt, MSUB.fmt, NMADD.fmt and
   NMSUB.fmt instructions in Chapter 18 "Floating-Point Unit Instruction 
   Set" with no restrictions as to their availability (the only other 
   member of the VR54xx family I know of is the VR5464 that is a 
   high-performance version of the VR5432 and is fully software 
   compatible).

 Further to that TARGET_MAD controls whether to "Use PMC-style 'mad' 
instructions" that are all CPU rather than FPU instructions.  The VR5432 
indeed supports extra integer multiply-accumulate instructions, as 
documented in #2 above; these are the MACC/MACCHI/MACCHIU/MACCU and 
MSAC/MSACHI/MSACHIU/MSACU instructions as roughly covered by our 
ISA_HAS_MACC, ISA_HAS_MSAC and ISA_HAS_MACCHI knobs (the latter is not 
implied for TARGET_MIPS5400, perhaps because the family does not support 
the doubleword variants).

 All in all it looks to me like a misplaced hunk.  It was introduced in 
rev. 56471 (you were named as one of the contributors on that commit, so 
you may be able to remember and/or correct me if I am wrong here anywhere) 
and it looks to me it should have been applied to the ISA_HAS_MADD_MSUB 
macro instead that's still just a few lines above ISA_HAS_NMADD4_NMSUB4 
(and was even closer to ISA_HAS_NMADD_NMSUB as the latter was then called; 
the bodies were close enough back then for a hunk to apply cleanly to 
either).

 These days we handle ISA_HAS_MADD_MSUB indirectly through 
GENERATE_MADD_MSUB and in many more places than back at rev. 56471.  We 
also handle TARGET_MAD and ISA_HAS_MACC/ISA_HAS_MSAC/ISA_HAS_MACCHI 
explicitly throughout mips.md, so I think we should simply discard this 
incorrect condition, and then, as ISA_HAS_FP_MADD4_MSUB4 and 
ISA_HAS_NMADD4_NMSUB4 will have become identical, fold the two macros into 
one, perhaps ISA_HAS_FP_MADD4.  And likewise ISA_HAS_FP_MADD3.  Thoughts?

 Back to the change considered here, it was successfully regression-tested 
with the gcc, g++ and libstdc++ testsuites, with the mips-linux-gnu target 
and the o32/mips32r2 and n64/mips64r2 multilibs, both endiannesses.  I 
examined some test cases executed to verify the instructions concerned 
have been emitted as appropriate where previously they were not.  I hope 
this change is OK to apply as soon as 4.9 has opened.

 BTW, do you happen to know a way to reliable force all our testsuites NOT 
to delete executables after run?  Personally I think it's missing the 
point to have them deleted -- how can one debug any regressions then?  I 
have a rather gross hack patching up most of the TCL scripts throughout to 
remove the various instances of file deletion commands and stick the 
-keep-output option onto dg-test calls, but there always seems to be 
something that escapes.  I find it very frustrating.

 What do other people do?  I can't believe all the GCC developers are 
happy to accept this pain.

2013-02-20  Maciej W. Rozycki  <macro@codesourcery.com>

	gcc/
	* config/mips/mips.h (ISA_HAS_FP4): Don't restrict ISA_MIPS32R2
	to TARGET_FLOAT64.
	(ISA_HAS_NMADD4_NMSUB4): Remove the MODE argument; don't restrict 
	ISA_MIPS32R2.
	(ISA_HAS_NMADD3_NMSUB3): Remove the MODE argument.
	* config/mips/mips.c (mips_rtx_costs): Update according to
	changes to ISA_HAS_NMADD4_NMSUB4 and ISA_HAS_NMADD3_NMSUB3.
	* config/mips/mips.md (nmadd4<mode>, nmadd3<mode>): Likewise.
	(nmadd4<mode>_fastmath, nmadd3<mode>_fastmath): Likewise.
	(nmsub4<mode>, nmsub3<mode>): Likewise.
	(nmsub4<mode>_fastmath, nmsub3<mode>_fastmath): Likewise.
	* doc/invoke.texi (MIPS Options): Update documentation of the
	floating-point multiply-accumulate instruction restrictions.

  Maciej

gcc-mips32r2-madd.patch

Comments

Richard Sandiford Feb. 21, 2013, 9 a.m. UTC | #1
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>  This issue was originally raised here:
>
> http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html
>
>  We have a shortcoming in GCC in that we only allow the use half of the FP 
> MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register 
> mode (CP0.Status.FR == 1) on MIPS32r2 processors.  Furthermore we never 
> enable the other half (NMADD.fmt and NMSUB.fmt) on those processors.  
> However this whole instruction subset is always available on MIPS32r2 FPUs 
> regardless of the mode selected, just as it always has been on FPUs of the 
> 64-bit ISA line from MIPS IV up.

Hmm, this was discussed here:

    http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00488.html
    http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00492.html

The footnote for COP1X instructions on page 12 of volume 1 of the MIPS32 ISA
(v2.50) says:

1. In Release 1 of the Architecture, these instructions are legal only
   with a MIPS64 processor with 64-bit operations enabled (they are, in
   effect, actually MIPS64 instructions). In Release 2 of the Architecture,
   these instructions are legal with either a MIPS32 or MIPS64 processor
   _which includes a 64-bit floating point unit_.

(my emphasis).  "which" rather than "that" makes this a bit ambiguous,
but various comments in the rest of the manual imply that MIPS32r2 allows
an implementation choice between 32-bit and 64-bit FPUs.  E.g. page 8 says:

  Support for 64-bit coprocessors with 32-bit CPUs: These changes allow
  a 64-bit coprocessor (including an FPU) to be attached to a 32-bit
  CPU. This enhancement is optional in a Release 2 implementation.

and page 45 says:

  In addition to an Instruction Set Architecture, the MIPS architecture
  definition includes processing resources such as the set of
  coprocessor general registers. In Release 1 of the Architecture, the
  32-bit registers in MIPS32 were enlarged to 64-bits in MIPS64;
  however, these 64-bit FPU registers are not backwards
  compatible. Instead, processors implementing the MIPS64 Architecture
  provide a mode bit to select either the 32-bit or 64-bit register
  model. In Release 2 of the Architecture, a 32-bit CPU _may_ include a
  full 64-bit coprocessor, including a floating point unit which
  implements the same mode bit to select 32-bit or 64-bit FPU register
  model.

On page 322 of volume 2, the footnote for "Table A-20 MIPS64 COP1X
Encoding of Function Field" uses slightly different wording:

  COP1X instructions are legal only if 64-bit floating point operations
  are enabled.

So was this all a big misunderstanding on my part?  The TARGET_FLOAT64
condition came from MIPS themselves, and when challenged they seemed
pretty adamant that it was correct.  If I was wrong to be convinced
by the explanation, I hope you can at least see why I was convinced. :-)

If it wasn't a misunderstanding, then the point is that we can't tell
from ISA_MIPS32R2 alone whether the target has a 32-bit or 64-bit FPU,
but we know that it must have a 64-bit FPU if using TARGET_FLOAT64.

>  Also, according to MIPS IV ISA documentation these operations are only 
> fused (i.e. don't match original IEEE 754-1985 accuracy requirements) on 
> the original MIPS IV R8000 CPU, and MIPS architecture specs don't mention 
> any limitations of these instructions either, so I have updated the GCC 
> manual to document that on non-R8000 CPUs (which are ones we really care 
> about) they are numerically equivalent to computations made with 
> corresponding individual operations.

This part is OK, thanks, and is probably the only bit that's suitable for
4.8 at this stage.  Would you mind applying it separately?

>  Finally, while at it, I found it interesting that we have separate 
> conditions to cover MADD.fmt/MSUB.fmt (ISA_HAS_FP_MADD4_MSUB4) and 
> NMADD.fmt/NMADD.fmt (ISA_HAS_NMADD4_NMSUB4) while all the four 
> instructions need to be implemented as a whole group per data format 
> supported and cannot be separated (the MIPS architecture specification 
> explicitly forbids subsetting).  The difference between the two conditions 
> is the former expands to ISA_HAS_FP4, that is enables the subsubset for 
> any MIPS IV and up FPU while the latter has an extra "&& (!TARGET_MIPS5400 
> || TARGET_MAD)" qualifier.
>
>  I went ahead and checked available NEC VR54xx documentation and here's 
> what I came up with:
>
> 1. "VR5400 MIPS RISC Microprocessor Family" datasheet (NEC doc #13362) 
>    says:
>
>    "The VR5400 processor family complies with the MIPS IV instruction set 
>    and IEEE-754 floating-point and IEEE-1149.1/1149.1a JTAG specification, 
>    [...]"
>
> 2. "VR5432 MIPS RISC Microprocessor User's Manual, Volume 2" (NEC doc 
>    #13751) lists all the individual MADD.fmt, MSUB.fmt, NMADD.fmt and
>    NMSUB.fmt instructions in Chapter 18 "Floating-Point Unit Instruction 
>    Set" with no restrictions as to their availability (the only other 
>    member of the VR54xx family I know of is the VR5464 that is a 
>    high-performance version of the VR5432 and is fully software 
>    compatible).
>
>  Further to that TARGET_MAD controls whether to "Use PMC-style 'mad' 
> instructions" that are all CPU rather than FPU instructions.  The VR5432 
> indeed supports extra integer multiply-accumulate instructions, as 
> documented in #2 above; these are the MACC/MACCHI/MACCHIU/MACCU and 
> MSAC/MSACHI/MSACHIU/MSACU instructions as roughly covered by our 
> ISA_HAS_MACC, ISA_HAS_MSAC and ISA_HAS_MACCHI knobs (the latter is not 
> implied for TARGET_MIPS5400, perhaps because the family does not support 
> the doubleword variants).
>
>  All in all it looks to me like a misplaced hunk.  It was introduced in 
> rev. 56471 (you were named as one of the contributors on that commit, so 
> you may be able to remember and/or correct me if I am wrong here anywhere) 
> and it looks to me it should have been applied to the ISA_HAS_MADD_MSUB 
> macro instead that's still just a few lines above ISA_HAS_NMADD4_NMSUB4 
> (and was even closer to ISA_HAS_NMADD_NMSUB as the latter was then called; 
> the bodies were close enough back then for a hunk to apply cleanly to 
> either).

I was named in that commit but the VR54xx stuff wasn't mine.  I do remember
that Mike put a lot of effort into tuning the VR54xx madd stuff though,
because of the difficulty of having multiply-accumulate instructions
that force the use of HI/LO on an architecture that also has efficient
three-operand multiplies.  So I'm pretty sure that this worked correctly
in the Cygnus devo tree, and your explanation of a misplaced hunk seems
very convincing.

Richard
Maciej W. Rozycki Feb. 21, 2013, 4:31 p.m. UTC | #2
On Thu, 21 Feb 2013, Richard Sandiford wrote:

> >  This issue was originally raised here:
> >
> > http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html
> >
> >  We have a shortcoming in GCC in that we only allow the use half of the FP 
> > MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register 
> > mode (CP0.Status.FR == 1) on MIPS32r2 processors.  Furthermore we never 
> > enable the other half (NMADD.fmt and NMSUB.fmt) on those processors.  
> > However this whole instruction subset is always available on MIPS32r2 FPUs 
> > regardless of the mode selected, just as it always has been on FPUs of the 
> > 64-bit ISA line from MIPS IV up.
> 
> Hmm, this was discussed here:
> 
>     http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00488.html
>     http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00492.html
> 
> The footnote for COP1X instructions on page 12 of volume 1 of the MIPS32 ISA
> (v2.50) says:

 Please note that v2.50 is obsolete, several typos were corrected in later 
revisions.  Sadly I don't even have a copy of that exact revision of 
volume I.

> 1. In Release 1 of the Architecture, these instructions are legal only
>    with a MIPS64 processor with 64-bit operations enabled (they are, in
>    effect, actually MIPS64 instructions). In Release 2 of the Architecture,
>    these instructions are legal with either a MIPS32 or MIPS64 processor
>    _which includes a 64-bit floating point unit_.
> 
> (my emphasis).  "which" rather than "that" makes this a bit ambiguous,
> but various comments in the rest of the manual imply that MIPS32r2 allows
> an implementation choice between 32-bit and 64-bit FPUs.  E.g. page 8 says:
> 
>   Support for 64-bit coprocessors with 32-bit CPUs: These changes allow
>   a 64-bit coprocessor (including an FPU) to be attached to a 32-bit
>   CPU. This enhancement is optional in a Release 2 implementation.
> 
> and page 45 says:
> 
>   In addition to an Instruction Set Architecture, the MIPS architecture
>   definition includes processing resources such as the set of
>   coprocessor general registers. In Release 1 of the Architecture, the
>   32-bit registers in MIPS32 were enlarged to 64-bits in MIPS64;
>   however, these 64-bit FPU registers are not backwards
>   compatible. Instead, processors implementing the MIPS64 Architecture
>   provide a mode bit to select either the 32-bit or 64-bit register
>   model. In Release 2 of the Architecture, a 32-bit CPU _may_ include a
>   full 64-bit coprocessor, including a floating point unit which
>   implements the same mode bit to select 32-bit or 64-bit FPU register
>   model.
> 
> On page 322 of volume 2, the footnote for "Table A-20 MIPS64 COP1X
> Encoding of Function Field" uses slightly different wording:
> 
>   COP1X instructions are legal only if 64-bit floating point operations
>   are enabled.
> 
> So was this all a big misunderstanding on my part?  The TARGET_FLOAT64
> condition came from MIPS themselves, and when challenged they seemed
> pretty adamant that it was correct.  If I was wrong to be convinced
> by the explanation, I hope you can at least see why I was convinced. :-)

 As good as my English comprehension might be I believe your understanding 
of documentation is right.  However several inconsistencies have already 
been spotted and it may well be that for example the restriction on COP1X 
instructions is simply a leftover that leaked from rev.1 of the MIPS64 
architecture and was not spotted on the update (please note that obviously 
all the architecture documents are produced from templates shared between 
the MIPS32 and MIPS64 architectures with some parts being output 
conditionally as appropriate).

> If it wasn't a misunderstanding, then the point is that we can't tell
> from ISA_MIPS32R2 alone whether the target has a 32-bit or 64-bit FPU,
> but we know that it must have a 64-bit FPU if using TARGET_FLOAT64.

 Well, the important point is as I understand all the MIPS32r2 FPU 
instructions (not all the FP formats of course though, L and PS are 
optional) are mandatory no matter if the FPU is 32-bit or 64-bit.  This 
should rule out any question as to what kind of FPU has been implemented 
or whether the 64-bit mode has been enabled.  I could be wrong however.

 However the good news is it's not the architecture documentation set that 
is normative for architecture implementers, it's the AVP (Architecture 
Verification Program) they have to run successfully on their processors to 
claim compliance.

 Steve, would you therefore please do us a favour and check what the AVP 
for MIPS32r2 requires for support of the floating-point MADD instruction 
subset, or preferably, the whole COP1X instruction set?  Are these 
instructions only mandatory for a 64-bit FPU (CP1.FIR.F64 == 1), or do 
they have to be included in all FPU implementations?

> This part is OK, thanks, and is probably the only bit that's suitable for
> 4.8 at this stage.  Would you mind applying it separately?

 Applied now, thanks.

> I was named in that commit but the VR54xx stuff wasn't mine.  I do remember
> that Mike put a lot of effort into tuning the VR54xx madd stuff though,
> because of the difficulty of having multiply-accumulate instructions
> that force the use of HI/LO on an architecture that also has efficient
> three-operand multiplies.  So I'm pretty sure that this worked correctly
> in the Cygnus devo tree, and your explanation of a misplaced hunk seems
> very convincing.

 I'll make appropriate adjustments then.  Also, regardless of the outcome 
of the instruction presence on 32-bit FPUs, do you agree it is a good idea 
to fold ISA_HAS_FP_MADD4_MSUB4 and ISA_HAS_NMADD4_NMSUB4 into one macro?

  Maciej
Mike Stump Feb. 21, 2013, 9:47 p.m. UTC | #3
On Feb 20, 2013, at 3:50 PM, Maciej W. Rozycki <macro@codesourcery.com> wrote:
> BTW, do you happen to know a way to reliable force all our testsuites NOT 
> to delete executables after run?  Personally I think it's missing the 
> point to have them deleted -- how can one debug any regressions then?

So, I go into the source and comment out the cleanup lines…  Would be nice to have them conditional on a variable so that one can set that variable in their .dejagnurc file.

The limitations of this are, multiple test cases can use the same intermediate files and they would not give you a nice 1 to 1 mapping back.  However, this I usually side step by only running a subset of the test suite that contains the part I'm interested in.

> What do other people do?

We usually can cut and paste the one line and run the one case by hand.  Test cases that don't fall into this category, well, suck, I mean, are more annoying.
Ramana Radhakrishnan Feb. 21, 2013, 10:59 p.m. UTC | #4
>> What do other people do?
>
> We usually can cut and paste the one line and run the one case by hand.  Test cases that don't fall into this category, well, suck, I mean, are more annoying.
>

These days I find contrib/repro_fail <regexpforthefailingtest>
file.log.useful to rebuild the failing test(s).

regards
Ramana
Steve Ellcey Feb. 27, 2013, 5:12 p.m. UTC | #5
From: Maciej W. Rozycki [macro@codesourcery.com]

>  Steve, would you therefore please do us a favour and check what the AVP
> for MIPS32r2 requires for support of the floating-point MADD instruction
> subset, or preferably, the whole COP1X instruction set?  Are these
> instructions only mandatory for a 64-bit FPU (CP1.FIR.F64 == 1), or do
> they have to be included in all FPU implementations

Maciej,

I checked with one of the AVP engineers and he said:

"madd/msub/nmadd/nmsub.fmt instructions are required for all release 2 implementations,
whether or not that implementation has a 64 bit fpu. They are also required for mips64
release 1 implementations.  The AVPs check for this."

Steve Ellcey
Richard Sandiford Feb. 27, 2013, 7:38 p.m. UTC | #6
Steve Ellcey <Steve.Ellcey@imgtec.com> writes:
> From: Maciej W. Rozycki [macro@codesourcery.com]
>
>>  Steve, would you therefore please do us a favour and check what the AVP
>> for MIPS32r2 requires for support of the floating-point MADD instruction
>> subset, or preferably, the whole COP1X instruction set?  Are these
>> instructions only mandatory for a 64-bit FPU (CP1.FIR.F64 == 1), or do
>> they have to be included in all FPU implementations
>
> Maciej,
>
> I checked with one of the AVP engineers and he said:
>
> "madd/msub/nmadd/nmsub.fmt instructions are required for all release 2
> implementations,
> whether or not that implementation has a 64 bit fpu. They are also
> required for mips64
> release 1 implementations.  The AVPs check for this."

Steve, thanks for checking.

Maciej, in that case, the rest of the patch is OK for 4.9, thanks.

Richard
Maciej W. Rozycki Feb. 27, 2013, 7:59 p.m. UTC | #7
On Wed, 27 Feb 2013, Richard Sandiford wrote:

> > I checked with one of the AVP engineers and he said:
> >
> > "madd/msub/nmadd/nmsub.fmt instructions are required for all release 2
> > implementations,
> > whether or not that implementation has a 64 bit fpu. They are also
> > required for mips64
> > release 1 implementations.  The AVPs check for this."
> 
> Steve, thanks for checking.

 I concur, thank you, Steve.

> Maciej, in that case, the rest of the patch is OK for 4.9, thanks.

 I will apply in due course then, thanks for your review.

  Maciej
Maciej W. Rozycki March 9, 2013, 3:32 a.m. UTC | #8
On Thu, 21 Feb 2013, Mike Stump wrote:

> > BTW, do you happen to know a way to reliable force all our testsuites NOT 
> > to delete executables after run?  Personally I think it's missing the 
> > point to have them deleted -- how can one debug any regressions then?
> 
> So, I go into the source and comment out the cleanup lines… Would be 
> nice to have them conditional on a variable so that one can set that 
> variable in their .dejagnurc file.

 Hmm, good idea, I'll see if the hack I use could be adapted easily.

> The limitations of this are, multiple test cases can use the same 
> intermediate files and they would not give you a nice 1 to 1 mapping 
> back.  However, this I usually side step by only running a subset of the 
> test suite that contains the part I'm interested in.

 Yeah, I think final executables that don't have unique names are even 
worse.  It would be good to avoid at least making such new cases.

> > What do other people do?
> 
> We usually can cut and paste the one line and run the one case by hand.  
> Test cases that don't fall into this category, well, suck, I mean, are 
> more annoying.

 For this to work you need to have at least all the auxiliary stuff needed 
for linking, etc. that's built once per test suite run still available.  
Somehow I found it troublesome to rebuild it manually.

 Thanks for all the hints -- it looks to me like there's no easy answer.  
In which case I'll see if I can do anything about the hack I have in case 
someone else finds it useful.

  Maciej
diff mbox

Patch

Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c	2013-02-07 02:59:05.465114046 +0000
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c	2013-02-07 02:59:48.575511623 +0000
@@ -3798,7 +3798,7 @@  mips_rtx_costs (rtx x, int code, int out
 
     case MINUS:
       if (float_mode_p
-	  && (ISA_HAS_NMADD4_NMSUB4 (mode) || ISA_HAS_NMADD3_NMSUB3 (mode))
+	  && (ISA_HAS_NMADD4_NMSUB4 || ISA_HAS_NMADD3_NMSUB3)
 	  && TARGET_FUSED_MADD
 	  && !HONOR_NANS (mode)
 	  && !HONOR_SIGNED_ZEROS (mode))
@@ -3850,7 +3850,7 @@  mips_rtx_costs (rtx x, int code, int out
 
     case NEG:
       if (float_mode_p
-	  && (ISA_HAS_NMADD4_NMSUB4 (mode) || ISA_HAS_NMADD3_NMSUB3 (mode))
+	  && (ISA_HAS_NMADD4_NMSUB4 || ISA_HAS_NMADD3_NMSUB3)
 	  && TARGET_FUSED_MADD
 	  && !HONOR_NANS (mode)
 	  && HONOR_SIGNED_ZEROS (mode))
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.h
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.h	2013-02-07 02:35:34.024073830 +0000
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.h	2013-02-07 02:59:48.575511623 +0000
@@ -855,7 +855,7 @@  struct mips_cpu_info {
    FP madd and msub instructions, and the FP recip and recip sqrt
    instructions.  */
 #define ISA_HAS_FP4		((ISA_MIPS4				\
-				  || (ISA_MIPS32R2 && TARGET_FLOAT64)   \
+				  || ISA_MIPS32R2			\
 				  || ISA_MIPS64				\
 				  || ISA_MIPS64R2)			\
 				 && !TARGET_MIPS16)
@@ -885,18 +885,12 @@  struct mips_cpu_info {
 
 /* ISA has floating-point nmadd and nmsub instructions
    'd = -((a * b) [+-] c)'.  */
-#define ISA_HAS_NMADD4_NMSUB4(MODE)					\
-				((ISA_MIPS4				\
-				  || (ISA_MIPS32R2 && (MODE) == V2SFmode) \
-				  || ISA_MIPS64				\
-				  || ISA_MIPS64R2)			\
-				 && (!TARGET_MIPS5400 || TARGET_MAD)	\
-				 && !TARGET_MIPS16)
+#define ISA_HAS_NMADD4_NMSUB4	(ISA_HAS_FP4				\
+				 && (!TARGET_MIPS5400 || TARGET_MAD))
 
 /* ISA has floating-point nmadd and nmsub instructions
    'c = -((a * b) [+-] c)'.  */
-#define ISA_HAS_NMADD3_NMSUB3(MODE)					\
-                                TARGET_LOONGSON_2EF
+#define ISA_HAS_NMADD3_NMSUB3	TARGET_LOONGSON_2EF
 
 /* ISA has count leading zeroes/ones instruction (not implemented).  */
 #define ISA_HAS_CLZ_CLO		((ISA_MIPS32				\
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.md
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.md	2013-02-07 02:35:34.004034605 +0000
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.md	2013-02-07 02:59:48.585476315 +0000
@@ -2344,7 +2344,7 @@ 
 		   (mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
 			      (match_operand:ANYF 2 "register_operand" "f"))
 		   (match_operand:ANYF 3 "register_operand" "f"))))]
-  "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode)
+  "ISA_HAS_NMADD4_NMSUB4
    && TARGET_FUSED_MADD
    && HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2359,7 +2359,7 @@ 
 		   (mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
 			      (match_operand:ANYF 2 "register_operand" "f"))
 		   (match_operand:ANYF 3 "register_operand" "0"))))]
-  "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode)
+  "ISA_HAS_NMADD3_NMSUB3
    && TARGET_FUSED_MADD
    && HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2374,7 +2374,7 @@ 
 	 (mult:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f"))
 		    (match_operand:ANYF 2 "register_operand" "f"))
 	 (match_operand:ANYF 3 "register_operand" "f")))]
-  "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode)
+  "ISA_HAS_NMADD4_NMSUB4
    && TARGET_FUSED_MADD
    && !HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2389,7 +2389,7 @@ 
 	 (mult:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f"))
 		    (match_operand:ANYF 2 "register_operand" "f"))
 	 (match_operand:ANYF 3 "register_operand" "0")))]
-  "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode)
+  "ISA_HAS_NMADD3_NMSUB3
    && TARGET_FUSED_MADD
    && !HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2404,7 +2404,7 @@ 
 		   (mult:ANYF (match_operand:ANYF 2 "register_operand" "f")
 			      (match_operand:ANYF 3 "register_operand" "f"))
 		   (match_operand:ANYF 1 "register_operand" "f"))))]
-  "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode)
+  "ISA_HAS_NMADD4_NMSUB4
    && TARGET_FUSED_MADD
    && HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2419,7 +2419,7 @@ 
 		   (mult:ANYF (match_operand:ANYF 2 "register_operand" "f")
 			      (match_operand:ANYF 3 "register_operand" "f"))
 		   (match_operand:ANYF 1 "register_operand" "0"))))]
-  "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode)
+  "ISA_HAS_NMADD3_NMSUB3
    && TARGET_FUSED_MADD
    && HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2434,7 +2434,7 @@ 
 	 (match_operand:ANYF 1 "register_operand" "f")
 	 (mult:ANYF (match_operand:ANYF 2 "register_operand" "f")
 		    (match_operand:ANYF 3 "register_operand" "f"))))]
-  "ISA_HAS_NMADD4_NMSUB4 (<MODE>mode)
+  "ISA_HAS_NMADD4_NMSUB4
    && TARGET_FUSED_MADD
    && !HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
@@ -2449,7 +2449,7 @@ 
 	 (match_operand:ANYF 1 "register_operand" "f")
 	 (mult:ANYF (match_operand:ANYF 2 "register_operand" "f")
 		    (match_operand:ANYF 3 "register_operand" "0"))))]
-  "ISA_HAS_NMADD3_NMSUB3 (<MODE>mode)
+  "ISA_HAS_NMADD3_NMSUB3
    && TARGET_FUSED_MADD
    && !HONOR_SIGNED_ZEROS (<MODE>mode)
    && !HONOR_NANS (<MODE>mode)"
Index: gcc-fsf-trunk-quilt/gcc/doc/invoke.texi
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/doc/invoke.texi	2013-02-07 02:21:07.574131467 +0000
+++ gcc-fsf-trunk-quilt/gcc/doc/invoke.texi	2013-02-07 02:59:48.585476315 +0000
@@ -16440,10 +16440,12 @@  Enable (disable) use of the floating-poi
 instructions, when they are available.  The default is
 @option{-mfused-madd}.
 
-When multiply-accumulate instructions are used, the intermediate
-product is calculated to infinite precision and is not subject to
-the FCSR Flush to Zero bit.  This may be undesirable in some
-circumstances.
+On the R8000 CPU when multiply-accumulate instructions are used,
+the intermediate product is calculated to infinite precision
+and is not subject to the FCSR Flush to Zero bit.  This may be
+undesirable in some circumstances.  On other processors the result
+is numerically identical to the equivalent computation using
+separate multiply, add, subtract and negate instructions.
 
 @item -nocpp
 @opindex nocpp