Patchwork Support for MIPS r5900

login
register
mail settings
Submitter Jürgen Urban
Date Jan. 6, 2013, 10:56 p.m.
Message ID <20130106225645.190700@gmx.net>
Download mbox | patch
Permalink /patch/209842/
State New
Headers show

Comments

Jürgen Urban - Jan. 6, 2013, 10:56 p.m.
Hello,

I created a patch from scratch to support MIPS r5900 used in the Playstation 2, but I have some problems with it.
The attached patch only works with the latest binutils from CVS. The binutils forces the compiler to use r5900 compatible instructions which is good to find errors in the GCC. Later I will try to submit a patch here, but first I need some help.
The MIPS r5900 supports 32 bit, 64 bit and 128 bit data accesses on a 32 Bit address bus. It supports instructions from MIPS ISA I, II, III, IV and has additional instructions, but none of them are complete. On each ISA level there are instructions missing.
It can run MIPS ABI o32, n32 and o64 code, as long as unsupported instructions are not used or emulated by the operating system and the addresses keep in the first 32 bit.
My patch adds support for r5900 and tries to disable the following unsupported instructions:
ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions.
ll and sc is disabled with "-mno-llsc" and works.
cvt.w.s is replaced by trunc.w.s. This seems to work.
I disabled 64 bit FPU instructions by "-msoft-float". This works, but using "-msingle-float" fails. This would be the better configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1   $2,$f0" when using "long double" multiplication). So "-msingle-float" doesn't seem to work on generic mips64-linux-gnu.
I tried to disable dmult and ddiv (see mips.md). Disabling worked, but now muldi3 calls itself in libgcc2. I thought this should work, because I got this working with GCC 4.3, but the latest GCC version is a problem. multi3 is calling muldi3, so that muldi3 should be able to use mulsi3, because it is the same C code in libgcc2. Can someone get me some hints or comments? How can this be debugged?

Does someone know how to enable TImode in MIPS ABI o32 (this doesn't need to use the 128 bit instructions at the moment)? There is some old code for the Playstation 2 which needs this. I know that TImode is supported in ABI n32, but some code uses also the 32 bit FPU and FPU registers are not available with "-msoft-float" in inline assembler.

What is the best way to change the alignment to 128 bit for all structures and stack in any MIPS ABI? Much old code for the Playstation 2 expects this.

Best regards
Jürgen Urban
Jeff Law - Jan. 7, 2013, 5:15 p.m.
On 01/06/2013 03:56 PM, "Jürgen Urban" wrote:
> Hello,
>
> I created a patch from scratch to support MIPS r5900 used in the
> Playstation 2, but I have some problems with it. The attached patch
> only works with the latest binutils from CVS. The binutils forces the
> compiler to use r5900 compatible instructions which is good to find
> errors in the GCC. Later I will try to submit a patch here, but first
> I need some help. The MIPS r5900 supports 32 bit, 64 bit and 128 bit
> data accesses on a 32 Bit address bus. It supports instructions from
> MIPS ISA I, II, III, IV and has additional instructions, but none of
> them are complete. On each ISA level there are instructions missing.
> It can run MIPS ABI o32, n32 and o64 code, as long as unsupported
> instructions are not used or emulated by the operating system and the
> addresses keep in the first 32 bit. My patch adds support for r5900
> and tries to disable the following unsupported instructions: ll, sc,
> dmult, ddiv, cvt.w.s, 64 bit FPU instructions. ll and sc is disabled
> with "-mno-llsc" and works. cvt.w.s is replaced by trunc.w.s. This
> seems to work. I disabled 64 bit FPU instructions by "-msoft-float".
> This works, but using "-msingle-float" fails. This would be the
> better configuration. There are still 64 bit FPU instructions used
> (e.g. "dmfc1   $2,$f0" when using "long double" multiplication). So
> "-msingle-float" doesn't seem to work on generic mips64-linux-gnu. I
> tried to disable dmult and ddiv (see mips.md). Disabling worked, but
> now muldi3 calls itself in libgcc2. I thought this should work,
> because I got this working with GCC 4.3, but the latest GCC version
> is a problem. multi3 is calling muldi3, so that muldi3 should be able
> to use mulsi3, because it is the same C code in libgcc2. Can someone
> get me some hints or comments? How can this be debugged?
>
> Does someone know how to enable TImode in MIPS ABI o32 (this doesn't
> need to use the 128 bit instructions at the moment)? There is some
> old code for the Playstation 2 which needs this. I know that TImode
> is supported in ABI n32, but some code uses also the 32 bit FPU and
> FPU registers are not available with "-msoft-float" in inline
> assembler.
>
> What is the best way to change the alignment to 128 bit for all
> structures and stack in any MIPS ABI? Much old code for the
> Playstation 2 expects this.
Hmm, I did a R5900 port back in the late 90s...  Did that port never get 
contributed (yes, my memory is that bad these days)

As far as getting TI mode working, IIRC I did a configury hack of some 
sort to force using a 64bit host wide integer, that in turn made it 
possible to support TImode as a pair of 64bit HWIs.

As far as aligning structures and the stack, GCC has a standard set of 
macros to define structure & stack alignment.

Jeff
Richard Sandiford - Jan. 7, 2013, 8:44 p.m.
Jeff Law <law@redhat.com> writes:
> On 01/06/2013 03:56 PM, "Jürgen Urban" wrote:
>> Hello,
>>
>> I created a patch from scratch to support MIPS r5900 used in the
>> Playstation 2, but I have some problems with it. The attached patch
>> only works with the latest binutils from CVS. The binutils forces the
>> compiler to use r5900 compatible instructions which is good to find
>> errors in the GCC. Later I will try to submit a patch here, but first
>> I need some help. The MIPS r5900 supports 32 bit, 64 bit and 128 bit
>> data accesses on a 32 Bit address bus. It supports instructions from
>> MIPS ISA I, II, III, IV and has additional instructions, but none of
>> them are complete. On each ISA level there are instructions missing.
>> It can run MIPS ABI o32, n32 and o64 code, as long as unsupported
>> instructions are not used or emulated by the operating system and the
>> addresses keep in the first 32 bit. My patch adds support for r5900
>> and tries to disable the following unsupported instructions: ll, sc,
>> dmult, ddiv, cvt.w.s, 64 bit FPU instructions. ll and sc is disabled
>> with "-mno-llsc" and works. cvt.w.s is replaced by trunc.w.s. This
>> seems to work. I disabled 64 bit FPU instructions by "-msoft-float".
>> This works, but using "-msingle-float" fails. This would be the
>> better configuration. There are still 64 bit FPU instructions used
>> (e.g. "dmfc1   $2,$f0" when using "long double" multiplication). So
>> "-msingle-float" doesn't seem to work on generic mips64-linux-gnu. I
>> tried to disable dmult and ddiv (see mips.md). Disabling worked, but
>> now muldi3 calls itself in libgcc2. I thought this should work,
>> because I got this working with GCC 4.3, but the latest GCC version
>> is a problem. multi3 is calling muldi3, so that muldi3 should be able
>> to use mulsi3, because it is the same C code in libgcc2. Can someone
>> get me some hints or comments? How can this be debugged?
>>
>> Does someone know how to enable TImode in MIPS ABI o32 (this doesn't
>> need to use the 128 bit instructions at the moment)? There is some
>> old code for the Playstation 2 which needs this. I know that TImode
>> is supported in ABI n32, but some code uses also the 32 bit FPU and
>> FPU registers are not available with "-msoft-float" in inline
>> assembler.
>>
>> What is the best way to change the alignment to 128 bit for all
>> structures and stack in any MIPS ABI? Much old code for the
>> Playstation 2 expects this.
> Hmm, I did a R5900 port back in the late 90s...  Did that port never get 
> contributed (yes, my memory is that bad these days)

I remember there was talk in the early 2000s of contributing it, but we
never had time.  I think the MIPS copro support was from the r5900 port --
so it's effectively dead at the moment -- and the MODE_HAS_* stuff was
from a refresh of it.  Both of those made their way in, but I think that
was about it.

Richard
Richard Sandiford - Jan. 7, 2013, 9:52 p.m.
"Jürgen Urban" <JuergenUrban@gmx.de> writes:
> ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions.
> ll and sc is disabled with "-mno-llsc" and works.
> cvt.w.s is replaced by trunc.w.s. This seems to work.

Probably showing my ignorance, but I couldn't see this in the patch.

> I disabled 64 bit FPU instructions by "-msoft-float". This works, but
> using "-msingle-float" fails. This would be the better
> configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
> $2,$f0" when using "long double" multiplication). So "-msingle-float"
> doesn't seem to work on generic mips64-linux-gnu.

Right.  That combination hasn't really been defined.  What happens
for plain doubles?  Do you pass those in FPRs or GPRs?

> I tried to disable dmult and ddiv (see mips.md). Disabling worked, but
> now muldi3 calls itself in libgcc2. I thought this should work, because
> I got this working with GCC 4.3, but the latest GCC version is a
> problem. multi3 is calling muldi3, so that muldi3 should be able to use
> mulsi3, because it is the same C code in libgcc2. Can someone get me
> some hints or comments? How can this be debugged?

Not sure, sorry.

> Does someone know how to enable TImode in MIPS ABI o32 (this doesn't
> need to use the 128 bit instructions at the moment)? There is some old
> code for the Playstation 2 which needs this. I know that TImode is
> supported in ABI n32, but some code uses also the 32 bit FPU and FPU
> registers are not available with "-msoft-float" in inline assembler.

The n32 TImode support you mention uses pairs of GPRs, whereas I imagine
you'd eventually want to use a single 128-bit GPR.  Is that right?

TImode in the current n32 sense doesn't really make practical sense
for 32-bit targets.  We'd be dealing with quad registers in that case.
I think it would only make sense with TImode registers.

ISTR the TImode registers being a can of worms, especially with the
optimisation to only store the lower 64 bits if the upper 64 weren't used.
(Am I remembering that right?)

When you submit the TImode register support, please make the support
itself a separate patch from the ABI changes.  I.e. one patch to
add TImode registers, one to add TImode to o32, one to add single-GPR
TImode to n32, etc.  For the record, I think all those patches would be
too invasive this late into the 4.8 cycle so would have to wait for 4.9.

Also, any ABI changes should be conditional on a new flag rather than
keyed off the architecture.  That flag would then be the default for
your new configuration.

> What is the best way to change the alignment to 128 bit for all
> structures and stack in any MIPS ABI? Much old code for the Playstation
> 2 expects this.

The stack is STACK_BOUNDARY (already 128 for n32).  Do you mean the
padding of all structure types, or just global structure variables?
If the former, it sounds like ROUND_TYPE_ALIGN, but also sounds scary :-)
If the latter, it's DATA_ALIGNMENT.

> @@ -15801,6 +15816,11 @@ mips_reorg_process_insns (void)
>    if (TARGET_FIX_VR4120 || TARGET_FIX_24K)
>      cfun->machine->all_noreorder_p = false;
>  
> +  /* Code compiled for R5900 can't be all noreorder because
> +     we rely on the assembler to work around some errata.  */
> +  if (TARGET_MIPS5900)
> +    cfun->machine->all_noreorder_p = false;
> +
>    /* The same is true for -mfix-vr4130 if we might generate MFLO or
>       MFHI instructions.  Note that we avoid using MFLO and MFHI if
>       the VR4130 MACC and DMACC instructions are available instead;

Please fold this into the clause above it.

> +/* Target supports instructions dmult and dmultu (integer). */
> +#define TARGET_HAS_DMULT	(TARGET_64BIT				\
> +				 && !TARGET_MIPS5900)

Please use ISA_HAS_* for consistency with other macros.  I think it'd
be better to drop the '(integer)'.

> +/* Target supports instructions mult and multu in 32 bit mode (integer). */
> +#define TARGET_HAS_MULT		(mips_isa >= 1)

...and here drop 'in 32 bit mode (integer)'.  32-bit-mode isn't really
relevant here.

> +/* Target supports instructions dmult and dmultu (integer). */
> +#define TARGET_HAS_DDIV		(TARGET_64BIT				\
> +				 && !TARGET_MIPS5900)

Same as above.

> +/* Target supports instructions mult and multu in 32 bit mode (integer). */
> +#define TARGET_HAS_DIV		(mips_isa >= 1)

Here too, plus "div" rather than "mult".

> @@ -841,10 +859,10 @@ struct mips_cpu_info {
>  
>  /* ISA has the integer conditional move instructions introduced in mips4 and
>     ST Loongson 2E/2F.  */
> -#define ISA_HAS_CONDMOVE        (ISA_HAS_FP_CONDMOVE || TARGET_LOONGSON_2EF)
> +#define ISA_HAS_CONDMOVE        (ISA_HAS_FP_CONDMOVE || TARGET_LOONGSON_2EF || TARGET_MIPS5900)

GCC has a strict 80-column limit, so please make this:

#define ISA_HAS_CONDMOVE	(ISA_HAS_FP_CONDMOVE \
				 || TARGET_LOONGSON_2EF \
				 || TARGET_MIPS5900)

>  /* ISA has LDC1 and SDC1.  */
> -#define ISA_HAS_LDC1_SDC1	(!ISA_MIPS1 && !TARGET_MIPS16)
> +#define ISA_HAS_LDC1_SDC1	(!ISA_MIPS1 && !TARGET_MIPS16 && !TARGET_MIPS5900)

Same 3-line expansion here.

> @@ -974,7 +993,11 @@ struct mips_cpu_info {
>  /* True if trunc.w.s and trunc.w.d are real (not synthetic)
>     instructions.  Both require TARGET_HARD_FLOAT, and trunc.w.d
>     also requires TARGET_DOUBLE_FLOAT.  */
> -#define ISA_HAS_TRUNC_W		(!ISA_MIPS1)
> +#define ISA_HAS_TRUNC_W_D	(!ISA_MIPS1)
> +
> +/* True if trunc.w.s is real (not synthetic) instructions.
> +   Requires TARGET_HARD_FLOAT.  */
> +#define ISA_HAS_TRUNC_W_S	(ISA_HAS_TRUNC_W_D || TARGET_MIPS5900)

First comment still describes both cases, so I think the second one
is redundant.  Just:

/* True if trunc.w.s and trunc.w.d are real (not synthetic)
   instructions.  Both require TARGET_HARD_FLOAT, and trunc.w.d
   also requires TARGET_DOUBLE_FLOAT.  */
#define ISA_HAS_TRUNC_W_D	(!ISA_MIPS1)
#define ISA_HAS_TRUNC_W_S	(ISA_HAS_TRUNC_W_D || TARGET_MIPS5900)

> @@ -726,7 +727,7 @@
>  ;; This mode iterator allows :MOVECC to be used anywhere that a
>  ;; conditional-move-type condition is needed.
>  (define_mode_iterator MOVECC [SI (DI "TARGET_64BIT")
> -                              (CC "TARGET_HARD_FLOAT && !TARGET_LOONGSON_2EF")])
> +                              (CC "TARGET_HARD_FLOAT && !TARGET_LOONGSON_2EF && !TARGET_MIPS5900")])

Same three-line expansion here:

(define_mode_iterator MOVECC [SI (DI "TARGET_64BIT")
			      (CC "TARGET_HARD_FLOAT
				   && !TARGET_LOONGSON_2EF
				   && !TARGET_MIPS5900")])

> @@ -1900,7 +1901,7 @@
>    [(set (match_operand:DI 0 "muldiv_target_operand" "=ka")
>  	(mult:DI (any_extend:DI (match_operand:SI 1 "register_operand" "d"))
>  		 (any_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
> -  "!TARGET_64BIT && (!TARGET_FIX_R4000 || ISA_HAS_DSP)"
> +  "(!TARGET_64BIT || (TARGET_64BIT && !TARGET_HAS_DMULT)) && (!TARGET_FIX_R4000 || ISA_HAS_DSP)"
>  {
>    if (ISA_HAS_DSP_MULT)
>      return "mult<u>\t%q0,%1,%2";

Just:

  "!ISA_HAS_DMULTU && (!TARGET_FIX_R4000 || ISA_HAS_DSP)"

Please update <u>mulsidi3_32bit_mips16 and <u>mulsidi3_32bit_r4000
in the same way.

> @@ -1927,7 +1928,7 @@
>  		 (any_extend:DI (match_operand:SI 2 "register_operand" "d"))))
>     (clobber (match_scratch:TI 3 "=x"))
>     (clobber (match_scratch:DI 4 "=d"))]
> -  "TARGET_64BIT && !TARGET_FIX_R4000 && !ISA_HAS_DMUL3 && !TARGET_MIPS16"
> +  "TARGET_64BIT && !TARGET_FIX_R4000 && !ISA_HAS_DMUL3 && !TARGET_MIPS16 && TARGET_HAS_DMULT"
>    "#"
>    "&& reload_completed"
>    [(const_int 0)]

Just:

  "ISA_HAS_DMULTU && !TARGET_FIX_R4000 && !ISA_HAS_DMUL3 && !TARGET_MIPS16"

Please update <u>mulsidi3_64bit_mips16 in the same way.

> @@ -2105,7 +2106,7 @@
>  {
>    rtx hilo;
>  
> -  if (TARGET_64BIT)
> +  if (TARGET_64BIT && TARGET_HAS_DMULT)
>      {
>        hilo = gen_rtx_REG (TImode, MD_REG_FIRST);
>        emit_insn (gen_<u>mulsidi3_64bit_hilo (hilo, operands[1], operands[2]));

Here too just ISA_HAS_DMULT.  Several other cases later on, I won't
bore you with them all :-)

> @@ -2537,7 +2541,7 @@
>     (set (match_operand:GPR 3 "register_operand")
>  	(mod:GPR (match_dup 1)
>  		 (match_dup 2)))]
> -  "!TARGET_FIX_VR4120"
> +  "!TARGET_FIX_VR4120 && TARGET_HAS_<D>DIV"
>  {
>    if (TARGET_MIPS16)
>      {

Would prefer the ISA_HAS_<D>DIV first.  Please update the MIPS16 patterns
in the same way.

> @@ -1881,11 +1881,17 @@ mipsisa64sb1-*-elf* | mipsisa64sb1el-*-e
>  	target_cpu_default="MASK_64BIT|MASK_FLOAT64"
>  	tm_defines="${tm_defines} MIPS_ISA_DEFAULT=64 MIPS_CPU_STRING_DEFAULT=\\\"sb1\\\" MIPS_ABI_DEFAULT=ABI_O64"
>  	;;
> -mips-*-elf* | mipsel-*-elf*)
> +mips-*-elf* | mipsel-*-elf* | mipsr5900-*-elf* | mipsr5900el-*-elf*)
>  	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
>  	tmake_file="mips/t-elf"
>  	;;
> -mips64-*-elf* | mips64el-*-elf*)
> +mips64r5900-*-elf* | mips64r5900el-*-elf*)
> +	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
> +	tmake_file="mips/t-elf"
> +	target_cpu_default="MASK_64BIT"
> +	tm_defines="${tm_defines} MIPS_ISA_DEFAULT=3 MIPS_ABI_DEFAULT=ABI_N32"
> +	;;
> +mips64-*-elf* | mips64el-*-elf* | mips64r5900-*-elf* | mips64r5900el-*-elf*)
>  	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
>  	tmake_file="mips/t-elf"
>  	target_cpu_default="MASK_64BIT|MASK_FLOAT64"

The change to the "mips64-*-elf* | mips64el-*-elf*)" line looks unnecessary.

> @@ -3374,7 +3400,7 @@ case "${target}" in
>  		supported_defaults="abi arch arch_32 arch_64 float tune tune_32 tune_64 divide llsc mips-plt synci"
>  
>  		case ${with_float} in
> -		"" | soft | hard)
> +		"" | soft | hard | single | double)
>  			# OK
>  			;;
>  		*)

Please leave this out for now and add it with the ABI changes mentioned
above.

I can't approve the libgcc bits, but I'm afraid they probably tip the
balance against this being acceptable for 4.8.

Richard
Maciej W. Rozycki - Jan. 8, 2013, 12:23 a.m.
On Mon, 7 Jan 2013, Richard Sandiford wrote:

> > ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions.
> > ll and sc is disabled with "-mno-llsc" and works.
> > cvt.w.s is replaced by trunc.w.s. This seems to work.
> 
> Probably showing my ignorance, but I couldn't see this in the patch.

 This has raised my attention -- AFAICS the binutils change recently 
approved correctly disables DMULT, DDIV, CVT.W.S, etc. for -march=r5900, 
but does not do that for LL or SC.  I think that should be fixed.  And I 
gather LLD and SCD should then be disabled as well.

> > I disabled 64 bit FPU instructions by "-msoft-float". This works, but
> > using "-msingle-float" fails. This would be the better
> > configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
> > $2,$f0" when using "long double" multiplication). So "-msingle-float"
> > doesn't seem to work on generic mips64-linux-gnu.
> 
> Right.  That combination hasn't really been defined.  What happens
> for plain doubles?  Do you pass those in FPRs or GPRs?

 IIUC the R5900 has an FPU that is functionally the same as that of the 
R4640/R4650.  If that is the case, then there is no way to pass doubles in 
FPRs -- there is no room to store the upper halves.  The single-precision 
FPU of the R4640/R4650 processors can be configured with CP0.Status.FR to 
present a register file of either 16 or 32 32-bit registers.  The upper 
halves are not implemented.

 Frankly I don't think we have an ABI to express doubles on such platforms 
-- we could "approximate" one by passing doubles in GPRs and singles in 
FPRs (where mandated by o32), but that would really be an entirely new 
ABI.  The compiler could presumably be taught to call soft-float routines 
for double arithmetic and emit FP machine code for single arithmetic.  
I'm not sure how feasible the use of single float could be in the 
soft-float library.

 Things would get more complicated if one wanted to run a real OS such as 
Linux on the R5900 and let the kernel FP emulator handle the missing 
double FP automagically -- this is a little bit out of scope here as 
regular -mdouble-float would then just do, but makes me wonder whether 
-mfp32 should really be enforced (as opposed to just defaulted) for the 
R5900, hmm...

  Maciej
Jeff Law - Jan. 8, 2013, 4:22 a.m.
On 01/07/2013 02:52 PM, Richard Sandiford wrote:

>> I disabled 64 bit FPU instructions by "-msoft-float". This works, but
>> using "-msingle-float" fails. This would be the better
>> configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
>> $2,$f0" when using "long double" multiplication). So "-msingle-float"
>> doesn't seem to work on generic mips64-linux-gnu.
>
> Right.  That combination hasn't really been defined.  What happens
> for plain doubles?  Do you pass those in FPRs or GPRs?
IIRC we defined doubles as 32bits wide in our old port.  We simply 
didn't support 64bit wide doubles.  I don't remember what mechanism we 
used to make this happen.

>
>> I tried to disable dmult and ddiv (see mips.md). Disabling worked, but
>> now muldi3 calls itself in libgcc2. I thought this should work, because
>> I got this working with GCC 4.3, but the latest GCC version is a
>> problem. multi3 is calling muldi3, so that muldi3 should be able to use
>> mulsi3, because it is the same C code in libgcc2. Can someone get me
>> some hints or comments? How can this be debugged?
>
> Not sure, sorry.
IIRC I simply disabled muldi3_internal2 and I think we defined away 
everything related to timode except register-register moves.

Jeff
Richard Sandiford - Jan. 8, 2013, 7:16 a.m.
Richard Sandiford <rdsandiford@googlemail.com> writes:
> "Jürgen Urban" <JuergenUrban@gmx.de> writes:
>> ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions.
>> ll and sc is disabled with "-mno-llsc" and works.
>> cvt.w.s is replaced by trunc.w.s. This seems to work.
>
> Probably showing my ignorance, but I couldn't see this in the patch.

Maciej's reply made me realise that this sounded like I was responding
to all three lines.  The LL and SC stuff is fine.  It was the CVT.W.S
bit that I couldn't see.

Sorry for the confusion.

Richard
Richard Sandiford - Jan. 8, 2013, 7:22 a.m.
Jeff Law <law@redhat.com> writes:
> On 01/07/2013 02:52 PM, Richard Sandiford wrote:
>
>>> I disabled 64 bit FPU instructions by "-msoft-float". This works, but
>>> using "-msingle-float" fails. This would be the better
>>> configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
>>> $2,$f0" when using "long double" multiplication). So "-msingle-float"
>>> doesn't seem to work on generic mips64-linux-gnu.
>>
>> Right.  That combination hasn't really been defined.  What happens
>> for plain doubles?  Do you pass those in FPRs or GPRs?
> IIRC we defined doubles as 32bits wide in our old port.  We simply 
> didn't support 64bit wide doubles.  I don't remember what mechanism we 
> used to make this happen.

Ah, yeah.

>>> I tried to disable dmult and ddiv (see mips.md). Disabling worked, but
>>> now muldi3 calls itself in libgcc2. I thought this should work, because
>>> I got this working with GCC 4.3, but the latest GCC version is a
>>> problem. multi3 is calling muldi3, so that muldi3 should be able to use
>>> mulsi3, because it is the same C code in libgcc2. Can someone get me
>>> some hints or comments? How can this be debugged?
>>
>> Not sure, sorry.
> IIRC I simply disabled muldi3_internal2 and I think we defined away 
> everything related to timode except register-register moves.

AIUI the problem that Jürgen's hitting is that _muldi3.o
in libgcc actually contains __multi3 on 64-bit targets,
because LIBGCC2_UNITS_PER_WORD == 8.  Presumably _mulsi3.o would
then contain __muldi3 where needed, but that file doesn't exist.
So he was trying to add it.

If this worked in 4.3 then I assume something has changed in the
last few years.

Richard
Richard Sandiford - Jan. 8, 2013, 7:28 a.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>> > I disabled 64 bit FPU instructions by "-msoft-float". This works, but
>> > using "-msingle-float" fails. This would be the better
>> > configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
>> > $2,$f0" when using "long double" multiplication). So "-msingle-float"
>> > doesn't seem to work on generic mips64-linux-gnu.
>> 
>> Right.  That combination hasn't really been defined.  What happens
>> for plain doubles?  Do you pass those in FPRs or GPRs?
>
>  IIUC the R5900 has an FPU that is functionally the same as that of the 
> R4640/R4650.  If that is the case, then there is no way to pass doubles in 
> FPRs -- there is no room to store the upper halves.

My point was that you could pass them in consecutive FPRs, like n32 does
for long double.  There's no architectural support for long double either,
but the decision was still to pass them in FPRs rather than GPRs.

I'm not saying that that's a sensible precendent to copy.  I was just
using it as one example of why an ABI has to be defined.

Richard
Maciej W. Rozycki - Jan. 8, 2013, 5:24 p.m.
On Tue, 8 Jan 2013, Richard Sandiford wrote:

> >> > I disabled 64 bit FPU instructions by "-msoft-float". This works, but
> >> > using "-msingle-float" fails. This would be the better
> >> > configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
> >> > $2,$f0" when using "long double" multiplication). So "-msingle-float"
> >> > doesn't seem to work on generic mips64-linux-gnu.
> >> 
> >> Right.  That combination hasn't really been defined.  What happens
> >> for plain doubles?  Do you pass those in FPRs or GPRs?
> >
> >  IIUC the R5900 has an FPU that is functionally the same as that of the 
> > R4640/R4650.  If that is the case, then there is no way to pass doubles in 
> > FPRs -- there is no room to store the upper halves.
> 
> My point was that you could pass them in consecutive FPRs, like n32 does
> for long double.  There's no architectural support for long double either,
> but the decision was still to pass them in FPRs rather than GPRs.

 You mean using a pair of FPRs (e.g. $f0/$f2) as a sum of two values of 
different exponents for extra precision?  That would make sense, but would 
not match the way the double type has been defined in the ISO C standard 
for IEEE-754 targets -- please note that contrariwise the standard 
provides more freedom as to how the long double type can be implemented on 
IEEE-754 targets.

 Otherwise it would make no sense IMO, the contents would have to be moved 
back to GPRs for any use anyway.

> I'm not saying that that's a sensible precendent to copy.  I was just
> using it as one example of why an ABI has to be defined.

 Not necessarily, the double type may simply be banned or alias to the 
single type.  Especially the latter -- such an arrangement is allowed by 
ISO C as long as the target does not claim IEEE-754 compliance (we'd have 
a problem with the Java frontend though) and I think such a compilation 
mode might be permitted as long as it is useful to someone.

  Maciej
Richard Sandiford - Jan. 8, 2013, 6:25 p.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
> On Tue, 8 Jan 2013, Richard Sandiford wrote:
>> >> > I disabled 64 bit FPU instructions by "-msoft-float". This works, but
>> >> > using "-msingle-float" fails. This would be the better
>> >> > configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
>> >> > $2,$f0" when using "long double" multiplication). So "-msingle-float"
>> >> > doesn't seem to work on generic mips64-linux-gnu.
>> >> 
>> >> Right.  That combination hasn't really been defined.  What happens
>> >> for plain doubles?  Do you pass those in FPRs or GPRs?
>> >
>> >  IIUC the R5900 has an FPU that is functionally the same as that of the 
>> > R4640/R4650.  If that is the case, then there is no way to pass doubles in 
>> > FPRs -- there is no room to store the upper halves.
>> 
>> My point was that you could pass them in consecutive FPRs, like n32 does
>> for long double.  There's no architectural support for long double either,
>> but the decision was still to pass them in FPRs rather than GPRs.
>
>  You mean using a pair of FPRs (e.g. $f0/$f2) as a sum of two values of 
> different exponents for extra precision?  That would make sense, but would 
> not match the way the double type has been defined in the ISO C standard 
> for IEEE-754 targets -- please note that contrariwise the standard 
> provides more freedom as to how the long double type can be implemented on 
> IEEE-754 targets.

No, I mean passing the two 32-bit halves in two FPRs, like we pass the
two 64-bit halves of long doubles in two FPRs.  Like I say...

>> I'm not saying that that's a sensible precendent to copy.  I was just
>> using it as one example of why an ABI has to be defined.
>
>  Not necessarily, the double type may simply be banned or alias to the 
> single type.  Especially the latter -- such an arrangement is allowed by 
> ISO C as long as the target does not claim IEEE-754 compliance (we'd have 
> a problem with the Java frontend though) and I think such a compilation 
> mode might be permitted as long as it is useful to someone.

But that's the point: we have to define what the rules are.  The definition
includes what isn't allowed.

Richard
Jürgen Urban - Jan. 8, 2013, 9:30 p.m.
Hello Richard,

> > cvt.w.s is replaced by trunc.w.s. This seems to work.
> 
> Probably showing my ignorance, but I couldn't see this in the patch.

trunc.w.s is enabled by ISA_HAS_TRUNC_W_S. This automatically disables cvt.w.s, because trunc.w.s is preferred.

> > I disabled 64 bit FPU instructions by "-msoft-float". This works, but
> > using "-msingle-float" fails. This would be the better
> > configuration. There are still 64 bit FPU instructions used (e.g. "dmfc1
> > $2,$f0" when using "long double" multiplication). So "-msingle-float"
> > doesn't seem to work on generic mips64-linux-gnu.
> 
> Right.  That combination hasn't really been defined.  What happens
> for plain doubles?

This seems to work. There are no unsupported instructions generated.

> Do you pass those in FPRs or GPRs?

I used -mhard-float together with -msingle-float, so it is using FPRs.
 
> The n32 TImode support you mention uses pairs of GPRs, whereas I imagine
> you'd eventually want to use a single 128-bit GPR.  Is that right?

Most old PS2 code will work when supporting this:

typedef unsigned int __u128 __attribute__((mode(TI)));

This is currently working with n32 ABI without any change, but not in o32 ABI. Support for 128-bit GPR would be better, so we have full compatibility to old PS2 code. 

> For the record, I think all those patches would be
> too invasive this late into the 4.8 cycle so would have to wait for 4.9.

OK. I also want to go step by step here.

> Please use ISA_HAS_* for consistency with other macros.  I think it'd
> be better to drop the '(integer)'.

OK. I thought ISA_HAS_* was related to MIPS ISA. My stuff is only related to one CPU here.

> Several other cases later on, I won't
> bore you with them all :-)

I rework it and try to get it stable and tested.
  
Best regards
Jürgen
Jürgen Urban - Jan. 8, 2013, 10:34 p.m.
Hello Maciej,

> > > ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions.
> > > ll and sc is disabled with "-mno-llsc" and works.
> > > cvt.w.s is replaced by trunc.w.s. This seems to work.
> > 
> > Probably showing my ignorance, but I couldn't see this in the patch.
> 
>  This has raised my attention -- AFAICS the binutils change recently 
> approved correctly disables DMULT, DDIV, CVT.W.S, etc. for -march=r5900, 
> but does not do that for LL or SC.  I think that should be fixed.  And I 
> gather LLD and SCD should then be disabled as well.

The glibc can only be compiled with support for ll and sc. The Linux kernel successfully emulates these instructions. When compiling GCC for mips*r5900*-elf (i.e. not Linux), the instructions ll/sc and lld/scd are disabled by my patch.

>  Things would get more complicated if one wanted to run a real OS such as 
> Linux on the R5900 and let the kernel FP emulator handle the missing 
> double FP automagically -- this is a little bit out of scope here as 
> regular -mdouble-float would then just do, but makes me wonder whether 
> -mfp32 should really be enforced (as opposed to just defaulted) for the 
> R5900, hmm...

I tried to emulate the 64 Bit FPU when the real 32 Bit FPU was enabled in Linux. There are 2 problems with this:
1. When the program starts, I don't know if it needs a 64 Bit or 32 Bit FPRs. So registers are initialized for 32 bit. When dmfc0 or dmtc0 appears, I need to emulate them using 32 Bit FPU, because some 32 bit programs use these instructions with a 32 Bit FPU (e.g. Linux 2.6.35 kernel and Debian 5.0). When a 64 bit calculation instructions appears, I need to switch from 32 bit FPRs to 64 bit FPRs. When the program used 32 bit instructions with the odd FPRs, there is no way to reconstruct the overwritten part of the 64 bit FPRs.
2. Some undefined instructions (e.g. c.eq.d) doesn't lead to an exception on an r5900, but have undefined behavior. So there is no emulation possible. It just calculates random stuff.
So the FPU needs to be disabled and completely emulated by the kernel, because then all FPU instructions lead to an exception. This is working with Linux 2.6 on PS2.
There are even more problems when running unchanged code from official Fedora 12 on PS2, because of some different opcode encoding. The users of my PS2 Linux 2.6 complain about low speed, because many instructions are emulated. I need some fast implementation, even if the size of the floating point data types is smaller. So 32 bit FPU must be default for r5900.

Best regards
Jürgen
Jürgen Urban - Jan. 8, 2013, 10:49 p.m.
> > IIRC we defined doubles as 32bits wide in our old port.  We simply 
> > didn't support 64bit wide doubles.  I don't remember what mechanism we 
> > used to make this happen.
> 
> Ah, yeah.

I think limiting wide doubles would be good.

> >>> I tried to disable dmult and ddiv (see mips.md). Disabling worked, but
> >>> now muldi3 calls itself in libgcc2. I thought this should work,
> because
> >>> I got this working with GCC 4.3, but the latest GCC version is a
> >>> problem. multi3 is calling muldi3, so that muldi3 should be able to
> use
> >>> mulsi3, because it is the same C code in libgcc2. Can someone get me
> >>> some hints or comments? How can this be debugged?
> >>
> >> Not sure, sorry.
> > IIRC I simply disabled muldi3_internal2 and I think we defined away 
> > everything related to timode except register-register moves.

@Jeff: I think you know the stringent copyright rules for GCC. I want to use the code from the original patch, but I don't know how many people were involved. So I can't use it without copyright problems. Can you please tell me which code can I use without encountering copyright problems? I plan to submit the code for fixing the r5900 short loop bug in GCC.

> AIUI the problem that Jürgen's hitting is that _muldi3.o
> in libgcc actually contains __multi3 on 64-bit targets,
> because LIBGCC2_UNITS_PER_WORD == 8.  Presumably _mulsi3.o would
> then contain __muldi3 where needed, but that file doesn't exist.
> So he was trying to add it.

Yes, this is exactly what happened.

Best regards
Jürgen
Jeff Law - Jan. 9, 2013, 5:24 a.m.
On 01/08/2013 03:49 PM, "Jürgen Urban" wrote:
>
> @Jeff: I think you know the stringent copyright rules for GCC. I want
> to use the code from the original patch, but I don't know how many
> people were involved. So I can't use it without copyright problems.
> Can you please tell me which code can I use without encountering
> copyright problems? I plan to submit the code for fixing the r5900
> short loop bug in GCC.
If you're using something from the Cygnus port, then it would be covered 
by the blanket copyright assignment Cygnus had in place with the FSF. 
If there are any doubts about the origin of a hunk of GCC code I could 
probably pull out the old sources to determine if it came from Cygnus's 
code base or not.

Jeff
Jürgen Urban - Jan. 10, 2013, 10:58 p.m.
Hello Jeff,

> If you're using something from the Cygnus port, then it would be covered 
> by the blanket copyright assignment Cygnus had in place with the FSF. 
> If there are any doubts about the origin of a hunk of GCC code I could 
> probably pull out the old sources to determine if it came from Cygnus's 
> code base or not.

Can you please tell me whether the following lines of the patches are covered by the blanket copyright assignment?

Lines 335-533 of GCC patch (mips_r5900_lengthen_loops()):
http://pastie.org/5664783

Lines 410-565 and 581-589 of binutils patch (check_short_loop()):
http://pastie.org/5664824

The patches are from the second DVD of Sony's Linux Toolkit for the PS2.

Best regards
Jürgen
Maciej W. Rozycki - Jan. 10, 2013, 11:24 p.m.
Jürgen,

 Adding the binutils list as more appropriate for some concerns discussed 
here.

> > > > ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions.
> > > > ll and sc is disabled with "-mno-llsc" and works.
> > > > cvt.w.s is replaced by trunc.w.s. This seems to work.
> > > 
> > > Probably showing my ignorance, but I couldn't see this in the patch.
> > 
> >  This has raised my attention -- AFAICS the binutils change recently 
> > approved correctly disables DMULT, DDIV, CVT.W.S, etc. for -march=r5900, 
> > but does not do that for LL or SC.  I think that should be fixed.  And I 
> > gather LLD and SCD should then be disabled as well.
> 
> The glibc can only be compiled with support for ll and sc. The Linux 
> kernel successfully emulates these instructions. When compiling GCC for 
> mips*r5900*-elf (i.e. not Linux), the instructions ll/sc and lld/scd are 
> disabled by my patch.

 That a particular OS emulates some instructions in software does not 
necessarily make them a part of the architecture.  GAS needs to support 
any target environment, including bare iron, and as such should closely 
match the hardware implementation.  I think the right place to address it 
is glibc.

 The library can be built for the base MIPS I ISA that did not have LL or 
SC instructions either and therefore already has some provisions in place 
to override the processor/ISA selection for the code fragments in question 
so that the instructions otherwise missing from the target hardware 
selected are nevertheless assembled successfully.  This is currently 
enabled for the o32 ABI, where .set mips2 is used to enable the assembly 
of LL and SC.

 Now if that failed for you, then it's a plain bug in GAS that should be 
fixed.  Can you therefore check whether a piece like:

	.set	mips2
	ll	$2, ($3)

assembles correctly with -march=r5900?

 Please note that the issue of LLD and SCD remains open -- these 
instructions are a part of the base MIPS III 64-bit ISA and therefore they 
are assumed by glibc and elsewhere to be present, and they are not 
emulated by Linux.  So not only you'll have to fix up glibc to surround 
their use with .set mips3 for the n64 and n32 ABIs (please note that .set 
mips3 is needed for LL and SC for these ABIs as well to avoid a 
miscalculation of addresses where applicable), but you'll have to add 
emulation code to Linux as well.

 And in any case I insist that the instructions are correctly marked in 
the opcode table.

> >  Things would get more complicated if one wanted to run a real OS such as 
> > Linux on the R5900 and let the kernel FP emulator handle the missing 
> > double FP automagically -- this is a little bit out of scope here as 
> > regular -mdouble-float would then just do, but makes me wonder whether 
> > -mfp32 should really be enforced (as opposed to just defaulted) for the 
> > R5900, hmm...
> 
> I tried to emulate the 64 Bit FPU when the real 32 Bit FPU was enabled 
> in Linux. There are 2 problems with this:
> 
> 1. When the program starts, I don't know if it needs a 64 Bit or 32 Bit 
> FPRs. So registers are initialized for 32 bit. When dmfc0 or dmtc0 
> appears, I need to emulate them using 32 Bit FPU, because some 32 bit 
> programs use these instructions with a 32 Bit FPU (e.g. Linux 2.6.35 
> kernel and Debian 5.0). When a 64 bit calculation instructions appears, 
> I need to switch from 32 bit FPRs to 64 bit FPRs. When the program used 
> 32 bit instructions with the odd FPRs, there is no way to reconstruct 
> the overwritten part of the 64 bit FPRs.

 The mode of the FPU is determined by the ABI -- o32 programs use the 
32-bit configuration (CP0.Status.FR set to 0) and n64/n32 programs use the 
64-bit arrangement (CP0.Status.FR set to 1).  That's already handled 
correctly by the kernel, by configuring the FPU on a process-by-process 
basis according to data obtained from the ELF file header of the 
executable run.

 Of course all double arithmetic would have to be handled by the emulator, 
by trapping the Unimplemented Operation exception.  This would clearly be 
a new mode of operation and not supported out of the box with current 
code as that would have to be tweaked to handle the case where only half 
the register state is stored in hardware.

> 2. Some undefined instructions (e.g. c.eq.d) doesn't lead to an 
> exception on an r5900, but have undefined behavior. So there is no 
> emulation possible. It just calculates random stuff.

 Oh well, that rules out any practical use of the FPU under Linux then.

> So the FPU needs to be disabled and completely emulated by the kernel, 
> because then all FPU instructions lead to an exception. This is working 
> with Linux 2.6 on PS2.

 Naturally, as long as they got the Coprocessor Unusable exception right.

> There are even more problems when running unchanged code from official 
> Fedora 12 on PS2, because of some different opcode encoding. The users 
> of my PS2 Linux 2.6 complain about low speed, because many instructions 
> are emulated. I need some fast implementation, even if the size of the 
> floating point data types is smaller. So 32 bit FPU must be default for 
> r5900.

 That sounds weird -- why would anyone want to use a non-standard encoding 
for any instructions?  The base MIPS III 64-bit ISA was set as far back as 
in 1991.  Is R5900 documentation publicly available BTW?

  Maciej
Jeff Law - Jan. 11, 2013, 4:41 a.m.
On 01/10/2013 03:58 PM, "Jürgen Urban" wrote:
> Hello Jeff,
>
>> If you're using something from the Cygnus port, then it would be covered
>> by the blanket copyright assignment Cygnus had in place with the FSF.
>> If there are any doubts about the origin of a hunk of GCC code I could
>> probably pull out the old sources to determine if it came from Cygnus's
>> code base or not.
>
> Can you please tell me whether the following lines of the patches are covered by the blanket copyright assignment?
>
> Lines 335-533 of GCC patch (mips_r5900_lengthen_loops()):
> http://pastie.org/5664783
>
> Lines 410-565 and 581-589 of binutils patch (check_short_loop()):
> http://pastie.org/5664824
>
> The patches are from the second DVD of Sony's Linux Toolkit for the PS2.
Neither of those would be covered by the blanket assignment as to the 
best of my knowledge they were not written by a Cygnus/Red Hat engineer 
while working for Cygnus/Red Hat.

Clearly they're working around a chip bug, which seems to be documented 
reasonably well in a comment.  Given that documentation you could write 
your own check for that processor bug.

jeff
Richard Sandiford - Jan. 11, 2013, 9:49 a.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>  And in any case I insist that the instructions are correctly marked in 
> the opcode table.

I agree that it would be better to exclude the unimplemented instructions.
Jürgen: if you're happy to submit a patch along those lines, I promise
to review it.

BTW Maciej, sorry to be prickly about this, but: where I live, "I insist"
has a very domineering ring to it, at least in this kind of context.
The implication tends to be that "having insisted, I really expect it to
happen, simply because it is _I_ who insisted".  Maybe it's not the same
everywhere though.

Richard
Maciej W. Rozycki - Jan. 11, 2013, 4:54 p.m.
On Fri, 11 Jan 2013, Richard Sandiford wrote:

> BTW Maciej, sorry to be prickly about this, but: where I live, "I insist"
> has a very domineering ring to it, at least in this kind of context.
> The implication tends to be that "having insisted, I really expect it to
> happen, simply because it is _I_ who insisted".  Maybe it's not the same
> everywhere though.

 That's probably a shortcoming of my English skills -- sorry about that -- 
I didn't want to sound impolite or to insult anyone, especially you, 
Jürgen.  Your contribution is very welcome even if there are minor issues 
there or some design decisions are not immediately obvious to everyone.  
Please feel free to disagree or argue if you think any opinion expressed 
does not convince you.

  Maciej
Jürgen Urban - Jan. 13, 2013, 2:15 p.m.
Hello Maciej,

>  Now if that failed for you, then it's a plain bug in GAS that should be 
> fixed.  Can you therefore check whether a piece like:
> 
> 	.set	mips2
> 	ll	$2, ($3)
> 
> assembles correctly with -march=r5900?

This seems to work. I didn't know that this would work. I thought it would never be possible to generate ll and sc.

>  Please note that the issue of LLD and SCD remains open -- these 
> instructions are a part of the base MIPS III 64-bit ISA and therefore they
> are assumed by glibc and elsewhere to be present, and they are not 
> emulated by Linux.  So not only you'll have to fix up glibc to surround 
> their use with .set mips3 for the n64 and n32 ABIs (please note that .set 
> mips3 is needed for LL and SC for these ABIs as well to avoid a 
> miscalculation of addresses where applicable), but you'll have to add 
> emulation code to Linux as well.

I didn't see any code yet that uses lld/scd, so it doesn't seem to be a problem.
I will create a patch which includes tests that will ensure that .set mips3 will work.

> > So the FPU needs to be disabled and completely emulated by the kernel, 
> > because then all FPU instructions lead to an exception. This is working 
> > with Linux 2.6 on PS2.
> 
>  Naturally, as long as they got the Coprocessor Unusable exception right.

Yes, this exception is also working for instructions with undefined behavior.

> > There are even more problems when running unchanged code from official 
> > Fedora 12 on PS2, because of some different opcode encoding. The users 
> > of my PS2 Linux 2.6 complain about low speed, because many instructions 
> > are emulated. I need some fast implementation, even if the size of the 
> > floating point data types is smaller. So 32 bit FPU must be default for 
> > r5900.
> 
>  That sounds weird -- why would anyone want to use a non-standard encoding
> for any instructions?  The base MIPS III 64-bit ISA was set as far back as
> in 1991.  Is R5900 documentation publicly available BTW?

The documentation for r5900 is available on the first DVD of Sony's Linux Toolkit and in the SDK for the PS2 which is only available for people which I would call "verified Sony customers".
The TX79 core is similar to the r5900:
http://www.lukasz.dk/files/tx79architecture.pdf
But the TX79 has a 64 Bit FPU, so there are no real problems with opcode encoding. This document also says that mips isa III is supported, but not ll,sc,lld,scd,dmult and ddiv.
In binutils/opcodes/mips-opc.c you can see the different opcode encoding for c.lt.s and trunc.w.s, the missing c.olt.s and cvt.w.s instructions. These are caused by the FPU. This is no problem on the TX79.
For Fedora 12 I need to disable the FPU and emulate everything.
One of the biggest problem is that most Linux programs use the rdhwr instruction (0x7c03e83b). I don't know any MIPS CPU which supports this instruction. This has the same encoding as the "sq v1,-6085(zero)" instruction on the r5900. Luckily this always leads to an alignment exception which is handled correctly by my Linux kernel to emulate rdhwr.

Here is some information from the EE core user's manual regarding FPU:
This unit is not IEEE 754 compatible.
Supports single-precision format as defined in the IEEE 754 specification.
Plus/Minus "0" in line with IEEE 754 specification are supported.
NaNs and plus/minus infinities are not supported.
No hardware exception mechanism to affect instruction execution.

The FPU only supports "Rounding towards 0".
... the results may differ from the IEEE 754 Rounding to 0. This difference is usually restricted to the least significant bit only.

NaN, +inf, -inf and denormalized numbers are not supported
The FPU does not use the Guard, Round and Sticky bits during computations.
Invalid Operation exceptions due to NaN, +/-inf and Inexact exceptions are not supported.

Operations with different results:
- 0/0
- Sqrt (negative number)
- Division by zero
- Exponent overflow
- Exponent underflow
- Conversion of Floating-point to Integer Overflow										

Best regards
Jürgen
Maciej W. Rozycki - Jan. 14, 2013, 6:42 p.m.
Hi Jürgen,

> >  Now if that failed for you, then it's a plain bug in GAS that should be 
> > fixed.  Can you therefore check whether a piece like:
> > 
> > 	.set	mips2
> > 	ll	$2, ($3)
> > 
> > assembles correctly with -march=r5900?
> 
> This seems to work. I didn't know that this would work. I thought it 
> would never be possible to generate ll and sc.

 Excellent, I hoped that it would work as we've been using these overrides 
for years, except that usually they are used to tweak the ISA selected 
rather than a specific CPU (with -march= you can request either an ISA or 
a CPU).  Your case is the first I've personally encountered where the CPU 
selected needs an override, so I'm glad that it just works.

> >  Please note that the issue of LLD and SCD remains open -- these 
> > instructions are a part of the base MIPS III 64-bit ISA and therefore they
> > are assumed by glibc and elsewhere to be present, and they are not 
> > emulated by Linux.  So not only you'll have to fix up glibc to surround 
> > their use with .set mips3 for the n64 and n32 ABIs (please note that .set 
> > mips3 is needed for LL and SC for these ABIs as well to avoid a 
> > miscalculation of addresses where applicable), but you'll have to add 
> > emulation code to Linux as well.
> 
> I didn't see any code yet that uses lld/scd, so it doesn't seem to be a 
> problem. I will create a patch which includes tests that will ensure 
> that .set mips3 will work.

 Glibc uses them exactly where it uses 32-bit LL/SC, except where a 64-bit 
data type is involved.  Of course that also requires a 64-bit ABI, either 
n64 or n32, as these are 64-bit instructions -- from what you wrote thus 
far I've gathered, perhaps incorrectly, that you've been using either or 
both too, in addition to o32 -- is my understanding correct?

> > > There are even more problems when running unchanged code from official 
> > > Fedora 12 on PS2, because of some different opcode encoding. The users 
> > > of my PS2 Linux 2.6 complain about low speed, because many instructions 
> > > are emulated. I need some fast implementation, even if the size of the 
> > > floating point data types is smaller. So 32 bit FPU must be default for 
> > > r5900.
> > 
> >  That sounds weird -- why would anyone want to use a non-standard encoding
> > for any instructions?  The base MIPS III 64-bit ISA was set as far back as
> > in 1991.  Is R5900 documentation publicly available BTW?
> 
> The documentation for r5900 is available on the first DVD of Sony's 
> Linux Toolkit and in the SDK for the PS2 which is only available for 
> people which I would call "verified Sony customers".

 OK, I see, so not really public, sigh...

> The TX79 core is similar to the r5900:
> http://www.lukasz.dk/files/tx79architecture.pdf
> But the TX79 has a 64 Bit FPU, so there are no real problems with opcode 
> encoding. This document also says that mips isa III is supported, but 
> not ll,sc,lld,scd,dmult and ddiv. In binutils/opcodes/mips-opc.c you can 
> see the different opcode encoding for c.lt.s and trunc.w.s, the missing 
> c.olt.s and cvt.w.s instructions. These are caused by the FPU. This is 
> no problem on the TX79.

 Oh well, missing instructions are not that much of a problem, they can 
always be emulated.  Instruction words that implement operation different 
to one expected are a show-stopper though.

 I see that the encodings supposed to implement C.OLT.S and C.OLE.S are 
interpreted as C.LT.S and C.LE.S, respectively.  However the former
instructions differ from the latters only in how quiet NaNs are treated.  
Given that, as you say, the processor does not support NaNs anyway, this 
may well be considered correct operation.  You may still need to emulate 
the other encoding though.

 How are unsupported floating-point data treated, BTW -- what results does 
the processor produce for floating-point encodings that would normally be 
interpreted as not-a-number, an infinity or a denormalised number?  Are 
they treated numerically, beyond the range IEEE-754 single provides?  You 
say that the Invalid Operation exception is not raised, so they cannot be 
trapped and emulated.

> For Fedora 12 I need to disable the FPU and emulate everything.

 Well, given the lack of full IEEE-754 support you'll always have to do 
that for generic MIPS code.  The kernel could interpret E_MIPS_MACH_5900 
set in the ELF file header flags though and enable the FPU selectively for 
compatible binaries.  Such binaries might produce computational results 
different to expected of course.  You'd have to enforce object-file 
compatibility though and make sure R5900 binaries do not run with the FPU 
enabled on other hardware too.

 That might be an interesting project if you'd like to dive into it.

> One of the biggest problem is that most Linux programs use the rdhwr 
> instruction (0x7c03e83b). I don't know any MIPS CPU which supports this 
> instruction.

 Oh, pretty much all modern MIPS processors do -- this instruction has 
been mandatory since the introduction of the MIPS32r2 and MIPS64r2 ISA 
level.  The UserLocal CP0 register accessible with RDHWR <rt>, $29 is 
however optional, so the instruction may still trap on some processors 
that otherwise support it, but there are such that do not, e.g. the 74K 
family processors.

> This has the same encoding as the "sq v1,-6085(zero)" 
> instruction on the r5900. Luckily this always leads to an alignment 
> exception which is handled correctly by my Linux kernel to emulate 
> rdhwr.

 Good.

> Here is some information from the EE core user's manual regarding FPU:
> This unit is not IEEE 754 compatible.
> Supports single-precision format as defined in the IEEE 754 specification.
> Plus/Minus "0" in line with IEEE 754 specification are supported.
> NaNs and plus/minus infinities are not supported.
> No hardware exception mechanism to affect instruction execution.
> 
> The FPU only supports "Rounding towards 0".
> ... the results may differ from the IEEE 754 Rounding to 0. This 
> difference is usually restricted to the least significant bit only.
> 
> NaN, +inf, -inf and denormalized numbers are not supported
> The FPU does not use the Guard, Round and Sticky bits during computations.
> Invalid Operation exceptions due to NaN, +/-inf and Inexact exceptions 
> are not supported.
> 
> Operations with different results:
> - 0/0
> - Sqrt (negative number)
> - Division by zero
> - Exponent overflow
> - Exponent underflow
> - Conversion of Floating-point to Integer Overflow

 OK, I guess you could still make it a supported processing unit with GCC, 
however I can't speak for GCC maintainers as to whether they would be 
willing to accept such support for inclusion.  Both ISO C and GCC do 
permit non-IEEE-754 floating point arithmetic (cf. VAX, that does not 
support qNaNs, infinities or denormals; sNaNs in a sense are supported).  
You'd probably have to bail out on sources referring to unsupported 
features, e.g. __builtin_inf; I reckon the VAX port does that.

  Maciej
Jürgen Urban - Jan. 17, 2013, 10:20 p.m.
Hello Maciej,

> > >  Please note that the issue of LLD and SCD remains open -- these 
> > > instructions are a part of the base MIPS III 64-bit ISA and therefore
> they
> > > are assumed by glibc and elsewhere to be present, and they are not 
> > > emulated by Linux.  So not only you'll have to fix up glibc to
> surround 
> > > their use with .set mips3 for the n64 and n32 ABIs (please note that
> .set 
> > > mips3 is needed for LL and SC for these ABIs as well to avoid a 
> > > miscalculation of addresses where applicable), but you'll have to add 
> > > emulation code to Linux as well.
> > 
> > I didn't see any code yet that uses lld/scd, so it doesn't seem to be a 
> > problem. I will create a patch which includes tests that will ensure 
> > that .set mips3 will work.
> 
>  Glibc uses them exactly where it uses 32-bit LL/SC, except where a 64-bit
> data type is involved.  Of course that also requires a 64-bit ABI, either 
> n64 or n32, as these are 64-bit instructions -- from what you wrote thus 
> far I've gathered, perhaps incorrectly, that you've been using either or 
> both too, in addition to o32 -- is my understanding correct?

I used o32 and n32 for Linux programs and with the OS of the PS2. I tried to use o64 for the Linux kernel, but I've got problems with the 64 bit TLBs and that the type "long" is used for pointers, so I decided to use the o32 kernel which was patched to support n32 user space. I never tried n64. I was not able to find an option to enable n64 in the gcc 4.3 (I mean more than -mabi=n64; i.e. multilib).

>  How are unsupported floating-point data treated, BTW -- what results does
> the processor produce for floating-point encodings that would normally be 
> interpreted as not-a-number, an infinity or a denormalised number?  Are 
> they treated numerically, beyond the range IEEE-754 single provides?  You 
> say that the Invalid Operation exception is not raised, so they cannot be 
> trapped and emulated.

The manual says that the traps can be emulated by a conditional trap instructions. I saw such code before, but I can't remember if this was x86, ARM, mipsel or r5900.
I tested the calculation with the type "float".
ABI o32 with -mhard-float and -msingle-float produces the following results:
1.000000 (0x3f800000) / 0.000000 (0x00000000) = nan (0x7fffffff)
0.000000 (0x00000000) / 0.000000 (0x00000000) = nan (0x7fffffff)
0.000000 (0x00000000) / nan (0x7fc00000) = 0.000000 (0x00000000)
1.000000 (0x3f800000) + 1.000000 (0x3f800000) = 2.000000 (0x40000000)
1.000000 (0x3f800000) + inf (0x7f800000) = inf (0x7f800000)
inf (0x7f800000) + inf (0x7f800000) = nan (0x7fffffff)
inf (0x7f800000) + -inf (0xff800000) = 0.000000 (0x00000000)
nan (0x7fc00000) + nan (0x7fc00000) = nan (0x7fffffff)
nan (0x7fc00000) + nan (0xffc00000) = 0.000000 (0x00000000)

The r5900 manual calls the result of 0/0 Fmax. So 0x7fffffff seems to be Fmax.

ABI n32 with -msoft-float and -mdouble-float produces the following results (this should be correct):
1.000000 (0x3f800000) / 0.000000 (0x00000000) = inf (0x7f800000)
0.000000 (0x00000000) / 0.000000 (0x00000000) = nan (0x7f8fffff)
0.000000 (0x00000000) / nan (0x7fc00000) = nan (0x7fcfffff)
1.000000 (0x3f800000) + 1.000000 (0x3f800000) = 2.000000 (0x40000000)
1.000000 (0x3f800000) + inf (0x7f800000) = inf (0x7f800000)
inf (0x7f800000) + inf (0x7f800000) = inf (0x7f800000)
inf (0x7f800000) + -inf (0xff800000) = nan (0x7f8fffff)
nan (0x7fc00000) + nan (0x7fc00000) = nan (0x7fcfffff)
nan (0x7fc00000) + nan (0xffc00000) = nan (0x7fcfffff)

Just for comparison: x86_64, Intel(R) Core(TM) i7-2600 CPU
1.000000 (0x3f800000) / 0.000000 (0x00000000) = inf (0x7f800000)
0.000000 (0x00000000) / 0.000000 (0x00000000) = -nan (0xffc00000)
0.000000 (0x00000000) / nan (0x7fc00000) = nan (0x7fc00000)
1.000000 (0x3f800000) + 1.000000 (0x3f800000) = 2.000000 (0x40000000)
1.000000 (0x3f800000) + inf (0x7f800000) = inf (0x7f800000)
inf (0x7f800000) + inf (0x7f800000) = inf (0x7f800000)
inf (0x7f800000) + -inf (0xff800000) = -nan (0xffc00000)
nan (0x7fc00000) + nan (0x7fc00000) = nan (0x7fc00000)
nan (0x7fc00000) + -nan (0xffc00000) = -nan (0xffc00000)

> > Here is some information from the EE core user's manual regarding FPU:
> > This unit is not IEEE 754 compatible.
> > Supports single-precision format as defined in the IEEE 754
> specification.
> > Plus/Minus "0" in line with IEEE 754 specification are supported.
> > NaNs and plus/minus infinities are not supported.
> > No hardware exception mechanism to affect instruction execution.
> > 
> > The FPU only supports "Rounding towards 0".
> > ... the results may differ from the IEEE 754 Rounding to 0. This 
> > difference is usually restricted to the least significant bit only.
> > 
> > NaN, +inf, -inf and denormalized numbers are not supported
> > The FPU does not use the Guard, Round and Sticky bits during
> computations.
> > Invalid Operation exceptions due to NaN, +/-inf and Inexact exceptions 
> > are not supported.
> > 
> > Operations with different results:
> > - 0/0
> > - Sqrt (negative number)
> > - Division by zero
> > - Exponent overflow
> > - Exponent underflow
> > - Conversion of Floating-point to Integer Overflow
> 
>  OK, I guess you could still make it a supported processing unit with GCC,
> however I can't speak for GCC maintainers as to whether they would be 
> willing to accept such support for inclusion.  Both ISO C and GCC do 
> permit non-IEEE-754 floating point arithmetic (cf. VAX, that does not 
> support qNaNs, infinities or denormals; sNaNs in a sense are supported).  
> You'd probably have to bail out on sources referring to unsupported 
> features, e.g. __builtin_inf; I reckon the VAX port does that.

I am thinking on using the MIPS soft float ABI. This means everything is passed in GPRs. Then I plan to implement the libgcc softfloat functions in an optimized way using the FPU when possible.

Best regards
Jürgen
Maciej W. Rozycki - Jan. 17, 2013, 11:22 p.m.
Hi Jürgen,

> >  Glibc uses them exactly where it uses 32-bit LL/SC, except where a 64-bit
> > data type is involved.  Of course that also requires a 64-bit ABI, either 
> > n64 or n32, as these are 64-bit instructions -- from what you wrote thus 
> > far I've gathered, perhaps incorrectly, that you've been using either or 
> > both too, in addition to o32 -- is my understanding correct?
> 
> I used o32 and n32 for Linux programs and with the OS of the PS2. I 
> tried to use o64 for the Linux kernel, but I've got problems with the 64 
> bit TLBs and that the type "long" is used for pointers, so I decided to 
> use the o32 kernel which was patched to support n32 user space. I never 
> tried n64. I was not able to find an option to enable n64 in the gcc 4.3 
> (I mean more than -mabi=n64; i.e. multilib).

 Well, -mabi= is exactly the option that switches between the three ABIs 
supported under Linux.  The o64 ABI is not supported with Linux, neither 
for userland programs nor for building the kernel.

 The kernel can be built either for 32-bit or for 64-bit support.  In the 
former case the resulting binary is o32 and can only run o32 user 
programs.  In the latter case the kernel binary is n64 and can run n64 
user programs, and can optionally be configured to run either or both o32 
and n32 user programs as well.

 Of course to be able to build and run user programs for the respective 
ABIs you need to have the right development environment and shared 
libraries installed.

 For 32-bit systems it's easy as you only have one ABI to choose from.  
The mips-unknown-linux-gnu and mipsel-unknown-linux-gnu targets are the 
canonical configuration triplets to configure all the pieces for, starting 
from binutils and GCC, for the big and the little endianness respectively.  
That'll build an o32 development environment.

 For 64-bit systems all the three ABIs are supported so it gets a tad more 
complicated.  The mips64-unknown-linux-gnu mips64el-unknown-linux-gnu 
targets are the canonical configuration triplets here and that'll build 
binutils and GCC that support all the three ABIs.  Then the compiler 
chooses among them by using different library paths -- there are multiple 
of them for each of the ABIs supported, but the rule of thumb is the 
actual directories where the libraries are located in is called "lib" for 
the n32 ABI, "lib64" for the n64 ABI and "lib32" for the o32 ABI.  You 
need to take that into account and set the correct library path -- e.g. 
with --libdir=\${exec_prefix}/lib64 for GNU autoconf scripts and the n64 
ABI -- when building further libraries as they are not normally 
automatically built for all the three ABIs.  Of course you then need to 
include -mabi=n64 among CFLAGS somewhere too.

> >  How are unsupported floating-point data treated, BTW -- what results does
> > the processor produce for floating-point encodings that would normally be 
> > interpreted as not-a-number, an infinity or a denormalised number?  Are 
> > they treated numerically, beyond the range IEEE-754 single provides?  You 
> > say that the Invalid Operation exception is not raised, so they cannot be 
> > trapped and emulated.
> 
> The manual says that the traps can be emulated by a conditional trap instructions. I saw such code before, but I can't remember if this was x86, ARM, mipsel or r5900.

 Yeah, but then you'd have to put these explicit trap instructions 
througout code somehow -- it's not like the affected floating-point 
instructions are going to trap themselves as expected.

> I tested the calculation with the type "float".
> ABI o32 with -mhard-float and -msingle-float produces the following results:
> 1.000000 (0x3f800000) / 0.000000 (0x00000000) = nan (0x7fffffff)
> 0.000000 (0x00000000) / 0.000000 (0x00000000) = nan (0x7fffffff)
> 0.000000 (0x00000000) / nan (0x7fc00000) = 0.000000 (0x00000000)
> 1.000000 (0x3f800000) + 1.000000 (0x3f800000) = 2.000000 (0x40000000)
> 1.000000 (0x3f800000) + inf (0x7f800000) = inf (0x7f800000)
> inf (0x7f800000) + inf (0x7f800000) = nan (0x7fffffff)
> inf (0x7f800000) + -inf (0xff800000) = 0.000000 (0x00000000)
> nan (0x7fc00000) + nan (0x7fc00000) = nan (0x7fffffff)
> nan (0x7fc00000) + nan (0xffc00000) = 0.000000 (0x00000000)
> 
> The r5900 manual calls the result of 0/0 Fmax. So 0x7fffffff seems to be Fmax.

 So presumably you can get 0x7fffffff as an arithmetic result of a 
calculation involving regular numbers as well, right?  Say 0x7f7ffffe + 
0x74000000 (using the binary-encoded notation)?  That would be beyond the
IEEE-754 single range.

> >  OK, I guess you could still make it a supported processing unit with GCC,
> > however I can't speak for GCC maintainers as to whether they would be 
> > willing to accept such support for inclusion.  Both ISO C and GCC do 
> > permit non-IEEE-754 floating point arithmetic (cf. VAX, that does not 
> > support qNaNs, infinities or denormals; sNaNs in a sense are supported).  
> > You'd probably have to bail out on sources referring to unsupported 
> > features, e.g. __builtin_inf; I reckon the VAX port does that.
> 
> I am thinking on using the MIPS soft float ABI. This means everything is 
> passed in GPRs. Then I plan to implement the libgcc softfloat functions 
> in an optimized way using the FPU when possible.

 That's sounds like a good idea to me, although you'll probably still have 
to sort out the issue of using the FPU for R5900 binaries, but not use it 
by accident for regular MIPS binaries somehow.  That could be handled by 
the kernel, by enabling the FPU selectively, for example using the way I 
previously outlined.

  Maciej
Richard Sandiford - Jan. 19, 2013, 10:53 a.m.
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>> I tested the calculation with the type "float".
>> ABI o32 with -mhard-float and -msingle-float produces the following results:
>> 1.000000 (0x3f800000) / 0.000000 (0x00000000) = nan (0x7fffffff)
>> 0.000000 (0x00000000) / 0.000000 (0x00000000) = nan (0x7fffffff)
>> 0.000000 (0x00000000) / nan (0x7fc00000) = 0.000000 (0x00000000)
>> 1.000000 (0x3f800000) + 1.000000 (0x3f800000) = 2.000000 (0x40000000)
>> 1.000000 (0x3f800000) + inf (0x7f800000) = inf (0x7f800000)
>> inf (0x7f800000) + inf (0x7f800000) = nan (0x7fffffff)
>> inf (0x7f800000) + -inf (0xff800000) = 0.000000 (0x00000000)
>> nan (0x7fc00000) + nan (0x7fc00000) = nan (0x7fffffff)
>> nan (0x7fc00000) + nan (0xffc00000) = 0.000000 (0x00000000)
>> 
>> The r5900 manual calls the result of 0/0 Fmax. So 0x7fffffff seems to be Fmax.
>
>  So presumably you can get 0x7fffffff as an arithmetic result of a 
> calculation involving regular numbers as well, right?  Say 0x7f7ffffe + 
> 0x74000000 (using the binary-encoded notation)?  That would be beyond the
> IEEE-754 single range.

Yeah, if I recall correctly.  We already support what I think is the
same format for SPU (spu_single_format), which I suppose makes sense
given its heritage.  Hopefully the format itself won't need much
work in GCC.

Richard
Jürgen Urban - Jan. 20, 2013, 9:42 p.m.
Hello Maciej,

> > I tested the calculation with the type "float".
> > ABI o32 with -mhard-float and -msingle-float produces the following
> results:
> > 1.000000 (0x3f800000) / 0.000000 (0x00000000) = nan (0x7fffffff)
> > 0.000000 (0x00000000) / 0.000000 (0x00000000) = nan (0x7fffffff)
> > 0.000000 (0x00000000) / nan (0x7fc00000) = 0.000000 (0x00000000)
> > 1.000000 (0x3f800000) + 1.000000 (0x3f800000) = 2.000000 (0x40000000)
> > 1.000000 (0x3f800000) + inf (0x7f800000) = inf (0x7f800000)
> > inf (0x7f800000) + inf (0x7f800000) = nan (0x7fffffff)
> > inf (0x7f800000) + -inf (0xff800000) = 0.000000 (0x00000000)
> > nan (0x7fc00000) + nan (0x7fc00000) = nan (0x7fffffff)
> > nan (0x7fc00000) + nan (0xffc00000) = 0.000000 (0x00000000)
> > 
> > The r5900 manual calls the result of 0/0 Fmax. So 0x7fffffff seems to be
> Fmax.
> 
>  So presumably you can get 0x7fffffff as an arithmetic result of a 
> calculation involving regular numbers as well, right?  Say 0x7f7ffffe + 
> 0x74000000 (using the binary-encoded notation)?  That would be beyond the
> IEEE-754 single range.

The FPU of the r5900 calculates the following:
340282306073709652508363335590014353408.000000 (0x7f7ffffd) + 40564819207303340847894502572032.000000 (0x74000000) = 340282346638528859811704183484516925440.000000 (0x7f7fffff)
340282326356119256160033759537265639424.000000 (0x7f7ffffe) + 40564819207303340847894502572032.000000 (0x74000000) = inf (0x7f800000)
340282346638528859811704183484516925440.000000 (0x7f7fffff) + 40564819207303340847894502572032.000000 (0x74000000) = inf (0x7f800000)
inf (0x7f800000) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7f800001)
nan (0x7f800001) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7f800002)
nan (0x7f900000) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7f900001)
nan (0x7f900001) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7f900002)
nan (0x7ffffff1) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7ffffff2)
nan (0x7ffffffc) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7ffffffd)
nan (0x7ffffffd) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7ffffffe)
nan (0x7ffffffe) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7fffffff)
nan (0x7fffffff) + 40564819207303340847894502572032.000000 (0x74000000) = nan (0x7fffffff)

So it seems that it interprets nan and inf as normal numbers, but it stops at 0x7fffffff. So 0x7fffffff should be interpreted as overflow.

Best regards
Jürgen

Patch

diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/config.sub gcc-svn-20130105-mips64r5900el-linux-patched/config.sub
--- ../gcc-svn-20130105.orig/config.sub	2013-01-05 20:06:32.859960482 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/config.sub	2013-01-06 19:11:56.332755480 +0100
@@ -284,6 +284,8 @@  case $basic_machine in
 	| mips64vr4300 | mips64vr4300el \
 	| mips64vr5000 | mips64vr5000el \
 	| mips64vr5900 | mips64vr5900el \
+	| mips64r5900 | mips64r5900el \
+	| mipsr5900 | mipsr5900el \
 	| mipsisa32 | mipsisa32el \
 	| mipsisa32r2 | mipsisa32r2el \
 	| mipsisa64 | mipsisa64el \
@@ -401,6 +403,8 @@  case $basic_machine in
 	| mips64vr4300-* | mips64vr4300el-* \
 	| mips64vr5000-* | mips64vr5000el-* \
 	| mips64vr5900-* | mips64vr5900el-* \
+	| mips64r5900-* | mips64r5900el-* \
+	| mipsr5900-* | mipsr5900el-* \
 	| mipsisa32-* | mipsisa32el-* \
 	| mipsisa32r2-* | mipsisa32r2el-* \
 	| mipsisa64-* | mipsisa64el-* \
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/gcc/config/mips/mips.c gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips.c
--- ../gcc-svn-20130105.orig/gcc/config/mips/mips.c	2013-01-05 20:03:24.231962472 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips.c	2013-01-06 19:11:56.336755480 +0100
@@ -1025,6 +1025,19 @@  static const struct mips_rtx_cost_data
 		     1,           /* branch_cost */
 		     4            /* memory_latency */
   },
+  { /* R5900 */
+    COSTS_N_INSNS (4),            /* fp_add */
+    COSTS_N_INSNS (4),            /* fp_mult_sf */
+    COSTS_N_INSNS (256),          /* fp_mult_df */
+    COSTS_N_INSNS (8),            /* fp_div_sf */
+    COSTS_N_INSNS (256),          /* fp_div_df */
+    COSTS_N_INSNS (4),            /* int_mult_si */
+    COSTS_N_INSNS (256),          /* int_mult_di */
+    COSTS_N_INSNS (37),           /* int_div_si */
+    COSTS_N_INSNS (256),          /* int_div_di */
+		     1,           /* branch_cost */
+		     4            /* memory_latency */
+  },
   { /* R7000 */
     /* The only costs that are changed here are
        integer multiplication.  */
@@ -12793,6 +12806,7 @@  mips_issue_rate (void)
     case PROCESSOR_R4130:
     case PROCESSOR_R5400:
     case PROCESSOR_R5500:
+    case PROCESSOR_R5900:
     case PROCESSOR_R7000:
     case PROCESSOR_R9000:
     case PROCESSOR_OCTEON:
@@ -15573,6 +15587,7 @@  vr4130_align_insns (void)
     }
   dfa_finish ();
 }
+
 
 /* This structure records that the current function has a LO_SUM
    involving SYMBOL_REF or LABEL_REF BASE and that MAX_OFFSET is
@@ -15801,6 +15816,11 @@  mips_reorg_process_insns (void)
   if (TARGET_FIX_VR4120 || TARGET_FIX_24K)
     cfun->machine->all_noreorder_p = false;
 
+  /* Code compiled for R5900 can't be all noreorder because
+     we rely on the assembler to work around some errata.  */
+  if (TARGET_MIPS5900)
+    cfun->machine->all_noreorder_p = false;
+
   /* The same is true for -mfix-vr4130 if we might generate MFLO or
      MFHI instructions.  Note that we avoid using MFLO and MFHI if
      the VR4130 MACC and DMACC instructions are available instead;
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/gcc/config/mips/mips-cpus.def gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips-cpus.def
--- ../gcc-svn-20130105.orig/gcc/config/mips/mips-cpus.def	2013-01-05 20:03:24.227962471 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips-cpus.def	2013-01-06 19:11:56.336755480 +0100
@@ -71,6 +71,7 @@  MIPS_CPU ("r4600", PROCESSOR_R4600, 3, 0
 MIPS_CPU ("orion", PROCESSOR_R4600, 3, 0)
 MIPS_CPU ("r4650", PROCESSOR_R4650, 3, 0)
 MIPS_CPU ("r4700", PROCESSOR_R4700, 3, 0)
+MIPS_CPU ("r5900", PROCESSOR_R5900, 3, 0)
 /* ST Loongson 2E/2F processors.  */
 MIPS_CPU ("loongson2e", PROCESSOR_LOONGSON_2E, 3, PTF_AVOID_BRANCHLIKELY)
 MIPS_CPU ("loongson2f", PROCESSOR_LOONGSON_2F, 3, PTF_AVOID_BRANCHLIKELY)
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/gcc/config/mips/mips.h gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips.h
--- ../gcc-svn-20130105.orig/gcc/config/mips/mips.h	2013-01-05 20:03:24.231962472 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips.h	2013-01-06 19:11:56.336755480 +0100
@@ -215,6 +215,7 @@  struct mips_cpu_info {
 #define TARGET_MIPS4130             (mips_arch == PROCESSOR_R4130)
 #define TARGET_MIPS5400             (mips_arch == PROCESSOR_R5400)
 #define TARGET_MIPS5500             (mips_arch == PROCESSOR_R5500)
+#define TARGET_MIPS5900             (mips_arch == PROCESSOR_R5900)
 #define TARGET_MIPS7000             (mips_arch == PROCESSOR_R7000)
 #define TARGET_MIPS9000             (mips_arch == PROCESSOR_R9000)
 #define TARGET_OCTEON		    (mips_arch == PROCESSOR_OCTEON	\
@@ -245,6 +246,7 @@  struct mips_cpu_info {
 #define TUNE_MIPS5000               (mips_tune == PROCESSOR_R5000)
 #define TUNE_MIPS5400               (mips_tune == PROCESSOR_R5400)
 #define TUNE_MIPS5500               (mips_tune == PROCESSOR_R5500)
+#define TUNE_MIPS5900               (mips_tune == PROCESSOR_R5900)
 #define TUNE_MIPS6000               (mips_tune == PROCESSOR_R6000)
 #define TUNE_MIPS7000               (mips_tune == PROCESSOR_R7000)
 #define TUNE_MIPS9000               (mips_tune == PROCESSOR_R9000)
@@ -815,6 +817,7 @@  struct mips_cpu_info {
 #define ISA_HAS_MUL3		((TARGET_MIPS3900                       \
 				  || TARGET_MIPS5400			\
 				  || TARGET_MIPS5500			\
+				  || TARGET_MIPS5900			\
 				  || TARGET_MIPS7000			\
 				  || TARGET_MIPS9000			\
 				  || TARGET_MAD				\
@@ -829,6 +832,21 @@  struct mips_cpu_info {
 				 && TARGET_OCTEON			\
 				 && !TARGET_MIPS16)
 
+/* Target supports instructions dmult and dmultu (integer). */
+#define TARGET_HAS_DMULT	(TARGET_64BIT				\
+				 && !TARGET_MIPS5900)
+
+/* Target supports instructions mult and multu in 32 bit mode (integer). */
+#define TARGET_HAS_MULT		(mips_isa >= 1)
+
+/* Target supports instructions dmult and dmultu (integer). */
+#define TARGET_HAS_DDIV		(TARGET_64BIT				\
+				 && !TARGET_MIPS5900)
+
+/* Target supports instructions mult and multu in 32 bit mode (integer). */
+#define TARGET_HAS_DIV		(mips_isa >= 1)
+
+
 /* ISA has the floating-point conditional move instructions introduced
    in mips4.  */
 #define ISA_HAS_FP_CONDMOVE	((ISA_MIPS4				\
@@ -841,10 +859,10 @@  struct mips_cpu_info {
 
 /* ISA has the integer conditional move instructions introduced in mips4 and
    ST Loongson 2E/2F.  */
-#define ISA_HAS_CONDMOVE        (ISA_HAS_FP_CONDMOVE || TARGET_LOONGSON_2EF)
+#define ISA_HAS_CONDMOVE        (ISA_HAS_FP_CONDMOVE || TARGET_LOONGSON_2EF || TARGET_MIPS5900)
 
 /* ISA has LDC1 and SDC1.  */
-#define ISA_HAS_LDC1_SDC1	(!ISA_MIPS1 && !TARGET_MIPS16)
+#define ISA_HAS_LDC1_SDC1	(!ISA_MIPS1 && !TARGET_MIPS16 && !TARGET_MIPS5900)
 
 /* ISA has the mips4 FP condition code instructions: FP-compare to CC,
    branch on CC, and move (both FP and non-FP) on CC.  */
@@ -884,7 +902,7 @@  struct mips_cpu_info {
 #define ISA_HAS_FP_MADD4_MSUB4  ISA_HAS_FP4
 
 /* ISA has floating-point madd and msub instructions 'c = a * b [+-] c'.  */
-#define ISA_HAS_FP_MADD3_MSUB3  TARGET_LOONGSON_2EF
+#define ISA_HAS_FP_MADD3_MSUB3  (TARGET_LOONGSON_2EF || TARGET_MIPS5900)
 
 /* ISA has floating-point nmadd and nmsub instructions
    'd = -((a * b) [+-] c)'.  */
@@ -955,6 +973,7 @@  struct mips_cpu_info {
 /* ISA has data prefetch instructions.  This controls use of 'pref'.  */
 #define ISA_HAS_PREFETCH	((ISA_MIPS4				\
 				  || TARGET_LOONGSON_2EF		\
+				  || TARGET_MIPS5900			\
 				  || ISA_MIPS32				\
 				  || ISA_MIPS32R2			\
 				  || ISA_MIPS64				\
@@ -974,7 +993,11 @@  struct mips_cpu_info {
 /* True if trunc.w.s and trunc.w.d are real (not synthetic)
    instructions.  Both require TARGET_HARD_FLOAT, and trunc.w.d
    also requires TARGET_DOUBLE_FLOAT.  */
-#define ISA_HAS_TRUNC_W		(!ISA_MIPS1)
+#define ISA_HAS_TRUNC_W_D	(!ISA_MIPS1)
+
+/* True if trunc.w.s is real (not synthetic) instructions.
+   Requires TARGET_HARD_FLOAT.  */
+#define ISA_HAS_TRUNC_W_S	(ISA_HAS_TRUNC_W_D || TARGET_MIPS5900)
 
 /* ISA includes the MIPS32r2 seb and seh instructions.  */
 #define ISA_HAS_SEB_SEH		((ISA_MIPS32R2		\
@@ -1015,15 +1038,18 @@  struct mips_cpu_info {
    and "addiu $4,$4,1".  */
 #define ISA_HAS_LOAD_DELAY	(ISA_MIPS1				\
 				 && !TARGET_MIPS3900			\
-				 && !TARGET_MIPS16)
+				 && !TARGET_MIPS16			\
+				 && !TARGET_MIPS5900)
 
 /* Likewise mtc1 and mfc1.  */
 #define ISA_HAS_XFER_DELAY	(mips_isa <= 3			\
-				 && !TARGET_LOONGSON_2EF)
+				 && !TARGET_LOONGSON_2EF	\
+				 && !TARGET_MIPS5900)
 
 /* Likewise floating-point comparisons.  */
 #define ISA_HAS_FCMP_DELAY	(mips_isa <= 3			\
-				 && !TARGET_LOONGSON_2EF)
+				 && !TARGET_LOONGSON_2EF	\
+				 && !TARGET_MIPS5900)
 
 /* True if mflo and mfhi can be immediately followed by instructions
    which write to the HI and LO registers.
@@ -1042,6 +1068,7 @@  struct mips_cpu_info {
 				 || ISA_MIPS64				\
 				 || ISA_MIPS64R2			\
 				 || TARGET_MIPS5500			\
+				 || TARGET_MIPS5900			\
 				 || TARGET_LOONGSON_2EF)
 
 /* ISA includes synci, jr.hb and jalr.hb.  */
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/gcc/config/mips/mips.md gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips.md
--- ../gcc-svn-20130105.orig/gcc/config/mips/mips.md	2013-01-05 20:03:24.227962471 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config/mips/mips.md	2013-01-06 20:08:35.056750990 +0100
@@ -58,6 +58,7 @@ 
   r5000
   r5400
   r5500
+  r5900
   r7000
   r8000
   r9000
@@ -726,7 +727,7 @@ 
 ;; This mode iterator allows :MOVECC to be used anywhere that a
 ;; conditional-move-type condition is needed.
 (define_mode_iterator MOVECC [SI (DI "TARGET_64BIT")
-                              (CC "TARGET_HARD_FLOAT && !TARGET_LOONGSON_2EF")])
+                              (CC "TARGET_HARD_FLOAT && !TARGET_LOONGSON_2EF && !TARGET_MIPS5900")])
 
 ;; 32-bit integer moves for which we provide move patterns.
 (define_mode_iterator IMOVE32
@@ -1448,7 +1449,7 @@ 
   [(set (match_operand:GPR 0 "register_operand")
 	(mult:GPR (match_operand:GPR 1 "register_operand")
 		  (match_operand:GPR 2 "register_operand")))]
-  ""
+  "TARGET_HAS_<D>MULT"
 {
   rtx lo;
 
@@ -1490,11 +1491,11 @@ 
 	(mult:GPR (match_operand:GPR 1 "register_operand" "d,d")
 		  (match_operand:GPR 2 "register_operand" "d,d")))
    (clobber (match_scratch:GPR 3 "=l,X"))]
-  "ISA_HAS_<D>MUL3"
+  "ISA_HAS_<D>MUL3 && TARGET_HAS_<D>MULT"
 {
   if (which_alternative == 1)
     return "<d>mult\t%1,%2";
-  if (<MODE>mode == SImode && TARGET_MIPS3900)
+  if (<MODE>mode == SImode && (TARGET_MIPS3900 || TARGET_MIPS5900))
     return "mult\t%0,%1,%2";
   return "<d>mul\t%0,%1,%2";
 }
@@ -1528,7 +1529,7 @@ 
   [(set (match_operand:GPR 0 "muldiv_target_operand" "=l")
 	(mult:GPR (match_operand:GPR 1 "register_operand" "d")
 		  (match_operand:GPR 2 "register_operand" "d")))]
-  "!TARGET_FIX_R4000"
+  "!TARGET_FIX_R4000 && TARGET_HAS_<D>MULT"
   "<d>mult\t%1,%2"
   [(set_attr "type" "imul")
    (set_attr "mode" "<MODE>")])
@@ -1538,7 +1539,7 @@ 
 	(mult:GPR (match_operand:GPR 1 "register_operand" "d")
 		  (match_operand:GPR 2 "register_operand" "d")))
    (clobber (match_scratch:GPR 3 "=l"))]
-  "TARGET_FIX_R4000"
+  "TARGET_FIX_R4000 && TARGET_HAS_<D>MULT"
   "<d>mult\t%1,%2\;mflo\t%0"
   [(set_attr "type" "imul")
    (set_attr "mode" "<MODE>")
@@ -1872,7 +1873,7 @@ 
   [(set (match_operand:DI 0 "register_operand")
 	(mult:DI (any_extend:DI (match_operand:SI 1 "register_operand"))
 		 (any_extend:DI (match_operand:SI 2 "register_operand"))))]
-  "mips_mulsidi3_gen_fn (<CODE>) != NULL"
+  "mips_mulsidi3_gen_fn (<CODE>) != NULL && TARGET_HAS_DMULT"
 {
   mulsidi3_gen_fn fn = mips_mulsidi3_gen_fn (<CODE>);
   emit_insn (fn (operands[0], operands[1], operands[2]));
@@ -1900,7 +1901,7 @@ 
   [(set (match_operand:DI 0 "muldiv_target_operand" "=ka")
 	(mult:DI (any_extend:DI (match_operand:SI 1 "register_operand" "d"))
 		 (any_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
-  "!TARGET_64BIT && (!TARGET_FIX_R4000 || ISA_HAS_DSP)"
+  "(!TARGET_64BIT || (TARGET_64BIT && !TARGET_HAS_DMULT)) && (!TARGET_FIX_R4000 || ISA_HAS_DSP)"
 {
   if (ISA_HAS_DSP_MULT)
     return "mult<u>\t%q0,%1,%2";
@@ -1927,7 +1928,7 @@ 
 		 (any_extend:DI (match_operand:SI 2 "register_operand" "d"))))
    (clobber (match_scratch:TI 3 "=x"))
    (clobber (match_scratch:DI 4 "=d"))]
-  "TARGET_64BIT && !TARGET_FIX_R4000 && !ISA_HAS_DMUL3 && !TARGET_MIPS16"
+  "TARGET_64BIT && !TARGET_FIX_R4000 && !ISA_HAS_DMUL3 && !TARGET_MIPS16 && TARGET_HAS_DMULT"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -2105,7 +2106,7 @@ 
 {
   rtx hilo;
 
-  if (TARGET_64BIT)
+  if (TARGET_64BIT && TARGET_HAS_DMULT)
     {
       hilo = gen_rtx_REG (TImode, MD_REG_FIRST);
       emit_insn (gen_<u>mulsidi3_64bit_hilo (hilo, operands[1], operands[2]));
@@ -2159,7 +2160,7 @@ 
 	  (mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
 		   (any_extend:TI (match_operand:DI 2 "register_operand")))
 	  (const_int 64))))]
-  "TARGET_64BIT && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120)"
+  "TARGET_64BIT && TARGET_HAS_DMULT && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120)"
 {
   if (TARGET_MIPS16)
     emit_insn (gen_<su>muldi3_highpart_split (operands[0], operands[1],
@@ -2180,6 +2181,7 @@ 
    (clobber (match_scratch:DI 3 "=l"))]
   "TARGET_64BIT
    && !TARGET_MIPS16
+   && TARGET_HAS_DMULT
    && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120)"
   { return TARGET_FIX_R4000 ? "dmult<u>\t%1,%2\n\tmfhi\t%0" : "#"; }
   "&& reload_completed && !TARGET_FIX_R4000"
@@ -2200,7 +2202,7 @@ 
 	  (mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
 		   (any_extend:TI (match_operand:DI 2 "register_operand")))
 	  (const_int 64))))]
-  ""
+  "TARGET_HAS_DMULT"
 {
   rtx hilo;
 
@@ -2214,7 +2216,7 @@ 
   [(set (match_operand:TI 0 "register_operand")
 	(mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
 		 (any_extend:TI (match_operand:DI 2 "register_operand"))))]
-  "TARGET_64BIT && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120)"
+  "TARGET_64BIT && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120) && TARGET_HAS_DMULT"
 {
   rtx hilo;
 
@@ -2237,6 +2239,7 @@ 
 	(mult:TI (any_extend:TI (match_operand:DI 1 "register_operand" "d"))
 		 (any_extend:TI (match_operand:DI 2 "register_operand" "d"))))]
   "TARGET_64BIT
+   && TARGET_HAS_DMULT
    && !TARGET_FIX_R4000
    && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120)"
   "dmult<u>\t%1,%2"
@@ -2250,6 +2253,7 @@ 
    (clobber (match_scratch:TI 3 "=x"))]
   "TARGET_64BIT
    && TARGET_FIX_R4000
+   && TARGET_HAS_DMULT
    && !(<CODE> == ZERO_EXTEND && TARGET_FIX_VR4120)"
   "dmult<u>\t%1,%2\;mflo\t%L0\;mfhi\t%M0"
   [(set_attr "type" "imul")
@@ -2537,7 +2541,7 @@ 
    (set (match_operand:GPR 3 "register_operand")
 	(mod:GPR (match_dup 1)
 		 (match_dup 2)))]
-  "!TARGET_FIX_VR4120"
+  "!TARGET_FIX_VR4120 && TARGET_HAS_<D>DIV"
 {
   if (TARGET_MIPS16)
     {
@@ -2558,7 +2562,7 @@ 
    (set (match_operand:GPR 3 "register_operand" "=d")
 	(mod:GPR (match_dup 1)
 		 (match_dup 2)))]
-  "!TARGET_FIX_VR4120 && !TARGET_MIPS16"
+  "!TARGET_FIX_VR4120 && !TARGET_MIPS16 && TARGET_HAS_<D>DIV"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -2577,7 +2581,7 @@ 
    (set (match_operand:GPR 3 "register_operand")
 	(umod:GPR (match_dup 1)
 		  (match_dup 2)))]
-  ""
+  "TARGET_HAS_<D>DIV"
 {
   if (TARGET_MIPS16)
     {
@@ -2598,7 +2602,7 @@ 
    (set (match_operand:GPR 3 "register_operand" "=d")
 	(umod:GPR (match_dup 1)
 		  (match_dup 2)))]
-  "!TARGET_MIPS16"
+  "!TARGET_MIPS16 && TARGET_HAS_<D>DIV"
   "#"
   "reload_completed"
   [(const_int 0)]
@@ -2614,7 +2618,7 @@ 
   [(set (match_operand:GPR 0 "register_operand")
 	(any_mod:GPR (match_operand:GPR 1 "register_operand")
 		     (match_operand:GPR 2 "register_operand")))]
-  ""
+  "TARGET_HAS_<D>DIV"
 {
   rtx hilo;
 
@@ -2641,7 +2645,7 @@ 
 	  [(any_div:GPR (match_operand:GPR 1 "register_operand" "d")
 			(match_operand:GPR 2 "register_operand" "d"))]
 	  UNSPEC_SET_HILO))]
-  ""
+  "TARGET_HAS_<D>DIV"
   { return mips_output_division ("<GPR:d>div<u>\t%.,%1,%2", operands); }
   [(set_attr "type" "idiv")
    (set_attr "mode" "<GPR:MODE>")])
@@ -3464,7 +3468,7 @@ 
 	(fix:SI (match_operand:DF 1 "register_operand")))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
 {
-  if (!ISA_HAS_TRUNC_W)
+  if (!ISA_HAS_TRUNC_W_D)
     {
       emit_insn (gen_fix_truncdfsi2_macro (operands[0], operands[1]));
       DONE;
@@ -3474,7 +3478,7 @@ 
 (define_insn "fix_truncdfsi2_insn"
   [(set (match_operand:SI 0 "register_operand" "=f")
 	(fix:SI (match_operand:DF 1 "register_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && ISA_HAS_TRUNC_W"
+  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && ISA_HAS_TRUNC_W_D"
   "trunc.w.d %0,%1"
   [(set_attr "type"	"fcvt")
    (set_attr "mode"	"DF")
@@ -3484,7 +3488,7 @@ 
   [(set (match_operand:SI 0 "register_operand" "=f")
 	(fix:SI (match_operand:DF 1 "register_operand" "f")))
    (clobber (match_scratch:DF 2 "=d"))]
-  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && !ISA_HAS_TRUNC_W"
+  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && !ISA_HAS_TRUNC_W_D"
 {
   if (mips_nomacro.nesting_level > 0)
     return ".set\tmacro\;trunc.w.d %0,%1,%2\;.set\tnomacro";
@@ -3501,7 +3505,7 @@ 
 	(fix:SI (match_operand:SF 1 "register_operand")))]
   "TARGET_HARD_FLOAT"
 {
-  if (!ISA_HAS_TRUNC_W)
+  if (!ISA_HAS_TRUNC_W_S)
     {
       emit_insn (gen_fix_truncsfsi2_macro (operands[0], operands[1]));
       DONE;
@@ -3511,7 +3515,7 @@ 
 (define_insn "fix_truncsfsi2_insn"
   [(set (match_operand:SI 0 "register_operand" "=f")
 	(fix:SI (match_operand:SF 1 "register_operand" "f")))]
-  "TARGET_HARD_FLOAT && ISA_HAS_TRUNC_W"
+  "TARGET_HARD_FLOAT && ISA_HAS_TRUNC_W_S"
   "trunc.w.s %0,%1"
   [(set_attr "type"	"fcvt")
    (set_attr "mode"	"SF")
@@ -3521,7 +3525,7 @@ 
   [(set (match_operand:SI 0 "register_operand" "=f")
 	(fix:SI (match_operand:SF 1 "register_operand" "f")))
    (clobber (match_scratch:SF 2 "=d"))]
-  "TARGET_HARD_FLOAT && !ISA_HAS_TRUNC_W"
+  "TARGET_HARD_FLOAT && !ISA_HAS_TRUNC_W_S"
 {
   if (mips_nomacro.nesting_level > 0)
     return ".set\tmacro\;trunc.w.s %0,%1,%2\;.set\tnomacro";
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/gcc/config.gcc gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config.gcc
--- ../gcc-svn-20130105.orig/gcc/config.gcc	2013-01-05 20:03:33.659962367 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/gcc/config.gcc	2013-01-06 19:11:56.340755480 +0100
@@ -1881,11 +1881,17 @@  mipsisa64sb1-*-elf* | mipsisa64sb1el-*-e
 	target_cpu_default="MASK_64BIT|MASK_FLOAT64"
 	tm_defines="${tm_defines} MIPS_ISA_DEFAULT=64 MIPS_CPU_STRING_DEFAULT=\\\"sb1\\\" MIPS_ABI_DEFAULT=ABI_O64"
 	;;
-mips-*-elf* | mipsel-*-elf*)
+mips-*-elf* | mipsel-*-elf* | mipsr5900-*-elf* | mipsr5900el-*-elf*)
 	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
 	tmake_file="mips/t-elf"
 	;;
-mips64-*-elf* | mips64el-*-elf*)
+mips64r5900-*-elf* | mips64r5900el-*-elf*)
+	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
+	tmake_file="mips/t-elf"
+	target_cpu_default="MASK_64BIT"
+	tm_defines="${tm_defines} MIPS_ISA_DEFAULT=3 MIPS_ABI_DEFAULT=ABI_N32"
+	;;
+mips64-*-elf* | mips64el-*-elf* | mips64r5900-*-elf* | mips64r5900el-*-elf*)
 	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
 	tmake_file="mips/t-elf"
 	target_cpu_default="MASK_64BIT|MASK_FLOAT64"
@@ -2910,6 +2916,26 @@  if test x$with_cpu = x ; then
 	  ;;
       esac
       ;;
+    mips64r5900-*-*|mips64r5900el-*-*)
+      with_arch=r5900
+      with_tune=r5900
+      if test x$with_llsc = x; then
+	# R5900 doesn't support ll, sc, lld and scd instructions:
+	with_llsc=no
+      fi
+      if test x$with_float = x; then
+	# R5900 doesn't support 64 bit float:
+	with_float=soft
+      fi
+      ;;
+    mipsr5900-*-*|mipsr5900el-*-*)
+      with_arch=r5900
+      with_tune=r5900
+      if test x$with_llsc = x; then
+	# R5900 doesn't support ll, sc, lld and scd instructions:
+	with_llsc=no
+      fi
+      ;;
     mips*-*-vxworks)
       with_arch=mips2
       ;;
@@ -3374,7 +3400,7 @@  case "${target}" in
 		supported_defaults="abi arch arch_32 arch_64 float tune tune_32 tune_64 divide llsc mips-plt synci"
 
 		case ${with_float} in
-		"" | soft | hard)
+		"" | soft | hard | single | double)
 			# OK
 			;;
 		*)
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/libgcc/config.host gcc-svn-20130105-mips64r5900el-linux-patched/libgcc/config.host
--- ../gcc-svn-20130105.orig/libgcc/config.host	2013-01-05 19:28:43.695984006 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/libgcc/config.host	2013-01-06 19:11:56.340755480 +0100
@@ -761,10 +761,18 @@  mips-*-elf* | mipsel-*-elf*)
 	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff mips/t-mips16"
 	extra_parts="$extra_parts crti.o crtn.o"
 	;;
+mipsr5900-*-elf* | mipsr5900el-*-elf*)
+	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff"
+	extra_parts="$extra_parts crti.o crtn.o"
+	;;
 mips64-*-elf* | mips64el-*-elf*)
 	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff mips/t-mips16"
 	extra_parts="$extra_parts crti.o crtn.o"
 	;;
+mips64r5900-*-elf* | mips64r5900el-*-elf*)
+	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff"
+	extra_parts="$extra_parts crti.o crtn.o"
+	;;
 mips64vr-*-elf* | mips64vrel-*-elf*)
 	tmake_file="$tmake_file mips/t-elf mips/t-vr mips/t-crtstuff"
 	extra_parts="$extra_parts crti.o crtn.o"
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/libgcc/Makefile.in gcc-svn-20130105-mips64r5900el-linux-patched/libgcc/Makefile.in
--- ../gcc-svn-20130105.orig/libgcc/Makefile.in	2013-01-05 19:28:43.695984006 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/libgcc/Makefile.in	2013-01-06 19:51:13.488752493 +0100
@@ -293,6 +293,9 @@  MULTIOSSUBDIR := $(shell if test $(MULTI
 inst_libdir = $(libsubdir)$(MULTISUBDIR)
 inst_slibdir = $(slibdir)$(MULTIOSSUBDIR)
 
+# Get mips type: __mips or __mips64 is defined as GCC macro:
+MIPSTYPE := $(shell $(CC) $(CFLAGS) -dM -E - < /dev/null | grep -e "\<__mips\>" -e "\<__mips64\>" | (read define type value; echo $$type))
+
 gcc_compile_bare = $(CC) $(INTERNAL_CFLAGS)
 compile_deps = -MT $@ -MD -MP -MF $(basename $@).dep
 gcc_compile = $(gcc_compile_bare) -o $@ $(compile_deps)
@@ -401,7 +404,8 @@  LIB2ADDEHSTATIC += $(srcdir)/emutls.c
 LIB2ADDEHSHARED += $(srcdir)/emutls.c
 
 # Library members defined in libgcc2.c.
-lib2funcs = _muldi3 _negdi2 _lshrdi3 _ashldi3 _ashrdi3 _cmpdi2 _ucmpdi2	   \
+lib2difuncs = _muldi3
+lib2funcs = $(lib2difuncs) _negdi2 _lshrdi3 _ashldi3 _ashrdi3 _cmpdi2 _ucmpdi2 \
 	    _clear_cache _trampoline __main _absvsi2 \
 	    _absvdi2 _addvsi3 _addvdi3 _subvsi3 _subvdi3 _mulvsi3 _mulvdi3 \
 	    _negvsi2 _negvdi2 _ctors _ffssi2 _ffsdi2 _clz _clzsi2 _clzdi2  \
@@ -427,7 +431,8 @@  endif
 
 # These might cause a divide overflow trap and so are compiled with
 # unwinder info.
-LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4
+LIB2_DIVMODDI_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udivmoddi4
+LIB2_DIVMOD_FUNCS = $(LIB2_DIVMODDI_FUNCS) _udiv_w_sdiv
 
 # Remove any objects from lib2funcs and LIB2_DIVMOD_FUNCS that are
 # defined as optimized assembly code in LIB1ASMFUNCS or as C code
@@ -459,12 +464,26 @@  lib2funcs-o = $(patsubst %,%$(objext),$(
 $(lib2funcs-o): %$(objext): $(srcdir)/libgcc2.c
 	$(gcc_compile) -DL$* -c $< $(vis_hide)
 libgcc-objects += $(lib2funcs-o)
+ifeq ($(MIPSTYPE),__mips64)
+# Build functions needed by MIPS r5900.
+lib2difuncs-o = $(patsubst %,%$(objext),$(addsuffix _32bit,$(lib2difuncs)))
+$(lib2difuncs-o): %$(objext): $(srcdir)/libgcc2.c
+	$(gcc_compile) -DL$(subst _32bit,,$*) -DLIBGCC2_UNITS_PER_WORD=4 -c $< $(vis_hide)
+libgcc-objects += $(lib2difuncs-o)
+endif
 
 ifeq ($(enable_shared),yes)
 lib2funcs-s-o = $(patsubst %,%_s$(objext),$(lib2funcs))
 $(lib2funcs-s-o): %_s$(objext): $(srcdir)/libgcc2.c
 	$(gcc_s_compile) -DL$* -c $<
 libgcc-s-objects += $(lib2funcs-s-o)
+ifeq ($(MIPSTYPE),__mips64)
+# Build functions needed by MIPS r5900.
+lib2difuncs-s-o = $(patsubst %,%_s$(objext),$(addsuffix _32bit,$(lib2difuncs)))
+$(lib2difuncs-s-o): %_s$(objext): $(srcdir)/libgcc2.c
+	$(gcc_s_compile) -DL$(subst _32bit,,$*) -DLIBGCC2_UNITS_PER_WORD=4 -c $<
+libgcc-s-objects += $(lib2difuncs-s-o)
+endif
 endif
 
 ifneq ($(LIB2_SIDITI_CONV_FUNCS),)
@@ -501,6 +520,14 @@  $(lib2-divmod-o): %$(objext): $(srcdir)/
 	$(gcc_compile) -DL$* -c $< \
 	  $(LIB2_DIVMOD_EXCEPTION_FLAGS) $(vis_hide)
 libgcc-objects += $(lib2-divmod-o)
+ifeq ($(MIPSTYPE),__mips64)
+# Build functions needed by MIPS r5900.
+lib2-divmoddi-o = $(patsubst %,%$(objext),$(addsuffix _32bit,$(LIB2_DIVMODDI_FUNCS)))
+$(lib2-divmoddi-o): %$(objext): $(srcdir)/libgcc2.c
+	$(gcc_compile) -DL$(subst _32bit,,$*) -DLIBGCC2_UNITS_PER_WORD=4 -c $< \
+	  $(LIB2_DIVMOD_EXCEPTION_FLAGS) $(vis_hide)
+libgcc-objects += $(lib2-divmoddi-o)
+endif
 
 ifeq ($(enable_shared),yes)
 lib2-divmod-s-o = $(patsubst %,%_s$(objext),$(LIB2_DIVMOD_FUNCS))
@@ -508,6 +535,14 @@  $(lib2-divmod-s-o): %_s$(objext): $(srcd
 	$(gcc_s_compile) -DL$* -c $< \
 	  $(LIB2_DIVMOD_EXCEPTION_FLAGS)
 libgcc-s-objects += $(lib2-divmod-s-o)
+ifeq ($(MIPSTYPE),__mips64)
+# Build functions needed by MIPS r5900.
+lib2-divmoddi-s-o = $(patsubst %,%_s$(objext),$(addsuffix _32bit,$(LIB2_DIVMODDI_FUNCS)))
+$(lib2-divmoddi-s-o): %_s$(objext): $(srcdir)/libgcc2.c
+	$(gcc_s_compile) -DL$(subst _32bit,,$*) -DLIBGCC2_UNITS_PER_WORD=4 -c $< \
+	  $(LIB2_DIVMOD_EXCEPTION_FLAGS)
+libgcc-s-objects += $(lib2-divmoddi-s-o)
+endif
 endif
 
 ifeq ($(TPBIT),)
diff -Nurp '--exclude=build01' ../gcc-svn-20130105.orig/libstdc++-v3/configure.host gcc-svn-20130105-mips64r5900el-linux-patched/libstdc++-v3/configure.host
--- ../gcc-svn-20130105.orig/libstdc++-v3/configure.host	2013-01-05 19:09:50.603996241 +0100
+++ gcc-svn-20130105-mips64r5900el-linux-patched/libstdc++-v3/configure.host	2013-01-06 19:11:56.340755480 +0100
@@ -322,6 +322,11 @@  esac
 # Set any OS-dependent and CPU-dependent bits.
 # THIS TABLE IS SORTED.  KEEP IT THAT WAY.
 case "${host}" in
+	mips*)
+        atomicity_dir="cpu/generic"
+	;;
+esac
+case "${host}" in
   *-*-linux*)
     case "${host_cpu}" in
       i[567]86)