Patchwork RFC: [MIPS] Add an option to disable ldc1/sdc1

login
register
mail settings
Submitter Chao-Ying Fu
Date Feb. 13, 2013, 10:14 p.m.
Message ID <81D57523CB07B24881D63DE650C6ED82EAA63B@bamail02.ba.imgtec.org>
Download mbox | patch
Permalink /patch/220283/
State New
Headers show

Comments

Chao-Ying Fu - Feb. 13, 2013, 10:14 p.m.
Hello All,

  Once in a while we got reports about programs (ex: WebKit, FireFox)
crash due to ldc1/sdc1 unaligned accesses on MIPS targets.  The root cause is that programmers
neglect the alignment issue and cast arbitrary pointers to point to double variables.

  Although the correct solution is to fix application source code to fulfill alignment requirements,
we want to add a GCC option to disable ldc1 and sdc1 (for the testing purpose or for workaround).
On 32-bit MIPS targets, GCC generates lwc1 and swc1 when -mno-ldc1-sdc1 is used,
so that the memory address can be just 4-byte aligned to avoid ldc1/sdc1 address exceptions.

Ex:  (The proposed patch)

  Any feedback?  Thanks a lot!

Regards,
Chao-ying
Andrew Pinski - Feb. 13, 2013, 10:35 p.m.
On Wed, Feb 13, 2013 at 2:14 PM, Chao-Ying Fu <Chao-Ying.Fu@imgtec.com> wrote:
> Hello All,
>
>   Once in a while we got reports about programs (ex: WebKit, FireFox)
> crash due to ldc1/sdc1 unaligned accesses on MIPS targets.  The root cause is that programmers
> neglect the alignment issue and cast arbitrary pointers to point to double variables.


And the code is undefined C/C++ anyways and should really be fixed
rather this workaround.  If the upstream vendor is not willing to fix
undefined code then it is their fault the code does not work rather
than the compiler's.

Thanks,
Andrew Pinski

>
>   Although the correct solution is to fix application source code to fulfill alignment requirements,
> we want to add a GCC option to disable ldc1 and sdc1 (for the testing purpose or for workaround).
> On 32-bit MIPS targets, GCC generates lwc1 and swc1 when -mno-ldc1-sdc1 is used,
> so that the memory address can be just 4-byte aligned to avoid ldc1/sdc1 address exceptions.
>
> Ex:  (The proposed patch)
> Index: mips.opt
> ===================================================================
> --- mips.opt    (revision 196026)
> +++ mips.opt    (working copy)
> @@ -233,6 +233,10 @@
>  Target Report RejectNegative Mask(MIPS3D)
>  Use MIPS-3D instructions
>
> +mldc1-sdc1
> +Target Report Var(TARGET_LDC1_SDC1) Init(1)
> +Use ldc1 and sdc1 instruction
> +
>  mllsc
>  Target Report Mask(LLSC)
>  Use ll, sc and sync instructions
> Index: mips.h
> ===================================================================
> --- mips.h      (revision 196026)
> +++ mips.h      (working copy)
> @@ -840,8 +840,9 @@
>     ST Loongson 2E/2F.  */
>  #define ISA_HAS_CONDMOVE        (ISA_HAS_FP_CONDMOVE || TARGET_LOONGSON_2EF)
>
> -/* ISA has LDC1 and SDC1.  */
> -#define ISA_HAS_LDC1_SDC1      (!ISA_MIPS1 && !TARGET_MIPS16)
> +/* ISA has LDC1 and SDC1 and they are enabled.  */
> +#define ISA_HAS_LDC1_SDC1 \
> +  (!ISA_MIPS1 && !TARGET_MIPS16 && TARGET_LDC1_SDC1)
>
>  /* ISA has the mips4 FP condition code instructions: FP-compare to CC,
>     branch on CC, and move (both FP and non-FP) on CC.  */
>
>   Any feedback?  Thanks a lot!
>
> Regards,
> Chao-ying
Richard Sandiford - Feb. 14, 2013, 9:52 a.m.
Chao-Ying Fu <Chao-Ying.Fu@imgtec.com> writes:
> Hello All,
>
>   Once in a while we got reports about programs (ex: WebKit, FireFox)
> crash due to ldc1/sdc1 unaligned accesses on MIPS targets.  The root
> cause is that programmers neglect the alignment issue and cast
> arbitrary pointers to point to double variables.
>
>   Although the correct solution is to fix application source code to
> fulfill alignment requirements, we want to add a GCC option to disable
> ldc1 and sdc1 (for the testing purpose or for workaround).  On 32-bit
> MIPS targets, GCC generates lwc1 and swc1 when -mno-ldc1-sdc1 is used,
> so that the memory address can be just 4-byte aligned to avoid
> ldc1/sdc1 address exceptions.

Sounds OK to me, given that the impact of the option is so low.

Bikeshed time, but I'd prefer the option to be named after the thing
that we're guaranteeing, rather than an instruction.  E.g. if the
problem is that doubles might only be 32-bit aligned, we could have
-mmisaligned-double (better suggestions welcome).

What about 64-bit targets?  We can sometimes access doubles using GPRs,
so on 64-bit targets we could end up using LD and SD to access a double
even when this option disables LDC1 and SDC1.  I think we'd need to
patch the move patterns as well.

If you only see the problem on 32-bit targets, then maybe it would be
better to have an option that prevents instructions that require greater
than word alignment.  Say (for sake of argument) -mno-superword-align.
Then it would be:

#define ISA_HAS_LDC1_SDC1 \
  (!ISA_MIPS1 && !TARGET_MIPS16 && (TARGET_64BIT || TARGET_SUPERWORD_ALIGN))

Thanks,
Richard
Chao-Ying Fu - Feb. 15, 2013, 12:19 a.m.
Richard Sandiford wrote:
> Chao-Ying Fu <Chao-Ying.Fu@imgtec.com> writes:
> > Hello All,
> >
> >   Once in a while we got reports about programs (ex: 
> WebKit, FireFox)
> > crash due to ldc1/sdc1 unaligned accesses on MIPS targets.  The root
> > cause is that programmers neglect the alignment issue and cast
> > arbitrary pointers to point to double variables.
> >
> >   Although the correct solution is to fix application source code to
> > fulfill alignment requirements, we want to add a GCC option 
> to disable
> > ldc1 and sdc1 (for the testing purpose or for workaround).  
> On 32-bit
> > MIPS targets, GCC generates lwc1 and swc1 when 
> -mno-ldc1-sdc1 is used,
> > so that the memory address can be just 4-byte aligned to avoid
> > ldc1/sdc1 address exceptions.
> 
> Sounds OK to me, given that the impact of the option is so low.

  Thanks a lot for OK!
 
> Bikeshed time, but I'd prefer the option to be named after the thing
> that we're guaranteeing, rather than an instruction.  E.g. if the
> problem is that doubles might only be 32-bit aligned, we could have
> -mmisaligned-double (better suggestions welcome).
> What about 64-bit targets?  We can sometimes access doubles 
> using GPRs,
> so on 64-bit targets we could end up using LD and SD to 
> access a double
> even when this option disables LDC1 and SDC1.  I think we'd need to
> patch the move patterns as well.
> 
> If you only see the problem on 32-bit targets, then maybe it would be
> better to have an option that prevents instructions that 
> require greater
> than word alignment.  Say (for sake of argument) -mno-superword-align.
> Then it would be:
> 
> #define ISA_HAS_LDC1_SDC1 \
>   (!ISA_MIPS1 && !TARGET_MIPS16 && (TARGET_64BIT || 
> TARGET_SUPERWORD_ALIGN))

  We only got reports about 32-bit MIPS targets.  But similar issues may happen on 64-bit MIPS targets.
Note that these problematic programs can run fine on other architectures that allow 8-byte
load/store instructions from 4-byte aligned addresses.
For a short-term goal, we can use -mon-superword-align (or other option names) to disable
ldc1 and sdc1 on 32-bit MIPS targets where only ldc1 and sdc1 have this 8-byte alignment rule.
Later, we can extend this option to cover 64-bit MIPS targets.

  Thanks!

Regards,
Chao-ying
Maciej W. Rozycki - Feb. 15, 2013, 1:23 a.m.
On Thu, 14 Feb 2013, Richard Sandiford wrote:

> What about 64-bit targets?  We can sometimes access doubles using GPRs,
> so on 64-bit targets we could end up using LD and SD to access a double
> even when this option disables LDC1 and SDC1.  I think we'd need to
> patch the move patterns as well.

 As far as Linux is concerned -- which from the context of this 
consideration I infer is the case -- there is no issue, because unless an 
app has explicitly requested this feature to be disabled, with the use of 
the sysmips(2) FIXADE API (which of course hardly any does), all address 
error exceptions triggered by an unaligned memory access made with a CPU 
instruction are trapped by the kernel and the access emulated.

 However no such accesses are emulated when made with a coprocessor 
instruction, whether the FPU or CP2 or the legacy CP1 and CP3 units as 
supported by earlier processors.  There was once a proposal to emulate FPU 
instructions as well, posted on the linux-mips mailing list here:

http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=20120615234641.6938B58FE7C%40mail.viric.name

-- but the MIPS/Linux maintainer rejected the idea.  I have cc-ed Ralf on 
this reply in case he wanted to add something.

 Of course an option to GCC to disable any such instructions may have some 
value to some people -- for bare-iron targets if nothing else -- but I 
fear this is going to end up with a lot of hassle with 64-bit ABIs or even 
just 64-bit FPU (-mabi=32 -mfp64) as individual halves of 64-bit registers 
are not addressable in the MIPS architecture (e.g. there's no LWHC1 
instruction), so you'll need to use scratch registers.

 Which is why I think any resources put into this effort would better be 
used to clean up such broken code and, perhaps more importantly, to 
educate people to write their programs properly in the first place.  
Writing portable code is really not a big deal, you just need to remember 
all the world is not x86 and apply a few simple rules.

  Maciej
Andrew Pinski - Feb. 15, 2013, 1:28 a.m.
On Thu, Feb 14, 2013 at 5:23 PM, Maciej W. Rozycki
<macro@codesourcery.com> wrote:
> On Thu, 14 Feb 2013, Richard Sandiford wrote:
>
>> What about 64-bit targets?  We can sometimes access doubles using GPRs,
>> so on 64-bit targets we could end up using LD and SD to access a double
>> even when this option disables LDC1 and SDC1.  I think we'd need to
>> patch the move patterns as well.
>
>  As far as Linux is concerned -- which from the context of this
> consideration I infer is the case -- there is no issue, because unless an
> app has explicitly requested this feature to be disabled, with the use of
> the sysmips(2) FIXADE API (which of course hardly any does), all address
> error exceptions triggered by an unaligned memory access made with a CPU
> instruction are trapped by the kernel and the access emulated.
>
>  However no such accesses are emulated when made with a coprocessor
> instruction, whether the FPU or CP2 or the legacy CP1 and CP3 units as
> supported by earlier processors.  There was once a proposal to emulate FPU
> instructions as well, posted on the linux-mips mailing list here:
>
> http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=20120615234641.6938B58FE7C%40mail.viric.name
>
> -- but the MIPS/Linux maintainer rejected the idea.  I have cc-ed Ralf on
> this reply in case he wanted to add something.

I would reject it also.  I say just fix the webkit code and forget
about these patches to both the GCC and the kernel.
Webkit is breaking C/C++ alignment rules anyways so the code is
undefined.  Just go and fix webkit.  There is no reason why webkit
can't just be fixed and we can forget about these broken patches to
work around broken code.

Thanks,
Andrew Pinski

>
>  Of course an option to GCC to disable any such instructions may have some
> value to some people -- for bare-iron targets if nothing else -- but I
> fear this is going to end up with a lot of hassle with 64-bit ABIs or even
> just 64-bit FPU (-mabi=32 -mfp64) as individual halves of 64-bit registers
> are not addressable in the MIPS architecture (e.g. there's no LWHC1
> instruction), so you'll need to use scratch registers.
>
>  Which is why I think any resources put into this effort would better be
> used to clean up such broken code and, perhaps more importantly, to
> educate people to write their programs properly in the first place.
> Writing portable code is really not a big deal, you just need to remember
> all the world is not x86 and apply a few simple rules.
>
>   Maciej
Ralf Baechle - Feb. 15, 2013, 3:37 p.m.
On Fri, Feb 15, 2013 at 01:23:14AM +0000, Maciej W. Rozycki wrote:

> > What about 64-bit targets?  We can sometimes access doubles using GPRs,
> > so on 64-bit targets we could end up using LD and SD to access a double
> > even when this option disables LDC1 and SDC1.  I think we'd need to
> > patch the move patterns as well.
> 
>  As far as Linux is concerned -- which from the context of this 
> consideration I infer is the case -- there is no issue, because unless an 
> app has explicitly requested this feature to be disabled, with the use of 
> the sysmips(2) FIXADE API (which of course hardly any does), all address 
> error exceptions triggered by an unaligned memory access made with a CPU 
> instruction are trapped by the kernel and the access emulated.

The sysmips(MIPS_FIXADE, ...) API is crafted after other, older MIPS
UNIX variants which as far as I was able to find out all only support
fixing up of integer accesses.

>  However no such accesses are emulated when made with a coprocessor 
> instruction, whether the FPU or CP2 or the legacy CP1 and CP3 units as 
> supported by earlier processors.  There was once a proposal to emulate FPU 
> instructions as well, posted on the linux-mips mailing list here:
> 
> http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=20120615234641.6938B58FE7C%40mail.viric.name
> 
> -- but the MIPS/Linux maintainer rejected the idea.  I have cc-ed Ralf on 
> this reply in case he wanted to add something.
> 
>  Of course an option to GCC to disable any such instructions may have some 
> value to some people -- for bare-iron targets if nothing else -- but I 
> fear this is going to end up with a lot of hassle with 64-bit ABIs or even 
> just 64-bit FPU (-mabi=32 -mfp64) as individual halves of 64-bit registers 
> are not addressable in the MIPS architecture (e.g. there's no LWHC1 
> instruction), so you'll need to use scratch registers.
> 
>  Which is why I think any resources put into this effort would better be 
> used to clean up such broken code and, perhaps more importantly, to 
> educate people to write their programs properly in the first place.  
> Writing portable code is really not a big deal, you just need to remember 
> all the world is not x86 and apply a few simple rules.

And I'd like to remind that emulation of missaligned accesses in software
is on the order of 1000 times slower than a properly aligned access.  As I
understand right now the issue only affects webkit and code based on it so
fixing it in webkit should be the way to go rather than introducing a
compatibility hack.

Even on x86 (or MIPS variants such as the Octeon core) which fixup
misaligned accesses in hardware there is still a performance penalty for
misalignment so proper alignment always matters.

Unless maybe when packing structures for space reasons.

  Ralf
Chao-Ying Fu - Feb. 15, 2013, 7:44 p.m.
Maciej W. Rozycki wrote:
> 
> On Thu, 14 Feb 2013, Richard Sandiford wrote:
> 
> > What about 64-bit targets?  We can sometimes access doubles 
> using GPRs,
> > so on 64-bit targets we could end up using LD and SD to 
> access a double
> > even when this option disables LDC1 and SDC1.  I think we'd need to
> > patch the move patterns as well.
> 
>  As far as Linux is concerned -- which from the context of this 
> consideration I infer is the case -- there is no issue, 
> because unless an 
> app has explicitly requested this feature to be disabled, 
> with the use of 
> the sysmips(2) FIXADE API (which of course hardly any does), 
> all address 
> error exceptions triggered by an unaligned memory access made 
> with a CPU 
> instruction are trapped by the kernel and the access emulated.
> 
>  However no such accesses are emulated when made with a coprocessor 
> instruction, whether the FPU or CP2 or the legacy CP1 and CP3 
> units as 
> supported by earlier processors.  There was once a proposal 
> to emulate FPU 
> instructions as well, posted on the linux-mips mailing list here:
> 
> http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=2012
0615234641.6938B58FE7C%40mail.viric.name
> 
> -- but the MIPS/Linux maintainer rejected the idea.  I have 
> cc-ed Ralf on 
> this reply in case he wanted to add something.

  The kernel emulation can help, but the performance is not great (as Ralf pointed out).
So, this compiler workaround may have its advantages.

> 
>  Of course an option to GCC to disable any such instructions 
> may have some 
> value to some people -- for bare-iron targets if nothing else 

  It is still good for 32-bit MIPS Linux targets.  Right?

> -- but I 
> fear this is going to end up with a lot of hassle with 64-bit 
> ABIs or even 
> just 64-bit FPU (-mabi=32 -mfp64) as individual halves of 
> 64-bit registers 
> are not addressable in the MIPS architecture (e.g. there's no LWHC1 
> instruction), so you'll need to use scratch registers.

  Thanks for this point!  I tested the patch and it cannot work with -mabi=32 -mfp64.
Probably we can limit the scope of the patch to 32-bit MIPS targets with 32-bit FPU.

> 
>  Which is why I think any resources put into this effort 
> would better be 
> used to clean up such broken code and, perhaps more importantly, to 
> educate people to write their programs properly in the first place.  
> Writing portable code is really not a big deal, you just need 
> to remember 
> all the world is not x86 and apply a few simple rules.
> 

  I know this, but it's very hard to educate people and let people admit it's not CPU's fault,
when same software can run on other architectures.

  Thanks a lot for your inputs!

Regards,
Chao-ying
Mike Stump - Feb. 15, 2013, 11:40 p.m.
On Feb 15, 2013, at 11:44 AM, Chao-Ying Fu <Chao-Ying.Fu@imgtec.com> wrote:
> I know this, but it's very hard to educate people and let people admit it's not CPU's fault,
> when same software can run on other architectures.

Show us the link were you tried.  Some of might be happy to help beat on them.  If you've not tried, well, no sympathy from us.  Also, if you can show a performance gain on CPUs that otherwise allow the access, but none the less run slower for it, you can push on performance   grounds alone.

Patch

Index: mips.opt
===================================================================
--- mips.opt    (revision 196026)
+++ mips.opt    (working copy)
@@ -233,6 +233,10 @@ 
 Target Report RejectNegative Mask(MIPS3D)
 Use MIPS-3D instructions

+mldc1-sdc1
+Target Report Var(TARGET_LDC1_SDC1) Init(1)
+Use ldc1 and sdc1 instruction
+
 mllsc
 Target Report Mask(LLSC)
 Use ll, sc and sync instructions
Index: mips.h
===================================================================
--- mips.h      (revision 196026)
+++ mips.h      (working copy)
@@ -840,8 +840,9 @@ 
    ST Loongson 2E/2F.  */
 #define ISA_HAS_CONDMOVE        (ISA_HAS_FP_CONDMOVE || TARGET_LOONGSON_2EF)

-/* ISA has LDC1 and SDC1.  */
-#define ISA_HAS_LDC1_SDC1      (!ISA_MIPS1 && !TARGET_MIPS16)
+/* ISA has LDC1 and SDC1 and they are enabled.  */
+#define ISA_HAS_LDC1_SDC1 \
+  (!ISA_MIPS1 && !TARGET_MIPS16 && TARGET_LDC1_SDC1)

 /* ISA has the mips4 FP condition code instructions: FP-compare to CC,
    branch on CC, and move (both FP and non-FP) on CC.  */