Patchwork Proposal: [PATCH] Workaround for MPC5121 DTLB errata

login
register
mail settings
Submitter David Jander
Date March 12, 2009, 1:30 p.m.
Message ID <200903121430.49077.david.jander@protonic.nl>
Download mbox | patch
Permalink /patch/24342/
State Superseded
Headers show

Comments

David Jander - March 12, 2009, 1:30 p.m.
Partial workaround for DTLB errata in MPC5121e processors of die M36P and 
older (all currently existing versions).

Due to the bug, the hardware-implemented LRU algorythm always goes to way 1 of 
the TLB. This fix forces writes to go to way 0, which would speed up 
memory-copy operations where bits 15...19 of source and destination address 
are the same.

Signed-off-by: David Jander <david@protonic.nl>

---
 arch/powerpc/kernel/head_32.S |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)
David Jander - March 12, 2009, 2 p.m.
Please note: the proposed patch is actually incomplete, someone with better 
knowledge of PowerPC assembly than me should complete it.
According to the errata from Freescale, the proposed workaround should be a 
complete LRW (Least-Recently Written) implementation. AFAIK that would 
implicate holding an extra table in RAM with LRW information for each entry 
in the TLB.

Anyway, with this patch I am experiencing enormous speed-up overall. Some 
example tests I have done so far:

- 'mplayer -nosound -benchmark' shows a speedup of roughly 22 %

- 'prboom -timedemo test' (where 'test.lmb' is a prerecorded demo) shows an 
increase from 14.1 to 16.7 fps.

Sysnthetic memcpy() benchmarks may show a more drastic improvement (if they 
are hit by this bug):

Using 'minibench' from Gunnar Von Boehn, memcpy() speed goes up from 27Mbyte/s 
to 173Mbyte/s for memory-2-memory cases.

Greetings,

Patch

--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -614,6 +614,14 @@  DataStoreTLBMiss:
  */
        mfctr   r0
        /* Get PTE (linux-style) and check access */
+#ifdef CONFIG_PPC_MPC512x
+/* MPC512x: (partial) workaround for errata in die M36P and earlier:
+ * Force writes to Way 0 (reads are always way 1)
+ */
+       mfspr   r3,SPRN_SRR1
+       rlwinm  r3,r3,0,15,13  /* Mask out SRR1[WAY] */
+       mtspr   SPRN_SRR1,r3
+#endif
        mfspr   r3,SPRN_DMISS
        lis     r1,PAGE_OFFSET@h                /* check if kernel address */
        cmplw   0,r1,r3