Message ID | 200903121430.49077.david.jander@protonic.nl (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Please note: the proposed patch is actually incomplete, someone with better knowledge of PowerPC assembly than me should complete it. According to the errata from Freescale, the proposed workaround should be a complete LRW (Least-Recently Written) implementation. AFAIK that would implicate holding an extra table in RAM with LRW information for each entry in the TLB. Anyway, with this patch I am experiencing enormous speed-up overall. Some example tests I have done so far: - 'mplayer -nosound -benchmark' shows a speedup of roughly 22 % - 'prboom -timedemo test' (where 'test.lmb' is a prerecorded demo) shows an increase from 14.1 to 16.7 fps. Sysnthetic memcpy() benchmarks may show a more drastic improvement (if they are hit by this bug): Using 'minibench' from Gunnar Von Boehn, memcpy() speed goes up from 27Mbyte/s to 173Mbyte/s for memory-2-memory cases. Greetings,
--- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -614,6 +614,14 @@ DataStoreTLBMiss: */ mfctr r0 /* Get PTE (linux-style) and check access */ +#ifdef CONFIG_PPC_MPC512x +/* MPC512x: (partial) workaround for errata in die M36P and earlier: + * Force writes to Way 0 (reads are always way 1) + */ + mfspr r3,SPRN_SRR1 + rlwinm r3,r3,0,15,13 /* Mask out SRR1[WAY] */ + mtspr SPRN_SRR1,r3 +#endif mfspr r3,SPRN_DMISS lis r1,PAGE_OFFSET@h /* check if kernel address */ cmplw 0,r1,r3
Partial workaround for DTLB errata in MPC5121e processors of die M36P and older (all currently existing versions). Due to the bug, the hardware-implemented LRU algorythm always goes to way 1 of the TLB. This fix forces writes to go to way 0, which would speed up memory-copy operations where bits 15...19 of source and destination address are the same. Signed-off-by: David Jander <david@protonic.nl> --- arch/powerpc/kernel/head_32.S | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-)