diff mbox

arch/powerpc/lib/copy_32.S: Use alternate memcpy for MPC512x and MPC52xx

Message ID 181804936ABC2349BE503168465576460F272CA4@exchserver.basler.com (mailing list archive)
State Rejected
Delegated to: Benjamin Herrenschmidt
Headers show

Commit Message

Steve Deiters June 29, 2010, 4:04 p.m. UTC
These processors will corrupt data if accessing the local bus with
unaligned
addresses. This version fixes the typical case of copying from Flash on
the
local bus by keeping the source address always aligned.

Signed-off-by: Steve Deiters <SteveDeiters@basler.com>
---
 arch/powerpc/lib/copy_32.S |   56
++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 56 insertions(+), 0 deletions(-)

Comments

Segher Boessenkool June 29, 2010, 4:58 p.m. UTC | #1
> These processors will corrupt data if accessing the local bus with
> unaligned
> addresses. This version fixes the typical case of copying from  
> Flash on
> the
> local bus by keeping the source address always aligned.

On many platforms accessing ROM as RAM simply doesn't work(*).  You  
shouldn't
map ROM as if it is RAM, and shouldn't use the same access functions  
on it.


Segher


(*) Example: any existing 970 system will checkstop as soon as you  
try to
do any cacheable access to some ROM.  Another example of course is  
unaligned
accesses on pretty much any system, no matter whether it's called a  
bug or
a feature on that system :-P
Benjamin Herrenschmidt July 8, 2010, 5:10 a.m. UTC | #2
On Tue, 2010-06-29 at 11:04 -0500, Steve Deiters wrote:
> These processors will corrupt data if accessing the local bus with
> unaligned
> addresses. This version fixes the typical case of copying from Flash on
> the
> local bus by keeping the source address always aligned.

Shouldn't this be solved by using memcpy_to/fromio ?

Cheers,
Ben.

> Signed-off-by: Steve Deiters <SteveDeiters@basler.com>
> ---
>  arch/powerpc/lib/copy_32.S |   56
> ++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 56 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
> index 74a7f41..42e7df5 100644
> --- a/arch/powerpc/lib/copy_32.S
> +++ b/arch/powerpc/lib/copy_32.S
> @@ -226,6 +226,60 @@ _GLOBAL(memmove)
>  	bgt	backwards_memcpy
>  	/* fall through */
>  
> +#if defined(CONFIG_PPC_MPC512x) || defined(CONFIG_PPC_MPC52xx)
> +
> +/*
> + * Alternate memcpy for MPC512x and MPC52xx to guarantee source
> + * address is always aligned to prevent corruption issues when
> + * copying unaligned from the local bus. This only fixes the usage
> + * when copying from the local bus (e.g. Flash) and will not fix
> + * issues copying to the local bus
> + */
> +_GLOBAL(memcpy)
> +	srwi.	r7,r5,3
> +	addi	r6,r3,-4
> +	addi	r4,r4,-4
> +	beq	2f			/* if less than 8 bytes to do */
> +	andi.	r0,r4,3			/* get src word aligned */
> +	mtctr	r7
> +	bne	5f
> +1:	lwz	r7,4(r4)
> +	lwzu	r8,8(r4)
> +	stw	r7,4(r6)
> +	stwu	r8,8(r6)
> +	bdnz	1b
> +	andi.	r5,r5,7
> +2:	cmplwi	0,r5,4
> +	blt	3f
> +	andi.	r0,r4,3
> +	bne	3f
> +	lwzu	r0,4(r4)
> +	addi	r5,r5,-4
> +	stwu	r0,4(r6)
> +3:	cmpwi	0,r5,0
> +	beqlr
> +	mtctr	r5
> +	addi	r4,r4,3
> +	addi	r6,r6,3
> +4:	lbzu	r0,1(r4)
> +	stbu	r0,1(r6)
> +	bdnz	4b
> +	blr
> +5:	subfic	r0,r0,4
> +	mtctr	r0
> +6:	lbz	r7,4(r4)
> +	addi	r4,r4,1
> +	stb	r7,4(r6)
> +	addi	r6,r6,1
> +	bdnz	6b
> +	subf	r5,r0,r5
> +	rlwinm.	r7,r5,32-3,3,31
> +	beq	2b
> +	mtctr	r7
> +	b	1b
> +
> +#else
> +
>  _GLOBAL(memcpy)
>  	srwi.	r7,r5,3
>  	addi	r6,r3,-4
> @@ -267,6 +321,8 @@ _GLOBAL(memcpy)
>  	mtctr	r7
>  	b	1b
>  
> +#endif
> +
>  _GLOBAL(backwards_memcpy)
>  	rlwinm.	r7,r5,32-3,3,31		/* r0 = r5 >> 3 */
>  	add	r6,r3,r5
Grant Likely July 8, 2010, 5:38 a.m. UTC | #3
On Wed, Jul 7, 2010 at 11:10 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2010-06-29 at 11:04 -0500, Steve Deiters wrote:
>> These processors will corrupt data if accessing the local bus with
>> unaligned
>> addresses. This version fixes the typical case of copying from Flash on
>> the
>> local bus by keeping the source address always aligned.
>
> Shouldn't this be solved by using memcpy_to/fromio ?

Maybe.  plain memcpy() access to anything localbus-attached on the
mpc5xxx is the wrong thing to do.  memcpy_to/fromio might do the right
thing; but the caller must understand the limitations of the localplus
bus.  The byte-wise alignment that memcpy_to/fromio() does may not
work (depending on configuration).

Steve, what code is doing a memcpy from flash?

g.
Steve Deiters July 8, 2010, 2:38 p.m. UTC | #4
> -----Original Message-----
> From: glikely@secretlab.ca [mailto:glikely@secretlab.ca] On 
> Behalf Of Grant Likely
> Sent: Thursday, July 08, 2010 12:38 AM
> To: Benjamin Herrenschmidt
> Cc: Steve Deiters; linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH] arch/powerpc/lib/copy_32.S: Use 
> alternate memcpy for MPC512x and MPC52xx
> 
> On Wed, Jul 7, 2010 at 11:10 PM, Benjamin Herrenschmidt 
> <benh@kernel.crashing.org> wrote:
> > On Tue, 2010-06-29 at 11:04 -0500, Steve Deiters wrote:
> >> These processors will corrupt data if accessing the local bus with 
> >> unaligned addresses. This version fixes the typical case 
> of copying 
> >> from Flash on the local bus by keeping the source address always 
> >> aligned.
> >
> > Shouldn't this be solved by using memcpy_to/fromio ?
> 
> Maybe.  plain memcpy() access to anything localbus-attached 
> on the mpc5xxx is the wrong thing to do.  memcpy_to/fromio 
> might do the right thing; but the caller must understand the 
> limitations of the localplus bus.  The byte-wise alignment 
> that memcpy_to/fromio() does may not work (depending on 
> configuration).
> 
> Steve, what code is doing a memcpy from flash?
> 
> g.

This was done in the JFFS2 code, in fs/jffs2/scan.c.  At least in one
function, in jffs2_scan_dirent_node it was using memcpy on a localbus
mapped structure.  I was getting JFFS2 filesystem corruption because of
this.  In fact I first tried changing this to memcpy_fromio and it fixed
that particular problem.  I was concerned though that other places I was
not aware of might be using memcpy from the localbus in a similar manner
so I decided to just tweak the memcpy routine.

Just out of curiousity, what configuration might cause a byte-wise
alignment not to work?
Grant Likely July 8, 2010, 3:22 p.m. UTC | #5
On Thu, Jul 8, 2010 at 8:38 AM, Steve Deiters <SteveDeiters@basler.com> wrote:
>> -----Original Message-----
>> From: glikely@secretlab.ca [mailto:glikely@secretlab.ca] On
>> Behalf Of Grant Likely
>> Sent: Thursday, July 08, 2010 12:38 AM
>> To: Benjamin Herrenschmidt
>> Cc: Steve Deiters; linuxppc-dev@lists.ozlabs.org
>> Subject: Re: [PATCH] arch/powerpc/lib/copy_32.S: Use
>> alternate memcpy for MPC512x and MPC52xx
>>
>> On Wed, Jul 7, 2010 at 11:10 PM, Benjamin Herrenschmidt
>> <benh@kernel.crashing.org> wrote:
>> > On Tue, 2010-06-29 at 11:04 -0500, Steve Deiters wrote:
>> >> These processors will corrupt data if accessing the local bus with
>> >> unaligned addresses. This version fixes the typical case
>> of copying
>> >> from Flash on the local bus by keeping the source address always
>> >> aligned.
>> >
>> > Shouldn't this be solved by using memcpy_to/fromio ?
>>
>> Maybe.  plain memcpy() access to anything localbus-attached
>> on the mpc5xxx is the wrong thing to do.  memcpy_to/fromio
>> might do the right thing; but the caller must understand the
>> limitations of the localplus bus.  The byte-wise alignment
>> that memcpy_to/fromio() does may not work (depending on
>> configuration).
>>
>> Steve, what code is doing a memcpy from flash?
>>
>> g.
>
> This was done in the JFFS2 code, in fs/jffs2/scan.c.  At least in one
> function, in jffs2_scan_dirent_node it was using memcpy on a localbus
> mapped structure.  I was getting JFFS2 filesystem corruption because of
> this.  In fact I first tried changing this to memcpy_fromio and it fixed
> that particular problem.  I was concerned though that other places I was
> not aware of might be using memcpy from the localbus in a similar manner
> so I decided to just tweak the memcpy routine.

[cc'ing David Woodhouse]

Sounds to me like the right thing to do is to fix the jffs2 code.

> Just out of curiousity, what configuration might cause a byte-wise
> alignment not to work?

Can't remember the register configuration, but I worked on one project
where this was the case.  In hindsight, it was probably a
mis-configuration of the localbus CS for the particular device.

g.
diff mbox

Patch

diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
index 74a7f41..42e7df5 100644
--- a/arch/powerpc/lib/copy_32.S
+++ b/arch/powerpc/lib/copy_32.S
@@ -226,6 +226,60 @@  _GLOBAL(memmove)
 	bgt	backwards_memcpy
 	/* fall through */
 
+#if defined(CONFIG_PPC_MPC512x) || defined(CONFIG_PPC_MPC52xx)
+
+/*
+ * Alternate memcpy for MPC512x and MPC52xx to guarantee source
+ * address is always aligned to prevent corruption issues when
+ * copying unaligned from the local bus. This only fixes the usage
+ * when copying from the local bus (e.g. Flash) and will not fix
+ * issues copying to the local bus
+ */
+_GLOBAL(memcpy)
+	srwi.	r7,r5,3
+	addi	r6,r3,-4
+	addi	r4,r4,-4
+	beq	2f			/* if less than 8 bytes to do */
+	andi.	r0,r4,3			/* get src word aligned */
+	mtctr	r7
+	bne	5f
+1:	lwz	r7,4(r4)
+	lwzu	r8,8(r4)
+	stw	r7,4(r6)
+	stwu	r8,8(r6)
+	bdnz	1b
+	andi.	r5,r5,7
+2:	cmplwi	0,r5,4
+	blt	3f
+	andi.	r0,r4,3
+	bne	3f
+	lwzu	r0,4(r4)
+	addi	r5,r5,-4
+	stwu	r0,4(r6)
+3:	cmpwi	0,r5,0
+	beqlr
+	mtctr	r5
+	addi	r4,r4,3
+	addi	r6,r6,3
+4:	lbzu	r0,1(r4)
+	stbu	r0,1(r6)
+	bdnz	4b
+	blr
+5:	subfic	r0,r0,4
+	mtctr	r0
+6:	lbz	r7,4(r4)
+	addi	r4,r4,1
+	stb	r7,4(r6)
+	addi	r6,r6,1
+	bdnz	6b
+	subf	r5,r0,r5
+	rlwinm.	r7,r5,32-3,3,31
+	beq	2b
+	mtctr	r7
+	b	1b
+
+#else
+
 _GLOBAL(memcpy)
 	srwi.	r7,r5,3
 	addi	r6,r3,-4
@@ -267,6 +321,8 @@  _GLOBAL(memcpy)
 	mtctr	r7
 	b	1b
 
+#endif
+
 _GLOBAL(backwards_memcpy)
 	rlwinm.	r7,r5,32-3,3,31		/* r0 = r5 >> 3 */
 	add	r6,r3,r5