Patchwork [U-Boot] arm: fix memory coherency problem after relocation

login
register
mail settings
Submitter Mike Dunn
Date June 17, 2013, 6:07 p.m.
Message ID <1371492435-14330-1-git-send-email-mikedunn@newsguy.com>
Download mbox | patch
Permalink /patch/252031/
State Superseded
Delegated to: Albert ARIBAUD
Headers show

Comments

Mike Dunn - June 17, 2013, 6:07 p.m.
On the xscale, the icache must be invalidated and the write buffers drained
after writing code over the data bus, even if the caches are disabled.  After
rebasing with the main git repository, u-boot began crashing in odd places on my
pxa270 board (palmtreo680) after the code relocation routine ran.  This patch
fixes it.  Cache coherency problems are often hit-and-miss (ha ha), and this
latent problem didn't rear its ugly head until now.  Tested on the pxa270.

Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
---

I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.  But how else
to test for pxa2xx/ixp in an assembly file?  I wouldn't expect any ill effects
on a non-xscale v5te because these CP15 operations are defined the same way in
the ARM Architecture Reference Manual for v5 cores, but unfortunately I only
have a pxa270 xscale to test on.

I experienced this same problem a couple years ago when I was getting OpenOCD
working on the xscale.  Often software breakpoints (replacing an instruction
with the 'bkpt' instruction) would fail unless this operation was performed
after replacing the instruction, even with caches off.  The following two
paragraphs are cut from section 4.2.7 of the xscale core developer's manual...

If the instruction cache is not enabled, or code is being written to a
non-cacheable region, software must still invalidate the instruction cache
before using the newly-written code. This precaution ensures that state
associated with the new code is not buffered elsewhere in the processor, such as
the fetch buffers or the BTB.

Naturally, when writing code as data, care must be taken to force it completely
out of the processor into external memory before attempting to execute it. If
writing into a non-cacheable region, flushing the write buffers is sufficient
precaution (see Section 7.2.8 for a description of this

 arch/arm/lib/relocate.S |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)
Marek Vasut - June 20, 2013, 1:56 p.m.
Dear Mike Dunn,

> On the xscale, the icache must be invalidated and the write buffers drained
> after writing code over the data bus, even if the caches are disabled. 
> After rebasing with the main git repository, u-boot began crashing in odd
> places on my pxa270 board (palmtreo680) after the code relocation routine
> ran.  This patch fixes it.  Cache coherency problems are often
> hit-and-miss (ha ha), and this latent problem didn't rear its ugly head
> until now.  Tested on the pxa270.
> 
> Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
> ---
> 
> I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.

We should introduce some CONFIG_PXA I guess.

Best regards,
Marek Vasut
Mike Dunn - June 20, 2013, 5:04 p.m.
On 06/20/2013 06:56 AM, Marek Vasut wrote:
> Dear Mike Dunn,
> 
>> On the xscale, the icache must be invalidated and the write buffers drained
>> after writing code over the data bus, even if the caches are disabled. 
>> After rebasing with the main git repository, u-boot began crashing in odd
>> places on my pxa270 board (palmtreo680) after the code relocation routine
>> ran.  This patch fixes it.  Cache coherency problems are often
>> hit-and-miss (ha ha), and this latent problem didn't rear its ugly head
>> until now.  Tested on the pxa270.
>>
>> Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
>> ---
>>
>> I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.
> 
> We should introduce some CONFIG_PXA I guess.


I think CONFIG_XSCALE is more correct, because if I'm not mistaken more recent
Marvell processors continue to use the 'pxa' nomenclature in their name but do
not contain xscale cores.

I'll take a stab at a patch if you like.  Maybe something like this macro in
arch/arm/include/asm/arch-pxa/hardware.h

/*
 * Define CONFIG_CPU_MONAHANS in case some CPU of the PXA3xx family is selected.
 * PXA300/310/320 all have distinct register mappings in some cases, that's why
 * the exact CPU has to be selected. CONFIG_CPU_MONAHANS is a helper for common
 * drivers and compatibility glue with old source then.
 */
#ifndef	CONFIG_CPU_MONAHANS
#if	defined(CONFIG_CPU_PXA300) || \
	defined(CONFIG_CPU_PXA310) || \
	defined(CONFIG_CPU_PXA320)
#define	CONFIG_CPU_MONAHANS
#endif
#endif

Thanks,
Mike
Marek Vasut - June 20, 2013, 8:19 p.m.
Dear Mike Dunn,

> On 06/20/2013 06:56 AM, Marek Vasut wrote:
> > Dear Mike Dunn,
> > 
> >> On the xscale, the icache must be invalidated and the write buffers
> >> drained after writing code over the data bus, even if the caches are
> >> disabled. After rebasing with the main git repository, u-boot began
> >> crashing in odd places on my pxa270 board (palmtreo680) after the code
> >> relocation routine ran.  This patch fixes it.  Cache coherency problems
> >> are often
> >> hit-and-miss (ha ha), and this latent problem didn't rear its ugly head
> >> until now.  Tested on the pxa270.
> >> 
> >> Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
> >> ---
> >> 
> >> I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.
> > 
> > We should introduce some CONFIG_PXA I guess.
> 
> I think CONFIG_XSCALE is more correct, because if I'm not mistaken more
> recent Marvell processors continue to use the 'pxa' nomenclature in their
> name but do not contain xscale cores.

I suspect we might end up checking all the datasheets (PXA2xx, PXA3xx and 
PXA9xx) to see which CPUs still need this fixup.

Best regards,
Marek Vasut

Patch

diff --git a/arch/arm/lib/relocate.S b/arch/arm/lib/relocate.S
index 4446da9..e2febed 100644
--- a/arch/arm/lib/relocate.S
+++ b/arch/arm/lib/relocate.S
@@ -92,6 +92,16 @@  fixnext:
 
 relocate_done:
 
+#ifdef __ARM_ARCH_5TE__
+	/*
+	 * On xscale, icache must be invalidated and write buffers drained,
+	 * even with cache disabled - 4.2.7 of xscale core developer's manual
+	 */
+	mov	r0, #0                  /* arm reference manual: "data SBZ" */
+	mcr	p15, 0, r0, c7, c7, 0	/* invalidate icache */
+	mcr	p15, 0, r0, c7, c10, 4	/* drain write buffer */
+#endif
+
 	/* ARMv4- don't know bx lr but the assembler fails to see that */
 
 #ifdef __ARM_ARCH_4__