diff mbox

[U-Boot,RFC] armv8: cache: Switch MMU table without turning off MMU

Message ID 1486160568-1372-1-git-send-email-york.sun@nxp.com
State Not Applicable
Delegated to: Tom Rini
Headers show

Commit Message

York Sun Feb. 3, 2017, 10:22 p.m. UTC
We don't have to completely turn off MMU and cache to switch to
another MMU table as far as the data is coherent before and after
the switching. This patch relaxes the procedure.

Signed-off-by: York Sun <york.sun@nxp.com>
CC: Alexander Graf <agraf@suse.de>
---
I found this issue while trying to change MMU table for a SPL boot.
The RAM version U-Boot was copied to memory when d-cache was enabled.
SPL code never flushed the cache (I will send another patch to fix that).
With existing code, U-Boot stops running as soon as the MMU is diabled.
With below propsed change, U-Boot continues to run. I have been in
contact with ARM support and got very useful information. However, this
switching TTBR method is "behavious that should work"  but not well
verified by ARM. During my debugging, I found other minor issue which
convinced me this code wasn't exercised. I don't intend to use this
method in long term, but figure it may help others by fixing it.

 arch/arm/cpu/armv8/cache.S | 44 +++++++++-----------------------------------
 1 file changed, 9 insertions(+), 35 deletions(-)

Comments

Stephen Warren Feb. 10, 2017, 5:04 p.m. UTC | #1
On 02/03/2017 03:22 PM, York Sun wrote:
> We don't have to completely turn off MMU and cache to switch to
> another MMU table as far as the data is coherent before and after
> the switching. This patch relaxes the procedure.

This seems plausible, but I'm not immersed well enough in the details to 
really review it. I CC'd a couple of Linux people from ARM who should be 
able to review.

> Signed-off-by: York Sun <york.sun@nxp.com>
> CC: Alexander Graf <agraf@suse.de>
> ---
> I found this issue while trying to change MMU table for a SPL boot.
> The RAM version U-Boot was copied to memory when d-cache was enabled.
> SPL code never flushed the cache (I will send another patch to fix that).
> With existing code, U-Boot stops running as soon as the MMU is diabled.
> With below propsed change, U-Boot continues to run. I have been in
> contact with ARM support and got very useful information. However, this
> switching TTBR method is "behavious that should work"  but not well
> verified by ARM. During my debugging, I found other minor issue which
> convinced me this code wasn't exercised. I don't intend to use this
> method in long term, but figure it may help others by fixing it.
>
>  arch/arm/cpu/armv8/cache.S | 44 +++++++++-----------------------------------
>  1 file changed, 9 insertions(+), 35 deletions(-)
>
> diff --git a/arch/arm/cpu/armv8/cache.S b/arch/arm/cpu/armv8/cache.S
> index f1deaa7..63fb112 100644
> --- a/arch/arm/cpu/armv8/cache.S
> +++ b/arch/arm/cpu/armv8/cache.S
> @@ -171,35 +171,10 @@ ENDPROC(__asm_invalidate_l3_icache)
>  /*
>   * void __asm_switch_ttbr(ulong new_ttbr)
>   *
> - * Safely switches to a new page table.
> + * Switches to a new page table. Cache coherency must be maintained
> + * before calling this function.
>   */
>  ENTRY(__asm_switch_ttbr)
> -	/* x2 = SCTLR (alive throghout the function) */
> -	switch_el x4, 3f, 2f, 1f
> -3:	mrs	x2, sctlr_el3
> -	b	0f
> -2:	mrs	x2, sctlr_el2
> -	b	0f
> -1:	mrs	x2, sctlr_el1
> -0:
> -
> -	/* Unset CR_M | CR_C | CR_I from SCTLR to disable all caches */
> -	movn	x1, #(CR_M | CR_C | CR_I)
> -	and	x1, x2, x1
> -	switch_el x4, 3f, 2f, 1f
> -3:	msr	sctlr_el3, x1
> -	b	0f
> -2:	msr	sctlr_el2, x1
> -	b	0f
> -1:	msr	sctlr_el1, x1
> -0:	isb
> -
> -	/* This call only clobbers x30 (lr) and x9 (unused) */
> -	mov	x3, x30
> -	bl	__asm_invalidate_tlb_all
> -
> -	/* From here on we're running safely with caches disabled */
> -
>  	/* Set TTBR to our first argument */
>  	switch_el x4, 3f, 2f, 1f
>  3:	msr	ttbr0_el3, x0
> @@ -209,14 +184,13 @@ ENTRY(__asm_switch_ttbr)
>  1:	msr	ttbr0_el1, x0
>  0:	isb
>
> -	/* Restore original SCTLR and thus enable caches again */
> -	switch_el x4, 3f, 2f, 1f
> -3:	msr	sctlr_el3, x2
> -	b	0f
> -2:	msr	sctlr_el2, x2
> -	b	0f
> -1:	msr	sctlr_el1, x2
> -0:	isb
> +	/* invalidate i-cache */
> +	ic      ialluis
> +	isb     sy
> +
> +	/* This call only clobbers x30 (lr) and x9 (unused) */
> +	mov	x3, x30
> +	bl	__asm_invalidate_tlb_all
>
>  	ret	x3
>  ENDPROC(__asm_switch_ttbr)
>
Mark Rutland Feb. 10, 2017, 5:31 p.m. UTC | #2
Hi,

On Fri, Feb 03, 2017 at 02:22:48PM -0800, York Sun wrote:
> We don't have to completely turn off MMU and cache to switch to
> another MMU table as far as the data is coherent before and after
> the switching. This patch relaxes the procedure.
>
> Signed-off-by: York Sun <york.sun@nxp.com>
> CC: Alexander Graf <agraf@suse.de>
> ---
> I found this issue while trying to change MMU table for a SPL boot.
> The RAM version U-Boot was copied to memory when d-cache was enabled.
> SPL code never flushed the cache (I will send another patch to fix that).
> With existing code, U-Boot stops running as soon as the MMU is diabled.
> With below propsed change, U-Boot continues to run. I have been in
> contact with ARM support and got very useful information. However, this
> switching TTBR method is "behavious that should work"  but not well
> verified by ARM. During my debugging, I found other minor issue which
> convinced me this code wasn't exercised. I don't intend to use this
> method in long term, but figure it may help others by fixing it.

I believe the approach taken in this patch is unsafe.

Even if the resulting translations are identical across the old and new
tables, it is not valid to transition from one set of tables to another
unless:

(a) the mappings in both cases are non-global, and the ASID is switched,
    or:

(b) A Break-Before-Make strategy is followed to ensure that the TLBs
    contain no stale entries for the old tables.

Even if the resulting translations are identical, you can be subject to
TLB conflict aborts and/or behaviours resulting from the amalgamation of
TLB entries. In Linux, we had to write special code [1] to switch tables
by using some trampoline code in the other TTBR.

I believe that in this case, you only need to clean the MMU-off code to
the PoC (using DC DVAC), before temporarily turning the MMU off.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401434.html

>  arch/arm/cpu/armv8/cache.S | 44 +++++++++-----------------------------------
>  1 file changed, 9 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm/cpu/armv8/cache.S b/arch/arm/cpu/armv8/cache.S
> index f1deaa7..63fb112 100644
> --- a/arch/arm/cpu/armv8/cache.S
> +++ b/arch/arm/cpu/armv8/cache.S
> @@ -171,35 +171,10 @@ ENDPROC(__asm_invalidate_l3_icache)
>  /*
>   * void __asm_switch_ttbr(ulong new_ttbr)
>   *
> - * Safely switches to a new page table.
> + * Switches to a new page table. Cache coherency must be maintained
> + * before calling this function.
>   */
>  ENTRY(__asm_switch_ttbr)
> -	/* x2 = SCTLR (alive throghout the function) */
> -	switch_el x4, 3f, 2f, 1f
> -3:	mrs	x2, sctlr_el3
> -	b	0f
> -2:	mrs	x2, sctlr_el2
> -	b	0f
> -1:	mrs	x2, sctlr_el1
> -0:
> -
> -	/* Unset CR_M | CR_C | CR_I from SCTLR to disable all caches */
> -	movn	x1, #(CR_M | CR_C | CR_I)
> -	and	x1, x2, x1
> -	switch_el x4, 3f, 2f, 1f
> -3:	msr	sctlr_el3, x1
> -	b	0f
> -2:	msr	sctlr_el2, x1
> -	b	0f
> -1:	msr	sctlr_el1, x1
> -0:	isb
> -
> -	/* This call only clobbers x30 (lr) and x9 (unused) */
> -	mov	x3, x30
> -	bl	__asm_invalidate_tlb_all
> -
> -	/* From here on we're running safely with caches disabled */
> -
>  	/* Set TTBR to our first argument */
>  	switch_el x4, 3f, 2f, 1f
>  3:	msr	ttbr0_el3, x0
> @@ -209,14 +184,13 @@ ENTRY(__asm_switch_ttbr)
>  1:	msr	ttbr0_el1, x0
>  0:	isb
>  
> -	/* Restore original SCTLR and thus enable caches again */
> -	switch_el x4, 3f, 2f, 1f
> -3:	msr	sctlr_el3, x2
> -	b	0f
> -2:	msr	sctlr_el2, x2
> -	b	0f
> -1:	msr	sctlr_el1, x2
> -0:	isb
> +	/* invalidate i-cache */
> +	ic      ialluis
> +	isb     sy
> +
> +	/* This call only clobbers x30 (lr) and x9 (unused) */
> +	mov	x3, x30
> +	bl	__asm_invalidate_tlb_all
>  
>  	ret	x3
>  ENDPROC(__asm_switch_ttbr)
> -- 
> 2.7.4
> 
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
York Sun Feb. 10, 2017, 5:39 p.m. UTC | #3
On 02/10/2017 09:32 AM, Mark Rutland wrote:
> Hi,
>
> On Fri, Feb 03, 2017 at 02:22:48PM -0800, York Sun wrote:
>> We don't have to completely turn off MMU and cache to switch to
>> another MMU table as far as the data is coherent before and after
>> the switching. This patch relaxes the procedure.
>>
>> Signed-off-by: York Sun <york.sun@nxp.com>
>> CC: Alexander Graf <agraf@suse.de>
>> ---
>> I found this issue while trying to change MMU table for a SPL boot.
>> The RAM version U-Boot was copied to memory when d-cache was enabled.
>> SPL code never flushed the cache (I will send another patch to fix that).
>> With existing code, U-Boot stops running as soon as the MMU is diabled.
>> With below propsed change, U-Boot continues to run. I have been in
>> contact with ARM support and got very useful information. However, this
>> switching TTBR method is "behavious that should work"  but not well
>> verified by ARM. During my debugging, I found other minor issue which
>> convinced me this code wasn't exercised. I don't intend to use this
>> method in long term, but figure it may help others by fixing it.
>
> I believe the approach taken in this patch is unsafe.
>
> Even if the resulting translations are identical across the old and new
> tables, it is not valid to transition from one set of tables to another
> unless:
>
> (a) the mappings in both cases are non-global, and the ASID is switched,
>     or:
>
> (b) A Break-Before-Make strategy is followed to ensure that the TLBs
>     contain no stale entries for the old tables.
>
> Even if the resulting translations are identical, you can be subject to
> TLB conflict aborts and/or behaviours resulting from the amalgamation of
> TLB entries. In Linux, we had to write special code [1] to switch tables
> by using some trampoline code in the other TTBR.
>
> I believe that in this case, you only need to clean the MMU-off code to
> the PoC (using DC DVAC), before temporarily turning the MMU off.
>
> Thanks,
> Mark.
>
> [1] https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.infradead.org%2Fpipermail%2Flinux-arm-kernel%2F2016-January%2F401434.html&data=01%7C01%7Cyork.sun%40nxp.com%7C12db759ec0a846ebf27308d451dad8ec%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&sdata=0gKNRAMUSKKuBAGYAmQOLtW8EOk1e6FOk9VZn50hYg8%3D&reserved=0
>

Thanks, Mark. Let's keep the current code. I will explore other ways to 
fix my issue.

York
diff mbox

Patch

diff --git a/arch/arm/cpu/armv8/cache.S b/arch/arm/cpu/armv8/cache.S
index f1deaa7..63fb112 100644
--- a/arch/arm/cpu/armv8/cache.S
+++ b/arch/arm/cpu/armv8/cache.S
@@ -171,35 +171,10 @@  ENDPROC(__asm_invalidate_l3_icache)
 /*
  * void __asm_switch_ttbr(ulong new_ttbr)
  *
- * Safely switches to a new page table.
+ * Switches to a new page table. Cache coherency must be maintained
+ * before calling this function.
  */
 ENTRY(__asm_switch_ttbr)
-	/* x2 = SCTLR (alive throghout the function) */
-	switch_el x4, 3f, 2f, 1f
-3:	mrs	x2, sctlr_el3
-	b	0f
-2:	mrs	x2, sctlr_el2
-	b	0f
-1:	mrs	x2, sctlr_el1
-0:
-
-	/* Unset CR_M | CR_C | CR_I from SCTLR to disable all caches */
-	movn	x1, #(CR_M | CR_C | CR_I)
-	and	x1, x2, x1
-	switch_el x4, 3f, 2f, 1f
-3:	msr	sctlr_el3, x1
-	b	0f
-2:	msr	sctlr_el2, x1
-	b	0f
-1:	msr	sctlr_el1, x1
-0:	isb
-
-	/* This call only clobbers x30 (lr) and x9 (unused) */
-	mov	x3, x30
-	bl	__asm_invalidate_tlb_all
-
-	/* From here on we're running safely with caches disabled */
-
 	/* Set TTBR to our first argument */
 	switch_el x4, 3f, 2f, 1f
 3:	msr	ttbr0_el3, x0
@@ -209,14 +184,13 @@  ENTRY(__asm_switch_ttbr)
 1:	msr	ttbr0_el1, x0
 0:	isb
 
-	/* Restore original SCTLR and thus enable caches again */
-	switch_el x4, 3f, 2f, 1f
-3:	msr	sctlr_el3, x2
-	b	0f
-2:	msr	sctlr_el2, x2
-	b	0f
-1:	msr	sctlr_el1, x2
-0:	isb
+	/* invalidate i-cache */
+	ic      ialluis
+	isb     sy
+
+	/* This call only clobbers x30 (lr) and x9 (unused) */
+	mov	x3, x30
+	bl	__asm_invalidate_tlb_all
 
 	ret	x3
 ENDPROC(__asm_switch_ttbr)