diff mbox series

[U-Boot,v2,1/2] arm: make arm926ejs startup code thumb compatible

Message ID 20180420195144.32900-2-klaus.goger@theobroma-systems.com
State Superseded
Delegated to: Tom Rini
Headers show
Series While trying to compile u-boot in thumb due space constraints on a mxs | expand

Commit Message

Klaus Goger April 20, 2018, 7:51 p.m. UTC
When building the mxs platform in thumb mode gcc generates code using
the intra procedure call scratch register (ip/r12) for the calling the
lowlevel_init function. This modifies the lr in flush_dcache which
causes u-boot proper to end in an endless loop.

40002334:       e1a0c00e        mov     ip, lr
40002338:       eb00df4c        bl      4003a070
<__lowlevel_init_from_arm>
4000233c:       e1a0e00c        mov     lr, ip
40002340:       e1a0f00e        mov     pc, lr
[...]
4003a070 <__lowlevel_init_from_arm>:
4003a070:       e59fc004        ldr     ip, [pc, #4]    ; 4003a07c
<__lowlevel_init_from_arm+0xc>
4003a074:       e08fc00c        add     ip, pc, ip
4003a078:       e12fff1c        bx      ip
4003a07c:       fffc86cd        .word   0xfffc86cd

Instead of using the the ip/r12 register we use sl/r10 to preserve the
link register.

According to "Procedure Call Standard for the ARM Architecture" by ARM
subroutines have to preserve the contents of register r4-r8, r10, r11
and SP. So using r10 instead of r12 should be save.

Signed-off-by: Klaus Goger <klaus.goger@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>

---

Changes in v2:
- use bl instead of blx to call lowlevel_init
- remove mxs tag as it apply to all arm926ejs platforms

 arch/arm/cpu/arm926ejs/start.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Marek Vasut April 20, 2018, 9:55 p.m. UTC | #1
On 04/20/2018 09:51 PM, Klaus Goger wrote:
> When building the mxs platform in thumb mode gcc generates code using
> the intra procedure call scratch register (ip/r12) for the calling the
> lowlevel_init function. This modifies the lr in flush_dcache which
> causes u-boot proper to end in an endless loop.
> 
> 40002334:       e1a0c00e        mov     ip, lr
> 40002338:       eb00df4c        bl      4003a070
> <__lowlevel_init_from_arm>
> 4000233c:       e1a0e00c        mov     lr, ip
> 40002340:       e1a0f00e        mov     pc, lr
> [...]
> 4003a070 <__lowlevel_init_from_arm>:
> 4003a070:       e59fc004        ldr     ip, [pc, #4]    ; 4003a07c
> <__lowlevel_init_from_arm+0xc>
> 4003a074:       e08fc00c        add     ip, pc, ip
> 4003a078:       e12fff1c        bx      ip
> 4003a07c:       fffc86cd        .word   0xfffc86cd
> 
> Instead of using the the ip/r12 register we use sl/r10 to preserve the
> link register.
> 
> According to "Procedure Call Standard for the ARM Architecture" by ARM
> subroutines have to preserve the contents of register r4-r8, r10, r11
> and SP. So using r10 instead of r12 should be save.
> 
> Signed-off-by: Klaus Goger <klaus.goger@theobroma-systems.com>
> Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
> 
> ---
> 
> Changes in v2:
> - use bl instead of blx to call lowlevel_init
> - remove mxs tag as it apply to all arm926ejs platforms

What is lowlevel_init() is implemented as a C code and therefore thumb ?
:-) doesn't ie. armv7a handle this somehow already ?

>  arch/arm/cpu/arm926ejs/start.S | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S
> index 959d1ed86d..317df5c401 100644
> --- a/arch/arm/cpu/arm926ejs/start.S
> +++ b/arch/arm/cpu/arm926ejs/start.S
> @@ -105,9 +105,9 @@ flush_dcache:
>  	/*
>  	 * Go setup Memory and board specific bits prior to relocation.
>  	 */
> -	mov	ip, lr		/* perserve link reg across call */
> +	mov	sl, lr		/* perserve link reg across call */
>  	bl	lowlevel_init	/* go setup pll,mux,memory */
> -	mov	lr, ip		/* restore link */
> +	mov	lr, sl		/* restore link */
>  #endif
> -	mov	pc, lr		/* back to my caller */
> +	bx	lr		/* back to my caller */
>  #endif /* CONFIG_SKIP_LOWLEVEL_INIT */
>
Klaus Goger April 20, 2018, 10:11 p.m. UTC | #2
> On 20.04.2018, at 23:55, Marek Vasut <marex@denx.de> wrote:
> 
> On 04/20/2018 09:51 PM, Klaus Goger wrote:
>> When building the mxs platform in thumb mode gcc generates code using
>> the intra procedure call scratch register (ip/r12) for the calling the
>> lowlevel_init function. This modifies the lr in flush_dcache which
>> causes u-boot proper to end in an endless loop.
>> 
>> 40002334:       e1a0c00e        mov     ip, lr
>> 40002338:       eb00df4c        bl      4003a070
>> <__lowlevel_init_from_arm>
>> 4000233c:       e1a0e00c        mov     lr, ip
>> 40002340:       e1a0f00e        mov     pc, lr
>> [...]
>> 4003a070 <__lowlevel_init_from_arm>:
>> 4003a070:       e59fc004        ldr     ip, [pc, #4]    ; 4003a07c
>> <__lowlevel_init_from_arm+0xc>
>> 4003a074:       e08fc00c        add     ip, pc, ip
>> 4003a078:       e12fff1c        bx      ip
>> 4003a07c:       fffc86cd        .word   0xfffc86cd
>> 
>> Instead of using the the ip/r12 register we use sl/r10 to preserve the
>> link register.
>> 
>> According to "Procedure Call Standard for the ARM Architecture" by ARM
>> subroutines have to preserve the contents of register r4-r8, r10, r11
>> and SP. So using r10 instead of r12 should be save.
>> 
>> Signed-off-by: Klaus Goger <klaus.goger@theobroma-systems.com>
>> Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
>> 
>> ---
>> 
>> Changes in v2:
>> - use bl instead of blx to call lowlevel_init
>> - remove mxs tag as it apply to all arm926ejs platforms
> 
> What is lowlevel_init() is implemented as a C code and therefore thumb ?
> :-) doesn't ie. armv7a handle this somehow already ?
 
lowlevel_init() is implemented in C (you wrote it :)) and therefore thumb if
compiled as thumb.
armv7 lowlevel_init is implemented in assembly.

As the compiler will generate the necessary interwork code there was no need
to change the branch instruction to blx. So I changed it back to bl as the original
code as this will work with both generated codes (arm and thumb) and also
if someone decides to reimplement lowlevel_init as arm assembly.

> 
>> arch/arm/cpu/arm926ejs/start.S | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S
>> index 959d1ed86d..317df5c401 100644
>> --- a/arch/arm/cpu/arm926ejs/start.S
>> +++ b/arch/arm/cpu/arm926ejs/start.S
>> @@ -105,9 +105,9 @@ flush_dcache:
>> 	/*
>> 	 * Go setup Memory and board specific bits prior to relocation.
>> 	 */
>> -	mov	ip, lr		/* perserve link reg across call */
>> +	mov	sl, lr		/* perserve link reg across call */
>> 	bl	lowlevel_init	/* go setup pll,mux,memory */
>> -	mov	lr, ip		/* restore link */
>> +	mov	lr, sl		/* restore link */
>> #endif
>> -	mov	pc, lr		/* back to my caller */
>> +	bx	lr		/* back to my caller */
>> #endif /* CONFIG_SKIP_LOWLEVEL_INIT */
>> 
> 
> 
> -- 
> Best regards,
> Marek Vasut
Måns Rullgård April 21, 2018, 1:10 p.m. UTC | #3
Klaus Goger <klaus.goger@theobroma-systems.com> writes:

> When building the mxs platform in thumb mode gcc generates code using
> the intra procedure call scratch register (ip/r12) for the calling the
> lowlevel_init function. This modifies the lr in flush_dcache which
> causes u-boot proper to end in an endless loop.
>
> 40002334:       e1a0c00e        mov     ip, lr
> 40002338:       eb00df4c        bl      4003a070
> <__lowlevel_init_from_arm>
> 4000233c:       e1a0e00c        mov     lr, ip
> 40002340:       e1a0f00e        mov     pc, lr
> [...]
> 4003a070 <__lowlevel_init_from_arm>:
> 4003a070:       e59fc004        ldr     ip, [pc, #4]    ; 4003a07c
> <__lowlevel_init_from_arm+0xc>
> 4003a074:       e08fc00c        add     ip, pc, ip
> 4003a078:       e12fff1c        bx      ip
> 4003a07c:       fffc86cd        .word   0xfffc86cd
>
> Instead of using the the ip/r12 register we use sl/r10 to preserve the
> link register.
>
> According to "Procedure Call Standard for the ARM Architecture" by ARM
> subroutines have to preserve the contents of register r4-r8, r10, r11
> and SP. So using r10 instead of r12 should be save.
>
> Signed-off-by: Klaus Goger <klaus.goger@theobroma-systems.com>
> Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>

This problem isn't specific to Thumb mode.  An ARM build would also
break if the lowlevel_init function happened to clobber r12, which it is
permitted to do.  It's just dumb luck that this hasn't happened yet.

> ---
>
> Changes in v2:
> - use bl instead of blx to call lowlevel_init
> - remove mxs tag as it apply to all arm926ejs platforms
>
>  arch/arm/cpu/arm926ejs/start.S | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S
> index 959d1ed86d..317df5c401 100644
> --- a/arch/arm/cpu/arm926ejs/start.S
> +++ b/arch/arm/cpu/arm926ejs/start.S
> @@ -105,9 +105,9 @@ flush_dcache:
>  	/*
>  	 * Go setup Memory and board specific bits prior to relocation.
>  	 */
> -	mov	ip, lr		/* perserve link reg across call */
> +	mov	sl, lr		/* perserve link reg across call */
>  	bl	lowlevel_init	/* go setup pll,mux,memory */
> -	mov	lr, ip		/* restore link */
> +	mov	lr, sl		/* restore link */

I prefer to use plain register names (r10) rather than the aliases (sl)
when not using them for the special functions indicated by the latter.

>  #endif
> -	mov	pc, lr		/* back to my caller */
> +	bx	lr		/* back to my caller */

This change seems unrelated.  Yes, bx is the preferred instruction, but
using mov here isn't breaking anything.  If it bothers you, feel free to
make a separate patch fixing all the instances of mov to the pc
register, not just this one.
Christoph Muellner April 21, 2018, 4:45 p.m. UTC | #4
> On 21 Apr 2018, at 15:10, Måns Rullgård <mans@mansr.com> wrote:
> 
> Klaus Goger <klaus.goger@theobroma-systems.com <mailto:klaus.goger@theobroma-systems.com>> writes:
> 
>> When building the mxs platform in thumb mode gcc generates code using
>> the intra procedure call scratch register (ip/r12) for the calling the
>> lowlevel_init function. This modifies the lr in flush_dcache which
>> causes u-boot proper to end in an endless loop.
>> 
>> 40002334:       e1a0c00e        mov     ip, lr
>> 40002338:       eb00df4c        bl      4003a070
>> <__lowlevel_init_from_arm>
>> 4000233c:       e1a0e00c        mov     lr, ip
>> 40002340:       e1a0f00e        mov     pc, lr
>> [...]
>> 4003a070 <__lowlevel_init_from_arm>:
>> 4003a070:       e59fc004        ldr     ip, [pc, #4]    ; 4003a07c
>> <__lowlevel_init_from_arm+0xc>
>> 4003a074:       e08fc00c        add     ip, pc, ip
>> 4003a078:       e12fff1c        bx      ip
>> 4003a07c:       fffc86cd        .word   0xfffc86cd
>> 
>> Instead of using the the ip/r12 register we use sl/r10 to preserve the
>> link register.
>> 
>> According to "Procedure Call Standard for the ARM Architecture" by ARM
>> subroutines have to preserve the contents of register r4-r8, r10, r11
>> and SP. So using r10 instead of r12 should be save.
>> 
>> Signed-off-by: Klaus Goger <klaus.goger@theobroma-systems.com>
>> Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
> 
> This problem isn't specific to Thumb mode.  An ARM build would also
> break if the lowlevel_init function happened to clobber r12, which it is
> permitted to do.  It's just dumb luck that this hasn't happened yet.

True. Since we are in low-level code and are not called by any C-functions
(we are startup code), using sl/r10 is ok.

> 
>> ---
>> 
>> Changes in v2:
>> - use bl instead of blx to call lowlevel_init
>> - remove mxs tag as it apply to all arm926ejs platforms
>> 
>> arch/arm/cpu/arm926ejs/start.S | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S
>> index 959d1ed86d..317df5c401 100644
>> --- a/arch/arm/cpu/arm926ejs/start.S
>> +++ b/arch/arm/cpu/arm926ejs/start.S
>> @@ -105,9 +105,9 @@ flush_dcache:
>> 	/*
>> 	 * Go setup Memory and board specific bits prior to relocation.
>> 	 */
>> -	mov	ip, lr		/* perserve link reg across call */
>> +	mov	sl, lr		/* perserve link reg across call */
>> 	bl	lowlevel_init	/* go setup pll,mux,memory */
>> -	mov	lr, ip		/* restore link */
>> +	mov	lr, sl		/* restore link */
> 
> I prefer to use plain register names (r10) rather than the aliases (sl)
> when not using them for the special functions indicated by the latter.

Ok, will change that.

> 
>> #endif
>> -	mov	pc, lr		/* back to my caller */
>> +	bx	lr		/* back to my caller */
> 
> This change seems unrelated.  Yes, bx is the preferred instruction, but
> using mov here isn't breaking anything.  If it bothers you, feel free to
> make a separate patch fixing all the instances of mov to the pc
> register, not just this one.

We’ll remove it from the patch.

Thanks,
Christoph
diff mbox series

Patch

diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S
index 959d1ed86d..317df5c401 100644
--- a/arch/arm/cpu/arm926ejs/start.S
+++ b/arch/arm/cpu/arm926ejs/start.S
@@ -105,9 +105,9 @@  flush_dcache:
 	/*
 	 * Go setup Memory and board specific bits prior to relocation.
 	 */
-	mov	ip, lr		/* perserve link reg across call */
+	mov	sl, lr		/* perserve link reg across call */
 	bl	lowlevel_init	/* go setup pll,mux,memory */
-	mov	lr, ip		/* restore link */
+	mov	lr, sl		/* restore link */
 #endif
-	mov	pc, lr		/* back to my caller */
+	bx	lr		/* back to my caller */
 #endif /* CONFIG_SKIP_LOWLEVEL_INIT */