Patchwork [U-Boot,01/10] arm: Compile cache_disable() with -O2 to avoid failure

login
register
mail settings
Submitter Simon Glass
Date Nov. 1, 2012, 11:42 p.m.
Message ID <1351813330-23741-1-git-send-email-sjg@chromium.org>
Download mbox | patch
Permalink /patch/196414/
State Rejected, archived
Delegated to: Albert ARIBAUD
Headers show

Comments

Simon Glass - Nov. 1, 2012, 11:42 p.m.
It is good to have these functions written in C instead of assembler,
but with -O0 the cache_disable() function doesn't return. Rather than
revert to assembler, this fix just forces this to be built with -O2.

Signed-off-by: Simon Glass <sjg@chromium.org>
---
 arch/arm/lib/cache-cp15.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)
Marek Vasut - Nov. 3, 2012, 8:32 a.m.
Dear Simon Glass,

> It is good to have these functions written in C instead of assembler,
> but with -O0 the cache_disable() function doesn't return. Rather than
> revert to assembler, this fix just forces this to be built with -O2.
> 
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>  arch/arm/lib/cache-cp15.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
> index 939de10..8f8385d 100644
> --- a/arch/arm/lib/cache-cp15.c
> +++ b/arch/arm/lib/cache-cp15.c
> @@ -110,6 +110,16 @@ static void cache_enable(uint32_t cache_bit)
>  	set_cr(reg | cache_bit);
>  }
> 
> +/*
> + * Big hack warning!
> + *
> + * Devs like to compile with -O0 to get a nice debugging illusion. But
> this + * function does not survive that since -O0 causes the compiler to
> read the + * PC back from the stack after the dcache flush. Might it be
> possible to fix + * this by flushing the write buffer?
> + */
> +static void cache_disable(uint32_t cache_bit) __attribute__
> ((optimize(2))); +
>  /* cache_bit must be either CR_I or CR_C */
>  static void cache_disable(uint32_t cache_bit)
>  {

Uh, are you sure this is right ?

Best regards,
Marek Vasut
Wolfgang Denk - Nov. 3, 2012, 12:29 p.m.
Dear Simon Glass,

In message <1351813330-23741-1-git-send-email-sjg@chromium.org> you wrote:
> It is good to have these functions written in C instead of assembler,
> but with -O0 the cache_disable() function doesn't return. Rather than
> revert to assembler, this fix just forces this to be built with -O2.

NAK.

This is vodoo programming to fix a problem which is obviously not
correctly understood (and fixed), so the real cause remains unfixed.

> +/*
> + * Big hack warning!
> + *
> + * Devs like to compile with -O0 to get a nice debugging illusion. But this

We don't use -O0 normally, and actually there are more places in the
code that are likely to cause problems or to actually break when
using -O0.

> + * function does not survive that since -O0 causes the compiler to read the
> + * PC back from the stack after the dcache flush. Might it be possible to fix
> + * this by flushing the write buffer?
> + */

"compiler to read the PC back from the stack after the dcache flush" -
can you please explain what exactly this means, and which exact
problem it causes?

> +static void cache_disable(uint32_t cache_bit) __attribute__ ((optimize(2)));

Sorry, I will not accept this.

Best regards,

Wolfgang Denk
Simon Glass - Nov. 7, 2012, 12:45 a.m.
Hi Wolfgang,

On Sat, Nov 3, 2012 at 5:29 AM, Wolfgang Denk <wd@denx.de> wrote:
> Dear Simon Glass,
>
> In message <1351813330-23741-1-git-send-email-sjg@chromium.org> you wrote:
>> It is good to have these functions written in C instead of assembler,
>> but with -O0 the cache_disable() function doesn't return. Rather than
>> revert to assembler, this fix just forces this to be built with -O2.
>
> NAK.
>
> This is vodoo programming to fix a problem which is obviously not
> correctly understood (and fixed), so the real cause remains unfixed.
>
>> +/*
>> + * Big hack warning!
>> + *
>> + * Devs like to compile with -O0 to get a nice debugging illusion. But this
>
> We don't use -O0 normally, and actually there are more places in the
> code that are likely to cause problems or to actually break when
> using -O0.
>
>> + * function does not survive that since -O0 causes the compiler to read the
>> + * PC back from the stack after the dcache flush. Might it be possible to fix
>> + * this by flushing the write buffer?
>> + */
>
> "compiler to read the PC back from the stack after the dcache flush" -
> can you please explain what exactly this means, and which exact
> problem it causes?

This is the code without the patch (armv7) using -O0:

00000248 <cache_disable>:
 248:	e92d4810 	push	{r4, fp, lr}
 24c:	e28db008 	add	fp, sp, #8
 250:	e24dd01c 	sub	sp, sp, #28
 254:	e50b0020 	str	r0, [fp, #-32]
 258:	ee114f10 	mrc	15, 0, r4, cr1, cr0, {0}
 25c:	e50b4014 	str	r4, [fp, #-20]
 260:	e51b3014 	ldr	r3, [fp, #-20]
 264:	e50b3010 	str	r3, [fp, #-16]
 268:	ebffff69 	bl	14 <cp_delay>
 26c:	e51b3020 	ldr	r3, [fp, #-32]
 270:	e3530004 	cmp	r3, #4
 274:	1a000007 	bne	298 <cache_disable+0x50>
 278:	e51b3010 	ldr	r3, [fp, #-16]
 27c:	e2033004 	and	r3, r3, #4
 280:	e3530000 	cmp	r3, #0
 284:	0a00000b 	beq	2b8 <cache_disable+0x70>
 288:	e51b3020 	ldr	r3, [fp, #-32]
 28c:	e3833001 	orr	r3, r3, #1
 290:	e50b3020 	str	r3, [fp, #-32]
 294:	ebfffffe 	bl	0 <flush_dcache_all>
 298:	e51b3020 	ldr	r3, [fp, #-32]
                 ^^^^^^^^^^^^^

 29c:	e1e02003 	mvn	r2, r3
 2a0:	e51b3010 	ldr	r3, [fp, #-16]
 2a4:	e0023003 	and	r3, r2, r3
 2a8:	e50b3018 	str	r3, [fp, #-24]
 2ac:	e51b3018 	ldr	r3, [fp, #-24]
 2b0:	ee013f10 	mcr	15, 0, r3, cr1, cr0, {0}
 2b4:	ea000000 	b	2bc <cache_disable+0x74>
 2b8:	e1a00000 	nop			; (mov r0, r0)
 2bc:	e24bd008 	sub	sp, fp, #8
 2c0:	e8bd8810 	pop	{r4, fp, pc}

this is the code with the patch:

00000124 <dcache_disable>:
 124:	e92d4010 	push	{r4, lr}
 128:	ebffffb4 	bl	0 <cp_delay>
 12c:	ee114f10 	mrc	15, 0, r4, cr1, cr0, {0}
 130:	e3140004 	tst	r4, #4
 134:	08bd8010 	popeq	{r4, pc}
 138:	ebfffffe 	bl	0 <flush_dcache_all>
 13c:	e3c44005 	bic	r4, r4, #5
 140:	ee014f10 	mcr	15, 0, r4, cr1, cr0, {0}
 144:	e8bd8010 	pop	{r4, pc}

I have marked with ^^^^^^^^^^ the line that I think it dies. I did not
spent a lot of time looking at it - just found the problem with an ICE
and then tried to fix it. I suspect it can be fixed with some sort of
dsb() in flush_dcache_all(), but I am not sure.

Unfortunately things have moved on and I can't easily test this (now
using Thumb2 where -O0 isn't quite so extreme).

>
>> +static void cache_disable(uint32_t cache_bit) __attribute__ ((optimize(2)));
>
> Sorry, I will not accept this.

OK. If I hit it again next time I will try a bit harder to root cause it.

Regards,
Simon

>
> Best regards,
>
> Wolfgang Denk
>
> --
> DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
> backups: always in season, never out of style.
Wolfgang Denk - Nov. 7, 2012, 12:55 p.m.
Dear Simon,

In message <CAPnjgZ38TZr2ztdq5D8fNfgCz9a+-vGJFv0saezJWiQBHh-FwA@mail.gmail.com> you wrote:
> 
> > "compiler to read the PC back from the stack after the dcache flush" -
> > can you please explain what exactly this means, and which exact
> > problem it causes?
> 
> This is the code without the patch (armv7) using -O0:
> 
> 00000248 <cache_disable>:
>  248:	e92d4810 	push	{r4, fp, lr}
>  24c:	e28db008 	add	fp, sp, #8
>  250:	e24dd01c 	sub	sp, sp, #28
>  254:	e50b0020 	str	r0, [fp, #-32]
>  258:	ee114f10 	mrc	15, 0, r4, cr1, cr0, {0}
>  25c:	e50b4014 	str	r4, [fp, #-20]
>  260:	e51b3014 	ldr	r3, [fp, #-20]
>  264:	e50b3010 	str	r3, [fp, #-16]
>  268:	ebffff69 	bl	14 <cp_delay>
>  26c:	e51b3020 	ldr	r3, [fp, #-32]
>  270:	e3530004 	cmp	r3, #4
>  274:	1a000007 	bne	298 <cache_disable+0x50>
>  278:	e51b3010 	ldr	r3, [fp, #-16]
>  27c:	e2033004 	and	r3, r3, #4
>  280:	e3530000 	cmp	r3, #0
>  284:	0a00000b 	beq	2b8 <cache_disable+0x70>
>  288:	e51b3020 	ldr	r3, [fp, #-32]
>  28c:	e3833001 	orr	r3, r3, #1
>  290:	e50b3020 	str	r3, [fp, #-32]
>  294:	ebfffffe 	bl	0 <flush_dcache_all>
>  298:	e51b3020 	ldr	r3, [fp, #-32]
>                  ^^^^^^^^^^^^^
> 
>  29c:	e1e02003 	mvn	r2, r3
>  2a0:	e51b3010 	ldr	r3, [fp, #-16]
>  2a4:	e0023003 	and	r3, r2, r3
>  2a8:	e50b3018 	str	r3, [fp, #-24]
>  2ac:	e51b3018 	ldr	r3, [fp, #-24]
>  2b0:	ee013f10 	mcr	15, 0, r3, cr1, cr0, {0}
>  2b4:	ea000000 	b	2bc <cache_disable+0x74>
>  2b8:	e1a00000 	nop			; (mov r0, r0)
>  2bc:	e24bd008 	sub	sp, fp, #8
>  2c0:	e8bd8810 	pop	{r4, fp, pc}
> 
> this is the code with the patch:
> 
> 00000124 <dcache_disable>:
>  124:	e92d4010 	push	{r4, lr}
>  128:	ebffffb4 	bl	0 <cp_delay>
>  12c:	ee114f10 	mrc	15, 0, r4, cr1, cr0, {0}
>  130:	e3140004 	tst	r4, #4
>  134:	08bd8010 	popeq	{r4, pc}
>  138:	ebfffffe 	bl	0 <flush_dcache_all>
>  13c:	e3c44005 	bic	r4, r4, #5
>  140:	ee014f10 	mcr	15, 0, r4, cr1, cr0, {0}
>  144:	e8bd8010 	pop	{r4, pc}
> 
> I have marked with ^^^^^^^^^^ the line that I think it dies. I did not
> spent a lot of time looking at it - just found the problem with an ICE
> and then tried to fix it. I suspect it can be fixed with some sort of
> dsb() in flush_dcache_all(), but I am not sure.

The code is actually pretty much different - note that the version
above contains calls to cache_disable() which are not visible in the
code below.

But then - the insn you mark is restoring r3 from the stack, after iut
was stored there right before calling flush_dcache_all().  Why should
this cause problems?  And in which way is this "read[ing] the PC back
from the stack" ?  r3 is not the PC, right?


> OK. If I hit it again next time I will try a bit harder to root cause it.

Thanks, and sorry for insisting on a real explanation for obscure
problems.

Best regards,

Wolfgang Denk
Simon Glass - Nov. 7, 2012, 3:42 p.m.
Hi Wolfgang,

On Wed, Nov 7, 2012 at 4:55 AM, Wolfgang Denk <wd@denx.de> wrote:
> Dear Simon,
>
> In message <CAPnjgZ38TZr2ztdq5D8fNfgCz9a+-vGJFv0saezJWiQBHh-FwA@mail.gmail.com> you wrote:
>>
>> > "compiler to read the PC back from the stack after the dcache flush" -
>> > can you please explain what exactly this means, and which exact
>> > problem it causes?
>>
>> This is the code without the patch (armv7) using -O0:
>>
>> 00000248 <cache_disable>:
>>  248: e92d4810        push    {r4, fp, lr}
>>  24c: e28db008        add     fp, sp, #8
>>  250: e24dd01c        sub     sp, sp, #28
>>  254: e50b0020        str     r0, [fp, #-32]
>>  258: ee114f10        mrc     15, 0, r4, cr1, cr0, {0}
>>  25c: e50b4014        str     r4, [fp, #-20]
>>  260: e51b3014        ldr     r3, [fp, #-20]
>>  264: e50b3010        str     r3, [fp, #-16]
>>  268: ebffff69        bl      14 <cp_delay>
>>  26c: e51b3020        ldr     r3, [fp, #-32]
>>  270: e3530004        cmp     r3, #4
>>  274: 1a000007        bne     298 <cache_disable+0x50>
>>  278: e51b3010        ldr     r3, [fp, #-16]
>>  27c: e2033004        and     r3, r3, #4
>>  280: e3530000        cmp     r3, #0
>>  284: 0a00000b        beq     2b8 <cache_disable+0x70>
>>  288: e51b3020        ldr     r3, [fp, #-32]
>>  28c: e3833001        orr     r3, r3, #1
>>  290: e50b3020        str     r3, [fp, #-32]
>>  294: ebfffffe        bl      0 <flush_dcache_all>
>>  298: e51b3020        ldr     r3, [fp, #-32]
>>                  ^^^^^^^^^^^^^
>>
>>  29c: e1e02003        mvn     r2, r3
>>  2a0: e51b3010        ldr     r3, [fp, #-16]
>>  2a4: e0023003        and     r3, r2, r3
>>  2a8: e50b3018        str     r3, [fp, #-24]
>>  2ac: e51b3018        ldr     r3, [fp, #-24]
>>  2b0: ee013f10        mcr     15, 0, r3, cr1, cr0, {0}
>>  2b4: ea000000        b       2bc <cache_disable+0x74>
>>  2b8: e1a00000        nop                     ; (mov r0, r0)
>>  2bc: e24bd008        sub     sp, fp, #8
>>  2c0: e8bd8810        pop     {r4, fp, pc}
>>
>> this is the code with the patch:
>>
>> 00000124 <dcache_disable>:
>>  124: e92d4010        push    {r4, lr}
>>  128: ebffffb4        bl      0 <cp_delay>
>>  12c: ee114f10        mrc     15, 0, r4, cr1, cr0, {0}
>>  130: e3140004        tst     r4, #4
>>  134: 08bd8010        popeq   {r4, pc}
>>  138: ebfffffe        bl      0 <flush_dcache_all>
>>  13c: e3c44005        bic     r4, r4, #5
>>  140: ee014f10        mcr     15, 0, r4, cr1, cr0, {0}
>>  144: e8bd8010        pop     {r4, pc}
>>
>> I have marked with ^^^^^^^^^^ the line that I think it dies. I did not
>> spent a lot of time looking at it - just found the problem with an ICE
>> and then tried to fix it. I suspect it can be fixed with some sort of
>> dsb() in flush_dcache_all(), but I am not sure.
>
> The code is actually pretty much different - note that the version
> above contains calls to cache_disable() which are not visible in the
> code below.
>
> But then - the insn you mark is restoring r3 from the stack, after iut
> was stored there right before calling flush_dcache_all().  Why should
> this cause problems?  And in which way is this "read[ing] the PC back
> from the stack" ?  r3 is not the PC, right?

I wish I could easily repeat this, but I am not set up for it now, and
the code bases have moved on. I recall that the PC value read from the
stack (in one of the pop operations0 was wrong.

>
>
>> OK. If I hit it again next time I will try a bit harder to root cause it.
>
> Thanks, and sorry for insisting on a real explanation for obscure
> problems.

Well fair enough I think. Maybe someone else will hit it and this
thread will be useful for them to root cause. I find the need for -O0
is much reduced now that the tools are better.

Regards,
SImon

>
> Best regards,
>
> Wolfgang Denk
>
> --
> DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
> Teenagers are people who express a burning desire to be different by
> dressing exactly alike.
> There are some strings. They're just not attached.

Patch

diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
index 939de10..8f8385d 100644
--- a/arch/arm/lib/cache-cp15.c
+++ b/arch/arm/lib/cache-cp15.c
@@ -110,6 +110,16 @@  static void cache_enable(uint32_t cache_bit)
 	set_cr(reg | cache_bit);
 }
 
+/*
+ * Big hack warning!
+ *
+ * Devs like to compile with -O0 to get a nice debugging illusion. But this
+ * function does not survive that since -O0 causes the compiler to read the
+ * PC back from the stack after the dcache flush. Might it be possible to fix
+ * this by flushing the write buffer?
+ */
+static void cache_disable(uint32_t cache_bit) __attribute__ ((optimize(2)));
+
 /* cache_bit must be either CR_I or CR_C */
 static void cache_disable(uint32_t cache_bit)
 {