Message ID | 1351813330-23741-1-git-send-email-sjg@chromium.org |
---|---|
State | Rejected, archived |
Delegated to: | Albert ARIBAUD |
Headers | show |
Dear Simon Glass, > It is good to have these functions written in C instead of assembler, > but with -O0 the cache_disable() function doesn't return. Rather than > revert to assembler, this fix just forces this to be built with -O2. > > Signed-off-by: Simon Glass <sjg@chromium.org> > --- > arch/arm/lib/cache-cp15.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c > index 939de10..8f8385d 100644 > --- a/arch/arm/lib/cache-cp15.c > +++ b/arch/arm/lib/cache-cp15.c > @@ -110,6 +110,16 @@ static void cache_enable(uint32_t cache_bit) > set_cr(reg | cache_bit); > } > > +/* > + * Big hack warning! > + * > + * Devs like to compile with -O0 to get a nice debugging illusion. But > this + * function does not survive that since -O0 causes the compiler to > read the + * PC back from the stack after the dcache flush. Might it be > possible to fix + * this by flushing the write buffer? > + */ > +static void cache_disable(uint32_t cache_bit) __attribute__ > ((optimize(2))); + > /* cache_bit must be either CR_I or CR_C */ > static void cache_disable(uint32_t cache_bit) > { Uh, are you sure this is right ? Best regards, Marek Vasut
Dear Simon Glass, In message <1351813330-23741-1-git-send-email-sjg@chromium.org> you wrote: > It is good to have these functions written in C instead of assembler, > but with -O0 the cache_disable() function doesn't return. Rather than > revert to assembler, this fix just forces this to be built with -O2. NAK. This is vodoo programming to fix a problem which is obviously not correctly understood (and fixed), so the real cause remains unfixed. > +/* > + * Big hack warning! > + * > + * Devs like to compile with -O0 to get a nice debugging illusion. But this We don't use -O0 normally, and actually there are more places in the code that are likely to cause problems or to actually break when using -O0. > + * function does not survive that since -O0 causes the compiler to read the > + * PC back from the stack after the dcache flush. Might it be possible to fix > + * this by flushing the write buffer? > + */ "compiler to read the PC back from the stack after the dcache flush" - can you please explain what exactly this means, and which exact problem it causes? > +static void cache_disable(uint32_t cache_bit) __attribute__ ((optimize(2))); Sorry, I will not accept this. Best regards, Wolfgang Denk
Hi Wolfgang, On Sat, Nov 3, 2012 at 5:29 AM, Wolfgang Denk <wd@denx.de> wrote: > Dear Simon Glass, > > In message <1351813330-23741-1-git-send-email-sjg@chromium.org> you wrote: >> It is good to have these functions written in C instead of assembler, >> but with -O0 the cache_disable() function doesn't return. Rather than >> revert to assembler, this fix just forces this to be built with -O2. > > NAK. > > This is vodoo programming to fix a problem which is obviously not > correctly understood (and fixed), so the real cause remains unfixed. > >> +/* >> + * Big hack warning! >> + * >> + * Devs like to compile with -O0 to get a nice debugging illusion. But this > > We don't use -O0 normally, and actually there are more places in the > code that are likely to cause problems or to actually break when > using -O0. > >> + * function does not survive that since -O0 causes the compiler to read the >> + * PC back from the stack after the dcache flush. Might it be possible to fix >> + * this by flushing the write buffer? >> + */ > > "compiler to read the PC back from the stack after the dcache flush" - > can you please explain what exactly this means, and which exact > problem it causes? This is the code without the patch (armv7) using -O0: 00000248 <cache_disable>: 248: e92d4810 push {r4, fp, lr} 24c: e28db008 add fp, sp, #8 250: e24dd01c sub sp, sp, #28 254: e50b0020 str r0, [fp, #-32] 258: ee114f10 mrc 15, 0, r4, cr1, cr0, {0} 25c: e50b4014 str r4, [fp, #-20] 260: e51b3014 ldr r3, [fp, #-20] 264: e50b3010 str r3, [fp, #-16] 268: ebffff69 bl 14 <cp_delay> 26c: e51b3020 ldr r3, [fp, #-32] 270: e3530004 cmp r3, #4 274: 1a000007 bne 298 <cache_disable+0x50> 278: e51b3010 ldr r3, [fp, #-16] 27c: e2033004 and r3, r3, #4 280: e3530000 cmp r3, #0 284: 0a00000b beq 2b8 <cache_disable+0x70> 288: e51b3020 ldr r3, [fp, #-32] 28c: e3833001 orr r3, r3, #1 290: e50b3020 str r3, [fp, #-32] 294: ebfffffe bl 0 <flush_dcache_all> 298: e51b3020 ldr r3, [fp, #-32] ^^^^^^^^^^^^^ 29c: e1e02003 mvn r2, r3 2a0: e51b3010 ldr r3, [fp, #-16] 2a4: e0023003 and r3, r2, r3 2a8: e50b3018 str r3, [fp, #-24] 2ac: e51b3018 ldr r3, [fp, #-24] 2b0: ee013f10 mcr 15, 0, r3, cr1, cr0, {0} 2b4: ea000000 b 2bc <cache_disable+0x74> 2b8: e1a00000 nop ; (mov r0, r0) 2bc: e24bd008 sub sp, fp, #8 2c0: e8bd8810 pop {r4, fp, pc} this is the code with the patch: 00000124 <dcache_disable>: 124: e92d4010 push {r4, lr} 128: ebffffb4 bl 0 <cp_delay> 12c: ee114f10 mrc 15, 0, r4, cr1, cr0, {0} 130: e3140004 tst r4, #4 134: 08bd8010 popeq {r4, pc} 138: ebfffffe bl 0 <flush_dcache_all> 13c: e3c44005 bic r4, r4, #5 140: ee014f10 mcr 15, 0, r4, cr1, cr0, {0} 144: e8bd8010 pop {r4, pc} I have marked with ^^^^^^^^^^ the line that I think it dies. I did not spent a lot of time looking at it - just found the problem with an ICE and then tried to fix it. I suspect it can be fixed with some sort of dsb() in flush_dcache_all(), but I am not sure. Unfortunately things have moved on and I can't easily test this (now using Thumb2 where -O0 isn't quite so extreme). > >> +static void cache_disable(uint32_t cache_bit) __attribute__ ((optimize(2))); > > Sorry, I will not accept this. OK. If I hit it again next time I will try a bit harder to root cause it. Regards, Simon > > Best regards, > > Wolfgang Denk > > -- > DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de > backups: always in season, never out of style.
Dear Simon, In message <CAPnjgZ38TZr2ztdq5D8fNfgCz9a+-vGJFv0saezJWiQBHh-FwA@mail.gmail.com> you wrote: > > > "compiler to read the PC back from the stack after the dcache flush" - > > can you please explain what exactly this means, and which exact > > problem it causes? > > This is the code without the patch (armv7) using -O0: > > 00000248 <cache_disable>: > 248: e92d4810 push {r4, fp, lr} > 24c: e28db008 add fp, sp, #8 > 250: e24dd01c sub sp, sp, #28 > 254: e50b0020 str r0, [fp, #-32] > 258: ee114f10 mrc 15, 0, r4, cr1, cr0, {0} > 25c: e50b4014 str r4, [fp, #-20] > 260: e51b3014 ldr r3, [fp, #-20] > 264: e50b3010 str r3, [fp, #-16] > 268: ebffff69 bl 14 <cp_delay> > 26c: e51b3020 ldr r3, [fp, #-32] > 270: e3530004 cmp r3, #4 > 274: 1a000007 bne 298 <cache_disable+0x50> > 278: e51b3010 ldr r3, [fp, #-16] > 27c: e2033004 and r3, r3, #4 > 280: e3530000 cmp r3, #0 > 284: 0a00000b beq 2b8 <cache_disable+0x70> > 288: e51b3020 ldr r3, [fp, #-32] > 28c: e3833001 orr r3, r3, #1 > 290: e50b3020 str r3, [fp, #-32] > 294: ebfffffe bl 0 <flush_dcache_all> > 298: e51b3020 ldr r3, [fp, #-32] > ^^^^^^^^^^^^^ > > 29c: e1e02003 mvn r2, r3 > 2a0: e51b3010 ldr r3, [fp, #-16] > 2a4: e0023003 and r3, r2, r3 > 2a8: e50b3018 str r3, [fp, #-24] > 2ac: e51b3018 ldr r3, [fp, #-24] > 2b0: ee013f10 mcr 15, 0, r3, cr1, cr0, {0} > 2b4: ea000000 b 2bc <cache_disable+0x74> > 2b8: e1a00000 nop ; (mov r0, r0) > 2bc: e24bd008 sub sp, fp, #8 > 2c0: e8bd8810 pop {r4, fp, pc} > > this is the code with the patch: > > 00000124 <dcache_disable>: > 124: e92d4010 push {r4, lr} > 128: ebffffb4 bl 0 <cp_delay> > 12c: ee114f10 mrc 15, 0, r4, cr1, cr0, {0} > 130: e3140004 tst r4, #4 > 134: 08bd8010 popeq {r4, pc} > 138: ebfffffe bl 0 <flush_dcache_all> > 13c: e3c44005 bic r4, r4, #5 > 140: ee014f10 mcr 15, 0, r4, cr1, cr0, {0} > 144: e8bd8010 pop {r4, pc} > > I have marked with ^^^^^^^^^^ the line that I think it dies. I did not > spent a lot of time looking at it - just found the problem with an ICE > and then tried to fix it. I suspect it can be fixed with some sort of > dsb() in flush_dcache_all(), but I am not sure. The code is actually pretty much different - note that the version above contains calls to cache_disable() which are not visible in the code below. But then - the insn you mark is restoring r3 from the stack, after iut was stored there right before calling flush_dcache_all(). Why should this cause problems? And in which way is this "read[ing] the PC back from the stack" ? r3 is not the PC, right? > OK. If I hit it again next time I will try a bit harder to root cause it. Thanks, and sorry for insisting on a real explanation for obscure problems. Best regards, Wolfgang Denk
Hi Wolfgang, On Wed, Nov 7, 2012 at 4:55 AM, Wolfgang Denk <wd@denx.de> wrote: > Dear Simon, > > In message <CAPnjgZ38TZr2ztdq5D8fNfgCz9a+-vGJFv0saezJWiQBHh-FwA@mail.gmail.com> you wrote: >> >> > "compiler to read the PC back from the stack after the dcache flush" - >> > can you please explain what exactly this means, and which exact >> > problem it causes? >> >> This is the code without the patch (armv7) using -O0: >> >> 00000248 <cache_disable>: >> 248: e92d4810 push {r4, fp, lr} >> 24c: e28db008 add fp, sp, #8 >> 250: e24dd01c sub sp, sp, #28 >> 254: e50b0020 str r0, [fp, #-32] >> 258: ee114f10 mrc 15, 0, r4, cr1, cr0, {0} >> 25c: e50b4014 str r4, [fp, #-20] >> 260: e51b3014 ldr r3, [fp, #-20] >> 264: e50b3010 str r3, [fp, #-16] >> 268: ebffff69 bl 14 <cp_delay> >> 26c: e51b3020 ldr r3, [fp, #-32] >> 270: e3530004 cmp r3, #4 >> 274: 1a000007 bne 298 <cache_disable+0x50> >> 278: e51b3010 ldr r3, [fp, #-16] >> 27c: e2033004 and r3, r3, #4 >> 280: e3530000 cmp r3, #0 >> 284: 0a00000b beq 2b8 <cache_disable+0x70> >> 288: e51b3020 ldr r3, [fp, #-32] >> 28c: e3833001 orr r3, r3, #1 >> 290: e50b3020 str r3, [fp, #-32] >> 294: ebfffffe bl 0 <flush_dcache_all> >> 298: e51b3020 ldr r3, [fp, #-32] >> ^^^^^^^^^^^^^ >> >> 29c: e1e02003 mvn r2, r3 >> 2a0: e51b3010 ldr r3, [fp, #-16] >> 2a4: e0023003 and r3, r2, r3 >> 2a8: e50b3018 str r3, [fp, #-24] >> 2ac: e51b3018 ldr r3, [fp, #-24] >> 2b0: ee013f10 mcr 15, 0, r3, cr1, cr0, {0} >> 2b4: ea000000 b 2bc <cache_disable+0x74> >> 2b8: e1a00000 nop ; (mov r0, r0) >> 2bc: e24bd008 sub sp, fp, #8 >> 2c0: e8bd8810 pop {r4, fp, pc} >> >> this is the code with the patch: >> >> 00000124 <dcache_disable>: >> 124: e92d4010 push {r4, lr} >> 128: ebffffb4 bl 0 <cp_delay> >> 12c: ee114f10 mrc 15, 0, r4, cr1, cr0, {0} >> 130: e3140004 tst r4, #4 >> 134: 08bd8010 popeq {r4, pc} >> 138: ebfffffe bl 0 <flush_dcache_all> >> 13c: e3c44005 bic r4, r4, #5 >> 140: ee014f10 mcr 15, 0, r4, cr1, cr0, {0} >> 144: e8bd8010 pop {r4, pc} >> >> I have marked with ^^^^^^^^^^ the line that I think it dies. I did not >> spent a lot of time looking at it - just found the problem with an ICE >> and then tried to fix it. I suspect it can be fixed with some sort of >> dsb() in flush_dcache_all(), but I am not sure. > > The code is actually pretty much different - note that the version > above contains calls to cache_disable() which are not visible in the > code below. > > But then - the insn you mark is restoring r3 from the stack, after iut > was stored there right before calling flush_dcache_all(). Why should > this cause problems? And in which way is this "read[ing] the PC back > from the stack" ? r3 is not the PC, right? I wish I could easily repeat this, but I am not set up for it now, and the code bases have moved on. I recall that the PC value read from the stack (in one of the pop operations0 was wrong. > > >> OK. If I hit it again next time I will try a bit harder to root cause it. > > Thanks, and sorry for insisting on a real explanation for obscure > problems. Well fair enough I think. Maybe someone else will hit it and this thread will be useful for them to root cause. I find the need for -O0 is much reduced now that the tools are better. Regards, SImon > > Best regards, > > Wolfgang Denk > > -- > DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de > Teenagers are people who express a burning desire to be different by > dressing exactly alike. > There are some strings. They're just not attached.
diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c index 939de10..8f8385d 100644 --- a/arch/arm/lib/cache-cp15.c +++ b/arch/arm/lib/cache-cp15.c @@ -110,6 +110,16 @@ static void cache_enable(uint32_t cache_bit) set_cr(reg | cache_bit); } +/* + * Big hack warning! + * + * Devs like to compile with -O0 to get a nice debugging illusion. But this + * function does not survive that since -O0 causes the compiler to read the + * PC back from the stack after the dcache flush. Might it be possible to fix + * this by flushing the write buffer? + */ +static void cache_disable(uint32_t cache_bit) __attribute__ ((optimize(2))); + /* cache_bit must be either CR_I or CR_C */ static void cache_disable(uint32_t cache_bit) {
It is good to have these functions written in C instead of assembler, but with -O0 the cache_disable() function doesn't return. Rather than revert to assembler, this fix just forces this to be built with -O2. Signed-off-by: Simon Glass <sjg@chromium.org> --- arch/arm/lib/cache-cp15.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-)