Patchwork [U-Boot,RFC,v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.

login
register
mail settings
Submitter John Rigby
Date Dec. 21, 2010, 12:46 a.m.
Message ID <AANLkTinMK=gzYgyUTjkCg2y1g9YuRAu_P-DS0GARrfPR@mail.gmail.com>
Download mbox | patch
Permalink /patch/76248/
State Not Applicable
Headers show

Comments

John Rigby - Dec. 21, 2010, 12:46 a.m.
On Mon, Dec 20, 2010 at 5:25 PM, John Rigby <john.rigby@linaro.org> wrote:
> On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler <holler@ahsoftware.de> wrote:
>
>> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
>> your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source to
>> compile u-boot the kernel comes up. Here is the output for the non-working
>> u-boot:
>>
>> ----------------
>> U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)
>>
>> OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
>> OMAP3 Beagle board + LPDDR/NAND
>> I2C:   ready
>> DRAM:  256 MiB
>> NAND:  256 MiB
>> MMC:   OMAP SD/MMC: 0
>> In:    serial
>> Out:   serial
>> Err:   serial
>> Beagle Rev C4
>> timed out in wait_for_pin: I2C_STAT=0
>> No EEPROM on expansion board
>> Die ID #062a000400000000040365fa16019019
>> Hit any key to stop autoboot:  0
>> reading boot.scr
>>
>> 422 bytes read
>> Running bootscript from mmc ...
>> ## Executing script at 82000000
>> reading uImage
>>
>> 2419940 bytes read
>> Booting from mmc ...
>> ## Booting kernel from Legacy Image at 82000000 ...
>>   Image Name:   Linux-2.6.37-rc5-beagleboard-000
>>   Image Type:   ARM Linux Kernel Image (uncompressed)
>>   Data Size:    2419876 Bytes = 2.3 MiB
>>   Load Address: 80008000
>>   Entry Point:  80008000
>>   Verifying Checksum ... OK
>>   Loading Kernel Image ... OK
>> OK
>> ----------------
>>
>> Nothing else.
>>
>> Regards,
>>
>> Alexander
>>
>
> Yes, you are correct, I see the same here with 4.5.2.  I noticed that
> bd did not have correct values of machine type and boot params:
>
> bd address  = 0x8FF24FE0
> arch_number = 0xFF0084FF
> boot_params = 0xBB2000FE
> DRAM bank   = 0x00000000
> -> start    = 0x80000000
> -> size     = 0x08000000
> DRAM bank   = 0x00000001
> -> start    = 0x88000000
> -> size     = 0x08000000
> baudrate    = 115200 bps
> TLB addr    = 0x8FFF0000
> relocaddr   = 0x8FF85000
> reloc off   = 0x0FF7D000
> irq_sp      = 0x8FF24F68
> sp start    = 0x8FF24F60
> FB base     = 0x00000000
>
> If we then look at board_init in beagle.c the problem is obvious:
>
> 800331ac <board_init>:
> 800331ac:       e92d4008        push    {r3, lr}
> 800331b0:       ebff5a74        bl      80009b88 <gpmc_init>
> 800331b4:       e3a00000        mov     r0, #0
> 800331b8:       e5983000        ldr     r3, [r8]
> 800331bc:       e5983000        ldr     r3, [r8]
> 800331c0:       e8bd8008        pop     {r3, pc}
>

Apparently this is a known issue mentioned in README:

NOTE: DECLARE_GLOBAL_DATA_PTR must be used with file-global scope,
or current versions of GCC may "optimize" the code too much.


With this fix I can boot again:


Please let me know if you find any other problems.

br,

John
Dirk Behme - Dec. 21, 2010, 7:11 a.m.
On 21.12.2010 01:46, John Rigby wrote:
> On Mon, Dec 20, 2010 at 5:25 PM, John Rigby<john.rigby@linaro.org>  wrote:
>> On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler<holler@ahsoftware.de>  wrote:
>>
>>> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
>>> your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source to
>>> compile u-boot the kernel comes up. Here is the output for the non-working
>>> u-boot:
>>>
>>> ----------------
>>> U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)
>>>
>>> OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
>>> OMAP3 Beagle board + LPDDR/NAND
>>> I2C:   ready
>>> DRAM:  256 MiB
>>> NAND:  256 MiB
>>> MMC:   OMAP SD/MMC: 0
>>> In:    serial
>>> Out:   serial
>>> Err:   serial
>>> Beagle Rev C4
>>> timed out in wait_for_pin: I2C_STAT=0
>>> No EEPROM on expansion board
>>> Die ID #062a000400000000040365fa16019019
>>> Hit any key to stop autoboot:  0
>>> reading boot.scr
>>>
>>> 422 bytes read
>>> Running bootscript from mmc ...
>>> ## Executing script at 82000000
>>> reading uImage
>>>
>>> 2419940 bytes read
>>> Booting from mmc ...
>>> ## Booting kernel from Legacy Image at 82000000 ...
>>>    Image Name:   Linux-2.6.37-rc5-beagleboard-000
>>>    Image Type:   ARM Linux Kernel Image (uncompressed)
>>>    Data Size:    2419876 Bytes = 2.3 MiB
>>>    Load Address: 80008000
>>>    Entry Point:  80008000
>>>    Verifying Checksum ... OK
>>>    Loading Kernel Image ... OK
>>> OK
>>> ----------------
>>>
>>> Nothing else.
>>>
>>> Regards,
>>>
>>> Alexander
>>>
>>
>> Yes, you are correct, I see the same here with 4.5.2.  I noticed that
>> bd did not have correct values of machine type and boot params:
>>
>> bd address  = 0x8FF24FE0
>> arch_number = 0xFF0084FF
>> boot_params = 0xBB2000FE
>> DRAM bank   = 0x00000000
>> ->  start    = 0x80000000
>> ->  size     = 0x08000000
>> DRAM bank   = 0x00000001
>> ->  start    = 0x88000000
>> ->  size     = 0x08000000
>> baudrate    = 115200 bps
>> TLB addr    = 0x8FFF0000
>> relocaddr   = 0x8FF85000
>> reloc off   = 0x0FF7D000
>> irq_sp      = 0x8FF24F68
>> sp start    = 0x8FF24F60
>> FB base     = 0x00000000
>>
>> If we then look at board_init in beagle.c the problem is obvious:
>>
>> 800331ac<board_init>:
>> 800331ac:       e92d4008        push    {r3, lr}
>> 800331b0:       ebff5a74        bl      80009b88<gpmc_init>
>> 800331b4:       e3a00000        mov     r0, #0
>> 800331b8:       e5983000        ldr     r3, [r8]
>> 800331bc:       e5983000        ldr     r3, [r8]
>> 800331c0:       e8bd8008        pop     {r3, pc}
>>
>
> Apparently this is a known issue mentioned in README:
>
> NOTE: DECLARE_GLOBAL_DATA_PTR must be used with file-global scope,
> or current versions of GCC may "optimize" the code too much.
>
>
> With this fix I can boot again:
>
> diff --git a/board/ti/beagle/beagle.c b/board/ti/beagle/beagle.c
> index d9b6f01..c066d6e 100644
> --- a/board/ti/beagle/beagle.c
> +++ b/board/ti/beagle/beagle.c
> @@ -51,6 +51,8 @@
>
>   #define BEAGLE_NO_EEPROM               0xffffffff
>
> +DECLARE_GLOBAL_DATA_PTR;
> +
>   static struct {
>          unsigned int device_vendor;
>          unsigned char revision;
> @@ -66,8 +68,6 @@ static struct {
>    */
>   int board_init(void)
>   {
> -       DECLARE_GLOBAL_DATA_PTR;
> -
>          gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
>          /* board id for Linux */
>          gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;
>
> Please let me know if you find any other problems.

Just to not loose the overview:

This is fixed by your patch

http://patchwork.ozlabs.org/patch/76250/

?

But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional 
ldrb    r3, [r3]) is still open? Has anybody tried to replace it with 
a nop in the binary to be sure this is the root cause?

Thanks

Dirk
Albert ARIBAUD - Dec. 21, 2010, 7:21 a.m.
Hi Dirk,

Le 21/12/2010 08:11, Dirk Behme a écrit :

> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
> a nop in the binary to be sure this is the root cause?

Can you try and preprocess the C file for both the broken and working 
cases, then post the preprocessed C extract? Differences at the C level 
may help understanding differences at the asm level.

> Thanks
>
> Dirk

Amicalement,
Dirk Behme - Dec. 21, 2010, 8:05 a.m.
On 21.12.2010 08:21, Albert ARIBAUD wrote:
> Hi Dirk,
>
> Le 21/12/2010 08:11, Dirk Behme a écrit :
>
>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>> a nop in the binary to be sure this is the root cause?
>
> Can you try and preprocess the C file for both the broken and working
> cases, then post the preprocessed C extract? Differences at the C level
> may help understanding differences at the asm level.

gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)

Work:
====

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;
  ...
  if (cmd != -1)

   (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
		origwriteb(cmd, this->IO_ADDR_W);
   88:	15933004 	ldrne	r3, [r3, #4]
   8c:	120110ff 	andne	r1, r1, #255	; 0xff
   90:	15c31000 	strbne	r1, [r3]
   94:	e12fff1e 	bx	lr
	...


Broken:
======

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;
...
  if (cmd != -1)
   ({ do { } while (0); (*(volatile unsigned char *)(this->IO_ADDR_W) 
= (cmd)); });
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
		writeb(cmd, this->IO_ADDR_W);
   88:	15933004 	ldrne	r3, [r3, #4]
   8c:	120110ff 	andne	r1, r1, #255	; 0xff
   90:	15c31000 	strbne	r1, [r3]
   94:	15d33000 	ldrbne	r3, [r3]
   98:	e12fff1e 	bx	lr
	...


The issue seems to be the additional 'ldrbne	r3, [r3]' added by the 
compiler in the broken version.

Best regards

Dirk
Wolfgang Denk - Dec. 21, 2010, 8:17 a.m.
Dear Dirk Behme,

In message <4D105FB3.3090801@googlemail.com> you wrote:
>
> Broken:
> ======
...
> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>      uint32_t ctrl)
> {
>   register struct nand_chip *this = mtd->priv;
> ...
>   if (cmd != -1)
>    ({ do { } while (0); (*(volatile unsigned char *)(this->IO_ADDR_W) >
> = (cmd)); });

But this is the old, discarded version of the patch.

I though we had agreed that we have to use the

	__asm__ __volatile__ ("" : : : "memory")

version instead?

Best regards,

Wolfgang Denk
Dirk Behme - Dec. 21, 2010, 8:35 a.m.
(Resend with corrected broken example)

On 21.12.2010 08:21, Albert ARIBAUD wrote:
> Hi Dirk,
>
> Le 21/12/2010 08:11, Dirk Behme a écrit :
>
>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>> a nop in the binary to be sure this is the root cause?
>
> Can you try and preprocess the C file for both the broken and working
> cases, then post the preprocessed C extract? Differences at the C level
> may help understanding differences at the asm level.

gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)

Work:
====

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;
  ...
  if (cmd != -1)

   (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
		origwriteb(cmd, this->IO_ADDR_W);
   88:	15933004 	ldrne	r3, [r3, #4]
   8c:	120110ff 	andne	r1, r1, #255	; 0xff
   90:	15c31000 	strbne	r1, [r3]
   94:	e12fff1e 	bx	lr
	...


Broken:
======

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;

...

  if (cmd != -1)
   ({ __asm__ __volatile__ ("" : : : "memory"); (*(volatile unsigned 
char *)(this->IO_ADDR_W) = (cmd)); });
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
   88:	012fff1e 	bxeq	lr
		writeb(cmd, this->IO_ADDR_W);
   8c:	e5933004 	ldr	r3, [r3, #4]
   90:	e20110ff 	and	r1, r1, #255	; 0xff
   94:	e5c31000 	strb	r1, [r3]
   98:	e5d33000 	ldrb	r3, [r3]
   9c:	e12fff1e 	bx	lr


The issue seems to be the additional 'ldrb	r3, [r3]' added by the 
compiler in the broken version.

Best regards

Dirk
Dirk Behme - Dec. 21, 2010, 8:37 a.m.
On 21.12.2010 09:17, Wolfgang Denk wrote:
> Dear Dirk Behme,
>
> In message<4D105FB3.3090801@googlemail.com>  you wrote:
>>
>> Broken:
>> ======
> ...
>> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>>       uint32_t ctrl)
>> {
>>    register struct nand_chip *this = mtd->priv;
>> ...
>>    if (cmd != -1)
>>     ({ do { } while (0); (*(volatile unsigned char *)(this->IO_ADDR_W)>
>> = (cmd)); });
>
> But this is the old, discarded version of the patch.
>
> I though we had agreed that we have to use the
>
> 	__asm__ __volatile__ ("" : : : "memory")
>
> version instead?

Yes.

I mixed the patches :(

Sorry for the noise. I just sent the hopefully correct version some 
minutes ago.

Thanks

Dirk
John Rigby - Dec. 21, 2010, 8:46 a.m.
On Tue, Dec 21, 2010 at 1:35 AM, Dirk Behme <dirk.behme@googlemail.com> wrote:
>
> (Resend with corrected broken example)
>
> On 21.12.2010 08:21, Albert ARIBAUD wrote:
>> Hi Dirk,
>>
>> Le 21/12/2010 08:11, Dirk Behme a écrit :
>>
>>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>>> a nop in the binary to be sure this is the root cause?
>>
>> Can you try and preprocess the C file for both the broken and working
>> cases, then post the preprocessed C extract? Differences at the C level
>> may help understanding differences at the asm level.
>
> gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)
>
> Work:
> ====
>
> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>     uint32_t ctrl)
> {
>  register struct nand_chip *this = mtd->priv;
>  ...
>  if (cmd != -1)
>
>   (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
> }
>
>        if (cmd != NAND_CMD_NONE)
>   84:  e3710001        cmn     r1, #1
>                origwriteb(cmd, this->IO_ADDR_W);
>   88:  15933004        ldrne   r3, [r3, #4]
>   8c:  120110ff        andne   r1, r1, #255    ; 0xff
>   90:  15c31000        strbne  r1, [r3]
>   94:  e12fff1e        bx      lr
>        ...
>
>
> Broken:
> ======
>
> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>     uint32_t ctrl)
> {
>  register struct nand_chip *this = mtd->priv;
>
> ...
>
>  if (cmd != -1)
>   ({ __asm__ __volatile__ ("" : : : "memory"); (*(volatile unsigned
> char *)(this->IO_ADDR_W) = (cmd)); });
> }
>
>        if (cmd != NAND_CMD_NONE)
>   84:  e3710001        cmn     r1, #1
>   88:  012fff1e        bxeq    lr
>                writeb(cmd, this->IO_ADDR_W);
>   8c:  e5933004        ldr     r3, [r3, #4]
>   90:  e20110ff        and     r1, r1, #255    ; 0xff
>   94:  e5c31000        strb    r1, [r3]
>   98:  e5d33000        ldrb    r3, [r3]
>   9c:  e12fff1e        bx      lr
>
>
> The issue seems to be the additional 'ldrb      r3, [r3]' added by the
> compiler in the broken version.
>

And I at your suggestion tried modifying the binary changing the extra
ldrb to a nop and it works.
Albert ARIBAUD - Dec. 21, 2010, 10:38 a.m.
Le 21/12/2010 09:46, John Rigby a écrit :
> On Tue, Dec 21, 2010 at 1:35 AM, Dirk Behme<dirk.behme@googlemail.com>  wrote:
>>
>> (Resend with corrected broken example)
>>
>> On 21.12.2010 08:21, Albert ARIBAUD wrote:
>>> Hi Dirk,
>>>
>>> Le 21/12/2010 08:11, Dirk Behme a écrit :
>>>
>>>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>>>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>>>> a nop in the binary to be sure this is the root cause?
>>>
>>> Can you try and preprocess the C file for both the broken and working
>>> cases, then post the preprocessed C extract? Differences at the C level
>>> may help understanding differences at the asm level.
>>
>> gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)
>>
>> Work:
>> ====
>>
>> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>>      uint32_t ctrl)
>> {
>>   register struct nand_chip *this = mtd->priv;
>>   ...
>>   if (cmd != -1)
>>
>>    (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
>> }
>>
>>         if (cmd != NAND_CMD_NONE)
>>    84:  e3710001        cmn     r1, #1
>>                 origwriteb(cmd, this->IO_ADDR_W);
>>    88:  15933004        ldrne   r3, [r3, #4]
>>    8c:  120110ff        andne   r1, r1, #255    ; 0xff
>>    90:  15c31000        strbne  r1, [r3]
>>    94:  e12fff1e        bx      lr
>>         ...
>>
>>
>> Broken:
>> ======
>>
>> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>>      uint32_t ctrl)
>> {
>>   register struct nand_chip *this = mtd->priv;
>>
>> ...
>>
>>   if (cmd != -1)
>>    ({ __asm__ __volatile__ ("" : : : "memory"); (*(volatile unsigned
>> char *)(this->IO_ADDR_W) = (cmd)); });
>> }
>>
>>         if (cmd != NAND_CMD_NONE)
>>    84:  e3710001        cmn     r1, #1
>>    88:  012fff1e        bxeq    lr
>>                 writeb(cmd, this->IO_ADDR_W);
>>    8c:  e5933004        ldr     r3, [r3, #4]
>>    90:  e20110ff        and     r1, r1, #255    ; 0xff
>>    94:  e5c31000        strb    r1, [r3]
>>    98:  e5d33000        ldrb    r3, [r3]
>>    9c:  e12fff1e        bx      lr
>>
>>
>> The issue seems to be the additional 'ldrb      r3, [r3]' added by the
>> compiler in the broken version.
>>
>
> And I at your suggestion tried modifying the binary changing the extra
> ldrb to a nop and it works.

Seems like a compiler issue to me, as the preprocessed C source is the 
same for the register access and does not call for a re-read (that is 
what I wanted to see in the preprocessed version), yet the ASM sequence 
does the re-read.

Amicalement,
Wolfgang Denk - Dec. 21, 2010, 10:53 a.m.
Dear Albert ARIBAUD,

In message <4D1083B4.2060704@free.fr> you wrote:
>
> > And I at your suggestion tried modifying the binary changing the extra
> > ldrb to a nop and it works.
>
> Seems like a compiler issue to me, as the preprocessed C source is the
> same for the register access and does not call for a re-read (that is
> what I wanted to see in the preprocessed version), yet the ASM sequence
> does the re-read.

I also tend to think this is a compiler problem.  Searching the gcc
bugzilla entries for "ldrb" turns up quite a number of hits.  I'm not
sure which of these we are running into here, but there are enough of
them so you can chose freely :-(

Best regards,

Wolfgang Denk
Alexander Holler - Dec. 21, 2010, 12:35 p.m.
Am 21.12.2010 11:53, schrieb Wolfgang Denk:
> Dear Albert ARIBAUD,
>
> In message<4D1083B4.2060704@free.fr>  you wrote:
>>
>>> And I at your suggestion tried modifying the binary changing the extra
>>> ldrb to a nop and it works.
>>
>> Seems like a compiler issue to me, as the preprocessed C source is the
>> same for the register access and does not call for a re-read (that is
>> what I wanted to see in the preprocessed version), yet the ASM sequence
>> does the re-read.
>
> I also tend to think this is a compiler problem.  Searching the gcc
> bugzilla entries for "ldrb" turns up quite a number of hits.  I'm not
> sure which of these we are running into here, but there are enough of
> them so you can chose freely :-(

Hmm, is there actual somethinbg which should forbid the compiler to 
generate such code which rereads something? It might not be nice, but I 
don't think that it is forbidden for a compiler to do so. So the proper 
way to handle such, might be to use asm to avoid that the compiler 
touches that register.

Regards,

Alexander
Albert ARIBAUD - Dec. 21, 2010, 12:51 p.m.
Le 21/12/2010 13:35, Alexander Holler a écrit :

> Hmm, is there actual somethinbg which should forbid the compiler to
> generate such code which rereads something? It might not be nice, but I
> don't think that it is forbidden for a compiler to do so. So the proper
> way to handle such, might be to use asm to avoid that the compiler
> touches that register.

Yes there is something that should prevent a compiler from inserting 
reads: these accesses are to hardware, not memory, and may cause side 
effects even on read (these could be acknowledges, for instance; I've 
seen instances of that myself on some HW).

Another way to look at it is that the semantics of " *ptr = value " is a 
pure write and should not result in a write-then-read.

> Regards,
>
> Alexander

Amicalement,
Alexander Holler - Dec. 21, 2010, 1:30 p.m.
Am 21.12.2010 13:51, schrieb Albert ARIBAUD:
> Le 21/12/2010 13:35, Alexander Holler a écrit :
>
>> Hmm, is there actual somethinbg which should forbid the compiler to
>> generate such code which rereads something? It might not be nice, but I
>> don't think that it is forbidden for a compiler to do so. So the proper
>> way to handle such, might be to use asm to avoid that the compiler
>> touches that register.
>
> Yes there is something that should prevent a compiler from inserting
> reads: these accesses are to hardware, not memory, and may cause side
> effects even on read (these could be acknowledges, for instance; I've
> seen instances of that myself on some HW).
>
> Another way to look at it is that the semantics of " *ptr = value " is a
> pure write and should not result in a write-then-read.

I think it's something like atomic_read. E.g. when reading an 32bit int 
(uint32_t i = *bla;), nothing forbids that the compiler generates code 
which reads those 4 bytes byte by byte (and so becoming a non-atomic 
operation). It's unusual to do so on 32bit architectures but valid.

Regards,

Alexander
Albert ARIBAUD - Dec. 21, 2010, 2:33 p.m.
Le 21/12/2010 14:30, Alexander Holler a écrit :
> Am 21.12.2010 13:51, schrieb Albert ARIBAUD:
>> Le 21/12/2010 13:35, Alexander Holler a écrit :
>>
>>> Hmm, is there actual somethinbg which should forbid the compiler to
>>> generate such code which rereads something? It might not be nice, but I
>>> don't think that it is forbidden for a compiler to do so. So the proper
>>> way to handle such, might be to use asm to avoid that the compiler
>>> touches that register.
>>
>> Yes there is something that should prevent a compiler from inserting
>> reads: these accesses are to hardware, not memory, and may cause side
>> effects even on read (these could be acknowledges, for instance; I've
>> seen instances of that myself on some HW).
>>
>> Another way to look at it is that the semantics of " *ptr = value " is a
>> pure write and should not result in a write-then-read.
>
> I think it's something like atomic_read. E.g. when reading an 32bit int
> (uint32_t i = *bla;), nothing forbids that the compiler generates code
> which reads those 4 bytes byte by byte (and so becoming a non-atomic
> operation). It's unusual to do so on 32bit architectures but valid.

OTOH, this still respects the semantics (the compiler is allowed to do a 
non-atomic read) while the bug we're seeing does not repect the 
semantics (the compiler is not asked to do any read but does one).

> Regards,
>
> Alexander

Amicalement,
Wolfgang Denk - Dec. 21, 2010, 7:52 p.m.
Dear Albert & friends,

what is your opinion?  Should we include the memory barrier patch into
the upcoming release (and eventually delay it for further testing), or
release as is and solve this issue in the next release?

I tend to leave it as is, as I expect that most people will disappear
in the next few days for holidays, so no much testing will be done
anyway, and we then can solve this with less pressure in the next
release - but I'm not really sure if this is a good idea?

Best regards,

Wolfgang Denk
Dirk Behme - Dec. 21, 2010, 8:04 p.m.
On 21.12.2010 20:52, Wolfgang Denk wrote:
> Dear Albert&  friends,
>
> what is your opinion?  Should we include the memory barrier patch into
> the upcoming release (and eventually delay it for further testing), or
> release as is and solve this issue in the next release?
>
> I tend to leave it as is, as I expect that most people will disappear
> in the next few days for holidays, so no much testing will be done
> anyway, and we then can solve this with less pressure in the next
> release - but I'm not really sure if this is a good idea?

I somehow tend to leave it as is, too.

We have issues with some recent compilers. For these we found a fix 
using the io.h somehow the same way the Linux kernel does. But this 
introduces new issues for us, we haven't found a proper fix yet 
(except changing the code to the 'old' io.h style). But we don't know 
where we might have this issue additionally, yet.

I would like to talk with some tool chain guys about this, too.

Could we add a readme or a note somewhere about the issues with the 
recent tool chains in this release?

Best regards

Dirk
Albert ARIBAUD - Dec. 21, 2010, 9:49 p.m.
Le 21/12/2010 21:04, Dirk Behme a écrit :

> I somehow tend to leave it as is, too.

Agree.

Amicalement,
Alexander Holler - Dec. 22, 2010, 12:11 a.m.
Am 21.12.2010 21:04, schrieb Dirk Behme:
> On 21.12.2010 20:52, Wolfgang Denk wrote:
>> Dear Albert&   friends,
>>
>> what is your opinion?  Should we include the memory barrier patch into
>> the upcoming release (and eventually delay it for further testing), or
>> release as is and solve this issue in the next release?
>>
>> I tend to leave it as is, as I expect that most people will disappear
>> in the next few days for holidays, so no much testing will be done
>> anyway, and we then can solve this with less pressure in the next
>> release - but I'm not really sure if this is a good idea?
>
> I somehow tend to leave it as is, too.
>
> We have issues with some recent compilers. For these we found a fix
> using the io.h somehow the same way the Linux kernel does. But this
> introduces new issues for us, we haven't found a proper fix yet
> (except changing the code to the 'old' io.h style). But we don't know
> where we might have this issue additionally, yet.

The only real problem found with that patch was one with a register 
which doesn't like an (unmotivated) read after write.

On the other side, without that patch, using gcc >= 4.5.x (at least on 
arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring 
that volatile, 4.5.x still fixes many bugs for arm and gcc >= 4.5.x is 
necessary for hardfloat. So it's likely that more people will start 
using 4.5.x (4.5.2 was just released).

Regards,

Alexander
Albert ARIBAUD - Dec. 22, 2010, 7:02 a.m.
Le 22/12/2010 01:11, Alexander Holler a écrit :
> Am 21.12.2010 21:04, schrieb Dirk Behme:
>> On 21.12.2010 20:52, Wolfgang Denk wrote:
>>> Dear Albert&    friends,
>>>
>>> what is your opinion?  Should we include the memory barrier patch into
>>> the upcoming release (and eventually delay it for further testing), or
>>> release as is and solve this issue in the next release?
>>>
>>> I tend to leave it as is, as I expect that most people will disappear
>>> in the next few days for holidays, so no much testing will be done
>>> anyway, and we then can solve this with less pressure in the next
>>> release - but I'm not really sure if this is a good idea?
>>
>> I somehow tend to leave it as is, too.
>>
>> We have issues with some recent compilers. For these we found a fix
>> using the io.h somehow the same way the Linux kernel does. But this
>> introduces new issues for us, we haven't found a proper fix yet
>> (except changing the code to the 'old' io.h style). But we don't know
>> where we might have this issue additionally, yet.
>
> The only real problem found with that patch was one with a register
> which doesn't like an (unmotivated) read after write.

Yes, and this is enough for me to not want it right away: we caught this 
one, but how many others, so far unseen, will creep up?

> On the other side, without that patch, using gcc>= 4.5.x (at least on
> arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring
> that volatile, 4.5.x still fixes many bugs for arm and gcc>= 4.5.x is
> necessary for hardfloat. So it's likely that more people will start
> using 4.5.x (4.5.2 was just released).

Do we need hard floating point in u-boot? IIRC, and unless this changed, 
the kernel does not want floating point, so I wonder why u-boot would.

As for getting 4.5 to work, for the next cycle people can still use pre 
4.5 gccs / toolchains, so this is important but not urgent to the point 
of rushing decisions.

> Regards,
>
> Alexander

Amicalement,
Dirk Behme - Dec. 22, 2010, 7:18 a.m.
On 22.12.2010 08:02, Albert ARIBAUD wrote:
> Le 22/12/2010 01:11, Alexander Holler a écrit :
>> Am 21.12.2010 21:04, schrieb Dirk Behme:
>>> On 21.12.2010 20:52, Wolfgang Denk wrote:
>>>> Dear Albert&     friends,
>>>>
>>>> what is your opinion?  Should we include the memory barrier patch into
>>>> the upcoming release (and eventually delay it for further testing), or
>>>> release as is and solve this issue in the next release?
>>>>
>>>> I tend to leave it as is, as I expect that most people will disappear
>>>> in the next few days for holidays, so no much testing will be done
>>>> anyway, and we then can solve this with less pressure in the next
>>>> release - but I'm not really sure if this is a good idea?
>>>
>>> I somehow tend to leave it as is, too.
>>>
>>> We have issues with some recent compilers. For these we found a fix
>>> using the io.h somehow the same way the Linux kernel does. But this
>>> introduces new issues for us, we haven't found a proper fix yet
>>> (except changing the code to the 'old' io.h style). But we don't know
>>> where we might have this issue additionally, yet.
>>
>> The only real problem found with that patch was one with a register
>> which doesn't like an (unmotivated) read after write.
>
> Yes, and this is enough for me to not want it right away: we caught this
> one, but how many others, so far unseen, will creep up?
>
>> On the other side, without that patch, using gcc>= 4.5.x (at least on
>> arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring
>> that volatile, 4.5.x still fixes many bugs for arm and gcc>= 4.5.x is
>> necessary for hardfloat. So it's likely that more people will start
>> using 4.5.x (4.5.2 was just released).
>
> Do we need hard floating point in u-boot? IIRC, and unless this changed,
> the kernel does not want floating point, so I wonder why u-boot would.
>
> As for getting 4.5 to work, for the next cycle people can still use pre
> 4.5 gccs / toolchains, so this is important but not urgent to the point
> of rushing decisions.

Agree.

Btw, I tried to send a summary of our issues to the Codesourcery 
mailing list:

http://www.codesourcery.com/archives/arm-gnu/msg03989.html

Let's see if we get an answer.

Best regards

Dirk
Wolfgang Denk - Dec. 22, 2010, 7:52 a.m.
Dear Dirk Behme,

In message <4D11A65E.8040200@googlemail.com> you wrote:
>
> Btw, I tried to send a summary of our issues to the Codesourcery =
> 
> mailing list:
> 
> http://www.codesourcery.com/archives/arm-gnu/msg03989.html
> 
> Let's see if we get an answer.

You got one:
http://www.codesourcery.com/archives/arm-gnu/msg03990.html

And it sounds like a reasonable explanation to me.

Best regards,

Wolfgang Denk
Alexander Holler - Dec. 22, 2010, 9:56 a.m.
Am 22.12.2010 08:02, schrieb Albert ARIBAUD:
> Le 22/12/2010 01:11, Alexander Holler a écrit :
>> Am 21.12.2010 21:04, schrieb Dirk Behme:
>>> On 21.12.2010 20:52, Wolfgang Denk wrote:
>>>> Dear Albert&     friends,
>>>>
>>>> what is your opinion?  Should we include the memory barrier patch into
>>>> the upcoming release (and eventually delay it for further testing), or
>>>> release as is and solve this issue in the next release?
>>>>
>>>> I tend to leave it as is, as I expect that most people will disappear
>>>> in the next few days for holidays, so no much testing will be done
>>>> anyway, and we then can solve this with less pressure in the next
>>>> release - but I'm not really sure if this is a good idea?
>>>
>>> I somehow tend to leave it as is, too.
>>>
>>> We have issues with some recent compilers. For these we found a fix
>>> using the io.h somehow the same way the Linux kernel does. But this
>>> introduces new issues for us, we haven't found a proper fix yet
>>> (except changing the code to the 'old' io.h style). But we don't know
>>> where we might have this issue additionally, yet.
>>
>> The only real problem found with that patch was one with a register
>> which doesn't like an (unmotivated) read after write.
>
> Yes, and this is enough for me to not want it right away: we caught this
> one, but how many others, so far unseen, will creep up?
>
>> On the other side, without that patch, using gcc>= 4.5.x (at least on
>> arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring
>> that volatile, 4.5.x still fixes many bugs for arm and gcc>= 4.5.x is
>> necessary for hardfloat. So it's likely that more people will start
>> using 4.5.x (4.5.2 was just released).
>
> Do we need hard floating point in u-boot? IIRC, and unless this changed,
> the kernel does not want floating point, so I wonder why u-boot would.

Most people won't use U-Boot as base for the decision which compiler 
version to use.

> As for getting 4.5 to work, for the next cycle people can still use pre
> 4.5 gccs / toolchains, so this is important but not urgent to the point
> of rushing decisions.

I've just written down my opinion.

Regards,

Alexander
Dirk Behme - Dec. 23, 2010, 4:40 p.m.
On 22.12.2010 08:52, Wolfgang Denk wrote:
> Dear Dirk Behme,
>
> In message<4D11A65E.8040200@googlemail.com>  you wrote:
>>
>> Btw, I tried to send a summary of our issues to the Codesourcery =
>>
>> mailing list:
>>
>> http://www.codesourcery.com/archives/arm-gnu/msg03989.html
>>
>> Let's see if we get an answer.
>
> You got one:
> http://www.codesourcery.com/archives/arm-gnu/msg03990.html
>
> And it sounds like a reasonable explanation to me.

An additional answer, mainly covering the question why the volatile is 
optimized away:

http://www.codesourcery.com/archives/arm-gnu/msg03999.html

Best regards

Dirk

Patch

diff --git a/board/ti/beagle/beagle.c b/board/ti/beagle/beagle.c
index d9b6f01..c066d6e 100644
--- a/board/ti/beagle/beagle.c
+++ b/board/ti/beagle/beagle.c
@@ -51,6 +51,8 @@ 

 #define BEAGLE_NO_EEPROM               0xffffffff

+DECLARE_GLOBAL_DATA_PTR;
+
 static struct {
        unsigned int device_vendor;
        unsigned char revision;
@@ -66,8 +68,6 @@  static struct {
  */
 int board_init(void)
 {
-       DECLARE_GLOBAL_DATA_PTR;
-
        gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
        /* board id for Linux */
        gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;