diff mbox

[U-Boot,RFC,v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.

Message ID AANLkTiksYKRt7DMsVwqA--WAachCfJuC65v-XgotLLyS@mail.gmail.com
State Not Applicable
Headers show

Commit Message

John Rigby Dec. 20, 2010, 6:07 a.m. UTC
On Sun, Dec 19, 2010 at 9:18 PM, John Rigby <john.rigby@linaro.org> wrote:
> On Sun, Dec 19, 2010 at 5:56 PM, Alexander Holler <holler@ahsoftware.de> wrote:
>> Am 20.12.2010 01:39, schrieb John Rigby:
>>>
>>> On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler<holler@ahsoftware.de>
>>>  wrote:
>>> ...
>>>>
>>>> No EEPROM on expansion board
>>>> Die ID #062a000400000000040365fa16019019
>>>> Hit any key to stop autoboot:  0
>>>> OMAP3 beagleboard.org # nand info
>>>>
>>>> Device 0: nand0, sector size 16 KiB
>>>> --------------------------------
>>>>
>>> I get the same output  without your change.  My gcc is linaro 4.4.5.
>>> I'll do some bisecting and try to find out what is going on.
>>
>> Bisecting won't help you here. Not if the problem was always there (which is
>> what I assume
> Sorry, I was confused about my results.
>
> If I replace include <asm/io.h> in drivers/mtd/nand/omap_gpmc.c with a
> copy of the original called orig_io.h:
> #include "orig_io.h"
>
> Nand starts working again.  So the problem seems to be isolated to this file.
>>
>> Regards,
>>
>> Alexander
>>
>

With your patch and the following hack nand works:


 /*

The working assembly looks like this:

	if (cmd != NAND_CMD_NONE)
80024d28:	e3710001 	cmn	r1, #1
		origwriteb(cmd, this->IO_ADDR_W);
80024d2c:	15933004 	ldrne	r3, [r3, #4]
80024d30:	120110ff 	andne	r1, r1, #255	; 0xff
80024d34:	15c31000 	strbne	r1, [r3]
80024d38:	e8bd8010 	pop	{r4, pc}

The broken assembly looks like this:

	if (cmd != NAND_CMD_NONE)
80024d28:	e3710001 	cmn	r1, #1
80024d2c:	08bd8010 	popeq	{r4, pc}
		writeb(cmd, this->IO_ADDR_W);
80024d30:	e5933004 	ldr	r3, [r3, #4]
80024d34:	e20110ff 	and	r1, r1, #255	; 0xff
80024d38:	e5c31000 	strb	r1, [r3]
80024d3c:	e5d33000 	ldrb	r3, [r3]
80024d40:	e8bd8010 	pop	{r4, pc}

This is with gcc version "(Ubuntu/Linaro 4.4.4-14ubuntu4) 4.4.5".
I'll try a 4.5.2 version next and see what happens.

br,

John

Comments

Dirk Behme Dec. 20, 2010, 6:49 a.m. UTC | #1
On 20.12.2010 07:07, John Rigby wrote:
> On Sun, Dec 19, 2010 at 9:18 PM, John Rigby<john.rigby@linaro.org>  wrote:
>> On Sun, Dec 19, 2010 at 5:56 PM, Alexander Holler<holler@ahsoftware.de>  wrote:
>>> Am 20.12.2010 01:39, schrieb John Rigby:
>>>>
>>>> On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler<holler@ahsoftware.de>
>>>>   wrote:
>>>> ...
>>>>>
>>>>> No EEPROM on expansion board
>>>>> Die ID #062a000400000000040365fa16019019
>>>>> Hit any key to stop autoboot:  0
>>>>> OMAP3 beagleboard.org # nand info
>>>>>
>>>>> Device 0: nand0, sector size 16 KiB
>>>>> --------------------------------
>>>>>
>>>> I get the same output  without your change.  My gcc is linaro 4.4.5.
>>>> I'll do some bisecting and try to find out what is going on.
>>>
>>> Bisecting won't help you here. Not if the problem was always there (which is
>>> what I assume
>> Sorry, I was confused about my results.
>>
>> If I replace include<asm/io.h>  in drivers/mtd/nand/omap_gpmc.c with a
>> copy of the original called orig_io.h:
>> #include "orig_io.h"
>>
>> Nand starts working again.  So the problem seems to be isolated to this file.
>>>
>>> Regards,
>>>
>>> Alexander
>>>
>>
>
> With your patch and the following hack nand works:
>
>
> diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c
> index 99b9cef..5e94155 100644
> --- a/drivers/mtd/nand/omap_gpmc.c
> +++ b/drivers/mtd/nand/omap_gpmc.c
> @@ -29,6 +29,8 @@
>   #include<linux/mtd/nand_ecc.h>
>   #include<nand.h>
>
> +#define origwriteb(v,a)                        __arch_putb(v,a)
> +
>   static uint8_t cs;
>   static struct nand_ecclayout hw_nand_oob = GPMC_NAND_HW_ECC_LAYOUT;
>
> @@ -58,7 +60,7 @@ static void omap_nand_hwcontrol(struct mtd_info
> *mtd, int32_t cmd,
>          }
>
>          if (cmd != NAND_CMD_NONE)
> -               writeb(cmd, this->IO_ADDR_W);
> +               origwriteb(cmd, this->IO_ADDR_W);
>   }
>
>   /*
>
> The working assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 		origwriteb(cmd, this->IO_ADDR_W);
> 80024d2c:	15933004 	ldrne	r3, [r3, #4]
> 80024d30:	120110ff 	andne	r1, r1, #255	; 0xff
> 80024d34:	15c31000 	strbne	r1, [r3]
> 80024d38:	e8bd8010 	pop	{r4, pc}
>
> The broken assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 80024d2c:	08bd8010 	popeq	{r4, pc}
> 		writeb(cmd, this->IO_ADDR_W);
> 80024d30:	e5933004 	ldr	r3, [r3, #4]
> 80024d34:	e20110ff 	and	r1, r1, #255	; 0xff
> 80024d38:	e5c31000 	strb	r1, [r3]
> 80024d3c:	e5d33000 	ldrb	r3, [r3]
> 80024d40:	e8bd8010 	pop	{r4, pc}

Hmm. From functionality point of view, the 'broken' assembly below 
should to the same as the working assembly, above. The main difference 
is the 'popeq	{r4, pc}' and the additional 'ldrb	r3, [r3]'. The write 
to the HW 'strb	r1, [r3]' is there, so it should work. Is this 
understanding correct?

If it's correct, the question is, what breaks the below assembly? The 
popeq or the additional ldrb? The popeq looks correct, but why is the 
additional ldrb there?

For some further debugging, I had two ideas: Modifying the resulting 
binary with a hex editor and replacing the ldrb with a nop (if I 
remember correctly the hex code for a nop is ffffffff?). Or modifying 
the the C code and adding a barrier behind the writeb(). E.g.

if (cmd != NAND_CMD_NONE);
         writeb(cmd, this->IO_ADDR_W);
         __asm__ __volatile__ ("dsb" : : : "memory");
}

Best regards

Dirk
John Rigby Dec. 20, 2010, 7:37 a.m. UTC | #2
On Sun, Dec 19, 2010 at 11:49 PM, Dirk Behme <dirk.behme@googlemail.com> wrote:
> On 20.12.2010 07:07, John Rigby wrote:
>> The working assembly looks like this:
>>
>>        if (cmd != NAND_CMD_NONE)
>> 80024d28:       e3710001        cmn     r1, #1
>>                origwriteb(cmd, this->IO_ADDR_W);
>> 80024d2c:       15933004        ldrne   r3, [r3, #4]
>> 80024d30:       120110ff        andne   r1, r1, #255    ; 0xff
>> 80024d34:       15c31000        strbne  r1, [r3]
>> 80024d38:       e8bd8010        pop     {r4, pc}
>>
>> The broken assembly looks like this:
>>
>>        if (cmd != NAND_CMD_NONE)
>> 80024d28:       e3710001        cmn     r1, #1
>> 80024d2c:       08bd8010        popeq   {r4, pc}
>>                writeb(cmd, this->IO_ADDR_W);
>> 80024d30:       e5933004        ldr     r3, [r3, #4]
>> 80024d34:       e20110ff        and     r1, r1, #255    ; 0xff
>> 80024d38:       e5c31000        strb    r1, [r3]
>> 80024d3c:       e5d33000        ldrb    r3, [r3]
>> 80024d40:       e8bd8010        pop     {r4, pc}
>
> Hmm. From functionality point of view, the 'broken' assembly below should to
> the same as the working assembly, above. The main difference is the 'popeq
> {r4, pc}' and the additional 'ldrb      r3, [r3]'. The write to the HW 'strb
>    r1, [r3]' is there, so it should work. Is this understanding correct?
>
> If it's correct, the question is, what breaks the below assembly? The popeq
> or the additional ldrb? The popeq looks correct, but why is the additional
> ldrb there?
>

I can't answer why the ldrb is there but I'm pretty sure it is the
problem.  From the TRM
http://focus.ti.com/lit/ug/spruf98m/spruf98m.pdf:

Only write accesses must be issued to these locations, but the GPMC
does not discard any read access. Accessing a NAND device with nOE and
CLE or ALE asserted (read access) can produce undefined results.

br,
John
John Rigby Dec. 20, 2010, 4:08 p.m. UTC | #3
Earlier in this thread Alexander said:
> I haven't add the definitions which are using a memory barrier because I haven't found
> a place in the kernel where they were actually enabled
> (CONFIG_ARM_DMA_MEM_BUFFERABLE).

I think this is the problem because it is indeed defined for all v6
and v7 arm platforms.  Here is the config snippet from
arch/arm/mm/Kconfig:

config ARM_DMA_MEM_BUFFERABLE
	bool "Use non-cacheable memory for DMA" if CPU_V6 && !CPU_V7
	depends on !(MACH_REALVIEW_PB1176 || REALVIEW_EB_ARM11MP || \
		     MACH_REALVIEW_PB11MP)
	default y if CPU_V6 || CPU_V7
	help
	  Historically, the kernel has used strongly ordered mappings to
	  provide DMA coherent memory.  With the advent of ARMv7, mapping
	  memory with differing types results in unpredictable behaviour,
	  so on these CPUs, this option is forced on.

	  Multiple mappings with differing attributes is also unpredictable
	  on ARMv6 CPUs, but since they do not have aggressive speculative
	  prefetch, no harm appears to occur.

	  However, drivers may be missing the necessary barriers for ARMv6,
	  and therefore turning this on may result in unpredictable driver
	  behaviour.  Therefore, we offer this as an option.

	  You are recommended say 'Y' here and debug any affected drivers.
John Rigby Dec. 20, 2010, 4:12 p.m. UTC | #4
On Mon, Dec 20, 2010 at 9:08 AM, John Rigby <john.rigby@linaro.org> wrote:
> Earlier in this thread Alexander said:
>> I haven't add the definitions which are using a memory barrier because I haven't found
>> a place in the kernel where they were actually enabled
>> (CONFIG_ARM_DMA_MEM_BUFFERABLE).
>
> I think this is the problem because it is indeed defined for all v6
> and v7 arm platforms.  Here is the config snippet from
> arch/arm/mm/Kconfig:
>
> config ARM_DMA_MEM_BUFFERABLE
>        bool "Use non-cacheable memory for DMA" if CPU_V6 && !CPU_V7
>        depends on !(MACH_REALVIEW_PB1176 || REALVIEW_EB_ARM11MP || \
>                     MACH_REALVIEW_PB11MP)
>        default y if CPU_V6 || CPU_V7
>        help
>          Historically, the kernel has used strongly ordered mappings to
>          provide DMA coherent memory.  With the advent of ARMv7, mapping
>          memory with differing types results in unpredictable behaviour,
>          so on these CPUs, this option is forced on.

On second thought maybe this is noise for us in u-boot without
cacheable mappings?  Sorry for the noise.

br,

John
Alexander Holler Dec. 20, 2010, 5:12 p.m. UTC | #5
Hello,

Am 20.12.2010 07:07, schrieb John Rigby:
> With your patch and the following hack nand works:
>
>
> diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c
> index 99b9cef..5e94155 100644
> --- a/drivers/mtd/nand/omap_gpmc.c
> +++ b/drivers/mtd/nand/omap_gpmc.c
> @@ -29,6 +29,8 @@
>   #include<linux/mtd/nand_ecc.h>
>   #include<nand.h>
>
> +#define origwriteb(v,a)                        __arch_putb(v,a)
> +
>   static uint8_t cs;
>   static struct nand_ecclayout hw_nand_oob = GPMC_NAND_HW_ECC_LAYOUT;
>
> @@ -58,7 +60,7 @@ static void omap_nand_hwcontrol(struct mtd_info
> *mtd, int32_t cmd,
>          }
>
>          if (cmd != NAND_CMD_NONE)
> -               writeb(cmd, this->IO_ADDR_W);
> +               origwriteb(cmd, this->IO_ADDR_W);
>   }
>
>   /*
>
> The working assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 		origwriteb(cmd, this->IO_ADDR_W);
> 80024d2c:	15933004 	ldrne	r3, [r3, #4]
> 80024d30:	120110ff 	andne	r1, r1, #255	; 0xff
> 80024d34:	15c31000 	strbne	r1, [r3]
> 80024d38:	e8bd8010 	pop	{r4, pc}
>
> The broken assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 80024d2c:	08bd8010 	popeq	{r4, pc}
> 		writeb(cmd, this->IO_ADDR_W);
> 80024d30:	e5933004 	ldr	r3, [r3, #4]
> 80024d34:	e20110ff 	and	r1, r1, #255	; 0xff
> 80024d38:	e5c31000 	strb	r1, [r3]
> 80024d3c:	e5d33000 	ldrb	r3, [r3]
> 80024d40:	e8bd8010 	pop	{r4, pc}
>
> This is with gcc version "(Ubuntu/Linaro 4.4.4-14ubuntu4) 4.4.5".
> I'll try a 4.5.2 version next and see what happens.

There must be more problems. Using gcc 4.5.1, the read*/write*-patch and 
your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source 
to compile u-boot the kernel comes up. Here is the output for the 
non-working u-boot:

----------------
U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)

OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
OMAP3 Beagle board + LPDDR/NAND
I2C:   ready
DRAM:  256 MiB
NAND:  256 MiB
MMC:   OMAP SD/MMC: 0
In:    serial
Out:   serial
Err:   serial
Beagle Rev C4
timed out in wait_for_pin: I2C_STAT=0
No EEPROM on expansion board
Die ID #062a000400000000040365fa16019019
Hit any key to stop autoboot:  0
reading boot.scr

422 bytes read
Running bootscript from mmc ...
## Executing script at 82000000
reading uImage

2419940 bytes read
Booting from mmc ...
## Booting kernel from Legacy Image at 82000000 ...
    Image Name:   Linux-2.6.37-rc5-beagleboard-000
    Image Type:   ARM Linux Kernel Image (uncompressed)
    Data Size:    2419876 Bytes = 2.3 MiB
    Load Address: 80008000
    Entry Point:  80008000
    Verifying Checksum ... OK
    Loading Kernel Image ... OK
OK
----------------

Nothing else.

Regards,

Alexander
John Rigby Dec. 21, 2010, 12:25 a.m. UTC | #6
On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler <holler@ahsoftware.de> wrote:

> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
> your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source to
> compile u-boot the kernel comes up. Here is the output for the non-working
> u-boot:
>
> ----------------
> U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)
>
> OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
> OMAP3 Beagle board + LPDDR/NAND
> I2C:   ready
> DRAM:  256 MiB
> NAND:  256 MiB
> MMC:   OMAP SD/MMC: 0
> In:    serial
> Out:   serial
> Err:   serial
> Beagle Rev C4
> timed out in wait_for_pin: I2C_STAT=0
> No EEPROM on expansion board
> Die ID #062a000400000000040365fa16019019
> Hit any key to stop autoboot:  0
> reading boot.scr
>
> 422 bytes read
> Running bootscript from mmc ...
> ## Executing script at 82000000
> reading uImage
>
> 2419940 bytes read
> Booting from mmc ...
> ## Booting kernel from Legacy Image at 82000000 ...
>   Image Name:   Linux-2.6.37-rc5-beagleboard-000
>   Image Type:   ARM Linux Kernel Image (uncompressed)
>   Data Size:    2419876 Bytes = 2.3 MiB
>   Load Address: 80008000
>   Entry Point:  80008000
>   Verifying Checksum ... OK
>   Loading Kernel Image ... OK
> OK
> ----------------
>
> Nothing else.
>
> Regards,
>
> Alexander
>

Yes, you are correct, I see the same here with 4.5.2.  I noticed that
bd did not have correct values of machine type and boot params:

bd address  = 0x8FF24FE0
arch_number = 0xFF0084FF
boot_params = 0xBB2000FE
DRAM bank   = 0x00000000
-> start    = 0x80000000
-> size     = 0x08000000
DRAM bank   = 0x00000001
-> start    = 0x88000000
-> size     = 0x08000000
baudrate    = 115200 bps
TLB addr    = 0x8FFF0000
relocaddr   = 0x8FF85000
reloc off   = 0x0FF7D000
irq_sp      = 0x8FF24F68
sp start    = 0x8FF24F60
FB base     = 0x00000000

If we then look at board_init in beagle.c the problem is obvious:

800331ac <board_init>:
800331ac:       e92d4008        push    {r3, lr}
800331b0:       ebff5a74        bl      80009b88 <gpmc_init>
800331b4:       e3a00000        mov     r0, #0
800331b8:       e5983000        ldr     r3, [r8]
800331bc:       e5983000        ldr     r3, [r8]
800331c0:       e8bd8008        pop     {r3, pc}

Here is with source mingled in:
800331ac <board_init>:
/*
 * Routine: board_init
 * Description: Early hardware init.
 */
int board_init(void)
{
800331ac:       e92d4008        push    {r3, lr}
        DECLARE_GLOBAL_DATA_PTR;

        gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
800331b0:       ebff5a74        bl      80009b88 <gpmc_init>
        gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;
        /* boot param addr */
        gd->bd->bi_boot_params = (OMAP34XX_SDRC_CS0 + 0x100);

        return 0;
}
800331b4:       e3a00000        mov     r0, #0
{
        DECLARE_GLOBAL_DATA_PTR;

        gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
        /* board id for Linux */
        gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;
800331b8:       e5983000        ldr     r3, [r8]
        /* boot param addr */
        gd->bd->bi_boot_params = (OMAP34XX_SDRC_CS0 + 0x100);
800331bc:       e5983000        ldr     r3, [r8]

        return 0;
}
800331c0:       e8bd8008        pop     {r3, pc}

br,

John
Alexander Holler Dec. 21, 2010, 1:38 p.m. UTC | #7
Hello,

Am 21.12.2010 01:25, schrieb John Rigby:
> On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler<holler@ahsoftware.de>  wrote:
>
>> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
...
> Yes, you are correct, I see the same here with 4.5.2.  I noticed that
> bd did not have correct values of machine type and boot params:
>
> bd address  = 0x8FF24FE0
...

Great!

Thanks a lot for searching and finding that.

I've just tested an u-boot for the BeagleBoard with your "Move 
DECLARE..."-patch compiled with gcc 4.5.1 and it now boots. ;)

I remember having seen those two DECLARE_GLOBAL_DATA_PTR in beagle.c 
too, but I was to lazy at that time to check if that has (bad) 
consequences. ;)

Regards,

Alexander
Alexander Holler Jan. 17, 2011, 4:35 a.m. UTC | #8
Hello,

Am 20.12.2010 17:08, schrieb John Rigby:
> Earlier in this thread Alexander said:
>> I haven't add the definitions which are using a memory barrier because I haven't found
>> a place in the kernel where they were actually enabled
>> (CONFIG_ARM_DMA_MEM_BUFFERABLE).

Because I've just run again into such a "search problem":

Don't use "git grep" on a kernel configured for x86, when you are 
searching an option for another architecture. ;)

Regards,

Alexander
diff mbox

Patch

diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c
index 99b9cef..5e94155 100644
--- a/drivers/mtd/nand/omap_gpmc.c
+++ b/drivers/mtd/nand/omap_gpmc.c
@@ -29,6 +29,8 @@ 
 #include <linux/mtd/nand_ecc.h>
 #include <nand.h>

+#define origwriteb(v,a)                        __arch_putb(v,a)
+
 static uint8_t cs;
 static struct nand_ecclayout hw_nand_oob = GPMC_NAND_HW_ECC_LAYOUT;

@@ -58,7 +60,7 @@  static void omap_nand_hwcontrol(struct mtd_info
*mtd, int32_t cmd,
        }

        if (cmd != NAND_CMD_NONE)
-               writeb(cmd, this->IO_ADDR_W);
+               origwriteb(cmd, this->IO_ADDR_W);
 }