diff mbox

[U-Boot,v3] ARM: Avoid compiler optimization for usages of readb, writeb and friends.

Message ID 4D1F1841.5060508@googlemail.com
State Changes Requested
Headers show

Commit Message

Dirk Behme Jan. 1, 2011, 12:04 p.m. UTC
On 22.12.2010 12:04, Alexander Holler wrote:
> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
> avoid that as done in the kernel.
>
> Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that
> gcc version to ignore the volatile type qualifier used e.g. in __arch_getl().
> Anyway, using a definition as in the kernel headers avoids such optimizations when
> gcc 4.5.1 is used.
>
> Maybe the headers as used in the current linux-kernel should be used,
> but to avoid large changes, I've just added a small change to the current headers.
>
> I haven't add the definitions which are using a memory barrier because I haven't found
> a place in the kernel where they were actually enabled (CONFIG_ARM_DMA_MEM_BUFFERABLE).
>
> Signed-off-by: Alexander Holler<holler@ahsoftware.de>
> ---
>   arch/arm/include/asm/io.h |   20 ++++++++++++++------
>   1 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
> index ff1518e..068ed17 100644
> --- a/arch/arm/include/asm/io.h
> +++ b/arch/arm/include/asm/io.h
> @@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen)
>   #define __raw_readw(a)			__arch_getw(a)
>   #define __raw_readl(a)			__arch_getl(a)
>
> -#define writeb(v,a)			__arch_putb(v,a)
> -#define writew(v,a)			__arch_putw(v,a)
> -#define writel(v,a)			__arch_putl(v,a)
> +/*
> + * TODO: The kernel offers some more advanced versions of barriers, it might
> + * have some advantages to use them instead of the simple one here.
> + */
> +#define dmb()				__asm__ __volatile__ ("" : : : "memory")
> +#define __iormb()			dmb()
> +#define __iowmb()			dmb()
> +
> +#define writeb(v,c)			do { __iowmb(); __arch_putb(v,c); } while (0)
> +#define writew(v,c)			do { __iowmb(); __arch_putw(v,c); } while (0)
> +#define writel(v,c)			do { __iowmb(); __arch_putl(v,c); } while (0)
>
> -#define readb(a)			__arch_getb(a)
> -#define readw(a)			__arch_getw(a)
> -#define readl(a)			__arch_getl(a)
> +#define readb(c)			({ u8  __v = __arch_getb(c); __iormb(); __v; })
> +#define readw(c)			({ u16 __v = __arch_getw(c); __iormb(); __v; })
> +#define readl(c)			({ u32 __v = __arch_getl(c); __iormb(); __v; })

Do you like to test the patch in the attachment? I named it 'v4'.

After some thinking and testing, it seems to me that the volatile 
optimization issue this patch shall fix is only with the readx() 
macros. So the idea is to drop all writex() changes done in the v3 
version of this patch. With dropping the writex() changes, we would 
drop all issues we discussed with e.g. the GCC statement-expression 
and the do while workaround, too.

Thanks

Dirk
Subject: [PATCH v4] ARM: Avoid compiler optimization for usages of readb and friends.

gcc 4.5.1 seems to ignore (at least some) volatile definitions,
avoid that as done in the kernel.

Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that
gcc version to ignore the volatile type qualifier used e.g. in __arch_getl().
Anyway, using a definition as in the kernel headers avoids such optimizations when
gcc 4.5.1 is used.

Maybe the headers as used in the current linux-kernel should be used,
but to avoid large changes, I've just added a small change to the current headers.

I haven't add the definitions which are using a memory barrier because I haven't found
a place in the kernel where they were actually enabled (CONFIG_ARM_DMA_MEM_BUFFERABLE).

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Dirk Behme <dirk.behme@googlemail.com>
---

Changes since v3: Drop all changes to writex(). It seems that
the compiler issue is only with readx(), so we don't have to
touch the writex() macros. With not touching the writex()
macros, we don't have to care about issues introduced by
touching them, too. 

Note: Tested by compilation only, not tested on real HW.

 arch/arm/include/asm/io.h |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Alexander Holler Jan. 1, 2011, 5:52 p.m. UTC | #1
Hello,

Am 01.01.2011 13:04, schrieb Dirk Behme:
> On 22.12.2010 12:04, Alexander Holler wrote:
>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>> avoid that as done in the kernel.
>>
>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of
>> that
>> gcc version to ignore the volatile type qualifier used e.g. in
>> __arch_getl().
>> Anyway, using a definition as in the kernel headers avoids such
>> optimizations when
>> gcc 4.5.1 is used.
>>
>> Maybe the headers as used in the current linux-kernel should be used,
>> but to avoid large changes, I've just added a small change to the
>> current headers.

> Do you like to test the patch in the attachment? I named it 'v4'.
>
> After some thinking and testing, it seems to me that the volatile
> optimization issue this patch shall fix is only with the readx() macros.
> So the idea is to drop all writex() changes done in the v3 version of
> this patch. With dropping the writex() changes, we would drop all issues
> we discussed with e.g. the GCC statement-expression and the do while
> workaround, too.

I've come across a bug which reads as the problem might be fixed in gcc 
4.5.2:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052

I will test gcc 4.5.2 in the next days.

Besides that I still think the correct solution would be to use the 
arm-headers as found in the current linux kernel. The problem is, that I 
don't know (haven't looked up) the reasons for changes in the 
arm-linux-headers as currently found in u-boot.

And because updating those headers might require some more changes in 
various other places in u-boot, I think it would be good if one of the 
u-boot-arm-maintainers would do that. I'm not that much involved in 
u-boot-development, don't follow the ml closely and therefor might miss 
necessary changes when taking the current arm-headers from the kernel 
and dropping them into u-boot.

Regards,

Alexander
Dirk Behme Jan. 1, 2011, 6:25 p.m. UTC | #2
On 01.01.2011 18:52, Alexander Holler wrote:
> Hello,
>
> Am 01.01.2011 13:04, schrieb Dirk Behme:
>> On 22.12.2010 12:04, Alexander Holler wrote:
>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>> avoid that as done in the kernel.
>>>
>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of
>>> that
>>> gcc version to ignore the volatile type qualifier used e.g. in
>>> __arch_getl().
>>> Anyway, using a definition as in the kernel headers avoids such
>>> optimizations when
>>> gcc 4.5.1 is used.
>>>
>>> Maybe the headers as used in the current linux-kernel should be used,
>>> but to avoid large changes, I've just added a small change to the
>>> current headers.
>
>> Do you like to test the patch in the attachment? I named it 'v4'.
>>
>> After some thinking and testing, it seems to me that the volatile
>> optimization issue this patch shall fix is only with the readx()
>> macros.
>> So the idea is to drop all writex() changes done in the v3 version of
>> this patch. With dropping the writex() changes, we would drop all
>> issues
>> we discussed with e.g. the GCC statement-expression and the do while
>> workaround, too.
>
> I've come across a bug which reads as the problem might be fixed in
> gcc 4.5.2:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>
> I will test gcc 4.5.2 in the next days.

Have you been able to test v4 of the patch I sent with gcc 4.5.1?

Thanks

Dirk
Alexander Holler Jan. 1, 2011, 6:47 p.m. UTC | #3
Am 01.01.2011 19:25, schrieb Dirk Behme:
> On 01.01.2011 18:52, Alexander Holler wrote:
>> Hello,
>>
>> Am 01.01.2011 13:04, schrieb Dirk Behme:
>>> On 22.12.2010 12:04, Alexander Holler wrote:
>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>> avoid that as done in the kernel.
>>>>
>>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of
>>>> that
>>>> gcc version to ignore the volatile type qualifier used e.g. in
>>>> __arch_getl().
>>>> Anyway, using a definition as in the kernel headers avoids such
>>>> optimizations when
>>>> gcc 4.5.1 is used.
>>>>
>>>> Maybe the headers as used in the current linux-kernel should be used,
>>>> but to avoid large changes, I've just added a small change to the
>>>> current headers.
>>
>>> Do you like to test the patch in the attachment? I named it 'v4'.
>>>
>>> After some thinking and testing, it seems to me that the volatile
>>> optimization issue this patch shall fix is only with the readx()
>>> macros.
>>> So the idea is to drop all writex() changes done in the v3 version of
>>> this patch. With dropping the writex() changes, we would drop all
>>> issues
>>> we discussed with e.g. the GCC statement-expression and the do while
>>> workaround, too.
>>
>> I've come across a bug which reads as the problem might be fixed in
>> gcc 4.5.2:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>>
>> I will test gcc 4.5.2 in the next days.
>
> Have you been able to test v4 of the patch I sent with gcc 4.5.1?

No, sorry, I don't have a test case for consequent write* and I will 
have to write one. I will do such, when testing gcc 4.5.2 (sometimes in 
the next days).

Regards,

Alexander
Dirk Behme Jan. 1, 2011, 7:21 p.m. UTC | #4
On 01.01.2011 19:47, Alexander Holler wrote:
> Am 01.01.2011 19:25, schrieb Dirk Behme:
>> On 01.01.2011 18:52, Alexander Holler wrote:
>>> Hello,
>>>
>>> Am 01.01.2011 13:04, schrieb Dirk Behme:
>>>> On 22.12.2010 12:04, Alexander Holler wrote:
>>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>>> avoid that as done in the kernel.
>>>>>
>>>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a
>>>>> bug of
>>>>> that
>>>>> gcc version to ignore the volatile type qualifier used e.g. in
>>>>> __arch_getl().
>>>>> Anyway, using a definition as in the kernel headers avoids such
>>>>> optimizations when
>>>>> gcc 4.5.1 is used.
>>>>>
>>>>> Maybe the headers as used in the current linux-kernel should be
>>>>> used,
>>>>> but to avoid large changes, I've just added a small change to the
>>>>> current headers.
>>>
>>>> Do you like to test the patch in the attachment? I named it 'v4'.
>>>>
>>>> After some thinking and testing, it seems to me that the volatile
>>>> optimization issue this patch shall fix is only with the readx()
>>>> macros.
>>>> So the idea is to drop all writex() changes done in the v3 version of
>>>> this patch. With dropping the writex() changes, we would drop all
>>>> issues
>>>> we discussed with e.g. the GCC statement-expression and the do while
>>>> workaround, too.
>>>
>>> I've come across a bug which reads as the problem might be fixed in
>>> gcc 4.5.2:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>>>
>>> I will test gcc 4.5.2 in the next days.
>>
>> Have you been able to test v4 of the patch I sent with gcc 4.5.1?
>
> No, sorry, I don't have a test case for consequent write* and I will
> have to write one.

?

If I remember correctly, the test case for this patch was compiling 
U-Boot with 4.5.1 and then check

a) if it boots at Beagle (correct clock.c)
b) if NAND works ok (correct omap_gpmc.c)

?

Thanks

Dirk
Alexander Holler Jan. 2, 2011, 12:43 p.m. UTC | #5
Am 01.01.2011 20:21, schrieb Dirk Behme:
> On 01.01.2011 19:47, Alexander Holler wrote:
>> Am 01.01.2011 19:25, schrieb Dirk Behme:
>>> On 01.01.2011 18:52, Alexander Holler wrote:
>>>> Hello,
>>>>
>>>> Am 01.01.2011 13:04, schrieb Dirk Behme:
>>>>> On 22.12.2010 12:04, Alexander Holler wrote:
>>>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>>>> avoid that as done in the kernel.
>>>>>>
>>>>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a
>>>>>> bug of
>>>>>> that
>>>>>> gcc version to ignore the volatile type qualifier used e.g. in
>>>>>> __arch_getl().
>>>>>> Anyway, using a definition as in the kernel headers avoids such
>>>>>> optimizations when
>>>>>> gcc 4.5.1 is used.
>>>>>>
>>>>>> Maybe the headers as used in the current linux-kernel should be
>>>>>> used,
>>>>>> but to avoid large changes, I've just added a small change to the
>>>>>> current headers.
>>>>
>>>>> Do you like to test the patch in the attachment? I named it 'v4'.
>>>>>
>>>>> After some thinking and testing, it seems to me that the volatile
>>>>> optimization issue this patch shall fix is only with the readx()
>>>>> macros.
>>>>> So the idea is to drop all writex() changes done in the v3 version of
>>>>> this patch. With dropping the writex() changes, we would drop all
>>>>> issues
>>>>> we discussed with e.g. the GCC statement-expression and the do while
>>>>> workaround, too.
>>>>
>>>> I've come across a bug which reads as the problem might be fixed in
>>>> gcc 4.5.2:
>>>>
>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>>>>
>>>> I will test gcc 4.5.2 in the next days.
>>>
>>> Have you been able to test v4 of the patch I sent with gcc 4.5.1?
>>
>> No, sorry, I don't have a test case for consequent write* and I will
>> have to write one.
>
> ?
>
> If I remember correctly, the test case for this patch was compiling
> U-Boot with 4.5.1 and then check
>
> a) if it boots at Beagle (correct clock.c)
> b) if NAND works ok (correct omap_gpmc.c)
>
> ?

No. None of those must fail when the compiler optimizes consequent 
write* to one write* because the compiler ignores the volatile keyword.
I've only found the problem with consequent read* (in clock.c), but 
there might be problems with consequent write* somewhere else too. So if 
you remove the change for those write* some other problems might arise 
and just through booting a kernel those might not be found. So I think 
it would be dangerous to remove the change for write* when using gcc 4.5.x

And because the patch fixes only write* and read* some stuff in u-boot 
which uses volatile in another context might still fail, therefore I 
vote to use the current kernel headers where other things besides read* 
and write* are using those barriers too.

Regards,

Alexander
Dirk Behme Jan. 2, 2011, 1:29 p.m. UTC | #6
On 02.01.2011 13:43, Alexander Holler wrote:
> Am 01.01.2011 20:21, schrieb Dirk Behme:
>> On 01.01.2011 19:47, Alexander Holler wrote:
>>> Am 01.01.2011 19:25, schrieb Dirk Behme:
>>>> On 01.01.2011 18:52, Alexander Holler wrote:
>>>>> Hello,
>>>>>
>>>>> Am 01.01.2011 13:04, schrieb Dirk Behme:
>>>>>> On 22.12.2010 12:04, Alexander Holler wrote:
>>>>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>>>>> avoid that as done in the kernel.
>>>>>>>
>>>>>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a
>>>>>>> bug of
>>>>>>> that
>>>>>>> gcc version to ignore the volatile type qualifier used e.g. in
>>>>>>> __arch_getl().
>>>>>>> Anyway, using a definition as in the kernel headers avoids such
>>>>>>> optimizations when
>>>>>>> gcc 4.5.1 is used.
>>>>>>>
>>>>>>> Maybe the headers as used in the current linux-kernel should be
>>>>>>> used,
>>>>>>> but to avoid large changes, I've just added a small change to the
>>>>>>> current headers.
>>>>>
>>>>>> Do you like to test the patch in the attachment? I named it 'v4'.
>>>>>>
>>>>>> After some thinking and testing, it seems to me that the volatile
>>>>>> optimization issue this patch shall fix is only with the readx()
>>>>>> macros.
>>>>>> So the idea is to drop all writex() changes done in the v3
>>>>>> version of
>>>>>> this patch. With dropping the writex() changes, we would drop all
>>>>>> issues
>>>>>> we discussed with e.g. the GCC statement-expression and the do
>>>>>> while
>>>>>> workaround, too.
>>>>>
>>>>> I've come across a bug which reads as the problem might be fixed in
>>>>> gcc 4.5.2:
>>>>>
>>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>>>>>
>>>>> I will test gcc 4.5.2 in the next days.
>>>>
>>>> Have you been able to test v4 of the patch I sent with gcc 4.5.1?
>>>
>>> No, sorry, I don't have a test case for consequent write* and I will
>>> have to write one.
>>
>> ?
>>
>> If I remember correctly, the test case for this patch was compiling
>> U-Boot with 4.5.1 and then check
>>
>> a) if it boots at Beagle (correct clock.c)
>> b) if NAND works ok (correct omap_gpmc.c)
>>
>> ?
>
> No. None of those must fail when the compiler optimizes consequent
> write* to one write* because the compiler ignores the volatile keyword.
> I've only found the problem with consequent read* (in clock.c), but
> there might be problems with consequent write* somewhere else too. So
> if you remove the change for those write* some other problems might
> arise and just through booting a kernel those might not be found. So I
> think it would be dangerous to remove the change for write* when using
> gcc 4.5.x
>
> And because the patch fixes only write* and read* some stuff in u-boot
> which uses volatile in another context might still fail, therefore I
> vote to use the current kernel headers where other things besides
> read* and write* are using those barriers too.

Just to understand correctly: Do you want to say that we should ignore 
your v3 patch

http://lists.denx.de/pipermail/u-boot/2010-December/084132.html

?

And that you didn't test the v4 patch

http://lists.denx.de/pipermail/u-boot/2011-January/084481.html

with the test you did in

http://lists.denx.de/pipermail/u-boot/2010-December/084134.html

("tested with both gcc 4.3.5 and gcc 4.5.1 using binutils 2.20.1") 
because you now think this test isn't sufficient?

Thanks

Dirk
Alexander Holler Jan. 2, 2011, 9 p.m. UTC | #7
On 02.01.2011 14:29, Dirk Behme wrote:
> On 02.01.2011 13:43, Alexander Holler wrote:
>> Am 01.01.2011 20:21, schrieb Dirk Behme:
>>> On 01.01.2011 19:47, Alexander Holler wrote:
>>>> Am 01.01.2011 19:25, schrieb Dirk Behme:
>>>>> On 01.01.2011 18:52, Alexander Holler wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Am 01.01.2011 13:04, schrieb Dirk Behme:
>>>>>>> On 22.12.2010 12:04, Alexander Holler wrote:
>>>>>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>>>>>> avoid that as done in the kernel.
>>>>>>>>
>>>>>>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a
>>>>>>>> bug of
>>>>>>>> that
>>>>>>>> gcc version to ignore the volatile type qualifier used e.g. in
>>>>>>>> __arch_getl().
>>>>>>>> Anyway, using a definition as in the kernel headers avoids such
>>>>>>>> optimizations when
>>>>>>>> gcc 4.5.1 is used.
>>>>>>>>
>>>>>>>> Maybe the headers as used in the current linux-kernel should be
>>>>>>>> used,
>>>>>>>> but to avoid large changes, I've just added a small change to the
>>>>>>>> current headers.
>>>>>>
>>>>>>> Do you like to test the patch in the attachment? I named it 'v4'.
>>>>>>>
>>>>>>> After some thinking and testing, it seems to me that the volatile
>>>>>>> optimization issue this patch shall fix is only with the readx()
>>>>>>> macros.
>>>>>>> So the idea is to drop all writex() changes done in the v3
>>>>>>> version of
>>>>>>> this patch. With dropping the writex() changes, we would drop all
>>>>>>> issues
>>>>>>> we discussed with e.g. the GCC statement-expression and the do
>>>>>>> while
>>>>>>> workaround, too.
>>>>>>
>>>>>> I've come across a bug which reads as the problem might be fixed in
>>>>>> gcc 4.5.2:
>>>>>>
>>>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>>>>>>
>>>>>> I will test gcc 4.5.2 in the next days.
>>>>>
>>>>> Have you been able to test v4 of the patch I sent with gcc 4.5.1?
>>>>
>>>> No, sorry, I don't have a test case for consequent write* and I will
>>>> have to write one.
>>>
>>> ?
>>>
>>> If I remember correctly, the test case for this patch was compiling
>>> U-Boot with 4.5.1 and then check
>>>
>>> a) if it boots at Beagle (correct clock.c)
>>> b) if NAND works ok (correct omap_gpmc.c)
>>>
>>> ?
>>
>> No. None of those must fail when the compiler optimizes consequent
>> write* to one write* because the compiler ignores the volatile keyword.
>> I've only found the problem with consequent read* (in clock.c), but
>> there might be problems with consequent write* somewhere else too. So
>> if you remove the change for those write* some other problems might
>> arise and just through booting a kernel those might not be found. So I
>> think it would be dangerous to remove the change for write* when using
>> gcc 4.5.x
>>
>> And because the patch fixes only write* and read* some stuff in u-boot
>> which uses volatile in another context might still fail, therefore I
>> vote to use the current kernel headers where other things besides
>> read* and write* are using those barriers too.
>
> Just to understand correctly: Do you want to say that we should ignore
> your v3 patch
>
> http://lists.denx.de/pipermail/u-boot/2010-December/084132.html
>
> ?
>
> And that you didn't test the v4 patch
>
> http://lists.denx.de/pipermail/u-boot/2011-January/084481.html
>
> with the test you did in
>
> http://lists.denx.de/pipermail/u-boot/2010-December/084134.html
>
> ("tested with both gcc 4.3.5 and gcc 4.5.1 using binutils 2.20.1")
> because you now think this test isn't sufficient?

Sorry, but I don't understand why you are assuming that the compiler 
will only use those (wrong) optimizations on reads and not writes.

If the compiler does the same wrong optimizations for writes (why not, 
if it ignores volatile), your v4 would'nt fix that.

Regards,

Alexander
Wolfgang Denk Jan. 9, 2011, 10:25 p.m. UTC | #8
Dear Dirk Behme,

In message <4D1F1841.5060508@googlemail.com> you wrote:
>
> Do you like to test the patch in the attachment? I named it 'v4'.

Please send patches inline.

> After some thinking and testing, it seems to me that the volatile 
> optimization issue this patch shall fix is only with the readx() 
> macros. So the idea is to drop all writex() changes done in the v3 
> version of this patch. With dropping the writex() changes, we would 
> drop all issues we discussed with e.g. the GCC statement-expression 
> and the do while workaround, too.

This makes no sense. Even if we experience problems only with read*()
at the moment, we should to the Rigth Thing (TM) and fix both the
read*() and write*() functions.

Please have a look a the patch I just posted,
http://patchwork.ozlabs.org/patch/78056/

Best regards,

Wolfgang Denk
Alexander Holler Jan. 10, 2011, 2:53 p.m. UTC | #9
Am 02.01.2011 22:00, schrieb Alexander Holler:
> On 02.01.2011 14:29, Dirk Behme wrote:
>> On 02.01.2011 13:43, Alexander Holler wrote:
>>> Am 01.01.2011 20:21, schrieb Dirk Behme:
>>>> On 01.01.2011 19:47, Alexander Holler wrote:
>>>>> Am 01.01.2011 19:25, schrieb Dirk Behme:
>>>>>> On 01.01.2011 18:52, Alexander Holler wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Am 01.01.2011 13:04, schrieb Dirk Behme:
>>>>>>>> On 22.12.2010 12:04, Alexander Holler wrote:
>>>>>>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>>>>>>> avoid that as done in the kernel.
>>>>>>>>>
>>>>>>>>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a
>>>>>>>>> bug of
>>>>>>>>> that
>>>>>>>>> gcc version to ignore the volatile type qualifier used e.g. in
>>>>>>>>> __arch_getl().
>>>>>>>>> Anyway, using a definition as in the kernel headers avoids such
>>>>>>>>> optimizations when
>>>>>>>>> gcc 4.5.1 is used.
>>>>>>>>>
>>>>>>>>> Maybe the headers as used in the current linux-kernel should be
>>>>>>>>> used,
>>>>>>>>> but to avoid large changes, I've just added a small change to the
>>>>>>>>> current headers.
>>>>>>>
>>>>>>>> Do you like to test the patch in the attachment? I named it 'v4'.
>>>>>>>>
>>>>>>>> After some thinking and testing, it seems to me that the volatile
>>>>>>>> optimization issue this patch shall fix is only with the readx()
>>>>>>>> macros.
>>>>>>>> So the idea is to drop all writex() changes done in the v3
>>>>>>>> version of
>>>>>>>> this patch. With dropping the writex() changes, we would drop all
>>>>>>>> issues
>>>>>>>> we discussed with e.g. the GCC statement-expression and the do
>>>>>>>> while
>>>>>>>> workaround, too.
>>>>>>>
>>>>>>> I've come across a bug which reads as the problem might be fixed in
>>>>>>> gcc 4.5.2:
>>>>>>>
>>>>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
>>>>>>>
>>>>>>> I will test gcc 4.5.2 in the next days.
>>>>>>
>>>>>> Have you been able to test v4 of the patch I sent with gcc 4.5.1?
>>>>>
>>>>> No, sorry, I don't have a test case for consequent write* and I will
>>>>> have to write one.
>>>>
>>>> ?
>>>>
>>>> If I remember correctly, the test case for this patch was compiling
>>>> U-Boot with 4.5.1 and then check
>>>>
>>>> a) if it boots at Beagle (correct clock.c)
>>>> b) if NAND works ok (correct omap_gpmc.c)
>>>>
>>>> ?
>>>
>>> No. None of those must fail when the compiler optimizes consequent
>>> write* to one write* because the compiler ignores the volatile keyword.
>>> I've only found the problem with consequent read* (in clock.c), but
>>> there might be problems with consequent write* somewhere else too. So
>>> if you remove the change for those write* some other problems might
>>> arise and just through booting a kernel those might not be found. So I
>>> think it would be dangerous to remove the change for write* when using
>>> gcc 4.5.x
>>>
>>> And because the patch fixes only write* and read* some stuff in u-boot
>>> which uses volatile in another context might still fail, therefore I
>>> vote to use the current kernel headers where other things besides
>>> read* and write* are using those barriers too.
>>
>> Just to understand correctly: Do you want to say that we should ignore
>> your v3 patch
>>
>> http://lists.denx.de/pipermail/u-boot/2010-December/084132.html
>>
>> ?
>>
>> And that you didn't test the v4 patch
>>
>> http://lists.denx.de/pipermail/u-boot/2011-January/084481.html
>>
>> with the test you did in
>>
>> http://lists.denx.de/pipermail/u-boot/2010-December/084134.html
>>
>> ("tested with both gcc 4.3.5 and gcc 4.5.1 using binutils 2.20.1")
>> because you now think this test isn't sufficient?
>
> Sorry, but I don't understand why you are assuming that the compiler
> will only use those (wrong) optimizations on reads and not writes.
>
> If the compiler does the same wrong optimizations for writes (why not,
> if it ignores volatile), your v4 would'nt fix that.

I've done now some more tests.

First, the bug is fixed in gcc 4.5.2.

And Indeed, gcc 4.5.0 and gcc 4.5.1 seems to ignore volatile only for 
reading. At least two writel() are not optimized to one when the 
volatile (as before) or the "__asm__ __volatile__ ("" : : : "memory")" 
is used.

Beeing kind of a defensive programmer, I still would prefer to use have 
that __asm__ for write* too. That would at least prevent us from a 
possible bug there too.

What makes me a bit nervous, is that I don't have a clue how to write a 
test if volatile works (without looking at the generated output). Maybe 
others have that problem too and therfore such a test doesn't exist in 
the testsuite of gcc,

Regards,

Alexander
Wolfgang Denk Jan. 10, 2011, 3:05 p.m. UTC | #10
Dear Alexander Holler,

In message <4D2B1D75.70809@ahsoftware.de> you wrote:
>
> Beeing kind of a defensive programmer, I still would prefer to use have 
> that __asm__ for write* too. That would at least prevent us from a 
> possible bug there too.

So why don't you simply test and, assuming it's working, ACK the patch
I submitted yesterday?  We should be on the safe side, then, and don't
have to care about which mood the current compiler's optimizer might
be in or what the POM is.

Best regards,

Wolfgang Denk
Dirk Behme Jan. 10, 2011, 4:13 p.m. UTC | #11
Dear Wolfgang,

On 09.01.2011 23:25, Wolfgang Denk wrote:
> Dear Dirk Behme,
>
> In message<4D1F1841.5060508@googlemail.com>  you wrote:
>>
>> Do you like to test the patch in the attachment? I named it 'v4'.
>
> Please send patches inline.
>
>> After some thinking and testing, it seems to me that the volatile
>> optimization issue this patch shall fix is only with the readx()
>> macros. So the idea is to drop all writex() changes done in the v3
>> version of this patch. With dropping the writex() changes, we would
>> drop all issues we discussed with e.g. the GCC statement-expression
>> and the do while workaround, too.
>
> This makes no sense. Even if we experience problems only with read*()
> at the moment, we should to the Rigth Thing (TM) and fix both the
> read*() and write*() functions.

The question I was thinking about with my patch was "what's Right 
Thing?" ;)

It's my understanding that we don't fix read*() and write*() because 
they are broken. We touch them to work around a broken tool chain.

We saw that this specific tool chain has issues with read*(). While 
working around this, we touched write*(), too. This was done in the 
wrong way. So while read*() was fine, write*() was accidentally broken 
(with all tool chains), then. So we could

(a) do write*() correctly, too (as you do in your patch below)

or

(b) just don't touch write*() as it isn't needed to work around the 
read*() tool chain issue (as I proposed in my patch v4)

Anyway:

> Please have a look a the patch I just posted,
> http://patchwork.ozlabs.org/patch/78056/

I'm fine with that patch.

Thanks

Dirk
Alexander Holler Jan. 11, 2011, 3:53 a.m. UTC | #12
Am 10.01.2011 16:05, schrieb Wolfgang Denk:
> Dear Alexander Holler,
>
> In message<4D2B1D75.70809@ahsoftware.de>  you wrote:
>>
>> Beeing kind of a defensive programmer, I still would prefer to use have
>> that __asm__ for write* too. That would at least prevent us from a
>> possible bug there too.
>
> So why don't you simply test and, assuming it's working, ACK the patch
> I submitted yesterday?  We should be on the safe side, then, and don't
> have to care about which mood the current compiler's optimizer might
> be in or what the POM is.

Sorry, I haven't had your last patch (mail) before I've written the mail 
you are referencing.

I have updated my mail-system at home (armv5 with 128mb ram) and the 
incoming queue, mainly filled through lkml, is still not completly 
processed. ~2000 messages (3 days) need some time to go through 
spamassassin on such a low-level hardware. ;)

I've seen you've switched from do {} while() to "something else", but I 
can't comment on that "something else". Because I've already switched to 
4.5.2. I'll have to dig out a system where I have a 4.5.1 to test the 
problem occured with the write.
If anybody else already has tested it, I'm fine with it.

Regards,

Alexander
Wolfgang Denk Jan. 17, 2011, 9:59 p.m. UTC | #13
Dear Dirk Behme,

In message <4D2B3036.4010506@googlemail.com> you wrote:
> 
> The question I was thinking about with my patch was "what's Right 
> Thing?" ;)

The Right Thing i not to make specific assumptions how the compiler
might handle volatile pointers.

> It's my understanding that we don't fix read*() and write*() because 
> they are broken. We touch them to work around a broken tool chain.

No. Please re-read volatile-considered-harmful.txt in the
linux/Documentation directory: "accessing I/O memory directly through
pointers is frowned upon and does not work on all architectures.
Those accessors are written to prevent unwanted optimization".

Best regards,

Wolfgang Denk
diff mbox

Patch

Index: u-boot.git/arch/arm/include/asm/io.h
===================================================================
--- u-boot.git.orig/arch/arm/include/asm/io.h
+++ u-boot.git/arch/arm/include/asm/io.h
@@ -128,10 +128,16 @@  extern inline void __raw_readsl(unsigned
 #define writeb(v,a)			__arch_putb(v,a)
 #define writew(v,a)			__arch_putw(v,a)
 #define writel(v,a)			__arch_putl(v,a)
+/*
+ * TODO: The kernel offers some more advanced versions of barriers, it might
+ * have some advantages to use them instead of the simple one here.
+ */
+#define dmb()				__asm__ __volatile__ ("" : : : "memory")
+#define __iormb()			dmb()
 
-#define readb(a)			__arch_getb(a)
-#define readw(a)			__arch_getw(a)
-#define readl(a)			__arch_getl(a)
+#define readb(c)			({ u8  __v = __arch_getb(c); __iormb(); __v; })
+#define readw(c)			({ u16 __v = __arch_getw(c); __iormb(); __v; })
+#define readl(c)			({ u32 __v = __arch_getl(c); __iormb(); __v; })
 
 /*
  * The compiler seems to be incapable of optimising constants