diff mbox

[U-Boot] ARM: UniPhier: remove __packed that causes a problem on GCC 4.9

Message ID 1420627298-15634-1-git-send-email-yamada.m@jp.panasonic.com
State Accepted
Delegated to: Masahiro Yamada
Headers show

Commit Message

Masahiro Yamada Jan. 7, 2015, 10:41 a.m. UTC
The DDR PHY training function, ddrphy_prepare_training() would not
work if compiled with GCC 4.9.

The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h)
is specified with __packed because it represents a hardware register
mapping, but it turned out to cause a problem on GCC 4.9.

If -mno-unaligned-access is specified (actually it is in
arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the
__attribute__((packed)) and generates extra instructions to perform
the memory access in a way that does not cause unaligned access.
(Actually it is bogus here because the register base, the first
argument of the ddrphy_prepare_training(), is always given with a
4-byte aligned address.)

Anyway, as a result, readl() / writel() is divided into byte-wise
accesses.  The problem is that this hardware only accepts 4-byte
register access.  Byte-wise accesses lead to unexpected behavior.

There are some options to avoid this problem.

[1] Remove -mno-unaligned-access
[2] Add __aligned(4) along with __packed to struct ddrphy
[3] Remove __packed from struct ddrphy

[1] solves the problem for ARMv7, but it does not for pre-ARMv6 and
ARMv6-M architectures where -mno-unaligned-access is default.
So, [1] does not seem reasonable in terms of code portability.

Both [2] and [3] work well, but [2] seems too much.  All the members
of struct ddrphy have the u32 type.  No padding would be inserted
even if __packed is dropped.

Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
---

 arch/arm/include/asm/arch-uniphier/ddrphy-regs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Tom Rini Jan. 7, 2015, 7 p.m. UTC | #1
On Wed, Jan 07, 2015 at 07:41:38PM +0900, Masahiro Yamada wrote:
> The DDR PHY training function, ddrphy_prepare_training() would not
> work if compiled with GCC 4.9.
> 
> The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h)
> is specified with __packed because it represents a hardware register
> mapping, but it turned out to cause a problem on GCC 4.9.
> 
> If -mno-unaligned-access is specified (actually it is in
> arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the
> __attribute__((packed)) and generates extra instructions to perform
> the memory access in a way that does not cause unaligned access.
> (Actually it is bogus here because the register base, the first
> argument of the ddrphy_prepare_training(), is always given with a
> 4-byte aligned address.)
> 
> Anyway, as a result, readl() / writel() is divided into byte-wise
> accesses.  The problem is that this hardware only accepts 4-byte
> register access.  Byte-wise accesses lead to unexpected behavior.
> 
> There are some options to avoid this problem.
> 
> [1] Remove -mno-unaligned-access
> [2] Add __aligned(4) along with __packed to struct ddrphy
> [3] Remove __packed from struct ddrphy
> 
> [1] solves the problem for ARMv7, but it does not for pre-ARMv6 and
> ARMv6-M architectures where -mno-unaligned-access is default.
> So, [1] does not seem reasonable in terms of code portability.
> 
> Both [2] and [3] work well, but [2] seems too much.  All the members
> of struct ddrphy have the u32 type.  No padding would be inserted
> even if __packed is dropped.
> 
> Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>

I wanted to think about this for a minute.  I argued with myself a bit
about [2] being the best choice, and lost.  I'm fairly sure that
__packed on an struct of all u32 (on a 32bit platform) is in fact the
wrong thing to do so yes, [3] is right.

Reviewed-by: Tom Rini <trini@ti.com>
Masahiro Yamada Jan. 8, 2015, 9:03 a.m. UTC | #2
On Wed, 7 Jan 2015 14:00:05 -0500
Tom Rini <trini@ti.com> wrote:

> On Wed, Jan 07, 2015 at 07:41:38PM +0900, Masahiro Yamada wrote:
> > The DDR PHY training function, ddrphy_prepare_training() would not
> > work if compiled with GCC 4.9.
> > 
> > The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h)
> > is specified with __packed because it represents a hardware register
> > mapping, but it turned out to cause a problem on GCC 4.9.
> > 
> > If -mno-unaligned-access is specified (actually it is in
> > arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the
> > __attribute__((packed)) and generates extra instructions to perform
> > the memory access in a way that does not cause unaligned access.
> > (Actually it is bogus here because the register base, the first
> > argument of the ddrphy_prepare_training(), is always given with a
> > 4-byte aligned address.)
> > 
> > Anyway, as a result, readl() / writel() is divided into byte-wise
> > accesses.  The problem is that this hardware only accepts 4-byte
> > register access.  Byte-wise accesses lead to unexpected behavior.
> > 
> > There are some options to avoid this problem.
> > 
> > [1] Remove -mno-unaligned-access
> > [2] Add __aligned(4) along with __packed to struct ddrphy
> > [3] Remove __packed from struct ddrphy
> > 
> > [1] solves the problem for ARMv7, but it does not for pre-ARMv6 and
> > ARMv6-M architectures where -mno-unaligned-access is default.
> > So, [1] does not seem reasonable in terms of code portability.
> > 
> > Both [2] and [3] work well, but [2] seems too much.  All the members
> > of struct ddrphy have the u32 type.  No padding would be inserted
> > even if __packed is dropped.
> > 
> > Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
> 
> I wanted to think about this for a minute.  I argued with myself a bit
> about [2] being the best choice, and lost.  I'm fairly sure that
> __packed on an struct of all u32 (on a 32bit platform) is in fact the
> wrong thing to do so yes, [3] is right.
> 
> Reviewed-by: Tom Rini <trini@ti.com>
> 


There is another (and I think the best) option:


[4] Sync writeb(), writew(), writel(), readlb(), readw(), readl() with Linux



This pitfall does not occur on Linux
because ARM Linux hard-codes writel()/readl() etc.




Linux:

#define __raw_writel __raw_writel
static inline void __raw_writel(u32 val, volatile void __iomem *addr)
{
        asm volatile("str %1, %0"
                     : "+Qo" (*(volatile u32 __force *)addr)
                     : "r" (val));
}




U-Boot:

#define __arch_putl(v,a)                (*(volatile unsigned int *)(a) = (v))

#define __raw_writel(v,a)       __arch_putl(v,a)






On Linux, writel() is always converted to "str" instruction.

On U-Boot, it depends on compiler's decesion.




Best Regards
Masahiro Yamada
Masahiro Yamada Jan. 22, 2015, 4:07 p.m. UTC | #3
2015-01-07 19:41 GMT+09:00 Masahiro Yamada <yamada.m@jp.panasonic.com>:
> The DDR PHY training function, ddrphy_prepare_training() would not
> work if compiled with GCC 4.9.
>
> The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h)
> is specified with __packed because it represents a hardware register
> mapping, but it turned out to cause a problem on GCC 4.9.
>
> If -mno-unaligned-access is specified (actually it is in
> arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the
> __attribute__((packed)) and generates extra instructions to perform
> the memory access in a way that does not cause unaligned access.
> (Actually it is bogus here because the register base, the first
> argument of the ddrphy_prepare_training(), is always given with a
> 4-byte aligned address.)
>
> Anyway, as a result, readl() / writel() is divided into byte-wise
> accesses.  The problem is that this hardware only accepts 4-byte
> register access.  Byte-wise accesses lead to unexpected behavior.
>
> There are some options to avoid this problem.
>
> [1] Remove -mno-unaligned-access
> [2] Add __aligned(4) along with __packed to struct ddrphy
> [3] Remove __packed from struct ddrphy
>
> [1] solves the problem for ARMv7, but it does not for pre-ARMv6 and
> ARMv6-M architectures where -mno-unaligned-access is default.
> So, [1] does not seem reasonable in terms of code portability.
>
> Both [2] and [3] work well, but [2] seems too much.  All the members
> of struct ddrphy have the u32 type.  No padding would be inserted
> even if __packed is dropped.
>
> Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>


Applied to u-boot-uniphier/master
diff mbox

Patch

diff --git a/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h b/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h
index 484559c..6b7d600 100644
--- a/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h
+++ b/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h
@@ -72,7 +72,7 @@  struct ddrphy {
 		u32 gtr;	/* General Timing Register */
 		u32 rsv[3];	/* Reserved */
 	} dx[9];
-} __packed;
+};
 
 #endif /* __ASSEMBLY__ */