[U-Boot] sunxi: H6: DRAM: avoid memcpy() on MMIO registers

Message ID 20181206000220.2302-1-andre.przywara@arm.com
State New
Delegated to: Jagannadha Sutradharudu Teki
Headers show
Series
  • [U-Boot] sunxi: H6: DRAM: avoid memcpy() on MMIO registers
Related show

Commit Message

Andre Przywara Dec. 6, 2018, 12:02 a.m.
Using memcpy() for MMIO operations is, however tempting, not a good idea:
It depends on the specific implementation of memcpy, also lacks barriers.
In this particular case the first registers were written using 64-bit
writes, and the last register using four separate single-byte writes.
Neither is what we actually want.
We get away with it in this case because of the particular details of
the bus implementation, the DRAM controller IP and the values that we
actually write, but we should not leave a bad example around.

Replace the memcpy with a proper loop using the writel() accessor.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm/mach-sunxi/dram_sun50i_h6.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Maxime Ripard Dec. 6, 2018, 7:42 a.m. | #1
On Thu, Dec 06, 2018 at 12:02:20AM +0000, Andre Przywara wrote:
> Using memcpy() for MMIO operations is, however tempting, not a good idea:
> It depends on the specific implementation of memcpy, also lacks barriers.
> In this particular case the first registers were written using 64-bit
> writes, and the last register using four separate single-byte writes.
> Neither is what we actually want.
> We get away with it in this case because of the particular details of
> the bus implementation, the DRAM controller IP and the values that we
> actually write, but we should not leave a bad example around.
> 
> Replace the memcpy with a proper loop using the writel() accessor.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/mach-sunxi/dram_sun50i_h6.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> index 5da90a2835..e2f141eb9b 100644
> --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
> +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> @@ -182,6 +182,7 @@ static void mctl_set_timing_lpddr3(struct dram_para *para)
>  			(struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
>  	struct sunxi_mctl_phy_reg * const mctl_phy =
>  			(struct sunxi_mctl_phy_reg *)SUNXI_DRAM_PHY0_BASE;
> +	int i;
>  
>  	u8 tccd		= 2;
>  	u8 tfaw		= max(ns_to_t(50), 4);
> @@ -237,8 +238,9 @@ static void mctl_set_timing_lpddr3(struct dram_para *para)
>  	u8 twr2rd	= tcwl + 4 + 1 + twtr;
>  	u8 trd2wr	= tcl + 4 + (tcksrea >> 1) - tcwl + 1;
>  
> -	/* set mode register */
> -	memcpy(mctl_phy->mr, mr_lpddr3, sizeof(mr_lpddr3));
> +	/* set mode registers */
> +	for (i = 0; i < ARRAY_SIZE(mr_lpddr3); i++)
> +		writel(mr_lpddr3[i], &mctl_phy->mr[i]);

memcpy_toio is meant to do just that.

Maxime
Andre Przywara Dec. 7, 2018, 2:17 p.m. | #2
On 06/12/2018 07:42, Maxime Ripard wrote:

Hi,

> On Thu, Dec 06, 2018 at 12:02:20AM +0000, Andre Przywara wrote:
>> Using memcpy() for MMIO operations is, however tempting, not a good
>> idea: It depends on the specific implementation of memcpy, also
>> lacks barriers. In this particular case the first registers were
>> written using 64-bit writes, and the last register using four
>> separate single-byte writes. Neither is what we actually want.
>> We get away with it in this case because of the particular details of
>> the bus implementation, the DRAM controller IP and the values that we
>> actually write, but we should not leave a bad example around.
>>
>> Replace the memcpy with a proper loop using the writel() accessor.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  arch/arm/mach-sunxi/dram_sun50i_h6.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c
>> b/arch/arm/mach-sunxi/dram_sun50i_h6.c index 5da90a2835..e2f141eb9b
>> 100644 --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
>> +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
>> @@ -182,6 +182,7 @@ static void mctl_set_timing_lpddr3(struct
>> dram_para *para) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
>>  	struct sunxi_mctl_phy_reg * const mctl_phy =
>>  			(struct sunxi_mctl_phy_reg
>> *)SUNXI_DRAM_PHY0_BASE;
>> +	int i;
>>  
>>  	u8 tccd		= 2;
>>  	u8 tfaw		= max(ns_to_t(50), 4);
>> @@ -237,8 +238,9 @@ static void mctl_set_timing_lpddr3(struct
>> dram_para *para) u8 twr2rd	= tcwl + 4 + 1 + twtr;
>>  	u8 trd2wr	= tcl + 4 + (tcksrea >> 1) - tcwl + 1;
>>  
>> -	/* set mode register */
>> -	memcpy(mctl_phy->mr, mr_lpddr3, sizeof(mr_lpddr3));
>> +	/* set mode registers */
>> +	for (i = 0; i < ARRAY_SIZE(mr_lpddr3); i++)
>> +		writel(mr_lpddr3[i], &mctl_phy->mr[i]);
> 
> memcpy_toio is meant to do just that.

"meant to" is the right wording ;-), since it doesn't seem to do it: If
I understand asm/io.h correctly (which apparently was just copied 20
years ago or so from Russell's Linux), it simply defines to
memcpy - if we don't have __mempci defined - whatever that means in
2018's U-Boot ;-) Funny enough even the Linux implementation is somewhat
dodgy, as it uses the raw_read<x> accessors (which don't do much in
U-Boot, I believe), but still uses 8-bit and 64-bit wide accesses, in
the hope that this is handled by the device.

So I'd rather stick with the explicit loop, as I saw funny effects with
byte writes to DRAM registers.

Cheers,
Andre.

Patch

diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c b/arch/arm/mach-sunxi/dram_sun50i_h6.c
index 5da90a2835..e2f141eb9b 100644
--- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
+++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
@@ -182,6 +182,7 @@  static void mctl_set_timing_lpddr3(struct dram_para *para)
 			(struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
 	struct sunxi_mctl_phy_reg * const mctl_phy =
 			(struct sunxi_mctl_phy_reg *)SUNXI_DRAM_PHY0_BASE;
+	int i;
 
 	u8 tccd		= 2;
 	u8 tfaw		= max(ns_to_t(50), 4);
@@ -237,8 +238,9 @@  static void mctl_set_timing_lpddr3(struct dram_para *para)
 	u8 twr2rd	= tcwl + 4 + 1 + twtr;
 	u8 trd2wr	= tcl + 4 + (tcksrea >> 1) - tcwl + 1;
 
-	/* set mode register */
-	memcpy(mctl_phy->mr, mr_lpddr3, sizeof(mr_lpddr3));
+	/* set mode registers */
+	for (i = 0; i < ARRAY_SIZE(mr_lpddr3); i++)
+		writel(mr_lpddr3[i], &mctl_phy->mr[i]);
 
 	/* set DRAM timing */
 	writel((twtp << 24) | (tfaw << 16) | (trasmax << 8) | tras,