diff mbox series

[U-Boot] configs: Lower Lamobo R1 DRAM clock rate to 384 MHz

Message ID 20180615205239.11868-1-contact@paulk.fr
State Changes Requested
Delegated to: Jagannadha Sutradharudu Teki
Headers show
Series [U-Boot] configs: Lower Lamobo R1 DRAM clock rate to 384 MHz | expand

Commit Message

Paul Kocialkowski June 15, 2018, 8:52 p.m. UTC
When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under
stressing workloads. Reducing the clock rate to 384 MHz results in
significantly-improved stability.

One reliable way to trigger a corruption at 432 MHz is to run
I/O-intensive operations on an attached SATA disk. The same operations
when operating the DRAM at 384 MHz typically go fine.

For some unexplained reason, running at 408 MHz worsens the situation.

Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
---
 configs/Lamobo_R1_defconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Maxime Ripard June 18, 2018, 7:59 a.m. UTC | #1
On Fri, Jun 15, 2018 at 10:52:39PM +0200, Paul Kocialkowski wrote:
> When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under
> stressing workloads. Reducing the clock rate to 384 MHz results in
> significantly-improved stability.
> 
> One reliable way to trigger a corruption at 432 MHz is to run
> I/O-intensive operations on an attached SATA disk. The same operations
> when operating the DRAM at 384 MHz typically go fine.
> 
> For some unexplained reason, running at 408 MHz worsens the situation.
> 
> Signed-off-by: Paul Kocialkowski <contact@paulk.fr>

What RAM settings are used by the Allwinner BSP, and can you reproduce
the issue there if they are the same?

Maxime
Paul Kocialkowski June 18, 2018, 9:26 a.m. UTC | #2
Hi,

On Mon, 2018-06-18 at 09:59 +0200, Maxime Ripard wrote:
> On Fri, Jun 15, 2018 at 10:52:39PM +0200, Paul Kocialkowski wrote:
> > When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under
> > stressing workloads. Reducing the clock rate to 384 MHz results in
> > significantly-improved stability.
> > 
> > One reliable way to trigger a corruption at 432 MHz is to run
> > I/O-intensive operations on an attached SATA disk. The same operations
> > when operating the DRAM at 384 MHz typically go fine.
> > 
> > For some unexplained reason, running at 408 MHz worsens the situation.
> > 
> > Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
> 
> What RAM settings are used by the Allwinner BSP, and can you reproduce
> the issue there if they are the same?

I forgot to mention it, but the fex uses 432 MHz (just like the u-boot
defconfig we have currently). I doubt that building the Allwinner boot
software (boot0 and so on) for comparison is really an option at this
point, due to the trainwreck of build issues that may occur.

Would the linux-sunxi downstream u-boot be sufficient for this?

For the sake of completeness, I also looked whether enabling ODT for 432
MHz could be a solution, but since the fex does not make use of it (and
has the default Zq value of 0x7f), this is not an option.

Cheers,

Paul
Siarhei Siamashka June 18, 2018, 12:37 p.m. UTC | #3
On Fri, 15 Jun 2018 22:52:39 +0200
Paul Kocialkowski <contact@paulk.fr> wrote:

> When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under
> stressing workloads.

Yes, it is well known that Allwinner devboards tend to have overclocked
settings out of the box and poor reliability track record because each
board vendor is trying to clock this low end hardware as high as
possible in order to look more "competitive". You can find more
information about this problem here:

   https://linux-sunxi.org/Hardware_Reliability_Tests

> Reducing the clock rate to 384 MHz results in significantly-improved stability.

It would be great if we could get the reliability problems completely
resolved on this board rather than just improved.

> One reliable way to trigger a corruption at 432 MHz is to run
> I/O-intensive operations on an attached SATA disk. The same operations
> when operating the DRAM at 384 MHz typically go fine.

Yes, concurrent access to the DRAM controller from more than one
peripheral exposes reliability problems. That's why we have the
lima-memtester tool at least for A10/A20 hardware, which does a
stress test for DRAM reliability by using CPU+Mali simultaneously:

    https://github.com/ssvb/lima-memtester/

I also did some experiments with CPU+Mali+G2D (simultaneous access from
3 sources) and CPU+G2D (use G2D instead of Mali) and the highest
reliable DRAM clock speeds under these workloads were pretty much the
same. So I suspect that CPU+SATA is about as stressful as any other
combination. And you can probably just run a regular memtester
tool together with some SATA activity in the background (I'm
assuming that you did just that when debugging this problem).

Still I would suggest you to try the lima-memtester tool too. It
requires a legacy 3.4 kernel. If you are really lazy, then you can
even try this kernel branch from my github repository:

   https://github.com/ssvb/linux-sunxi/tree/20151206-embedded-lima-memtester

The embedded initramfs automatically starts lima-memtester, so you only
need to boot this kernel image and watch the serial console log. The
only other thing is a proper script.bin file for your board (created
from a fex file).

If you are interested in a more advanced stuff (finding better DRAM
settings rather than just downclocking DRAM until it stops failing),
then you may want to check this wiki page:

   https://linux-sunxi.org/A10_DRAM_Controller_Calibration

For example, I had to downclock DRAM from 408MHz to 360MHz in the
Linksprite_pcDuino_defconfig in the past, you can find a detailed
analysis in the commit log:

   https://lists.denx.de/pipermail/u-boot/2015-October/229567.html

> For some unexplained reason, running at 408 MHz worsens the situation.
>
> Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
> ---
>  configs/Lamobo_R1_defconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/configs/Lamobo_R1_defconfig b/configs/Lamobo_R1_defconfig
> index 92e682128c..cf60fdfaf4 100644
> --- a/configs/Lamobo_R1_defconfig
> +++ b/configs/Lamobo_R1_defconfig
> @@ -2,7 +2,7 @@ CONFIG_ARM=y
>  CONFIG_ARCH_SUNXI=y
>  CONFIG_SYS_TEXT_BASE=0x4a000000
>  CONFIG_MACH_SUN7I=y
> -CONFIG_DRAM_CLK=432
> +CONFIG_DRAM_CLK=384
>  CONFIG_MACPWR="PH23"
>  CONFIG_MMC0_CD_PIN="PH10"
>  CONFIG_SATAPWR="PB3"
diff mbox series

Patch

diff --git a/configs/Lamobo_R1_defconfig b/configs/Lamobo_R1_defconfig
index 92e682128c..cf60fdfaf4 100644
--- a/configs/Lamobo_R1_defconfig
+++ b/configs/Lamobo_R1_defconfig
@@ -2,7 +2,7 @@  CONFIG_ARM=y
 CONFIG_ARCH_SUNXI=y
 CONFIG_SYS_TEXT_BASE=0x4a000000
 CONFIG_MACH_SUN7I=y
-CONFIG_DRAM_CLK=432
+CONFIG_DRAM_CLK=384
 CONFIG_MACPWR="PH23"
 CONFIG_MMC0_CD_PIN="PH10"
 CONFIG_SATAPWR="PB3"