[SRU,Artful] UBUNTU: SAUCE: ahci: thunderx2: stop engine fix update

Message ID 20171024220905.wbzsyix4pougo3e3@xps13.dannf
State New
Headers show
Series
  • [SRU,Artful] UBUNTU: SAUCE: ahci: thunderx2: stop engine fix update
Related show

Commit Message

dann frazier Oct. 24, 2017, 10:09 p.m.
BugLink: https://bugs.launchpad.net/bugs/1724117

The current reset fix fails during continuous reboot test. The failure
happens when both the on-board SATA slots are used and when one of the
controllers are reset.

The latest ThunderX2 firmware (3.1) enables hardware error interrupts and
when the reset fix fails, we get a hang with the print:
[   14.839308] sd 1:0:0:0: [sdb] 468862128 512-byte logical blocks: (240 GB/224 GiB)
[   14.846796] sd 1:0:0:0: [sdb] 4096-byte physical blocks
[   14.852036] sd 1:0:0:0: [sdb] Write Protect is off
[   14.856843] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   14.866022] ata2.00: Enabling discard_zeroes_data

        *** NBU BAR Error 0x1e25c ***
         AddrLo 0x1d80180 AddrHi 0x0

To fix this issue, update the SATA reset fix to increase the delays between register writes.

Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
[ dannf: *** There is no need to carry this forward beyond artful *** ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
---
 drivers/ata/libahci.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Marcelo Henrique Cerri Oct. 26, 2017, 11:06 a.m. | #1
Minor change and restricted to arm64.

Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Kleber Souza Oct. 26, 2017, 1:12 p.m. | #2
On 10/25/17 00:09, dann frazier wrote:
> BugLink: https://bugs.launchpad.net/bugs/1724117
> 
> The current reset fix fails during continuous reboot test. The failure
> happens when both the on-board SATA slots are used and when one of the
> controllers are reset.
> 
> The latest ThunderX2 firmware (3.1) enables hardware error interrupts and
> when the reset fix fails, we get a hang with the print:
> [   14.839308] sd 1:0:0:0: [sdb] 468862128 512-byte logical blocks: (240 GB/224 GiB)
> [   14.846796] sd 1:0:0:0: [sdb] 4096-byte physical blocks
> [   14.852036] sd 1:0:0:0: [sdb] Write Protect is off
> [   14.856843] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [   14.866022] ata2.00: Enabling discard_zeroes_data
> 
>         *** NBU BAR Error 0x1e25c ***
>          AddrLo 0x1d80180 AddrHi 0x0
> 
> To fix this issue, update the SATA reset fix to increase the delays between register writes.
> 
> Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
> [ dannf: *** There is no need to carry this forward beyond artful *** ]
> Signed-off-by: dann frazier <dann.frazier@canonical.com>

Hi Dann,

Is "Jayachandran C <jnair@caviumnetworks.com>" the original author of
the patch?


Kleber
Stefan Bader Oct. 27, 2017, 12:23 p.m. | #3
On 25.10.2017 00:09, dann frazier wrote:
> BugLink: https://bugs.launchpad.net/bugs/1724117
> 
> The current reset fix fails during continuous reboot test. The failure
> happens when both the on-board SATA slots are used and when one of the
> controllers are reset.
> 
> The latest ThunderX2 firmware (3.1) enables hardware error interrupts and
> when the reset fix fails, we get a hang with the print:
> [   14.839308] sd 1:0:0:0: [sdb] 468862128 512-byte logical blocks: (240 GB/224 GiB)
> [   14.846796] sd 1:0:0:0: [sdb] 4096-byte physical blocks
> [   14.852036] sd 1:0:0:0: [sdb] Write Protect is off
> [   14.856843] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [   14.866022] ata2.00: Enabling discard_zeroes_data
> 
>         *** NBU BAR Error 0x1e25c ***
>          AddrLo 0x1d80180 AddrHi 0x0
> 
> To fix this issue, update the SATA reset fix to increase the delays between register writes.
> 
> Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
> [ dannf: *** There is no need to carry this forward beyond artful *** ]
> Signed-off-by: dann frazier <dann.frazier@canonical.com>
> ---
>  drivers/ata/libahci.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index 9116bba1b07d..1d3e614bad2b 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -679,10 +679,11 @@ int ahci_stop_engine(struct ata_port *ap)
>  			MIDR_CPU_VAR_REV(0, 0),
>  			MIDR_CPU_VAR_REV(0, MIDR_REVISION_MASK))) {
>  		tmp = readl(hpriv->mmio + 0x8000);
> +		udelay(100);
>  		writel(tmp | (1 << 26), hpriv->mmio + 0x8000);
> -		udelay(1);
> +		udelay(100);
>  		writel(tmp & ~(1 << 26), hpriv->mmio + 0x8000);
> -		dev_warn(ap->host->dev, "CN99XX stop engine fix applied!\n");
> +		dev_warn(ap->host->dev, "CN99XX SATA reset workaround applied\n");
>  	}
>  #endif
>  
> 
The bug report claims that this should fix a certain chipset and only on arm.
But I cannot see how this would be limited to anything specific...
Marcelo Henrique Cerri Oct. 27, 2017, 12:33 p.m. | #4
Hi, Stefan.

The change is inside an #ifdef CONFIG_ARM64 section.
Stefan Bader Oct. 27, 2017, 12:55 p.m. | #5
On 27.10.2017 14:33, Marcelo Henrique Cerri wrote:
> Hi, Stefan.
> 
> The change is inside an #ifdef CONFIG_ARM64 section.
> 
Ah ok, so it looks acceptable if Dann could clarify the author and maybe as well
why no need to carry beyond artful

-Stefan
dann frazier Oct. 27, 2017, 2:59 p.m. | #6
On Fri, Oct 27, 2017 at 6:55 AM, Stefan Bader
<stefan.bader@canonical.com> wrote:
> On 27.10.2017 14:33, Marcelo Henrique Cerri wrote:
>> Hi, Stefan.
>>
>> The change is inside an #ifdef CONFIG_ARM64 section.
>>
> Ah ok, so it looks acceptable if Dann could clarify the author and maybe as well
> why no need to carry beyond artful

Stefan, Kleber, Marcelo,

Yeah - Jayachandran is the original author.

This is a workaround for a bug in early silicon that is not for
production use. That bug will be fixed in a silicon rev, and existing
boards replaced, so they do not plan to upstream the fix. However,
this early silicon is being evaluated with the artful kernel, so it'll
help users if we can carry it for a cycle.

(This is a fix for a fix we're already carrying, with the above caveat).

  -dann
Kleber Souza Oct. 27, 2017, 3:10 p.m. | #7
On 10/25/17 00:09, dann frazier wrote:
> BugLink: https://bugs.launchpad.net/bugs/1724117
> 
> The current reset fix fails during continuous reboot test. The failure
> happens when both the on-board SATA slots are used and when one of the
> controllers are reset.
> 
> The latest ThunderX2 firmware (3.1) enables hardware error interrupts and
> when the reset fix fails, we get a hang with the print:
> [   14.839308] sd 1:0:0:0: [sdb] 468862128 512-byte logical blocks: (240 GB/224 GiB)
> [   14.846796] sd 1:0:0:0: [sdb] 4096-byte physical blocks
> [   14.852036] sd 1:0:0:0: [sdb] Write Protect is off
> [   14.856843] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [   14.866022] ata2.00: Enabling discard_zeroes_data
> 
>         *** NBU BAR Error 0x1e25c ***
>          AddrLo 0x1d80180 AddrHi 0x0
> 
> To fix this issue, update the SATA reset fix to increase the delays between register writes.
> 
> Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
> [ dannf: *** There is no need to carry this forward beyond artful *** ]
> Signed-off-by: dann frazier <dann.frazier@canonical.com>

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

Thanks Dann for the additional explanation.

Note: the authorship of the patch needs to be set properly when applying it.

> ---
>  drivers/ata/libahci.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index 9116bba1b07d..1d3e614bad2b 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -679,10 +679,11 @@ int ahci_stop_engine(struct ata_port *ap)
>  			MIDR_CPU_VAR_REV(0, 0),
>  			MIDR_CPU_VAR_REV(0, MIDR_REVISION_MASK))) {
>  		tmp = readl(hpriv->mmio + 0x8000);
> +		udelay(100);
>  		writel(tmp | (1 << 26), hpriv->mmio + 0x8000);
> -		udelay(1);
> +		udelay(100);
>  		writel(tmp & ~(1 << 26), hpriv->mmio + 0x8000);
> -		dev_warn(ap->host->dev, "CN99XX stop engine fix applied!\n");
> +		dev_warn(ap->host->dev, "CN99XX SATA reset workaround applied\n");
>  	}
>  #endif
>  
>
Kleber Souza Oct. 30, 2017, 5:28 p.m. | #8
Applied to artful/master-next branch, fixing commit authorship. Thanks.

Patch

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index 9116bba1b07d..1d3e614bad2b 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -679,10 +679,11 @@  int ahci_stop_engine(struct ata_port *ap)
 			MIDR_CPU_VAR_REV(0, 0),
 			MIDR_CPU_VAR_REV(0, MIDR_REVISION_MASK))) {
 		tmp = readl(hpriv->mmio + 0x8000);
+		udelay(100);
 		writel(tmp | (1 << 26), hpriv->mmio + 0x8000);
-		udelay(1);
+		udelay(100);
 		writel(tmp & ~(1 << 26), hpriv->mmio + 0x8000);
-		dev_warn(ap->host->dev, "CN99XX stop engine fix applied!\n");
+		dev_warn(ap->host->dev, "CN99XX SATA reset workaround applied\n");
 	}
 #endif