diff mbox

[REGRESSION] 3.14-rc2 boot failure on Kirkwood (qnap ts-119p+)

Message ID 20140216143735.GA8680@localhost
State New
Headers show

Commit Message

Ezequiel Garcia Feb. 16, 2014, 2:45 p.m. UTC
Hi Mikael,

On Sun, Feb 16, 2014 at 12:00:37PM +0100, Mikael Pettersson wrote:
> My Kirkwood box worked fine with the 3.13 kernel, but with 3.14-rc2
> boot always fails due to a kernel NULL dereference in __clk_put.
> 
> This is a non-DT kernel, with:
> 
> CONFIG_ARCH_KIRKWOOD=y
> CONFIG_KIRKWOOD_LEGACY=y
> CONFIG_MACH_TS219=y
> # CONFIG_ARCH_KIRKWOOD_DT is not set
> 

Thanks for the report. I thought this issue was already fixed, but I
cannot find it on either the mailing lists or linux-next.

So, in case it hasn't been fixed here's an untested fix for you to test.
Please try this patch and let us know.

Your SATA won't work but if the patch is OK the kernel wont't blow away.

Andrew? Do we support the new phy requirement in non-DT platforms?

From 5d4010ba3c485e7e22887408663fb260f628b9b2 Mon Sep 17 00:00:00 2001
From: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date: Sun, 16 Feb 2014 11:21:50 -0300
Subject: [PATCH] ata: sata_mv: Cleanup only the initialized ports

When an error occurs in the port initialization loop, currently the
driver tries to cleanup all the ports. This results in a NULL pointer
dereference if the ports were only partially initialized.

Fix this by updating only the number of initialized ports (either
with failure or successfully), before jumping to the error path
and looping over that number in the cleanup loop.

Cc: Andrew Lunn <andrew@lunn.ch>
Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
---
 drivers/ata/sata_mv.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Mikael Pettersson Feb. 16, 2014, 3:10 p.m. UTC | #1
Ezequiel Garcia writes:
 > Hi Mikael,
 > 
 > On Sun, Feb 16, 2014 at 12:00:37PM +0100, Mikael Pettersson wrote:
 > > My Kirkwood box worked fine with the 3.13 kernel, but with 3.14-rc2
 > > boot always fails due to a kernel NULL dereference in __clk_put.
 > > 
 > > This is a non-DT kernel, with:
 > > 
 > > CONFIG_ARCH_KIRKWOOD=y
 > > CONFIG_KIRKWOOD_LEGACY=y
 > > CONFIG_MACH_TS219=y
 > > # CONFIG_ARCH_KIRKWOOD_DT is not set
 > > 
 > 
 > Thanks for the report. I thought this issue was already fixed, but I
 > cannot find it on either the mailing lists or linux-next.
 > 
 > So, in case it hasn't been fixed here's an untested fix for you to test.
 > Please try this patch and let us know.
 > 
 > Your SATA won't work but if the patch is OK the kernel wont't blow away.

Thanks, this fixes the oops but does leave sata_mv non-functional,
which is still a major regression from 3.13.

sata_mv sata_mv.0: cannot get optional clkdev
sata_mv sata_mv.0: error getting phy
sata_mv: probe of sata_mv.0 failed with error -38
...
VFS: Cannot open root device "sda1" or unknown-block(0,0): error -6
Please append a correct "root=" boot option; here are the available partitions:
1f00             512 mtdblock0  (driver?)
1f01            2048 mtdblock1  (driver?)
1f02            9216 mtdblock2  (driver?)
1f03            3072 mtdblock3  (driver?)
1f04             256 mtdblock4  (driver?)
1f05            1280 mtdblock5  (driver?)
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

/Mikael

 > Andrew? Do we support the new phy requirement in non-DT platforms?
 > 
 > >From 5d4010ba3c485e7e22887408663fb260f628b9b2 Mon Sep 17 00:00:00 2001
 > From: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
 > Date: Sun, 16 Feb 2014 11:21:50 -0300
 > Subject: [PATCH] ata: sata_mv: Cleanup only the initialized ports
 > 
 > When an error occurs in the port initialization loop, currently the
 > driver tries to cleanup all the ports. This results in a NULL pointer
 > dereference if the ports were only partially initialized.
 > 
 > Fix this by updating only the number of initialized ports (either
 > with failure or successfully), before jumping to the error path
 > and looping over that number in the cleanup loop.
 > 
 > Cc: Andrew Lunn <andrew@lunn.ch>
 > Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
 > Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
 > ---
 >  drivers/ata/sata_mv.c | 9 +++++++--
 >  1 file changed, 7 insertions(+), 2 deletions(-)
 > 
 > diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
 > index 20a7517..9c1a11d 100644
 > --- a/drivers/ata/sata_mv.c
 > +++ b/drivers/ata/sata_mv.c
 > @@ -4104,7 +4104,6 @@ static int mv_platform_probe(struct platform_device *pdev)
 >  	if (!hpriv->port_phys)
 >  		return -ENOMEM;
 >  	host->private_data = hpriv;
 > -	hpriv->n_ports = n_ports;
 >  	hpriv->board_idx = chip_soc;
 >  
 >  	host->iomap = NULL;
 > @@ -4132,11 +4131,17 @@ static int mv_platform_probe(struct platform_device *pdev)
 >  			hpriv->port_phys[port] = NULL;
 >  			if ((rc != -EPROBE_DEFER) && (rc != -ENODEV))
 >  				dev_warn(&pdev->dev, "error getting phy");
 > +
 > +			/* Cleanup only the initialized ports */
 > +			hpriv->n_ports = port;
 >  			goto err;
 >  		} else
 >  			phy_power_on(hpriv->port_phys[port]);
 >  	}
 >  
 > +	/* All the ports have been initialized */
 > +	hpriv->n_ports = n_ports;
 > +
 >  	/*
 >  	 * (Re-)program MBUS remapping windows if we are asked to.
 >  	 */
 > @@ -4174,7 +4179,7 @@ err:
 >  		clk_disable_unprepare(hpriv->clk);
 >  		clk_put(hpriv->clk);
 >  	}
 > -	for (port = 0; port < n_ports; port++) {
 > +	for (port = 0; port < hpriv->n_ports; port++) {
 >  		if (!IS_ERR(hpriv->port_clks[port])) {
 >  			clk_disable_unprepare(hpriv->port_clks[port]);
 >  			clk_put(hpriv->port_clks[port]);
 > -- 
 > 1.8.1.5
 > 
 > 
 > -- 
 > Ezequiel GarcĂ­a, Free Electrons
 > Embedded Linux, Kernel and Android Engineering
 > http://free-electrons.com
Ezequiel Garcia Feb. 16, 2014, 3:28 p.m. UTC | #2
On Sun, Feb 16, 2014 at 04:10:02PM +0100, Mikael Pettersson wrote:
> Ezequiel Garcia writes:
>  > Hi Mikael,
>  > 
>  > On Sun, Feb 16, 2014 at 12:00:37PM +0100, Mikael Pettersson wrote:
>  > > My Kirkwood box worked fine with the 3.13 kernel, but with 3.14-rc2
>  > > boot always fails due to a kernel NULL dereference in __clk_put.
>  > > 
>  > > This is a non-DT kernel, with:
>  > > 
>  > > CONFIG_ARCH_KIRKWOOD=y
>  > > CONFIG_KIRKWOOD_LEGACY=y
>  > > CONFIG_MACH_TS219=y
>  > > # CONFIG_ARCH_KIRKWOOD_DT is not set
>  > > 
>  > 
>  > Thanks for the report. I thought this issue was already fixed, but I
>  > cannot find it on either the mailing lists or linux-next.
>  > 
>  > So, in case it hasn't been fixed here's an untested fix for you to test.
>  > Please try this patch and let us know.
>  > 
>  > Your SATA won't work but if the patch is OK the kernel wont't blow away.
> 
> Thanks, this fixes the oops but does leave sata_mv non-functional,
> which is still a major regression from 3.13.
> 

Please, try linux-next to get the most recent fixes and make sure you have
CONFIG_PHY_MVEBU_SATA enabled.

Thanks,
Mikael Pettersson Feb. 16, 2014, 4:48 p.m. UTC | #3
Ezequiel Garcia writes:
 > On Sun, Feb 16, 2014 at 04:10:02PM +0100, Mikael Pettersson wrote:
 > > Ezequiel Garcia writes:
 > >  > Hi Mikael,
 > >  > 
 > >  > On Sun, Feb 16, 2014 at 12:00:37PM +0100, Mikael Pettersson wrote:
 > >  > > My Kirkwood box worked fine with the 3.13 kernel, but with 3.14-rc2
 > >  > > boot always fails due to a kernel NULL dereference in __clk_put.
 > >  > > 
 > >  > > This is a non-DT kernel, with:
 > >  > > 
 > >  > > CONFIG_ARCH_KIRKWOOD=y
 > >  > > CONFIG_KIRKWOOD_LEGACY=y
 > >  > > CONFIG_MACH_TS219=y
 > >  > > # CONFIG_ARCH_KIRKWOOD_DT is not set
 > >  > > 
 > >  > 
 > >  > Thanks for the report. I thought this issue was already fixed, but I
 > >  > cannot find it on either the mailing lists or linux-next.
 > >  > 
 > >  > So, in case it hasn't been fixed here's an untested fix for you to test.
 > >  > Please try this patch and let us know.
 > >  > 
 > >  > Your SATA won't work but if the patch is OK the kernel wont't blow away.
 > > 
 > > Thanks, this fixes the oops but does leave sata_mv non-functional,
 > > which is still a major regression from 3.13.
 > > 
 > 
 > Please, try linux-next to get the most recent fixes and make sure you have
 > CONFIG_PHY_MVEBU_SATA enabled.

If I backport

"drivers: phy: Make NULL a valid phy reference" (04c2facad8fee66c981a51852806d8923336f362)
"drivers: phy: Add support for optional phys" (788a4d56ff378bff0b8e685d03a962b36903a149)
"ata: sata_mv: Fix probe failures with optional phys" (90aa2997029fa623fe9e3ec3a469a00a34130237)

from linux-next, and fix up your sata_mv cleanup fixes for a context reject in
hunk #2, and enable CONFIG_OF and CONFIG_GENERIC_PHY, then I get a 3.14-rc2-ish
kernel that boots and mounts its root fileystem.

Still,

@@ -123,6 +124,8 @@
 loop: module loaded
 sata_mv sata_mv.0: version 1.28
 sata_mv sata_mv.0: cannot get optional clkdev
+sata_mv sata_mv.0: unable to find phy
+sata_mv sata_mv.0: unable to find phy
 sata_mv sata_mv.0: slots 32 ports 2
 scsi0 : sata_mv
 scsi1 : sata_mv

tells me that the PHY dependency is artificial.

I hope this all can be resolved in a clean way before 3.14 final.

/Mikael
Andrew Lunn Feb. 16, 2014, 8:29 p.m. UTC | #4
On Sun, Feb 16, 2014 at 11:45:20AM -0300, Ezequiel Garcia wrote:
> Hi Mikael,
> 
> On Sun, Feb 16, 2014 at 12:00:37PM +0100, Mikael Pettersson wrote:
> > My Kirkwood box worked fine with the 3.13 kernel, but with 3.14-rc2
> > boot always fails due to a kernel NULL dereference in __clk_put.
> > 
> > This is a non-DT kernel, with:
> > 
> > CONFIG_ARCH_KIRKWOOD=y
> > CONFIG_KIRKWOOD_LEGACY=y
> > CONFIG_MACH_TS219=y
> > # CONFIG_ARCH_KIRKWOOD_DT is not set
> > 
> 
> Thanks for the report. I thought this issue was already fixed, but I
> cannot find it on either the mailing lists or linux-next.
> 
> So, in case it hasn't been fixed here's an untested fix for you to test.
> Please try this patch and let us know.
> 
> Your SATA won't work but if the patch is OK the kernel wont't blow away.
> 
> Andrew? Do we support the new phy requirement in non-DT platforms?

Hi Ezequiel

I consider Non-DT kirkwood now pretty much near death. Probably with
3.16 it will of gone altogether. So i did not plan for Non-DT to
support SATA phy. I did however want it to at least boot, so i broke
something here :-(

I will take a look at your fix and what we can expect in the next -rc.

  Andrew
Ezequiel Garcia Feb. 16, 2014, 9:17 p.m. UTC | #5
On Sun, Feb 16, 2014 at 09:29:57PM +0100, Andrew Lunn wrote:
> On Sun, Feb 16, 2014 at 11:45:20AM -0300, Ezequiel Garcia wrote:
> > Hi Mikael,
> > 
> > On Sun, Feb 16, 2014 at 12:00:37PM +0100, Mikael Pettersson wrote:
> > > My Kirkwood box worked fine with the 3.13 kernel, but with 3.14-rc2
> > > boot always fails due to a kernel NULL dereference in __clk_put.
> > > 
> > > This is a non-DT kernel, with:
> > > 
> > > CONFIG_ARCH_KIRKWOOD=y
> > > CONFIG_KIRKWOOD_LEGACY=y
> > > CONFIG_MACH_TS219=y
> > > # CONFIG_ARCH_KIRKWOOD_DT is not set
> > > 
> > 
> > Thanks for the report. I thought this issue was already fixed, but I
> > cannot find it on either the mailing lists or linux-next.
> > 
> > So, in case it hasn't been fixed here's an untested fix for you to test.
> > Please try this patch and let us know.
> > 
> > Your SATA won't work but if the patch is OK the kernel wont't blow away.
> > 
> > Andrew? Do we support the new phy requirement in non-DT platforms?
[..]
> 
> I consider Non-DT kirkwood now pretty much near death. Probably with
> 3.16 it will of gone altogether. So i did not plan for Non-DT to
> support SATA phy. I did however want it to at least boot, so i broke
> something here :-(
> 

I'm pretty sure v3.14-rc{3,4} will be fine for DT and non-DT plaforms,
as Mikael already reported linux-next works. Non-DT Kirkwood would
behave just as Armada 370/XP platforms, where a SATA phy is not
registered.

> I will take a look at your fix and what we can expect in the next -rc.
> 

AFAICS, the patch is a fix on its own, and should be merged (after we're
happy with it) for v3.14.
diff mbox

Patch

diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index 20a7517..9c1a11d 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -4104,7 +4104,6 @@  static int mv_platform_probe(struct platform_device *pdev)
 	if (!hpriv->port_phys)
 		return -ENOMEM;
 	host->private_data = hpriv;
-	hpriv->n_ports = n_ports;
 	hpriv->board_idx = chip_soc;
 
 	host->iomap = NULL;
@@ -4132,11 +4131,17 @@  static int mv_platform_probe(struct platform_device *pdev)
 			hpriv->port_phys[port] = NULL;
 			if ((rc != -EPROBE_DEFER) && (rc != -ENODEV))
 				dev_warn(&pdev->dev, "error getting phy");
+
+			/* Cleanup only the initialized ports */
+			hpriv->n_ports = port;
 			goto err;
 		} else
 			phy_power_on(hpriv->port_phys[port]);
 	}
 
+	/* All the ports have been initialized */
+	hpriv->n_ports = n_ports;
+
 	/*
 	 * (Re-)program MBUS remapping windows if we are asked to.
 	 */
@@ -4174,7 +4179,7 @@  err:
 		clk_disable_unprepare(hpriv->clk);
 		clk_put(hpriv->clk);
 	}
-	for (port = 0; port < n_ports; port++) {
+	for (port = 0; port < hpriv->n_ports; port++) {
 		if (!IS_ERR(hpriv->port_clks[port])) {
 			clk_disable_unprepare(hpriv->port_clks[port]);
 			clk_put(hpriv->port_clks[port]);