diff mbox

[Oneiric] ASPM: Fix pcie devices with non-pcie children

Message ID 1333381284-120302-1-git-send-email-tim.gardner@canonical.com
State New
Headers show

Commit Message

Tim Gardner April 2, 2012, 3:41 p.m. UTC
From: Matthew Garrett <mjg@redhat.com>

BugLink: http://bugs.launchpad.net/bugs/961482

Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
Some other systems using the pata_jmicron driver fail to boot because no
disks are detected.  Passing pcie_aspm=force on the kernel command line
works around it.

The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking when
ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
always return 0 if aspm is disabled, in order to avoid cases where we
changed ASPM state on pre-PCIe 1.1 devices.

This skipped the secondary function of pcie_aspm_sanity_check which was
to avoid us enabling ASPM on devices that had non-PCIe children, causing
trouble later on.  Move the aspm_disabled check so we continue to honour
that scenario.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
          http://bugs.debian.org/665420

Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
[jn: with more symptoms in log message]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1)

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 drivers/pci/pcie/aspm.c |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

Comments

Andy Whitcroft April 2, 2012, 3:48 p.m. UTC | #1
On Mon, Apr 02, 2012 at 09:41:24AM -0600, Tim Gardner wrote:
> From: Matthew Garrett <mjg@redhat.com>
> 
> BugLink: http://bugs.launchpad.net/bugs/961482
> 
> Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
> Some other systems using the pata_jmicron driver fail to boot because no
> disks are detected.  Passing pcie_aspm=force on the kernel command line
> works around it.
> 
> The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking when
> ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
> always return 0 if aspm is disabled, in order to avoid cases where we
> changed ASPM state on pre-PCIe 1.1 devices.
> 
> This skipped the secondary function of pcie_aspm_sanity_check which was
> to avoid us enabling ASPM on devices that had non-PCIe children, causing
> trouble later on.  Move the aspm_disabled check so we continue to honour
> that scenario.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
>           http://bugs.debian.org/665420
> 
> Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
> Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
> Signed-off-by: Matthew Garrett <mjg@redhat.com>
> Cc: stable@vger.kernel.org
> Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
> Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
> [jn: with more symptoms in log message]
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> (cherry picked from commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1)
> 
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
> ---
>  drivers/pci/pcie/aspm.c |   13 ++++++++++---
>  1 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 56a6966..0ff0182 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -500,9 +500,6 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>  	int pos;
>  	u32 reg32;
>  
> -	if (aspm_disabled)
> -		return 0;
> -
>  	/*
>  	 * Some functions in a slot might not all be PCIe functions,
>  	 * very strange. Disable ASPM for the whole slot
> @@ -511,6 +508,16 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>  		pos = pci_pcie_cap(child);
>  		if (!pos)
>  			return -EINVAL;
> +
> +		/*
> +		 * If ASPM is disabled then we're not going to change
> +		 * the BIOS state. It's safe to continue even if it's a
> +		 * pre-1.1 device
> +		 */
> +
> +		if (aspm_disabled)
> +			continue;
> +
>  		/*
>  		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
>  		 * RBER bit to determine if a function is 1.1 version device

Do we not need the second patch Colin proposed as well to avoid
regressions here?

Colin?

-apw
Tim Gardner April 2, 2012, 4:01 p.m. UTC | #2
On 04/02/2012 09:48 AM, Andy Whitcroft wrote:
> On Mon, Apr 02, 2012 at 09:41:24AM -0600, Tim Gardner wrote:
>> From: Matthew Garrett <mjg@redhat.com>
>>
>> BugLink: http://bugs.launchpad.net/bugs/961482
>>
>> Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
>> Some other systems using the pata_jmicron driver fail to boot because no
>> disks are detected.  Passing pcie_aspm=force on the kernel command line
>> works around it.
>>
>> The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking when
>> ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
>> always return 0 if aspm is disabled, in order to avoid cases where we
>> changed ASPM state on pre-PCIe 1.1 devices.
>>
>> This skipped the secondary function of pcie_aspm_sanity_check which was
>> to avoid us enabling ASPM on devices that had non-PCIe children, causing
>> trouble later on.  Move the aspm_disabled check so we continue to honour
>> that scenario.
>>
>> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
>>           http://bugs.debian.org/665420
>>
>> Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
>> Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
>> Signed-off-by: Matthew Garrett <mjg@redhat.com>
>> Cc: stable@vger.kernel.org
>> Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
>> Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
>> [jn: with more symptoms in log message]
>> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
>> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>> (cherry picked from commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1)
>>
>> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
>> ---
>>  drivers/pci/pcie/aspm.c |   13 ++++++++++---
>>  1 files changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index 56a6966..0ff0182 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -500,9 +500,6 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>>  	int pos;
>>  	u32 reg32;
>>  
>> -	if (aspm_disabled)
>> -		return 0;
>> -
>>  	/*
>>  	 * Some functions in a slot might not all be PCIe functions,
>>  	 * very strange. Disable ASPM for the whole slot
>> @@ -511,6 +508,16 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>>  		pos = pci_pcie_cap(child);
>>  		if (!pos)
>>  			return -EINVAL;
>> +
>> +		/*
>> +		 * If ASPM is disabled then we're not going to change
>> +		 * the BIOS state. It's safe to continue even if it's a
>> +		 * pre-1.1 device
>> +		 */
>> +
>> +		if (aspm_disabled)
>> +			continue;
>> +
>>  		/*
>>  		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
>>  		 * RBER bit to determine if a function is 1.1 version device
> 
> Do we not need the second patch Colin proposed as well to avoid
> regressions here?
> 
> Colin?
> 
> -apw

The stable updates patch that caused the Oneiric/Precise ASPM boot
regression is "PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled".
The patch to correct that regression is "ASPM: Fix pcie devices with
non-pcie children"; contained in Precise but not yet in Oneiric.

rtg
Andy Whitcroft April 2, 2012, 4:15 p.m. UTC | #3
On Mon, Apr 02, 2012 at 09:41:24AM -0600, Tim Gardner wrote:
> From: Matthew Garrett <mjg@redhat.com>
> 
> BugLink: http://bugs.launchpad.net/bugs/961482
> 
> Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
> Some other systems using the pata_jmicron driver fail to boot because no
> disks are detected.  Passing pcie_aspm=force on the kernel command line
> works around it.
> 
> The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking when
> ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
> always return 0 if aspm is disabled, in order to avoid cases where we
> changed ASPM state on pre-PCIe 1.1 devices.
> 
> This skipped the secondary function of pcie_aspm_sanity_check which was
> to avoid us enabling ASPM on devices that had non-PCIe children, causing
> trouble later on.  Move the aspm_disabled check so we continue to honour
> that scenario.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
>           http://bugs.debian.org/665420
> 
> Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
> Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
> Signed-off-by: Matthew Garrett <mjg@redhat.com>
> Cc: stable@vger.kernel.org
> Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
> Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
> [jn: with more symptoms in log message]
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> (cherry picked from commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1)
> 
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
> ---
>  drivers/pci/pcie/aspm.c |   13 ++++++++++---
>  1 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 56a6966..0ff0182 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -500,9 +500,6 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>  	int pos;
>  	u32 reg32;
>  
> -	if (aspm_disabled)
> -		return 0;
> -
>  	/*
>  	 * Some functions in a slot might not all be PCIe functions,
>  	 * very strange. Disable ASPM for the whole slot
> @@ -511,6 +508,16 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>  		pos = pci_pcie_cap(child);
>  		if (!pos)
>  			return -EINVAL;
> +
> +		/*
> +		 * If ASPM is disabled then we're not going to change
> +		 * the BIOS state. It's safe to continue even if it's a
> +		 * pre-1.1 device
> +		 */
> +
> +		if (aspm_disabled)
> +			continue;
> +
>  		/*
>  		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
>  		 * RBER bit to determine if a function is 1.1 version device

Based on the thread discussion this is the right commit.  Looks correct
as per upstream.

Acked-by: Andy Whitcroft <apw@canonical.com>

-apw
Tim Gardner April 2, 2012, 4:19 p.m. UTC | #4

diff mbox

Patch

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 56a6966..0ff0182 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -500,9 +500,6 @@  static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 	int pos;
 	u32 reg32;
 
-	if (aspm_disabled)
-		return 0;
-
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
 	 * very strange. Disable ASPM for the whole slot
@@ -511,6 +508,16 @@  static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		pos = pci_pcie_cap(child);
 		if (!pos)
 			return -EINVAL;
+
+		/*
+		 * If ASPM is disabled then we're not going to change
+		 * the BIOS state. It's safe to continue even if it's a
+		 * pre-1.1 device
+		 */
+
+		if (aspm_disabled)
+			continue;
+
 		/*
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device