[net-next,5/9] e1000e: Disable ASPM L1 on 82574

Message ID	1336038992-3144-6-git-send-email-jeffrey.t.kirsher@intel.com
State	Accepted, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> To: davem@davemloft.net Cc: Chris Boot <bootc@bootc.net>, netdev@vger.kernel.org, gospo@redhat.com, sassmann@redhat.com, "Wyborny, Carolyn" <carolyn.wyborny@intel.com>, Nix <nix@esperi.org.uk>, Jeff Kirsher <jeffrey.t.kirsher@intel.com> Subject: [net-next 5/9] e1000e: Disable ASPM L1 on 82574 Date: Thu, 3 May 2012 02:56:28 -0700 Message-Id: <1336038992-3144-6-git-send-email-jeffrey.t.kirsher@intel.com> In-Reply-To: <1336038992-3144-1-git-send-email-jeffrey.t.kirsher@intel.com> References: <1336038992-3144-1-git-send-email-jeffrey.t.kirsher@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk

Kirsher, Jeffrey T May 3, 2012, 9:56 a.m. UTC

From: Chris Boot <bootc@bootc.net>

ASPM on the 82574 causes trouble. Currently the driver disables L0s for
this NIC but only disables L1 if the MTU is >1500. This patch simply
causes L1 to be disabled regardless of the MTU setting.

Signed-off-by: Chris Boot <bootc@bootc.net>
Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com>
Cc: Nix <nix@esperi.org.uk>
Link: https://lkml.org/lkml/2012/3/19/362
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/82571.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

Nix May 3, 2012, 10:08 a.m. UTC | #1

On 3 May 2012, Jeff Kirsher spake thusly:

> From: Chris Boot <bootc@bootc.net>
>
> ASPM on the 82574 causes trouble. Currently the driver disables L0s for
> this NIC but only disables L1 if the MTU is >1500. This patch simply
> causes L1 to be disabled regardless of the MTU setting.
>
> Signed-off-by: Chris Boot <bootc@bootc.net>
> Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com>
> Cc: Nix <nix@esperi.org.uk>
> Link: https://lkml.org/lkml/2012/3/19/362
> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

(reminder: this is known not to fix the instance of this problem I am
experiencing, where ASPM is being re-enabled by something even if turned
off via setpci during boot, though it does fix those instances seen by
others where that doesn't happen. I'd have done more printf()-scattering
debugging to see where it's turned back on if it wasn't that this is
happening on an always-on server for which rebooting outside the dead of
night is a long-winded chore...)

FWIW I have also seen -- very rare -- lockups of the same nature on
82574L links in 100MbE mode using non-jumbo frames. However they are far
more common on GbE jumbo-framed links, normally taking less than an hour
to take the link down with a wildly corrupted register set (as shown by
ethtool).

(It's annoying this firmware isn't flashable so we could just *fix* this
bug rather than working around it. :( )


I think I might cheat a bit next and printk_once() the state of ASPM L1
on the errant PCI device from inside the scheduler when it flips from L1
off to L1 on again. At 100 tests per second that should indicate at what
time the thing is turned back on fairly tightly: even if not providing a
direct clue as to which bit of the kernel is doing it, if I combine it
with a set -x in userspace it should at least indicate what bit of the
boot process is happening at the same time. It'll be the weekend before
I can try that though.

Wyborny, Carolyn May 3, 2012, 8:12 p.m. UTC | #2

>-----Original Message-----
>From: Nix [mailto:nix@esperi.org.uk]
>Sent: Thursday, May 03, 2012 3:09 AM
>To: Kirsher, Jeffrey T
>Cc: davem@davemloft.net; Chris Boot; netdev@vger.kernel.org;
>gospo@redhat.com; sassmann@redhat.com; Wyborny, Carolyn
>Subject: Re: [net-next 5/9] e1000e: Disable ASPM L1 on 82574
>
 [..]
>(reminder: this is known not to fix the instance of this problem I am
>experiencing, where ASPM is being re-enabled by something even if turned
>off via setpci during boot, though it does fix those instances seen by
>others where that doesn't happen. I'd have done more printf()-scattering
>debugging to see where it's turned back on if it wasn't that this is
>happening on an always-on server for which rebooting outside the dead of
>night is a long-winded chore...)
>
>FWIW I have also seen -- very rare -- lockups of the same nature on
>82574L links in 100MbE mode using non-jumbo frames. However they are far
>more common on GbE jumbo-framed links, normally taking less than an hour
>to take the link down with a wildly corrupted register set (as shown by
>ethtool).
>
>(It's annoying this firmware isn't flashable so we could just *fix* this
>bug rather than working around it. :( )
>
>
>I think I might cheat a bit next and printk_once() the state of ASPM L1
>on the errant PCI device from inside the scheduler when it flips from L1
>off to L1 on again. At 100 tests per second that should indicate at what
>time the thing is turned back on fairly tightly: even if not providing a
>direct clue as to which bit of the kernel is doing it, if I combine it
>with a set -x in userspace it should at least indicate what bit of the
>boot process is happening at the same time. It'll be the weekend before
>I can try that though.
>
>--
>NULL && (void)

Hello,

It would be good to know why/how your system is re-enabling the setting.  The problem is not solvable in firmware unfortunately and is somewhat platform dependent. MMIO-tracer might be used to try and see when the re-enabling config space is written, but it might be too heavyweight for a live production system.

I am also working on a patch to the driver to detect the condition and then do a slot reset to avoid a whole system reboot.  Would you be willing to test it in your problem system?

Thanks,

Carolyn

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Nix May 3, 2012, 8:17 p.m. UTC | #3

On 3 May 2012, Carolyn Wyborny told this:

> It would be good to know why/how your system is re-enabling the
> setting. The problem is not solvable in firmware unfortunately and is
> somewhat platform dependent. MMIO-tracer might be used to try and see

I entirely forgot about that tool! *Definitely* worth trying.

I'll give it a try this weekend.

> when the re-enabling config space is written, but it might be too
> heavyweight for a live production system.

Given that the re-enabling happens at around the same time as the boot
scripts finish running (it's done by the time I can log in), that's not
going to be a problem. Hence my speculation that it's being re-enabled
when the interface stabilizes (which is, of course, asynchronous) or
something like that.

> I am also working on a patch to the driver to detect the condition and
> then do a slot reset to avoid a whole system reboot. Would you be
> willing to test it in your problem system?

Yes, definitely. The whole-system reboot is irritating because the
system is headless, and with its NICs dead that means a big red switch
to reboot when this problem strikes, which gives me the heebie-jeebies
:)

(Turning off ASPM definitely completely fixes it, so it *is* this bug.
It's just getting the disabling to stick that's proving tricky.)

Nix May 5, 2012, 4:33 p.m. UTC | #4

On 3 May 2012, nix@esperi.org.uk outgrape:

> On 3 May 2012, Carolyn Wyborny told this:
>
>> It would be good to know why/how your system is re-enabling the
>> setting. The problem is not solvable in firmware unfortunately and is
>> somewhat platform dependent. MMIO-tracer might be used to try and see
>
> I entirely forgot about that tool! *Definitely* worth trying.
>
> I'll give it a try this weekend.

Well, mmiotrace was a total flop: massive numbers of unexpected
secondary interrupts and a hard lockup. Still, I've now diagnosed this
bug and it's right up Matthew Garrett's street!

Matthew: the problem here is a server with an 82574L (controlled by the
e1000e driver). This NIC has a hardware bug causing it to lock up in a
way that only a reboot can solve in an hour or two if PCIe ASPM is not
disabled during boot (leaving me with my home directory stuck behind a
dead NIC on a headless machine, most annoying). The driver is attempting
to disable it, but failing.

>> when the re-enabling config space is written, but it might be too
>> heavyweight for a live production system.
>
> Given that the re-enabling happens at around the same time as the boot
> scripts finish running (it's done by the time I can log in), that's not
> going to be a problem. Hence my speculation that it's being re-enabled
> when the interface stabilizes (which is, of course, asynchronous) or
> something like that.

This is wrong. The disable never happens. The BIOS has been told to
enable PCIe ASPM. However, the kernel log says:

May  5 17:06:53 spindle info: [    0.629699]  pci0000:00: Requesting ACPI _OSC control (0x1d)
May  5 17:06:53 spindle info: [    0.629941]  pci0000:00: ACPI _OSC request failed (AE_NOT_FOUND), returned control mask: 0x1d
May  5 17:06:53 spindle info: [    0.630373] ACPI _OSC control for PCIe not granted, disabling ASPM

Unless pcie_aspm=force has been specified on the kernel command line,
this flips aspm_disabled to 1.

The e1000e driver then says (with a bit of extra debugging info I
added):

May  5 17:06:53 spindle info: [    1.248153] e1000e 0000:03:00.0: Disabling ASPM L0s L1
May  5 17:06:53 spindle info: [    1.248393] e1000e 0000:03:00.0: Disabling ASPM via pci_disable_link_state_locked()
May  5 17:06:53 spindle info: [    1.248823] e1000e 0000:03:00.0: aspm disabled, not forcing

i.e. because aspm_disabled is set, pci/pcie/aspm.c refuses to make any
changes at all to ASPM link state, not even to turn *off* ASPM on a
device on which the BIOS turned it on at boot. So ASPM remains enabled
and the NIC eventually locks up.

The question here is how to fix it. It appears that the motherboard or
BIOS on this machine does not grant _OSC control even (especially?) if
you have turned on PCIe ASPM in the BIOS. But perhaps even if _OSC is
not granted you should permit PCIe to be *disabled* by drivers, just not
enabled? (The BIOS appears to be buggy in this area: if you turn off
ASPM, save, and go back into setup, ASPM has turned itself back on
again!)

I'm not sure what the right thing to do is here: I don't know enough
about this area. But it does seem very strange that the only way I have
to turn off PCIe ASPM reliably on this device is to tell the kernel to
forcibly turn it *on*!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Nix May 9, 2012, 2:02 p.m. UTC | #5

On 5 May 2012, nix@esperi.org.uk outgrape:

> The question here is how to fix it. It appears that the motherboard or
> BIOS on this machine does not grant _OSC control even (especially?) if
> you have turned on PCIe ASPM in the BIOS. But perhaps even if _OSC is
> not granted you should permit PCIe to be *disabled* by drivers, just not
> enabled? (The BIOS appears to be buggy in this area: if you turn off
> ASPM, save, and go back into setup, ASPM has turned itself back on
> again!)

This turned out to be me not knowing how to drive the BIOS's deeply
unintuitive configuration program. If I turn PCIe ASPM off in the BIOS,
the kernel does exactly the same as it does in my previous message (i.e.
decides that ASPM is disabled due to the failure of an _OSC request,
then refuses to change the ASPM link state of the e1000e), but since the
BIOS has already disabled ASPM, the card is not in crash-happy mode and
I don't need to force anything off by hand.

But if I turn ASPM on, as reported in my previous message the kernel
promptly bans itself from changing any PCIe ASPM link states whatsoever,
and the e1000e locks up about an hour later.

I presume that

May  5 17:06:53 spindle info: [    0.629699]  pci0000:00: Requesting ACPI _OSC control (0x1d)
May  5 17:06:53 spindle info: [    0.629941]  pci0000:00: ACPI _OSC request failed (AE_NOT_FOUND), returned control mask: 0x1d
May  5 17:06:53 spindle info: [    0.630373] ACPI _OSC control for PCIe not granted, disabling ASPM

is reporting some sort of BIOS bug, but it is at best confusing to have
the boot messages reporting that ASPM is disabled: better perhaps to
describe this as 'leaving ASPM on all devices how the BIOS set it' and
have the e1000e driver emit a giant flaming warning if it spots this
happening on an 82574L with ASPM turned on. (Or, alternatively, permit
ASPM to be turned off when the system is in this state, but not on,
whereupon the existing code in the e1000e driver will do the right
thing. But I don't know if that will break any laptops. This machine is
very much not a laptop.)

> I'm not sure what the right thing to do is here: I don't know enough
> about this area. But it does seem very strange that the only way I have
> to turn off PCIe ASPM reliably on this device is to tell the kernel to
> forcibly turn it *on*!

This is still strange, though it seems that turning ASPM completely off
in the BIOS will also serve.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next,5/9] e1000e: Disable ASPM L1 on 82574

Commit Message

Comments

Patch