PCI/AER: Prevent runtime power management during recovery

Message ID 20180611222918.1708-1-keith.busch@intel.com
State New
Delegated to: Bjorn Helgaas
Headers show
Series
  • PCI/AER: Prevent runtime power management during recovery
Related show

Commit Message

Keith Busch June 11, 2018, 10:29 p.m.
A bridge that supports D3 but not hotplug will be subject to runtime
power management placing it in a non-operation power state if it doesn't
have any devices attached. This patch will prevent this power management
during error recovery so that the rescan at the end may be successful.

Cc: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/err.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Sinan Kaya June 12, 2018, 4:40 a.m. | #1
On 6/11/2018 6:29 PM, Keith Busch wrote:
> A bridge that supports D3 but not hotplug will be subject to runtime
> power management placing it in a non-operation power state if it doesn't
> have any devices attached. This patch will prevent this power management
> during error recovery so that the rescan at the end may be successful.

If there is no card connected, why would the bridge observe a fatal error?
Oza Pawandeep June 12, 2018, 7:51 a.m. | #2
On 2018-06-12 10:10, Sinan Kaya wrote:
> On 6/11/2018 6:29 PM, Keith Busch wrote:
>> A bridge that supports D3 but not hotplug will be subject to runtime
>> power management placing it in a non-operation power state if it 
>> doesn't
>> have any devices attached. This patch will prevent this power 
>> management
>> during error recovery so that the rescan at the end may be successful.
> 
> If there is no card connected, why would the bridge observe a fatal 
> error?

fatal error could be coming from anywhere let us say below RootPort..
and RP observes it and decides to take the tree down....
Is that the case Keith is talking about ?

Why will re-enumeration be a problem even if runtime PM is active ?
I assume that enumeration will get bridge out of D3.
Keith Busch June 12, 2018, 2:44 p.m. | #3
On Tue, Jun 12, 2018 at 01:21:53PM +0530, poza@codeaurora.org wrote:
> On 2018-06-12 10:10, Sinan Kaya wrote:
> > On 6/11/2018 6:29 PM, Keith Busch wrote:
> > > A bridge that supports D3 but not hotplug will be subject to runtime
> > > power management placing it in a non-operation power state if it
> > > doesn't
> > > have any devices attached. This patch will prevent this power
> > > management
> > > during error recovery so that the rescan at the end may be successful.
> > 
> > If there is no card connected, why would the bridge observe a fatal
> > error?
> 
> fatal error could be coming from anywhere let us say below RootPort..
> and RP observes it and decides to take the tree down....
> Is that the case Keith is talking about ?

Right, the err fatal handling removes all the devices below a bridge,
making that bridge allowed for run time d3.
 
> Why will re-enumeration be a problem even if runtime PM is active ?
> I assume that enumeration will get bridge out of D3.

That doesn't seem to be the case.
Oza Pawandeep June 12, 2018, 3:16 p.m. | #4
On 2018-06-12 20:14, Keith Busch wrote:
> On Tue, Jun 12, 2018 at 01:21:53PM +0530, poza@codeaurora.org wrote:
>> On 2018-06-12 10:10, Sinan Kaya wrote:
>> > On 6/11/2018 6:29 PM, Keith Busch wrote:
>> > > A bridge that supports D3 but not hotplug will be subject to runtime
>> > > power management placing it in a non-operation power state if it
>> > > doesn't
>> > > have any devices attached. This patch will prevent this power
>> > > management
>> > > during error recovery so that the rescan at the end may be successful.
>> >
>> > If there is no card connected, why would the bridge observe a fatal
>> > error?
>> 
>> fatal error could be coming from anywhere let us say below RootPort..
>> and RP observes it and decides to take the tree down....
>> Is that the case Keith is talking about ?
> 
> Right, the err fatal handling removes all the devices below a bridge,
> making that bridge allowed for run time d3.
> 
>> Why will re-enumeration be a problem even if runtime PM is active ?
>> I assume that enumeration will get bridge out of D3.
> 
> That doesn't seem to be the case.

reset_link(udev, service); will initiate SBR, and which should do hot 
reset the bridge and bring it out of D3 !
but it seems even SBR is not doing it for you.

Although I am not sure if the EP has to be designed to reset its config 
space upon SBR.. even if they are, some just might not do it all 
correctly !

The code looks okay to me anyway.
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>

Patch

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index f7ce0cb0b0b7..247b6ce14f0d 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -16,6 +16,7 @@ 
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/aer.h>
+#include <linux/pm_runtime.h>
 #include "portdrv.h"
 #include "../pci.h"
 
@@ -294,6 +295,7 @@  void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
 		udev = dev->bus->self;
 
 	parent = udev->subordinate;
+	pm_runtime_forbid(&udev->dev);
 	pci_lock_rescan_remove();
 	list_for_each_entry_safe_reverse(pdev, temp, &parent->devices,
 					 bus_list) {
@@ -329,6 +331,7 @@  void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
 	}
 
 	pci_unlock_rescan_remove();
+	pm_runtime_allow(&udev->dev);
 }
 
 /**