diff mbox

[BUGFIX,3/4] PCI/PM: Fix config reg access for D3cold and bridge suspending

Message ID 1343975435-25469-4-git-send-email-ying.huang@intel.com
State Superseded
Headers show

Commit Message

Huang, Ying Aug. 3, 2012, 6:30 a.m. UTC
This patch fixes the following bug:

http://marc.info/?l=linux-pci&m=134338059022620&w=2

Where lspci does not work properly if a device and the corresponding
parent bridge (such as PCIe port) is suspended.  This is because the
device configuration space registers will be not accessible if the
corresponding parent bridge is suspended or the device is put into
D3cold state.

To solve the issue, the bridge/PCIe port connected to the device is
put into active state before read/write configuration space registers.
If the device is in D3cold state, it will be put into active state
too.

To avoid resume/suspend PCIe port for each configuration register
read/write, a small delay is added before the PCIe port to go
suspended.

Reported-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/pci/pci-sysfs.c        |   37 +++++++++++++++++++++++++++++++++++++
 drivers/pci/pcie/portdrv_pci.c |    9 +++++++++
 2 files changed, 46 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alan Stern Aug. 3, 2012, 2:46 p.m. UTC | #1
On Fri, 3 Aug 2012, Huang Ying wrote:

> This patch fixes the following bug:
> 
> http://marc.info/?l=linux-pci&m=134338059022620&w=2
> 
> Where lspci does not work properly if a device and the corresponding
> parent bridge (such as PCIe port) is suspended.  This is because the
> device configuration space registers will be not accessible if the
> corresponding parent bridge is suspended or the device is put into
> D3cold state.
> 
> To solve the issue, the bridge/PCIe port connected to the device is
> put into active state before read/write configuration space registers.
> If the device is in D3cold state, it will be put into active state
> too.
> 
> To avoid resume/suspend PCIe port for each configuration register
> read/write, a small delay is added before the PCIe port to go
> suspended.


> +static void
> +pci_config_pm_runtime_put(struct pci_dev *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct device *parent = dev->parent;
> +
> +	pm_runtime_put(dev);
> +	if (parent)
> +		pm_runtime_put(parent);
> +}

This is just the sort of thing Rafael and I have been talking about.  
Why do an asynchronous put, going to all the trouble of using the 
workqueue, if the idle routine is just going to call 
pm_schedule_suspend()?

Why not call pm_runtime_put_sync() instead?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Aug. 4, 2012, 9:37 p.m. UTC | #2
On Friday, August 03, 2012, Alan Stern wrote:
> On Fri, 3 Aug 2012, Huang Ying wrote:
> 
> > This patch fixes the following bug:
> > 
> > http://marc.info/?l=linux-pci&m=134338059022620&w=2
> > 
> > Where lspci does not work properly if a device and the corresponding
> > parent bridge (such as PCIe port) is suspended.  This is because the
> > device configuration space registers will be not accessible if the
> > corresponding parent bridge is suspended or the device is put into
> > D3cold state.
> > 
> > To solve the issue, the bridge/PCIe port connected to the device is
> > put into active state before read/write configuration space registers.
> > If the device is in D3cold state, it will be put into active state
> > too.
> > 
> > To avoid resume/suspend PCIe port for each configuration register
> > read/write, a small delay is added before the PCIe port to go
> > suspended.
> 
> 
> > +static void
> > +pci_config_pm_runtime_put(struct pci_dev *pdev)
> > +{
> > +	struct device *dev = &pdev->dev;
> > +	struct device *parent = dev->parent;
> > +
> > +	pm_runtime_put(dev);
> > +	if (parent)
> > +		pm_runtime_put(parent);
> > +}
> 
> This is just the sort of thing Rafael and I have been talking about.  
> Why do an asynchronous put, going to all the trouble of using the 
> workqueue, if the idle routine is just going to call 
> pm_schedule_suspend()?

If that's PCI, it will call pm_runtime_suspend().  That probably _should_ be
pm_schedule_suspend(), but it isn't at the moment.

> Why not call pm_runtime_put_sync() instead?

I guess because the caller doesn't care whether or not the devices will be
suspended immediately and we seem to have agreed already that the added
workqueue overhead is minimal.

If the _idle() routine were to call pm_schedule_suspend(), though, I'd
agree that the overhead would be absolutely unnecessary.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Aug. 4, 2012, 9:44 p.m. UTC | #3
On Saturday, August 04, 2012, Rafael J. Wysocki wrote:
> On Friday, August 03, 2012, Alan Stern wrote:
> > On Fri, 3 Aug 2012, Huang Ying wrote:
> > 
> > > This patch fixes the following bug:
> > > 
> > > http://marc.info/?l=linux-pci&m=134338059022620&w=2
> > > 
> > > Where lspci does not work properly if a device and the corresponding
> > > parent bridge (such as PCIe port) is suspended.  This is because the
> > > device configuration space registers will be not accessible if the
> > > corresponding parent bridge is suspended or the device is put into
> > > D3cold state.
> > > 
> > > To solve the issue, the bridge/PCIe port connected to the device is
> > > put into active state before read/write configuration space registers.
> > > If the device is in D3cold state, it will be put into active state
> > > too.
> > > 
> > > To avoid resume/suspend PCIe port for each configuration register
> > > read/write, a small delay is added before the PCIe port to go
> > > suspended.
> > 
> > 
> > > +static void
> > > +pci_config_pm_runtime_put(struct pci_dev *pdev)
> > > +{
> > > +	struct device *dev = &pdev->dev;
> > > +	struct device *parent = dev->parent;
> > > +
> > > +	pm_runtime_put(dev);
> > > +	if (parent)
> > > +		pm_runtime_put(parent);
> > > +}
> > 
> > This is just the sort of thing Rafael and I have been talking about.  
> > Why do an asynchronous put, going to all the trouble of using the 
> > workqueue, if the idle routine is just going to call 
> > pm_schedule_suspend()?
> 
> If that's PCI, it will call pm_runtime_suspend().  That probably _should_ be
> pm_schedule_suspend(), but it isn't at the moment.
> 
> > Why not call pm_runtime_put_sync() instead?
> 
> I guess because the caller doesn't care whether or not the devices will be
> suspended immediately and we seem to have agreed already that the added
> workqueue overhead is minimal.
> 
> If the _idle() routine were to call pm_schedule_suspend(), though, I'd
> agree that the overhead would be absolutely unnecessary.

Sorry, I should have had a closer look at pcie_port_runtime_idle() before
replying.

You're right, pm_runtime_put_sync() should be used for the parent.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -458,6 +458,35 @@  boot_vga_show(struct device *dev, struct
 }
 struct device_attribute vga_attr = __ATTR_RO(boot_vga);
 
+static void
+pci_config_pm_runtime_get(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device *parent = dev->parent;
+
+	if (parent)
+		pm_runtime_get_sync(parent);
+	pm_runtime_get_noresume(dev);
+	/*
+	 * pdev->current_state is set to PCI_D3cold during suspending,
+	 * so wait until suspending completes
+	 */
+	pm_runtime_barrier(dev);
+	if (pdev->current_state == PCI_D3cold)
+		pm_runtime_resume(dev);
+}
+
+static void
+pci_config_pm_runtime_put(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device *parent = dev->parent;
+
+	pm_runtime_put(dev);
+	if (parent)
+		pm_runtime_put(parent);
+}
+
 static ssize_t
 pci_read_config(struct file *filp, struct kobject *kobj,
 		struct bin_attribute *bin_attr,
@@ -484,6 +513,8 @@  pci_read_config(struct file *filp, struc
 		size = count;
 	}
 
+	pci_config_pm_runtime_get(dev);
+
 	if ((off & 1) && size) {
 		u8 val;
 		pci_user_read_config_byte(dev, off, &val);
@@ -529,6 +560,8 @@  pci_read_config(struct file *filp, struc
 		--size;
 	}
 
+	pci_config_pm_runtime_put(dev);
+
 	return count;
 }
 
@@ -549,6 +582,8 @@  pci_write_config(struct file* filp, stru
 		count = size;
 	}
 	
+	pci_config_pm_runtime_get(dev);
+
 	if ((off & 1) && size) {
 		pci_user_write_config_byte(dev, off, data[off - init_off]);
 		off++;
@@ -587,6 +622,8 @@  pci_write_config(struct file* filp, stru
 		--size;
 	}
 
+	pci_config_pm_runtime_put(dev);
+
 	return count;
 }
 
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -140,9 +140,17 @@  static int pcie_port_runtime_resume(stru
 {
 	return 0;
 }
+
+static int pcie_port_runtime_idle(struct device *dev)
+{
+	/* Delay for a short while to prevent too frequent suspend/resume */
+	pm_schedule_suspend(dev, 10);
+	return -EBUSY;
+}
 #else
 #define pcie_port_runtime_suspend	NULL
 #define pcie_port_runtime_resume	NULL
+#define pcie_port_runtime_idle		NULL
 #endif
 
 static const struct dev_pm_ops pcie_portdrv_pm_ops = {
@@ -155,6 +163,7 @@  static const struct dev_pm_ops pcie_port
 	.resume_noirq	= pcie_port_resume_noirq,
 	.runtime_suspend = pcie_port_runtime_suspend,
 	.runtime_resume = pcie_port_runtime_resume,
+	.runtime_idle	= pcie_port_runtime_idle,
 };
 
 #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)