diff mbox

pci: fix kexec with power state D3

Message ID 49B37D38.7060304@kernel.org
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Yinghai Lu March 8, 2009, 8:09 a.m. UTC
Impact: second kernel by kexec will have some pci devices working

Found one system with 82575EB, in the kernel that is kexeced, probe igb
failed with -2.

it looks like the same behavior happened on forcedeth.
try to check system_state to make sure if put it on D3

Jesse Brandeburg said that we should do that check in core code instead of
every device driver.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/pci.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rafael J. Wysocki March 8, 2009, 10:08 a.m. UTC | #1
On Sunday 08 March 2009, Yinghai Lu wrote:
> 
> Impact: second kernel by kexec will have some pci devices working
> 
> Found one system with 82575EB, in the kernel that is kexeced, probe igb
> failed with -2.
> 
> it looks like the same behavior happened on forcedeth.
> try to check system_state to make sure if put it on D3

This is not enough, because the PM code doesn't change system_state and
it uses pci_set_power_state too.

> Jesse Brandeburg said that we should do that check in core code instead of
> every device driver.

Well, I'm not really sure.  The drivers are where the bug is.

> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  drivers/pci/pci.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> Index: linux-2.6/drivers/pci/pci.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pci.c
> +++ linux-2.6/drivers/pci/pci.c
> @@ -593,6 +593,14 @@ int pci_set_power_state(struct pci_dev *
>  	if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
>  		return 0;
>  
> +	/*
> +	 * Apparently it is not possible to reinitialise from D3 hot,
> +	 * only put the device into D3 if we really go for poweroff.
> +	 */
> +	if (system_state != SYSTEM_POWER_OFF &&
> +	    (state == PCI_D3hot || state == PCI_D3cold))
> +		return 0;
> +

This breaks suspend/hibernation, doesn't it?  Surely, we want to put devices
into D3 when going for suspend, for example.

That's apart from the fact that the 'state == PCI_D3cold' is redundant.

>  	error = pci_raw_set_power_state(dev, state, true);
>  
>  	if (state > PCI_D0 && platform_pci_power_manageable(dev)) {
> @@ -1124,6 +1132,15 @@ int pci_enable_wake(struct pci_dev *dev,
>  	int error = 0;
>  	bool pme_done = false;
>  
> +	/*
> +	 * Apparently it is not possible to reinitialise from D3 hot,
> +	 * only put the device into D3 if we really go for poweroff.
> +	 * we only need to enable wake when we are going to power off
> +	 */
> +	if (enable && system_state != SYSTEM_POWER_OFF &&
> +	    (state == PCI_D3hot || state == PCI_D3cold))
> +		return 0;
> +
>  	if (enable && !device_may_wakeup(&dev->dev))
>  		return -EINVAL;

I don't like this at all, sorry.

I thought we were supposed to avoid using system_state in such a way,
apart from the other things.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ingo Molnar March 8, 2009, 10:15 a.m. UTC | #2
* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Sunday 08 March 2009, Yinghai Lu wrote:
> > 
> > Impact: second kernel by kexec will have some pci devices working
> > 
> > Found one system with 82575EB, in the kernel that is kexeced, probe igb
> > failed with -2.
> > 
> > it looks like the same behavior happened on forcedeth.
> > try to check system_state to make sure if put it on D3
> 
> This is not enough, because the PM code doesn't change 
> system_state and it uses pci_set_power_state too.
> 
> > Jesse Brandeburg said that we should do that check in core 
> > code instead of every device driver.
> 
> Well, I'm not really sure.  The drivers are where the bug is.

So Yinghai has now written a driver fix and a generic code fix 
as well and both are being rejected pointing to the other side? 

This really sucks for users (who'd be happy with any of the 
patches) and the disagreement needs to be resolved ASAP.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 8, 2009, 10:28 a.m. UTC | #3
On Sunday 08 March 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Sunday 08 March 2009, Yinghai Lu wrote:
> > > 
> > > Impact: second kernel by kexec will have some pci devices working
> > > 
> > > Found one system with 82575EB, in the kernel that is kexeced, probe igb
> > > failed with -2.
> > > 
> > > it looks like the same behavior happened on forcedeth.
> > > try to check system_state to make sure if put it on D3
> > 
> > This is not enough, because the PM code doesn't change 
> > system_state and it uses pci_set_power_state too.
> > 
> > > Jesse Brandeburg said that we should do that check in core 
> > > code instead of every device driver.
> > 
> > Well, I'm not really sure.  The drivers are where the bug is.
> 
> So Yinghai has now written a driver fix and a generic code fix 
> as well and both are being rejected pointing to the other side? 
> 
> This really sucks for users (who'd be happy with any of the 
> patches) and the disagreement needs to be resolved ASAP.

I hope you read my other comment (that you deleted from the reply)?

If we're going to fix this at the core level along the Yinghai's patch lines,
we'll have to introduce a few additional values for system_state and make
suspend and hibernation code use them.

Do you think it's worth the effort, given that only a couple of drivers have
been found to be affected so far?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 8, 2009, 10:33 a.m. UTC | #4
On Sunday 08 March 2009, Rafael J. Wysocki wrote:
> On Sunday 08 March 2009, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > On Sunday 08 March 2009, Yinghai Lu wrote:
> > > > 
> > > > Impact: second kernel by kexec will have some pci devices working
> > > > 
> > > > Found one system with 82575EB, in the kernel that is kexeced, probe igb
> > > > failed with -2.
> > > > 
> > > > it looks like the same behavior happened on forcedeth.
> > > > try to check system_state to make sure if put it on D3
> > > 
> > > This is not enough, because the PM code doesn't change 
> > > system_state and it uses pci_set_power_state too.
> > > 
> > > > Jesse Brandeburg said that we should do that check in core 
> > > > code instead of every device driver.
> > > 
> > > Well, I'm not really sure.  The drivers are where the bug is.
> > 
> > So Yinghai has now written a driver fix and a generic code fix 
> > as well and both are being rejected pointing to the other side? 
> > 
> > This really sucks for users (who'd be happy with any of the 
> > patches) and the disagreement needs to be resolved ASAP.
> 
> I hope you read my other comment (that you deleted from the reply)?
> 
> If we're going to fix this at the core level along the Yinghai's patch lines,
> we'll have to introduce a few additional values for system_state and make
> suspend and hibernation code use them.
> 
> Do you think it's worth the effort, given that only a couple of drivers have
> been found to be affected so far?

And, quite frankly, I'm not sure if users will be happy with the $subject
patch, because it _really_ breaks things (well, the kexec users who don't use
suspend might be, but surely suspend users who don't use kexec won't).

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ingo Molnar March 8, 2009, 11:08 a.m. UTC | #5
* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Sunday 08 March 2009, Rafael J. Wysocki wrote:
> > On Sunday 08 March 2009, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > 
> > > > On Sunday 08 March 2009, Yinghai Lu wrote:
> > > > > 
> > > > > Impact: second kernel by kexec will have some pci devices working
> > > > > 
> > > > > Found one system with 82575EB, in the kernel that is kexeced, probe igb
> > > > > failed with -2.
> > > > > 
> > > > > it looks like the same behavior happened on forcedeth.
> > > > > try to check system_state to make sure if put it on D3
> > > > 
> > > > This is not enough, because the PM code doesn't change 
> > > > system_state and it uses pci_set_power_state too.
> > > > 
> > > > > Jesse Brandeburg said that we should do that check in core 
> > > > > code instead of every device driver.
> > > > 
> > > > Well, I'm not really sure.  The drivers are where the bug is.
> > > 
> > > So Yinghai has now written a driver fix and a generic code fix 
> > > as well and both are being rejected pointing to the other side? 
> > > 
> > > This really sucks for users (who'd be happy with any of the 
> > > patches) and the disagreement needs to be resolved ASAP.
> > 
> > I hope you read my other comment (that you deleted from the reply)?
> > 
> > If we're going to fix this at the core level along the Yinghai's patch lines,
> > we'll have to introduce a few additional values for system_state and make
> > suspend and hibernation code use them.
> > 
> > Do you think it's worth the effort, given that only a couple of drivers have
> > been found to be affected so far?
> 
> And, quite frankly, I'm not sure if users will be happy with 
> the $subject patch, because it _really_ breaks things (well, 
> the kexec users who don't use suspend might be, but surely 
> suspend users who don't use kexec won't).

Please note that i havent reviewed the patches and i did not 
take any sides in the discussion - i just flagged the maintainer 
ping-pong. As long as we pick one of the patches (or a third 
one) within a bound amount of time we should be fine :)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Barnes March 20, 2009, 1:49 a.m. UTC | #6
> > And, quite frankly, I'm not sure if users will be happy with 
> > the $subject patch, because it _really_ breaks things (well, 
> > the kexec users who don't use suspend might be, but surely 
> > suspend users who don't use kexec won't).
> 
> Please note that i havent reviewed the patches and i did not 
> take any sides in the discussion - i just flagged the maintainer 
> ping-pong. As long as we pick one of the patches (or a third 
> one) within a bound amount of time we should be fine :)

I'll defer to Rafael here; he's been working the most in this area.
The changelog wasn't very complete for the original patch, but it
sounds like in the kexec case the newly booted kernel will get an igb
device in D3 which it can't handle?  That really does sound like a
driver bug, not something we should mess with in the core.
Rafael J. Wysocki March 20, 2009, 11:29 a.m. UTC | #7
On Friday 20 March 2009, Jesse Barnes wrote:
> > > And, quite frankly, I'm not sure if users will be happy with 
> > > the $subject patch, because it _really_ breaks things (well, 
> > > the kexec users who don't use suspend might be, but surely 
> > > suspend users who don't use kexec won't).
> > 
> > Please note that i havent reviewed the patches and i did not 
> > take any sides in the discussion - i just flagged the maintainer 
> > ping-pong. As long as we pick one of the patches (or a third 
> > one) within a bound amount of time we should be fine :)
> 
> I'll defer to Rafael here; he's been working the most in this area.
> The changelog wasn't very complete for the original patch, but it
> sounds like in the kexec case the newly booted kernel will get an igb
> device in D3 which it can't handle?  That really does sound like a
> driver bug, not something we should mess with in the core.

I have already posted an alternative patch for igb that's been reported to
fix the problem.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -593,6 +593,14 @@  int pci_set_power_state(struct pci_dev *
 	if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
 		return 0;
 
+	/*
+	 * Apparently it is not possible to reinitialise from D3 hot,
+	 * only put the device into D3 if we really go for poweroff.
+	 */
+	if (system_state != SYSTEM_POWER_OFF &&
+	    (state == PCI_D3hot || state == PCI_D3cold))
+		return 0;
+
 	error = pci_raw_set_power_state(dev, state, true);
 
 	if (state > PCI_D0 && platform_pci_power_manageable(dev)) {
@@ -1124,6 +1132,15 @@  int pci_enable_wake(struct pci_dev *dev,
 	int error = 0;
 	bool pme_done = false;
 
+	/*
+	 * Apparently it is not possible to reinitialise from D3 hot,
+	 * only put the device into D3 if we really go for poweroff.
+	 * we only need to enable wake when we are going to power off
+	 */
+	if (enable && system_state != SYSTEM_POWER_OFF &&
+	    (state == PCI_D3hot || state == PCI_D3cold))
+		return 0;
+
 	if (enable && !device_may_wakeup(&dev->dev))
 		return -EINVAL;