[2/2] PCI: Warn if power_state != PCI_D0 when write MSI message

Submitted by Yijing Wang on Sept. 12, 2013, 8:20 a.m.

Details

Message ID 1378974012-13364-1-git-send-email-wangyijing@huawei.com
State Rejected
Headers show

Commit Message

Yijing Wang Sept. 12, 2013, 8:20 a.m.
When we setup MSI, if device power state is not equal to PCI_D0,
system will silently ignore writing MSI message, but pci_enable_msi()
still return 0 which seems setup successfully. So we should warn here
to help diagnose this issue.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/msi.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Comments

Bjorn Helgaas Sept. 27, 2013, 9:03 p.m.
[+cc Ben]

On Thu, Sep 12, 2013 at 2:20 AM, Yijing Wang <wangyijing@huawei.com> wrote:
> When we setup MSI, if device power state is not equal to PCI_D0,
> system will silently ignore writing MSI message, but pci_enable_msi()
> still return 0 which seems setup successfully. So we should warn here
> to help diagnose this issue.
>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  drivers/pci/msi.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index aca7578..25ed59d 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -291,6 +291,8 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  {
>         if (entry->dev->current_state != PCI_D0) {
>                 /* Don't touch the hardware now */
> +               dev_warn(&entry->dev->dev,
> +                       "current_state != PCI_D0, ignore writing MSI message!\n");

Is there a real problem here?  If there is a real problem, a printk
doesn't help fix it.  If there's no problem, I don't see the point of
this printk.

I would expect that if the device is not in D0, we should remember the
hardware updates that need to be made, and if the device returns to
D0, we should apply the updates then.  If that's the case this is not
an error and we shouldn't warn about it.

Bjorn

>         } else if (entry->msi_attrib.is_msix) {
>                 void __iomem *base;
>                 base = entry->mask_base +
> --
> 1.7.1
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Sept. 28, 2013, 2:28 a.m.
On Fri, 2013-09-27 at 15:03 -0600, Bjorn Helgaas wrote:
> [+cc Ben]
> 
> On Thu, Sep 12, 2013 at 2:20 AM, Yijing Wang <wangyijing@huawei.com> wrote:
> > When we setup MSI, if device power state is not equal to PCI_D0,
> > system will silently ignore writing MSI message, but pci_enable_msi()
> > still return 0 which seems setup successfully. So we should warn here
> > to help diagnose this issue.
> >
> > Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> > ---
> >  drivers/pci/msi.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> > index aca7578..25ed59d 100644
> > --- a/drivers/pci/msi.c
> > +++ b/drivers/pci/msi.c
> > @@ -291,6 +291,8 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> >  {
> >         if (entry->dev->current_state != PCI_D0) {
> >                 /* Don't touch the hardware now */
> > +               dev_warn(&entry->dev->dev,
> > +                       "current_state != PCI_D0, ignore writing MSI message!\n");
> 
> Is there a real problem here?  If there is a real problem, a printk
> doesn't help fix it.  If there's no problem, I don't see the point of
> this printk.
> 
> I would expect that if the device is not in D0, we should remember the
> hardware updates that need to be made, and if the device returns to
> D0, we should apply the updates then.  If that's the case this is not
> an error and we shouldn't warn about it.
[...]

During system suspend all IRQ affinities are changed to the boot CPU as
other CPUs are disabled, and then I think they are reverted during
system resume.  This certainly used to be done while most devices were
suspended.  That's why we have to check for the device power state here.
If suspend/resume is still done in the same order then we shouldn't log
a warning about this.

Yijing, why not add a check for this in pci_enable_msi() itself if you
think it's that important to warn about?

Ben.
Yijing Wang Sept. 29, 2013, 1:32 a.m.
>> @@ -291,6 +291,8 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>>  {
>>         if (entry->dev->current_state != PCI_D0) {
>>                 /* Don't touch the hardware now */
>> +               dev_warn(&entry->dev->dev,
>> +                       "current_state != PCI_D0, ignore writing MSI message!\n");
> 
> Is there a real problem here?  If there is a real problem, a printk
> doesn't help fix it.  If there's no problem, I don't see the point of
> this printk.
> 
> I would expect that if the device is not in D0, we should remember the
> hardware updates that need to be made, and if the device returns to
> D0, we should apply the updates then.  If that's the case this is not
> an error and we shouldn't warn about it.

My colleagues found some msi irqs can not work normally in their driver which works
normally in SLES11 SP1(2.6.32 kernel). I review their code and found that they forget to
call pci_enable_device(), so their device always in PCI_UNKNOWN state despite the really PCI Power
State in PM register is PCI_D0 state. In this case, users call pci_enable_msi() always return success,
but actually, nothing was wrote to the MSI registers.

So I think we should warn here, if someone use pci_enable_msi(), but their devices are not in PCI_D0.

Thanks!
Yijing.

> 
>>         } else if (entry->msi_attrib.is_msix) {
>>                 void __iomem *base;
>>                 base = entry->mask_base +
>> --
>> 1.7.1
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
>
Yijing Wang Sept. 29, 2013, 1:38 a.m.
>> I would expect that if the device is not in D0, we should remember the
>> hardware updates that need to be made, and if the device returns to
>> D0, we should apply the updates then.  If that's the case this is not
>> an error and we shouldn't warn about it.
> [...]
> 
> During system suspend all IRQ affinities are changed to the boot CPU as
> other CPUs are disabled, and then I think they are reverted during
> system resume.  This certainly used to be done while most devices were
> suspended.  That's why we have to check for the device power state here.
> If suspend/resume is still done in the same order then we shouldn't log
> a warning about this.
> 
> Yijing, why not add a check for this in pci_enable_msi() itself if you
> think it's that important to warn about?

This is a  good idea, better than just warn a inconspicuous message.

Thanks!
Yijing.

Patch hide | download patch | download mbox

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index aca7578..25ed59d 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -291,6 +291,8 @@  void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
 	if (entry->dev->current_state != PCI_D0) {
 		/* Don't touch the hardware now */
+		dev_warn(&entry->dev->dev,
+			"current_state != PCI_D0, ignore writing MSI message!\n");
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base;
 		base = entry->mask_base +