diff mbox

[part1,1/4] PCI, pciehp: Turn on link again after power off the slot

Message ID 1346622458-30595-2-git-send-email-yinghai@kernel.org
State Rejected
Headers show

Commit Message

Yinghai Lu Sept. 2, 2012, 9:47 p.m. UTC
During debugging hotplug surpise support, found slot status reg does not report
present status anymore after removing the card. That present bit
does not get cleared even that card is removed. So no interrupt is generated
later after reinsert the card.

That problem is caused by commit:
| commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
|
|    PCI: pciehp: Disable/enable link during slot power off/on

Try turn on link after power off.
With the fix present bit will report correctly, also still avoid the aer
during the power off.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/hotplug/pciehp_hpc.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Comments

Prarit Bhargava Sept. 6, 2012, 2:34 p.m. UTC | #1
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: stable@vger.kernel.org

>diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
>index 13b2eaf..7cee8db 100644
>--- a/drivers/pci/hotplug/pciehp_hpc.c
>+++ b/drivers/pci/hotplug/pciehp_hpc.c
>@@ -631,6 +631,8 @@ int pciehp_power_off_slot(struct slot * slot)
> 	slot_cmd = POWER_OFF;
> 	cmd_mask = PCI_EXP_SLTCTL_PCC;
> 	retval = pcie_write_cmd(ctrl, slot_cmd, cmd_mask);
>+	/* need to enable link again for present bit report */
>+	pciehp_link_enable(ctrl);

You're not checking the return value of pciehp_link_enable().  I wonder if
enabling the link is the only way to get the present bit to report correctly...

Also, should the pciehp_link_enable() be called in pciehp_power_on_slot() if
we already reenable it on the power off?

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Sept. 6, 2012, 5:30 p.m. UTC | #2
On Thu, Sep 6, 2012 at 7:34 AM, Prarit Bhargava <prarit@redhat.com> wrote:
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Cc: stable@vger.kernel.org
>
>>diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
>>index 13b2eaf..7cee8db 100644
>>--- a/drivers/pci/hotplug/pciehp_hpc.c
>>+++ b/drivers/pci/hotplug/pciehp_hpc.c
>>@@ -631,6 +631,8 @@ int pciehp_power_off_slot(struct slot * slot)
>>       slot_cmd = POWER_OFF;
>>       cmd_mask = PCI_EXP_SLTCTL_PCC;
>>       retval = pcie_write_cmd(ctrl, slot_cmd, cmd_mask);
>>+      /* need to enable link again for present bit report */
>>+      pciehp_link_enable(ctrl);
>
> You're not checking the return value of pciehp_link_enable().  I wonder if
> enabling the link is the only way to get the present bit to report correctly...

that is silicon problem. and no one noticed that before.

>
> Also, should the pciehp_link_enable() be called in pciehp_power_on_slot() if
> we already reenable it on the power off?

we call that in pciehp_power_on_slot for another case:
BIOS could have link disable on pcie slot if there is no card plugged
in that slot.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Sept. 17, 2012, 11:06 p.m. UTC | #3
On Thu, Sep 6, 2012 at 11:30 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, Sep 6, 2012 at 7:34 AM, Prarit Bhargava <prarit@redhat.com> wrote:
>> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> Cc: stable@vger.kernel.org
>>
>>>diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
>>>index 13b2eaf..7cee8db 100644
>>>--- a/drivers/pci/hotplug/pciehp_hpc.c
>>>+++ b/drivers/pci/hotplug/pciehp_hpc.c
>>>@@ -631,6 +631,8 @@ int pciehp_power_off_slot(struct slot * slot)
>>>       slot_cmd = POWER_OFF;
>>>       cmd_mask = PCI_EXP_SLTCTL_PCC;
>>>       retval = pcie_write_cmd(ctrl, slot_cmd, cmd_mask);
>>>+      /* need to enable link again for present bit report */
>>>+      pciehp_link_enable(ctrl);
>>
>> You're not checking the return value of pciehp_link_enable().  I wonder if
>> enabling the link is the only way to get the present bit to report correctly...
>
> that is silicon problem. and no one noticed that before.
>
>>
>> Also, should the pciehp_link_enable() be called in pciehp_power_on_slot() if
>> we already reenable it on the power off?
>
> we call that in pciehp_power_on_slot for another case:
> BIOS could have link disable on pcie slot if there is no card plugged
> in that slot.

So now we have two situations that should be the same, but they're not:

  1) Slot is empty at boot, and BIOS leaves link disabled.  Link
remains disabled until pciehp_power_on_slot() enables it.
  2) We hot-remove a card, so the slot is now empty, but
pciehp_power_off_slot() leaves the link enabled.  Link stays *enabled*
until another disable/enable sequence in pciehp_power_off_slot().

Why should these be different?  In both cases the slot is empty, but
in case 1 the link is disabled, while in case 2 it is enabled.  If we
can detect a hot-add in case 1, why can't we detect it after a
hot-remove even if the link is disabled?

Or maybe case 1 is broken on this box, and hot-add to that slot
doesn't work correctly either?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Sept. 17, 2012, 11:25 p.m. UTC | #4
On Mon, Sep 17, 2012 at 4:06 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> So now we have two situations that should be the same, but they're not:
>
>   1) Slot is empty at boot, and BIOS leaves link disabled.  Link
> remains disabled until pciehp_power_on_slot() enables it.
>   2) We hot-remove a card, so the slot is now empty, but
> pciehp_power_off_slot() leaves the link enabled.  Link stays *enabled*
> until another disable/enable sequence in pciehp_power_off_slot().
>
> Why should these be different?  In both cases the slot is empty, but
> in case 1 the link is disabled, while in case 2 it is enabled.  If we
> can detect a hot-add in case 1, why can't we detect it after a
> hot-remove even if the link is disabled?

silicon problem. assume that case is not well tested.

we filed the bug to vendor, and they can not explain that so far.
system hotplug fpga report correct present status to the chipset.
looks like the silicon get confused somehow.

>
> Or maybe case 1 is broken on this box, and hot-add to that slot
> doesn't work correctly either?

works. with and without this patch.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Sept. 17, 2012, 11:52 p.m. UTC | #5
On Mon, Sep 17, 2012 at 5:25 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Mon, Sep 17, 2012 at 4:06 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>
>> So now we have two situations that should be the same, but they're not:
>>
>>   1) Slot is empty at boot, and BIOS leaves link disabled.  Link
>> remains disabled until pciehp_power_on_slot() enables it.
>>   2) We hot-remove a card, so the slot is now empty, but
>> pciehp_power_off_slot() leaves the link enabled.  Link stays *enabled*
>> until another disable/enable sequence in pciehp_power_off_slot().
>>
>> Why should these be different?  In both cases the slot is empty, but
>> in case 1 the link is disabled, while in case 2 it is enabled.  If we
>> can detect a hot-add in case 1, why can't we detect it after a
>> hot-remove even if the link is disabled?
>
> silicon problem. assume that case is not well tested.
>
> we filed the bug to vendor, and they can not explain that so far.
> system hotplug fpga report correct present status to the chipset.
> looks like the silicon get confused somehow.
>
>>
>> Or maybe case 1 is broken on this box, and hot-add to that slot
>> doesn't work correctly either?
>
> works. with and without this patch.

This patch is just black magic then.  I don't want to intentionally
treat a slot that's empty at boot differently than one that is empty
after a hot-remove.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Sept. 17, 2012, 11:56 p.m. UTC | #6
On Mon, Sep 17, 2012 at 4:52 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> This patch is just black magic then.  I don't want to intentionally
> treat a slot that's empty at boot differently than one that is empty
> after a hot-remove.

sure. Please drop this one.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 13b2eaf..7cee8db 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -631,6 +631,8 @@  int pciehp_power_off_slot(struct slot * slot)
 	slot_cmd = POWER_OFF;
 	cmd_mask = PCI_EXP_SLTCTL_PCC;
 	retval = pcie_write_cmd(ctrl, slot_cmd, cmd_mask);
+	/* need to enable link again for present bit report */
+	pciehp_link_enable(ctrl);
 	if (retval) {
 		ctrl_err(ctrl, "Write command failed!\n");
 		return retval;