diff mbox

[v2] PCI: pciehp: Check link state before accessing device during removal

Message ID 546E7120.5080505@gmail.com
State Changes Requested
Headers show

Commit Message

Rajat Jain Nov. 20, 2014, 10:54 p.m. UTC
While removing a card, we can't assume the presence to mean that the
access to card is OK. That is because the cause of removal may be a
link down event, and the card may still be physically present. Thus,
instead of presence, use the link state to decide whether or not it is
OK to access the card devices.

Here are the problem symptoms:
During the removal of a card due to link down, sometimes the following
error is seen (because pciehp_unconfigure_device() reads 0xFF from
bridge control register as the link is down, which cause it to assume
that the VGA bit is set):

pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
pciehp 0000:21:05.0:pcie24: Data Link Layer State change
pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0

Ofcourse, when the link comes back up, the device addition fails too:

pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
pciehp 0000:21:05.0:pcie24: Data Link Layer State change
pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00

The problem is not seen with this patch applied. The device removal and
insertion works as expected.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
---
v2: Use the already initialized "ctrl" instead of "p_slot->ctrl"

 drivers/pci/hotplug/pciehp_pci.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Rajat Jain Dec. 8, 2014, 6:15 a.m. UTC | #1
Hello Bjorn,

Just checking if you got a chance to look at this.

Thanks,

Rajat


On Thu, Nov 20, 2014 at 2:54 PM, Rajat Jain <rajatxjain@gmail.com> wrote:
> While removing a card, we can't assume the presence to mean that the
> access to card is OK. That is because the cause of removal may be a
> link down event, and the card may still be physically present. Thus,
> instead of presence, use the link state to decide whether or not it is
> OK to access the card devices.
>
> Here are the problem symptoms:
> During the removal of a card due to link down, sometimes the following
> error is seen (because pciehp_unconfigure_device() reads 0xFF from
> bridge control register as the link is down, which cause it to assume
> that the VGA bit is set):
>
> pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
> pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
> pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
> pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0
>
> Ofcourse, when the link comes back up, the device addition fails too:
>
> pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
> pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
> pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
> pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
> pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00
>
> The problem is not seen with this patch applied. The device removal and
> insertion works as expected.
>
> Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
> Signed-off-by: Rajat Jain <rajatjain@juniper.net>
> Signed-off-by: Guenter Roeck <groeck@juniper.net>
> ---
> v2: Use the already initialized "ctrl" instead of "p_slot->ctrl"
>
>  drivers/pci/hotplug/pciehp_pci.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
> index 9e69403..911f85b 100644
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -77,7 +77,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>  {
>         int rc = 0;
>         u8 bctl = 0;
> -       u8 presence = 0;
> +       bool link_active = false;
>         struct pci_dev *dev, *temp;
>         struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
>         u16 command;
> @@ -85,7 +85,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>
>         ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
>                  __func__, pci_domain_nr(parent), parent->number);
> -       pciehp_get_adapter_status(p_slot, &presence);
> +       link_active = pciehp_check_link_active(ctrl);
>
>         pci_lock_rescan_remove();
>
> @@ -98,7 +98,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>         list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
>                                          bus_list) {
>                 pci_dev_get(dev);
> -               if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
> +               if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && link_active) {
>                         pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
>                         if (bctl & PCI_BRIDGE_CTL_VGA) {
>                                 ctrl_err(ctrl,
> @@ -114,7 +114,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>                  * Ensure that no new Requests will be generated from
>                  * the device.
>                  */
> -               if (presence) {
> +               if (link_active) {
>                         pci_read_config_word(dev, PCI_COMMAND, &command);
>                         command &= ~(PCI_COMMAND_MASTER | PCI_COMMAND_SERR);
>                         command |= PCI_COMMAND_INTX_DISABLE;
> --
> 1.7.9.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 11, 2014, 12:26 a.m. UTC | #2
On Thu, Nov 20, 2014 at 02:54:24PM -0800, Rajat Jain wrote:
> While removing a card, we can't assume the presence to mean that the
> access to card is OK. That is because the cause of removal may be a
> link down event, and the card may still be physically present. Thus,
> instead of presence, use the link state to decide whether or not it is
> OK to access the card devices.
> 
> Here are the problem symptoms:
> During the removal of a card due to link down, sometimes the following
> error is seen (because pciehp_unconfigure_device() reads 0xFF from
> bridge control register as the link is down, which cause it to assume
> that the VGA bit is set):
> 
> pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
> pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
> pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
> pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0
> 
> Ofcourse, when the link comes back up, the device addition fails too:
> 
> pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
> pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
> pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
> pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
> pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00
> 
> The problem is not seen with this patch applied. The device removal and
> insertion works as expected.
> 
> Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
> Signed-off-by: Rajat Jain <rajatjain@juniper.net>
> Signed-off-by: Guenter Roeck <groeck@juniper.net>
> ---
> v2: Use the already initialized "ctrl" instead of "p_slot->ctrl"
> 
>  drivers/pci/hotplug/pciehp_pci.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
> index 9e69403..911f85b 100644
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -77,7 +77,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>  {
>  	int rc = 0;
>  	u8 bctl = 0;
> -	u8 presence = 0;
> +	bool link_active = false;
>  	struct pci_dev *dev, *temp;
>  	struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
>  	u16 command;
> @@ -85,7 +85,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>  
>  	ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
>  		 __func__, pci_domain_nr(parent), parent->number);
> -	pciehp_get_adapter_status(p_slot, &presence);
> +	link_active = pciehp_check_link_active(ctrl);
>  
>  	pci_lock_rescan_remove();
>  
> @@ -98,7 +98,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>  	list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
>  					 bus_list) {
>  		pci_dev_get(dev);
> -		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
> +		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && link_active) {
>  			pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
>  			if (bctl & PCI_BRIDGE_CTL_VGA) {
>  				ctrl_err(ctrl,

Why do we even have this code to check for VGA devices?  I looked (briefly)
and couldn't find anything in the spec that prohibits removal of VGA
devices.

If we do need it (and it looks like most or all hotplug drivers copied it),
isn't there still a race?  Can't we have the following sequence?

  - pciehp_check_link_active()		# returns true
  - Link goes down
  - pci_read_config_byte()		# fails because link is down

Bjorn

> @@ -114,7 +114,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>  		 * Ensure that no new Requests will be generated from
>  		 * the device.
>  		 */
> -		if (presence) {
> +		if (link_active) {
>  			pci_read_config_word(dev, PCI_COMMAND, &command);
>  			command &= ~(PCI_COMMAND_MASTER | PCI_COMMAND_SERR);
>  			command |= PCI_COMMAND_INTX_DISABLE;
> -- 
> 1.7.9.5
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Dec. 11, 2014, 4:38 p.m. UTC | #3
On Wed, Dec 10, 2014 at 05:26:30PM -0700, Bjorn Helgaas wrote:
> On Thu, Nov 20, 2014 at 02:54:24PM -0800, Rajat Jain wrote:
> > While removing a card, we can't assume the presence to mean that the
> > access to card is OK. That is because the cause of removal may be a
> > link down event, and the card may still be physically present. Thus,
> > instead of presence, use the link state to decide whether or not it is
> > OK to access the card devices.
> > 
> > Here are the problem symptoms:
> > During the removal of a card due to link down, sometimes the following
> > error is seen (because pciehp_unconfigure_device() reads 0xFF from
> > bridge control register as the link is down, which cause it to assume
> > that the VGA bit is set):
> > 
> > pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> > pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> > pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
> > pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
> > pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
> > pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0
> > 
> > Ofcourse, when the link comes back up, the device addition fails too:
> > 
> > pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> > pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> > pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
> > pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
> > pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
> > pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00
> > 
> > The problem is not seen with this patch applied. The device removal and
> > insertion works as expected.
> > 
> > Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
> > Signed-off-by: Rajat Jain <rajatjain@juniper.net>
> > Signed-off-by: Guenter Roeck <groeck@juniper.net>
> > ---
> > v2: Use the already initialized "ctrl" instead of "p_slot->ctrl"
> > 
> >  drivers/pci/hotplug/pciehp_pci.c |    8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
> > index 9e69403..911f85b 100644
> > --- a/drivers/pci/hotplug/pciehp_pci.c
> > +++ b/drivers/pci/hotplug/pciehp_pci.c
> > @@ -77,7 +77,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
> >  {
> >  	int rc = 0;
> >  	u8 bctl = 0;
> > -	u8 presence = 0;
> > +	bool link_active = false;
> >  	struct pci_dev *dev, *temp;
> >  	struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
> >  	u16 command;
> > @@ -85,7 +85,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
> >  
> >  	ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
> >  		 __func__, pci_domain_nr(parent), parent->number);
> > -	pciehp_get_adapter_status(p_slot, &presence);
> > +	link_active = pciehp_check_link_active(ctrl);
> >  
> >  	pci_lock_rescan_remove();
> >  
> > @@ -98,7 +98,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
> >  	list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
> >  					 bus_list) {
> >  		pci_dev_get(dev);
> > -		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
> > +		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && link_active) {
> >  			pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
> >  			if (bctl & PCI_BRIDGE_CTL_VGA) {
> >  				ctrl_err(ctrl,
> 
> Why do we even have this code to check for VGA devices?  I looked (briefly)
> and couldn't find anything in the spec that prohibits removal of VGA
> devices.
> 
For my part I don't know. I only know that I had to integrate the patch into
our images since I hit the problem repeatedly. Usually I wait with integrating
Rajat's patches until you accept them, but this one was too disruptive.

I would argue that while the patch may not be perfect, at least it improves
the situation substantially.

> If we do need it (and it looks like most or all hotplug drivers copied it),
> isn't there still a race?  Can't we have the following sequence?
> 
>   - pciehp_check_link_active()		# returns true
>   - Link goes down
>   - pci_read_config_byte()		# fails because link is down
> 
I would guess so. Question is how to address it. Read the configuration byte
first, then check if the link is down ? Check if link is still up after reading
the configuration byte ? Add a note that there may be a potential race condition
and do nothing until it is actually seen ?

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 11, 2014, 8:26 p.m. UTC | #4
On Thu, Dec 11, 2014 at 9:38 AM, Guenter Roeck <groeck@juniper.net> wrote:
> On Wed, Dec 10, 2014 at 05:26:30PM -0700, Bjorn Helgaas wrote:
>> On Thu, Nov 20, 2014 at 02:54:24PM -0800, Rajat Jain wrote:
>> > While removing a card, we can't assume the presence to mean that the
>> > access to card is OK. That is because the cause of removal may be a
>> > link down event, and the card may still be physically present. Thus,
>> > instead of presence, use the link state to decide whether or not it is
>> > OK to access the card devices.
>> >
>> > Here are the problem symptoms:
>> > During the removal of a card due to link down, sometimes the following
>> > error is seen (because pciehp_unconfigure_device() reads 0xFF from
>> > bridge control register as the link is down, which cause it to assume
>> > that the VGA bit is set):
>> >
>> > pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
>> > pciehp 0000:21:05.0:pcie24: Data Link Layer State change
>> > pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
>> > pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
>> > pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
>> > pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0
>> >
>> > Ofcourse, when the link comes back up, the device addition fails too:
>> >
>> > pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
>> > pciehp 0000:21:05.0:pcie24: Data Link Layer State change
>> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
>> > pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
>> > pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
>> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
>> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
>> > pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
>> > pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00
>> >
>> > The problem is not seen with this patch applied. The device removal and
>> > insertion works as expected.
>> >
>> > Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
>> > Signed-off-by: Rajat Jain <rajatjain@juniper.net>
>> > Signed-off-by: Guenter Roeck <groeck@juniper.net>
>> > ---
>> > v2: Use the already initialized "ctrl" instead of "p_slot->ctrl"
>> >
>> >  drivers/pci/hotplug/pciehp_pci.c |    8 ++++----
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
>> > index 9e69403..911f85b 100644
>> > --- a/drivers/pci/hotplug/pciehp_pci.c
>> > +++ b/drivers/pci/hotplug/pciehp_pci.c
>> > @@ -77,7 +77,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>> >  {
>> >     int rc = 0;
>> >     u8 bctl = 0;
>> > -   u8 presence = 0;
>> > +   bool link_active = false;
>> >     struct pci_dev *dev, *temp;
>> >     struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
>> >     u16 command;
>> > @@ -85,7 +85,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>> >
>> >     ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
>> >              __func__, pci_domain_nr(parent), parent->number);
>> > -   pciehp_get_adapter_status(p_slot, &presence);
>> > +   link_active = pciehp_check_link_active(ctrl);
>> >
>> >     pci_lock_rescan_remove();
>> >
>> > @@ -98,7 +98,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
>> >     list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
>> >                                      bus_list) {
>> >             pci_dev_get(dev);
>> > -           if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
>> > +           if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && link_active) {
>> >                     pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
>> >                     if (bctl & PCI_BRIDGE_CTL_VGA) {
>> >                             ctrl_err(ctrl,
>>
>> Why do we even have this code to check for VGA devices?  I looked (briefly)
>> and couldn't find anything in the spec that prohibits removal of VGA
>> devices.
>>
> For my part I don't know. I only know that I had to integrate the patch into
> our images since I hit the problem repeatedly. Usually I wait with integrating
> Rajat's patches until you accept them, but this one was too disruptive.
>
> I would argue that while the patch may not be perfect, at least it improves
> the situation substantially.

I don't think removing the VGA checks is the way to fix the problem
you're seeing.  But I do want to investigate this code since we're in
the area.

>> If we do need it (and it looks like most or all hotplug drivers copied it),
>> isn't there still a race?  Can't we have the following sequence?
>>
>>   - pciehp_check_link_active()                # returns true
>>   - Link goes down
>>   - pci_read_config_byte()            # fails because link is down
>>
> I would guess so. Question is how to address it. Read the configuration byte
> first, then check if the link is down ? Check if link is still up after reading
> the configuration byte ? Add a note that there may be a potential race condition
> and do nothing until it is actually seen ?

I think we should just read PCI_BRIDGE_CONTROL and look for a 0xff
value.  That's not a legal value for the register, so if we see it, it
should be pretty safe to assume the link is down or the device is not
present at all.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Dec. 11, 2014, 8:45 p.m. UTC | #5
On Thu, Dec 11, 2014 at 01:26:47PM -0700, Bjorn Helgaas wrote:
> On Thu, Dec 11, 2014 at 9:38 AM, Guenter Roeck <groeck@juniper.net> wrote:
> > On Wed, Dec 10, 2014 at 05:26:30PM -0700, Bjorn Helgaas wrote:
> >> On Thu, Nov 20, 2014 at 02:54:24PM -0800, Rajat Jain wrote:
> >> > While removing a card, we can't assume the presence to mean that the
> >> > access to card is OK. That is because the cause of removal may be a
> >> > link down event, and the card may still be physically present. Thus,
> >> > instead of presence, use the link state to decide whether or not it is
> >> > OK to access the card devices.
> >> >
> >> > Here are the problem symptoms:
> >> > During the removal of a card due to link down, sometimes the following
> >> > error is seen (because pciehp_unconfigure_device() reads 0xFF from
> >> > bridge control register as the link is down, which cause it to assume
> >> > that the VGA bit is set):
> >> >
> >> > pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> >> > pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> >> > pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
> >> > pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
> >> > pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
> >> > pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0
> >> >
> >> > Ofcourse, when the link comes back up, the device addition fails too:
> >> >
> >> > pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
> >> > pciehp 0000:21:05.0:pcie24: Data Link Layer State change
> >> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> >> > pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
> >> > pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
> >> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
> >> > pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
> >> > pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
> >> > pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00
> >> >
> >> > The problem is not seen with this patch applied. The device removal and
> >> > insertion works as expected.
> >> >
> >> > Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
> >> > Signed-off-by: Rajat Jain <rajatjain@juniper.net>
> >> > Signed-off-by: Guenter Roeck <groeck@juniper.net>
> >> > ---
> >> > v2: Use the already initialized "ctrl" instead of "p_slot->ctrl"
> >> >
> >> >  drivers/pci/hotplug/pciehp_pci.c |    8 ++++----
> >> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
> >> > index 9e69403..911f85b 100644
> >> > --- a/drivers/pci/hotplug/pciehp_pci.c
> >> > +++ b/drivers/pci/hotplug/pciehp_pci.c
> >> > @@ -77,7 +77,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
> >> >  {
> >> >     int rc = 0;
> >> >     u8 bctl = 0;
> >> > -   u8 presence = 0;
> >> > +   bool link_active = false;
> >> >     struct pci_dev *dev, *temp;
> >> >     struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
> >> >     u16 command;
> >> > @@ -85,7 +85,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
> >> >
> >> >     ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
> >> >              __func__, pci_domain_nr(parent), parent->number);
> >> > -   pciehp_get_adapter_status(p_slot, &presence);
> >> > +   link_active = pciehp_check_link_active(ctrl);
> >> >
> >> >     pci_lock_rescan_remove();
> >> >
> >> > @@ -98,7 +98,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
> >> >     list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
> >> >                                      bus_list) {
> >> >             pci_dev_get(dev);
> >> > -           if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
> >> > +           if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && link_active) {
> >> >                     pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
> >> >                     if (bctl & PCI_BRIDGE_CTL_VGA) {
> >> >                             ctrl_err(ctrl,
> >>
> >> Why do we even have this code to check for VGA devices?  I looked (briefly)
> >> and couldn't find anything in the spec that prohibits removal of VGA
> >> devices.
> >>
> > For my part I don't know. I only know that I had to integrate the patch into
> > our images since I hit the problem repeatedly. Usually I wait with integrating
> > Rajat's patches until you accept them, but this one was too disruptive.
> >
> > I would argue that while the patch may not be perfect, at least it improves
> > the situation substantially.
> 
> I don't think removing the VGA checks is the way to fix the problem
> you're seeing.  But I do want to investigate this code since we're in
> the area.
> 
I agree. Removing the VGA check would attempt to fix something that
isn't known to be broken, and might have undesirable side effects.

> >> If we do need it (and it looks like most or all hotplug drivers copied it),
> >> isn't there still a race?  Can't we have the following sequence?
> >>
> >>   - pciehp_check_link_active()                # returns true
> >>   - Link goes down
> >>   - pci_read_config_byte()            # fails because link is down
> >>
> > I would guess so. Question is how to address it. Read the configuration byte
> > first, then check if the link is down ? Check if link is still up after reading
> > the configuration byte ? Add a note that there may be a potential race condition
> > and do nothing until it is actually seen ?
> 
> I think we should just read PCI_BRIDGE_CONTROL and look for a 0xff
> value.  That's not a legal value for the register, so if we see it, it
> should be pretty safe to assume the link is down or the device is not
> present at all.
> 
Something like
	if (bctl != 0xff && (bctl & PCI_BRIDGE_CTL_VGA)) {
in addition to Rajat's changes ?

I think it would be good to keep the change Rajat proposed, ie to check
the link state instead of presence. Question then is if you'd want a new
revision of Rajat's patch or another patch on top of it with the bctl
related change.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 11, 2014, 9 p.m. UTC | #6
On Thu, Dec 11, 2014 at 1:45 PM, Guenter Roeck <groeck@juniper.net> wrote:
> On Thu, Dec 11, 2014 at 01:26:47PM -0700, Bjorn Helgaas wrote:
>> On Thu, Dec 11, 2014 at 9:38 AM, Guenter Roeck <groeck@juniper.net> wrote:
>> > On Wed, Dec 10, 2014 at 05:26:30PM -0700, Bjorn Helgaas wrote:
>> >> On Thu, Nov 20, 2014 at 02:54:24PM -0800, Rajat Jain wrote:

>> >> If we do need it (and it looks like most or all hotplug drivers copied it),
>> >> isn't there still a race?  Can't we have the following sequence?
>> >>
>> >>   - pciehp_check_link_active()                # returns true
>> >>   - Link goes down
>> >>   - pci_read_config_byte()            # fails because link is down
>> >>
>> > I would guess so. Question is how to address it. Read the configuration byte
>> > first, then check if the link is down ? Check if link is still up after reading
>> > the configuration byte ? Add a note that there may be a potential race condition
>> > and do nothing until it is actually seen ?
>>
>> I think we should just read PCI_BRIDGE_CONTROL and look for a 0xff
>> value.  That's not a legal value for the register, so if we see it, it
>> should be pretty safe to assume the link is down or the device is not
>> present at all.
>>
> Something like
>         if (bctl != 0xff && (bctl & PCI_BRIDGE_CTL_VGA)) {
> in addition to Rajat's changes ?
>
> I think it would be good to keep the change Rajat proposed, ie to check
> the link state instead of presence. Question then is if you'd want a new
> revision of Rajat's patch or another patch on top of it with the bctl
> related change.

Why do we need the link state or the presence check?  It seems like
those are sort of a 90% solution, and doing them provides the illusion
of value but without real value.  If we think that checking for 0xff
is a 100% solution, we should rely on that and not bother with
anything else.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Dec. 11, 2014, 9:39 p.m. UTC | #7
On Thu, Dec 11, 2014 at 02:00:06PM -0700, Bjorn Helgaas wrote:
> On Thu, Dec 11, 2014 at 1:45 PM, Guenter Roeck <groeck@juniper.net> wrote:
> > On Thu, Dec 11, 2014 at 01:26:47PM -0700, Bjorn Helgaas wrote:
> >> On Thu, Dec 11, 2014 at 9:38 AM, Guenter Roeck <groeck@juniper.net> wrote:
> >> > On Wed, Dec 10, 2014 at 05:26:30PM -0700, Bjorn Helgaas wrote:
> >> >> On Thu, Nov 20, 2014 at 02:54:24PM -0800, Rajat Jain wrote:
> 
> >> >> If we do need it (and it looks like most or all hotplug drivers copied it),
> >> >> isn't there still a race?  Can't we have the following sequence?
> >> >>
> >> >>   - pciehp_check_link_active()                # returns true
> >> >>   - Link goes down
> >> >>   - pci_read_config_byte()            # fails because link is down
> >> >>
> >> > I would guess so. Question is how to address it. Read the configuration byte
> >> > first, then check if the link is down ? Check if link is still up after reading
> >> > the configuration byte ? Add a note that there may be a potential race condition
> >> > and do nothing until it is actually seen ?
> >>
> >> I think we should just read PCI_BRIDGE_CONTROL and look for a 0xff
> >> value.  That's not a legal value for the register, so if we see it, it
> >> should be pretty safe to assume the link is down or the device is not
> >> present at all.
> >>
> > Something like
> >         if (bctl != 0xff && (bctl & PCI_BRIDGE_CTL_VGA)) {
> > in addition to Rajat's changes ?
> >
> > I think it would be good to keep the change Rajat proposed, ie to check
> > the link state instead of presence. Question then is if you'd want a new
> > revision of Rajat's patch or another patch on top of it with the bctl
> > related change.
> 
> Why do we need the link state or the presence check?  It seems like
> those are sort of a 90% solution, and doing them provides the illusion
> of value but without real value.  If we think that checking for 0xff
> is a 100% solution, we should rely on that and not bother with
> anything else.
> 
Interesting thought. Let me play with that.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 9e69403..911f85b 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -77,7 +77,7 @@  int pciehp_unconfigure_device(struct slot *p_slot)
 {
 	int rc = 0;
 	u8 bctl = 0;
-	u8 presence = 0;
+	bool link_active = false;
 	struct pci_dev *dev, *temp;
 	struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
 	u16 command;
@@ -85,7 +85,7 @@  int pciehp_unconfigure_device(struct slot *p_slot)
 
 	ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
 		 __func__, pci_domain_nr(parent), parent->number);
-	pciehp_get_adapter_status(p_slot, &presence);
+	link_active = pciehp_check_link_active(ctrl);
 
 	pci_lock_rescan_remove();
 
@@ -98,7 +98,7 @@  int pciehp_unconfigure_device(struct slot *p_slot)
 	list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
 					 bus_list) {
 		pci_dev_get(dev);
-		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && presence) {
+		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE && link_active) {
 			pci_read_config_byte(dev, PCI_BRIDGE_CONTROL, &bctl);
 			if (bctl & PCI_BRIDGE_CTL_VGA) {
 				ctrl_err(ctrl,
@@ -114,7 +114,7 @@  int pciehp_unconfigure_device(struct slot *p_slot)
 		 * Ensure that no new Requests will be generated from
 		 * the device.
 		 */
-		if (presence) {
+		if (link_active) {
 			pci_read_config_word(dev, PCI_COMMAND, &command);
 			command &= ~(PCI_COMMAND_MASTER | PCI_COMMAND_SERR);
 			command |= PCI_COMMAND_INTX_DISABLE;