diff mbox

[2/2] pci-hyperv: properly handle device eject

Message ID 1473724441-21207-2-git-send-email-longli@exchange.microsoft.com
State Changes Requested
Headers show

Commit Message

Long Li Sept. 12, 2016, 11:54 p.m. UTC
From: Long Li <longli@microsoft.com>

A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message, in this case we could potentailly modify the bus from two places. Properly lock the bus access.

Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/pci/host/pci-hyperv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dexuan Cui Sept. 13, 2016, 9:50 a.m. UTC | #1
> From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On Behalf
> Of Long Li
> Sent: Tuesday, September 13, 2016 7:54
> ...
> A PCI_EJECT message can arrive at the same time we are calling
> pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS
> message, in this case we could potentailly modify the bus from two places.
> Properly lock the bus access.
> 
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct work_struct
> *work)
>         pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0,
>                                            wslot);
>         if (pdev) {
> -               pci_stop_and_remove_bus_device(pdev);
> +               pci_stop_and_remove_bus_device_locked(pdev);
>                 pci_dev_put(pdev);
>         }

The _locked version tries to get the mutex pci_rescan_remove_lock.

But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how can
this patch make sure the 2 code paths are not running simultaneously?

Thanks,
-- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Long Li Sept. 13, 2016, 5:33 p.m. UTC | #2
> -----Original Message-----
> From: Dexuan Cui
> Sent: Tuesday, September 13, 2016 2:51 AM
> To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> 
> > From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On
> > Behalf Of Long Li
> > Sent: Tuesday, September 13, 2016 7:54 ...
> > A PCI_EJECT message can arrive at the same time we are calling
> > pci_scan_child_bus in the workqueue for the previous
> PCI_BUS_RELATIONS
> > message, in this case we could potentailly modify the bus from two places.
> > Properly lock the bus access.
> >
> > --- a/drivers/pci/host/pci-hyperv.c
> > +++ b/drivers/pci/host/pci-hyperv.c
> > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct
> > work_struct
> > *work)
> >         pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain,
> 0,
> >                                            wslot);
> >         if (pdev) {
> > -               pci_stop_and_remove_bus_device(pdev);
> > +               pci_stop_and_remove_bus_device_locked(pdev);
> >                 pci_dev_put(pdev);
> >         }
> 
> The _locked version tries to get the mutex pci_rescan_remove_lock.
> 
> But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how can
> this patch make sure the 2 code paths are not running simultaneously?

Thanks for the review.

The lock is to protect the following call to pci_scan_child_bus() in pci_devices_present_work():

                /*
                 * Tell the core to rescan bus
                 * because there may have been changes.
                 */
                pci_lock_rescan_remove();
                pci_scan_child_bus(hbus->pci_bus);
                pci_unlock_rescan_remove();

This race condition has shown up in the tests.

You raised a valid concern in create_root_hv_pci_bus(). There might be another race condition there. I'll look into this.

> 
> Thanks,
> -- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Long Li Sept. 13, 2016, 5:41 p.m. UTC | #3
> -----Original Message-----
> From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On
> Behalf Of Long Li
> Sent: Tuesday, September 13, 2016 10:33 AM
> To: Dexuan Cui <decui@microsoft.com>; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Bjorn
> Helgaas <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> 
> This sender failed our fraud detection checks and may not be who they
> appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing
> 
> > -----Original Message-----
> > From: Dexuan Cui
> > Sent: Tuesday, September 13, 2016 2:51 AM
> > To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> > Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> > <bhelgaas@google.com>
> > Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> > pci@vger.kernel.org
> > Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> >
> > > From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org]
> > > On Behalf Of Long Li
> > > Sent: Tuesday, September 13, 2016 7:54 ...
> > > A PCI_EJECT message can arrive at the same time we are calling
> > > pci_scan_child_bus in the workqueue for the previous
> > PCI_BUS_RELATIONS
> > > message, in this case we could potentailly modify the bus from two
> places.
> > > Properly lock the bus access.
> > >
> > > --- a/drivers/pci/host/pci-hyperv.c
> > > +++ b/drivers/pci/host/pci-hyperv.c
> > > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct
> > > work_struct
> > > *work)
> > >         pdev =
> > > pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain,
> > 0,
> > >                                            wslot);
> > >         if (pdev) {
> > > -               pci_stop_and_remove_bus_device(pdev);
> > > +               pci_stop_and_remove_bus_device_locked(pdev);
> > >                 pci_dev_put(pdev);
> > >         }
> >
> > The _locked version tries to get the mutex pci_rescan_remove_lock.
> >
> > But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so
> > how can this patch make sure the 2 code paths are not running
> simultaneously?
> 
> Thanks for the review.
> 
> The lock is to protect the following call to pci_scan_child_bus() in
> pci_devices_present_work():
> 
>                 /*
>                  * Tell the core to rescan bus
>                  * because there may have been changes.
>                  */
>                 pci_lock_rescan_remove();
>                 pci_scan_child_bus(hbus->pci_bus);
>                 pci_unlock_rescan_remove();
> 
> This race condition has shown up in the tests.
> 
> You raised a valid concern in create_root_hv_pci_bus(). There might be
> another race condition there. I'll look into this.

I think this code is safe here. If we reach the code pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is already called.

> 
> >
> > Thanks,
> > -- Dexuan
> _______________________________________________
> devel mailing list
> devel@linuxdriverproject.org
> https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fdriverde
> v.linuxdriverproject.org%2fmailman%2flistinfo%2fdriverdev-
> devel&data=02%7c01%7clongli%40microsoft.com%7c3d12ee6d87c140eb5114
> 08d3dbfc1713%7c72f988bf86f141af91ab2d7cd011db47%7c1%7c0%7c6360938
> 48185348266&sdata=a2GYqIBsQAFxszkKg3fl1nqqPgvZHh%2bAY2255RgrvUU
> %3d
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dexuan Cui Sept. 14, 2016, 5:45 a.m. UTC | #4
> From: Long Li
> Sent: Wednesday, September 14, 2016 1:41
> 
> I think this code is safe here. If we reach the code
> pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is already
> called.

When hv_pci_probe() -> create_root_hv_pci_bus() -> pci_scan_child_bus() is running
on one cpu, I think nothing in the current code can prevent 
hv_eject_device_work() -> pci_stop_and_remove_bus_device_locked()
from running on another cpu?

The race window is pretty small however.

Thanks,
-- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Long Li Sept. 14, 2016, 4:31 p.m. UTC | #5
> -----Original Message-----
> From: Dexuan Cui
> Sent: Tuesday, September 13, 2016 10:45 PM
> To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> 
> > From: Long Li
> > Sent: Wednesday, September 14, 2016 1:41
> >
> > I think this code is safe here. If we reach the code
> > pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is
> > already called.
> 
> When hv_pci_probe() -> create_root_hv_pci_bus() -> pci_scan_child_bus()
> is running on one cpu, I think nothing in the current code can prevent
> hv_eject_device_work() -> pci_stop_and_remove_bus_device_locked()
> from running on another cpu?
> 
> The race window is pretty small however.

This is a valid race condition. I'll work on a V2 patch. Thanks!

> 
> Thanks,
> -- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 3c2b330..ca77009 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -1587,7 +1587,7 @@  static void hv_eject_device_work(struct work_struct *work)
 	pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0,
 					   wslot);
 	if (pdev) {
-		pci_stop_and_remove_bus_device(pdev);
+		pci_stop_and_remove_bus_device_locked(pdev);
 		pci_dev_put(pdev);
 	}