[v6,01/30] PCI: Fix race condition in pci_enable/disable_device()
diff mbox series

Message ID 20191024171228.877974-2-s.miroshnichenko@yadro.com
State New
Delegated to: Bjorn Helgaas
Headers show
Series
  • PCI: Allow BAR movement during hotplug
Related show

Commit Message

Sergey Miroshnichenko Oct. 24, 2019, 5:11 p.m. UTC
This is a yet another approach to fix an old [1-2] concurrency issue, when:
 - two or more devices are being hot-added into a bridge which was
   initially empty;
 - a bridge with two or more devices is being hot-added;
 - during boot, if BIOS/bootloader/firmware doesn't pre-enable bridges.

The problem is that a bridge is reported as enabled before the MEM/IO bits
are actually written to the PCI_COMMAND register, so another driver thread
starts memory requests through the not-yet-enabled bridge:

 CPU0                                        CPU1

 pci_enable_device_mem()                     pci_enable_device_mem()
   pci_enable_bridge()                         pci_enable_bridge()
     pci_is_enabled()
       return false;
     atomic_inc_return(enable_cnt)
     Start actual enabling the bridge
     ...                                         pci_is_enabled()
     ...                                           return true;
     ...                                     Start memory requests <-- FAIL
     ...
     Set the PCI_COMMAND_MEMORY bit <-- Must wait for this

Protect the pci_enable/disable_device() and pci_enable_bridge(), which is
similar to the previous solution from commit 40f11adc7cd9 ("PCI: Avoid race
while enabling upstream bridges"), but adding a per-device mutexes and
preventing the dev->enable_cnt from from incrementing early.

CC: Srinath Mannam <srinath.mannam@broadcom.com>
CC: Marta Rybczynska <mrybczyn@kalray.eu>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>

[1] https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u
    [RFC PATCH v3] pci: Concurrency issue during pci enable bridge

[2] https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u
    [RFC PATCH] nvme: avoid race-conditions when enabling devices
---
 drivers/pci/pci.c   | 26 ++++++++++++++++++++++----
 drivers/pci/probe.c |  1 +
 include/linux/pci.h |  1 +
 3 files changed, 24 insertions(+), 4 deletions(-)

Comments

Carlo Pisani Oct. 25, 2019, 2:33 p.m. UTC | #1
pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff]
pci_bus 0000:00: root bus resource [io  0x18800000-0x188fffff]
pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0]
pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)
pci 0000:00:04.0: BAR 0: assigned [mem 0x50000000-0x5000ffff]
pci 0000:00:05.0: BAR 1: assigned [mem 0x50010000-0x50010fff]
pci 0000:00:05.0: BAR 3: assigned [mem 0x50011000-0x50011fff]
pci 0000:00:0a.0: BAR 1: assigned [mem 0x50012000-0x50012fff]
pci 0000:00:0a.0: BAR 3: assigned [mem 0x50013000-0x50013fff]
pci 0000:00:02.0: BAR 0: assigned [io  0x18800000-0x188000ff]
pci 0000:00:02.0: BAR 1: assigned [mem 0x50014000-0x500140ff]
pci 0000:00:03.0: BAR 0: assigned [io  0x18800400-0x188004ff]
pci 0000:00:03.0: BAR 1: assigned [mem 0x50014100-0x500141ff]
pci 0000:00:05.0: BAR 0: assigned [io  0x18800800-0x1880081f]
pci 0000:00:05.0: BAR 2: assigned [io  0x18800820-0x1880083f]
pci 0000:00:0a.0: BAR 0: assigned [io  0x18800840-0x1880085f]
pci 0000:00:0a.0: BAR 2: assigned [io  0x18800860-0x1880087f]


00:00.0 Non-VGA unclassified device: Integrated Device Technology,
Inc. Device 0000
        Subsystem: Device 0214:011d
        Flags: bus master, 66MHz, medium devsel, latency 60, IRQ 140
        Memory at <unassigned> (32-bit, prefetchable)
        I/O ports at <ignored>
        I/O ports at <ignored>

00:02.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S
[Rhine-III] (rev 86)
        Subsystem: AST Research Inc Device 086c
        Flags: bus master, stepping, medium devsel, latency 64, IRQ 142
        I/O ports at 18800000 [size=256]
        Memory at 50014000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [40] Power Management version 2
        Kernel driver in use: via-rhine

00:03.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S
[Rhine-III] (rev 86)
        Subsystem: AST Research Inc Device 086c
        Flags: bus master, stepping, medium devsel, latency 64, IRQ 143
        I/O ports at 18800400 [size=256]
        Memory at 50014100 (32-bit, non-prefetchable) [size=256]
        Capabilities: [40] Power Management version 2
        Kernel driver in use: via-rhine

00:04.0 Network controller: Atheros Communications Inc. Device 0029 (rev 01)
        Subsystem: Atheros Communications Inc. Device 2091
        Flags: bus master, 66MHz, medium devsel, latency 168, IRQ 142
        Memory at 50000000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: [44] Power Management version 2
        Kernel driver in use: ath9k

00:05.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad
16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [)
        Subsystem: Oxford Semiconductor Ltd Device 0000
        Flags: medium devsel, IRQ 143
        I/O ports at 18800800 [size=32]
        Memory at 50010000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at 18800820 [size=32]
        Memory at 50011000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [40] Power Management version 2
        Kernel driver in use: serial

00:05.1 Non-VGA unclassified device: Oxford Semiconductor Ltd
OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01)
        Subsystem: Oxford Semiconductor Ltd Device 0000
        Flags: medium devsel, IRQ 143
        I/O ports at <unassigned> [disabled]
        I/O ports at <unassigned> [disabled]
        I/O ports at <unassigned> [disabled]
        Capabilities: [40] Power Management version 2

00:0a.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad
16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [)
        Subsystem: Oxford Semiconductor Ltd Device 0000
        Flags: medium devsel, IRQ 140
        I/O ports at 18800840 [size=32]
        Memory at 50012000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at 18800860 [size=32]
        Memory at 50013000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [40] Power Management version 2
        Kernel driver in use: serial

00:0a.1 Non-VGA unclassified device: Oxford Semiconductor Ltd
OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01)
        Subsystem: Oxford Semiconductor Ltd Device 0000
        Flags: medium devsel, IRQ 140
        I/O ports at <unassigned> [disabled]
        I/O ports at <unassigned> [disabled]
        I/O ports at <unassigned> [disabled]
        Capabilities: [40] Power Management version 2


hi guys
I have a couple of miniPCI Oxford Semiconductor Ltd OX16PCI954 cards
installed, and the dmesg looks weird

espeially these lines
pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff]
pci_bus 0000:00: root bus resource [io  0x18800000-0x188fffff]
pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0]
pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)

besides, I am experimenting crashes happening in burn-in tests, and I
do suspect it's something related to the newly added cards

any enlightenment?
Bjorn Helgaas Oct. 25, 2019, 4:37 p.m. UTC | #2
On Fri, Oct 25, 2019 at 04:33:13PM +0200, Carlo Pisani wrote:
> pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff]
> pci_bus 0000:00: root bus resource [io  0x18800000-0x188fffff]
> pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0]
> pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
> pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
> pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)
> pci 0000:00:04.0: BAR 0: assigned [mem 0x50000000-0x5000ffff]
> pci 0000:00:05.0: BAR 1: assigned [mem 0x50010000-0x50010fff]
> pci 0000:00:05.0: BAR 3: assigned [mem 0x50011000-0x50011fff]
> pci 0000:00:0a.0: BAR 1: assigned [mem 0x50012000-0x50012fff]
> pci 0000:00:0a.0: BAR 3: assigned [mem 0x50013000-0x50013fff]
> pci 0000:00:02.0: BAR 0: assigned [io  0x18800000-0x188000ff]
> pci 0000:00:02.0: BAR 1: assigned [mem 0x50014000-0x500140ff]
> pci 0000:00:03.0: BAR 0: assigned [io  0x18800400-0x188004ff]
> pci 0000:00:03.0: BAR 1: assigned [mem 0x50014100-0x500141ff]
> pci 0000:00:05.0: BAR 0: assigned [io  0x18800800-0x1880081f]
> pci 0000:00:05.0: BAR 2: assigned [io  0x18800820-0x1880083f]
> pci 0000:00:0a.0: BAR 0: assigned [io  0x18800840-0x1880085f]
> pci 0000:00:0a.0: BAR 2: assigned [io  0x18800860-0x1880087f]
> 
> 
> 00:00.0 Non-VGA unclassified device: Integrated Device Technology,
> Inc. Device 0000
>         Subsystem: Device 0214:011d
>         Flags: bus master, 66MHz, medium devsel, latency 60, IRQ 140
>         Memory at <unassigned> (32-bit, prefetchable)
>         I/O ports at <ignored>
>         I/O ports at <ignored>
> 
> 00:02.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S
> [Rhine-III] (rev 86)
>         Subsystem: AST Research Inc Device 086c
>         Flags: bus master, stepping, medium devsel, latency 64, IRQ 142
>         I/O ports at 18800000 [size=256]
>         Memory at 50014000 (32-bit, non-prefetchable) [size=256]
>         Capabilities: [40] Power Management version 2
>         Kernel driver in use: via-rhine
> 
> 00:03.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S
> [Rhine-III] (rev 86)
>         Subsystem: AST Research Inc Device 086c
>         Flags: bus master, stepping, medium devsel, latency 64, IRQ 143
>         I/O ports at 18800400 [size=256]
>         Memory at 50014100 (32-bit, non-prefetchable) [size=256]
>         Capabilities: [40] Power Management version 2
>         Kernel driver in use: via-rhine
> 
> 00:04.0 Network controller: Atheros Communications Inc. Device 0029 (rev 01)
>         Subsystem: Atheros Communications Inc. Device 2091
>         Flags: bus master, 66MHz, medium devsel, latency 168, IRQ 142
>         Memory at 50000000 (32-bit, non-prefetchable) [size=64K]
>         Capabilities: [44] Power Management version 2
>         Kernel driver in use: ath9k
> 
> 00:05.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad
> 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [)
>         Subsystem: Oxford Semiconductor Ltd Device 0000
>         Flags: medium devsel, IRQ 143
>         I/O ports at 18800800 [size=32]
>         Memory at 50010000 (32-bit, non-prefetchable) [size=4K]
>         I/O ports at 18800820 [size=32]
>         Memory at 50011000 (32-bit, non-prefetchable) [size=4K]
>         Capabilities: [40] Power Management version 2
>         Kernel driver in use: serial
> 
> 00:05.1 Non-VGA unclassified device: Oxford Semiconductor Ltd
> OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01)
>         Subsystem: Oxford Semiconductor Ltd Device 0000
>         Flags: medium devsel, IRQ 143
>         I/O ports at <unassigned> [disabled]
>         I/O ports at <unassigned> [disabled]
>         I/O ports at <unassigned> [disabled]
>         Capabilities: [40] Power Management version 2
> 
> 00:0a.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad
> 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [)
>         Subsystem: Oxford Semiconductor Ltd Device 0000
>         Flags: medium devsel, IRQ 140
>         I/O ports at 18800840 [size=32]
>         Memory at 50012000 (32-bit, non-prefetchable) [size=4K]
>         I/O ports at 18800860 [size=32]
>         Memory at 50013000 (32-bit, non-prefetchable) [size=4K]
>         Capabilities: [40] Power Management version 2
>         Kernel driver in use: serial
> 
> 00:0a.1 Non-VGA unclassified device: Oxford Semiconductor Ltd
> OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01)
>         Subsystem: Oxford Semiconductor Ltd Device 0000
>         Flags: medium devsel, IRQ 140
>         I/O ports at <unassigned> [disabled]
>         I/O ports at <unassigned> [disabled]
>         I/O ports at <unassigned> [disabled]
>         Capabilities: [40] Power Management version 2
> 
> 
> hi guys
> I have a couple of miniPCI Oxford Semiconductor Ltd OX16PCI954 cards
> installed, and the dmesg looks weird
> 
> espeially these lines
> pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff]
> pci_bus 0000:00: root bus resource [io  0x18800000-0x188fffff]
> pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0]
> pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]

These resources are supplied to the PCI core, probably from DT.  A
complete dmesg log would show more.

> pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
> pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)

> besides, I am experimenting crashes happening in burn-in tests, and I
> do suspect it's something related to the newly added cards

If you take the cards out do the lines you mention above change?

What sort of crashes do you see?  I assume it doesn't crash without
the cards?

It *looks* like the miniPCI cards should be these devices:

  00:05.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [)
  00:0a.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [)

which are unrelated to the 00:00.0 device with the broken BAR.

Patch
diff mbox series

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..44d0d12c80cf 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1643,6 +1643,8 @@  static void pci_enable_bridge(struct pci_dev *dev)
 	struct pci_dev *bridge;
 	int retval;
 
+	mutex_lock(&dev->enable_mutex);
+
 	bridge = pci_upstream_bridge(dev);
 	if (bridge)
 		pci_enable_bridge(bridge);
@@ -1650,6 +1652,7 @@  static void pci_enable_bridge(struct pci_dev *dev)
 	if (pci_is_enabled(dev)) {
 		if (!dev->is_busmaster)
 			pci_set_master(dev);
+		mutex_unlock(&dev->enable_mutex);
 		return;
 	}
 
@@ -1658,11 +1661,14 @@  static void pci_enable_bridge(struct pci_dev *dev)
 		pci_err(dev, "Error enabling bridge (%d), continuing\n",
 			retval);
 	pci_set_master(dev);
+	mutex_unlock(&dev->enable_mutex);
 }
 
 static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 {
 	struct pci_dev *bridge;
+	/* Enable-locking of bridges is performed within the pci_enable_bridge() */
+	bool need_lock = !dev->subordinate;
 	int err;
 	int i, bars = 0;
 
@@ -1678,8 +1684,13 @@  static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
 	}
 
-	if (atomic_inc_return(&dev->enable_cnt) > 1)
+	if (need_lock)
+		mutex_lock(&dev->enable_mutex);
+	if (pci_is_enabled(dev)) {
+		if (need_lock)
+			mutex_unlock(&dev->enable_mutex);
 		return 0;		/* already enabled */
+	}
 
 	bridge = pci_upstream_bridge(dev);
 	if (bridge)
@@ -1694,8 +1705,10 @@  static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 			bars |= (1 << i);
 
 	err = do_pci_enable_device(dev, bars);
-	if (err < 0)
-		atomic_dec(&dev->enable_cnt);
+	if (err >= 0)
+		atomic_inc(&dev->enable_cnt);
+	if (need_lock)
+		mutex_unlock(&dev->enable_mutex);
 	return err;
 }
 
@@ -1939,15 +1952,20 @@  void pci_disable_device(struct pci_dev *dev)
 	if (dr)
 		dr->enabled = 0;
 
+	mutex_lock(&dev->enable_mutex);
 	dev_WARN_ONCE(&dev->dev, atomic_read(&dev->enable_cnt) <= 0,
 		      "disabling already-disabled device");
 
-	if (atomic_dec_return(&dev->enable_cnt) != 0)
+	if (atomic_dec_return(&dev->enable_cnt) != 0) {
+		mutex_unlock(&dev->enable_mutex);
 		return;
+	}
 
 	do_pci_disable_device(dev);
 
 	dev->is_busmaster = 0;
+
+	mutex_unlock(&dev->enable_mutex);
 }
 EXPORT_SYMBOL(pci_disable_device);
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 3d5271a7a849..d4f21e413638 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2158,6 +2158,7 @@  struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 	INIT_LIST_HEAD(&dev->bus_list);
 	dev->dev.type = &pci_dev_type;
 	dev->bus = pci_bus_get(bus);
+	mutex_init(&dev->enable_mutex);
 
 	return dev;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f9088c89a534..525140e3a460 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -425,6 +425,7 @@  struct pci_dev {
 	unsigned int	no_vf_scan:1;		/* Don't scan for VFs after IOV enablement */
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
+	struct mutex	enable_mutex;
 
 	u32		saved_config_space[16]; /* Config space saved at suspend time */
 	struct hlist_head saved_cap_space;