diff mbox series

[resend,v2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors

Message ID 20211217141709.379663-1-hdegoede@redhat.com
State New
Headers show
Series [resend,v2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors | expand

Commit Message

Hans de Goede Dec. 17, 2021, 2:17 p.m. UTC
Use down_read_nested() and down_write_nested() when taking the
ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
the path to the PCI root bus as lock subclass parameter. This fixes the
following false-positive lockdep report when unplugging a Lenovo X1C8 from
a Lenovo 2nd gen TB3 dock:

[   28.583853] pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
[   28.583891] pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
[   28.584849] ============================================
[   28.584854] WARNING: possible recursive locking detected
[   28.584858] 5.16.0-rc2+ #621 Not tainted
[   28.584864] --------------------------------------------
[   28.584867] irq/124-pciehp/86 is trying to acquire lock:
[   28.584873] ffff8e5ac4299ef8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80
[   28.584904]
               but task is already holding lock:
[   28.584908] ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
[   28.584929]
               other info that might help us debug this:
[   28.584933]  Possible unsafe locking scenario:

[   28.584936]        CPU0
[   28.584939]        ----
[   28.584942]   lock(&ctrl->reset_lock);
[   28.584949]   lock(&ctrl->reset_lock);
[   28.584955]
                *** DEADLOCK ***

[   28.584959]  May be due to missing lock nesting notation

[   28.584963] 3 locks held by irq/124-pciehp/86:
[   28.584970]  #0: ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
[   28.584991]  #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
[   28.585012]  #2: ffff8e5ac1ee2248 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40
[   28.585037]
               stack backtrace:
[   28.585042] CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
[   28.585052] Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
[   28.585059] Call Trace:
[   28.585064]  <TASK>
[   28.585073]  dump_stack_lvl+0x59/0x73
[   28.585087]  __lock_acquire.cold+0xc5/0x2c6
[   28.585106]  ? find_held_lock+0x2b/0x80
[   28.585124]  lock_acquire+0xb5/0x2b0
[   28.585132]  ? pciehp_check_presence+0x23/0x80
[   28.585144]  ? lock_is_held_type+0xa8/0x120
[   28.585161]  down_read+0x3e/0x50
[   28.585172]  ? pciehp_check_presence+0x23/0x80
[   28.585183]  pciehp_check_presence+0x23/0x80
[   28.585194]  pciehp_runtime_resume+0x5c/0xa0
[   28.585206]  ? pci_msix_init+0x60/0x60
[   28.585214]  device_for_each_child+0x45/0x70
[   28.585227]  pcie_port_device_runtime_resume+0x20/0x30
[   28.585236]  pci_pm_runtime_resume+0xa7/0xc0
[   28.585246]  ? pci_pm_freeze_noirq+0x100/0x100
[   28.585257]  __rpm_callback+0x41/0x110
[   28.585271]  ? pci_pm_freeze_noirq+0x100/0x100
[   28.585281]  rpm_callback+0x59/0x70
[   28.585293]  rpm_resume+0x512/0x7b0
[   28.585309]  __pm_runtime_resume+0x4a/0x90
[   28.585322]  __device_release_driver+0x28/0x240
[   28.585338]  device_release_driver+0x26/0x40
[   28.585351]  pci_stop_bus_device+0x68/0x90
[   28.585363]  pci_stop_bus_device+0x2c/0x90
[   28.585373]  pci_stop_and_remove_bus_device+0xe/0x20
[   28.585384]  pciehp_unconfigure_device+0x6c/0x110
[   28.585396]  ? __pm_runtime_resume+0x58/0x90
[   28.585409]  pciehp_disable_slot+0x5b/0xe0
[   28.585421]  pciehp_handle_presence_or_link_change+0xc3/0x2f0
[   28.585436]  pciehp_ist+0x179/0x180
[   28.585449]  ? disable_irq_nosync+0x10/0x10
[   28.585460]  irq_thread_fn+0x1d/0x60
[   28.585470]  ? irq_thread+0x81/0x1a0
[   28.585480]  irq_thread+0xcb/0x1a0
[   28.585491]  ? irq_thread_fn+0x60/0x60
[   28.585502]  ? irq_thread_check_affinity+0xb0/0xb0
[   28.585514]  kthread+0x165/0x190
[   28.585522]  ? set_kthread_struct+0x40/0x40
[   28.585531]  ret_from_fork+0x1f/0x30
[   28.585554]  </TASK>

This lockdep warning is triggered because with Thunderbolt, hotplug ports
are nested. When removing multiple devices in a daisy-chain, each hotplug
port's reset_lock may be acquired recursively. It's never the same lock,
so the lockdep splat is a false positive.

Because locks at the same hierarchy level are never acquired recursively,
a per-level lockdep class is sufficient to fix the lockdep warning.

The choice to use one lockdep subclass per pcie-hotplug controller in
the path to the root-bus was made to conserve class keys because their
number is limited and the complexity grows quadratically with number of
keys according to Documentation/locking/lockdep-design.rst.

Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
Cc: stable@vger.kernel.org
Reported-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
Changes in v2:
- Only use a subclass for each hotplug capable port/parent in the path to
  the PCI root bus, instead of one for each level in the PCI hierarchy,
  to avoid hitting MAX_LOCKDEP_SUBCLASSES
- Drop the "PCI: Add a pci_dev_depth() helper function" since we now need
  a special version of this to only count hotplug ports
- Various commit message improvements
---
 drivers/pci/hotplug/pciehp.h      |  3 +++
 drivers/pci/hotplug/pciehp_core.c |  2 +-
 drivers/pci/hotplug/pciehp_hpc.c  | 21 ++++++++++++++++++---
 3 files changed, 22 insertions(+), 4 deletions(-)

Comments

Bjorn Helgaas Jan. 11, 2022, 5:14 p.m. UTC | #1
On Fri, Dec 17, 2021 at 03:17:09PM +0100, Hans de Goede wrote:
> Use down_read_nested() and down_write_nested() when taking the
> ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
> the path to the PCI root bus as lock subclass parameter. This fixes the
> following false-positive lockdep report when unplugging a Lenovo X1C8 from
> a Lenovo 2nd gen TB3 dock:
> 
> [   28.583853] pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
> [   28.583891] pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
> [   28.584849] ============================================
> [   28.584854] WARNING: possible recursive locking detected
> [   28.584858] 5.16.0-rc2+ #621 Not tainted
> [   28.584864] --------------------------------------------
> [   28.584867] irq/124-pciehp/86 is trying to acquire lock:
> [   28.584873] ffff8e5ac4299ef8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80
> [   28.584904]
>                but task is already holding lock:
> [   28.584908] ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
> [   28.584929]
>                other info that might help us debug this:
> [   28.584933]  Possible unsafe locking scenario:
> 
> [   28.584936]        CPU0
> [   28.584939]        ----
> [   28.584942]   lock(&ctrl->reset_lock);
> [   28.584949]   lock(&ctrl->reset_lock);
> [   28.584955]
>                 *** DEADLOCK ***
> 
> [   28.584959]  May be due to missing lock nesting notation
> 
> [   28.584963] 3 locks held by irq/124-pciehp/86:
> [   28.584970]  #0: ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
> [   28.584991]  #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
> [   28.585012]  #2: ffff8e5ac1ee2248 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40
> [   28.585037]
>                stack backtrace:
> [   28.585042] CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
> [   28.585052] Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
> [   28.585059] Call Trace:
> [   28.585064]  <TASK>
> [   28.585073]  dump_stack_lvl+0x59/0x73
> [   28.585087]  __lock_acquire.cold+0xc5/0x2c6
> [   28.585106]  ? find_held_lock+0x2b/0x80
> [   28.585124]  lock_acquire+0xb5/0x2b0
> [   28.585132]  ? pciehp_check_presence+0x23/0x80
> [   28.585144]  ? lock_is_held_type+0xa8/0x120
> [   28.585161]  down_read+0x3e/0x50
> [   28.585172]  ? pciehp_check_presence+0x23/0x80
> [   28.585183]  pciehp_check_presence+0x23/0x80
> [   28.585194]  pciehp_runtime_resume+0x5c/0xa0
> [   28.585206]  ? pci_msix_init+0x60/0x60
> [   28.585214]  device_for_each_child+0x45/0x70
> [   28.585227]  pcie_port_device_runtime_resume+0x20/0x30
> [   28.585236]  pci_pm_runtime_resume+0xa7/0xc0
> [   28.585246]  ? pci_pm_freeze_noirq+0x100/0x100
> [   28.585257]  __rpm_callback+0x41/0x110
> [   28.585271]  ? pci_pm_freeze_noirq+0x100/0x100
> [   28.585281]  rpm_callback+0x59/0x70
> [   28.585293]  rpm_resume+0x512/0x7b0
> [   28.585309]  __pm_runtime_resume+0x4a/0x90
> [   28.585322]  __device_release_driver+0x28/0x240
> [   28.585338]  device_release_driver+0x26/0x40
> [   28.585351]  pci_stop_bus_device+0x68/0x90
> [   28.585363]  pci_stop_bus_device+0x2c/0x90
> [   28.585373]  pci_stop_and_remove_bus_device+0xe/0x20
> [   28.585384]  pciehp_unconfigure_device+0x6c/0x110
> [   28.585396]  ? __pm_runtime_resume+0x58/0x90
> [   28.585409]  pciehp_disable_slot+0x5b/0xe0
> [   28.585421]  pciehp_handle_presence_or_link_change+0xc3/0x2f0
> [   28.585436]  pciehp_ist+0x179/0x180
> [   28.585449]  ? disable_irq_nosync+0x10/0x10
> [   28.585460]  irq_thread_fn+0x1d/0x60
> [   28.585470]  ? irq_thread+0x81/0x1a0
> [   28.585480]  irq_thread+0xcb/0x1a0
> [   28.585491]  ? irq_thread_fn+0x60/0x60
> [   28.585502]  ? irq_thread_check_affinity+0xb0/0xb0
> [   28.585514]  kthread+0x165/0x190
> [   28.585522]  ? set_kthread_struct+0x40/0x40
> [   28.585531]  ret_from_fork+0x1f/0x30
> [   28.585554]  </TASK>
> 
> This lockdep warning is triggered because with Thunderbolt, hotplug ports
> are nested. When removing multiple devices in a daisy-chain, each hotplug
> port's reset_lock may be acquired recursively. It's never the same lock,
> so the lockdep splat is a false positive.
> 
> Because locks at the same hierarchy level are never acquired recursively,
> a per-level lockdep class is sufficient to fix the lockdep warning.
> 
> The choice to use one lockdep subclass per pcie-hotplug controller in
> the path to the root-bus was made to conserve class keys because their
> number is limited and the complexity grows quadratically with number of
> keys according to Documentation/locking/lockdep-design.rst.
> 
> Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
> Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
> Cc: stable@vger.kernel.org
> Reported-by: "Theodore Ts'o" <tytso@mit.edu>
> Reviewed-by: Lukas Wunner <lukas@wunner.de>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

Applied to pci/hotplug for v5.17, thanks, Hans!

> ---
> Changes in v2:
> - Only use a subclass for each hotplug capable port/parent in the path to
>   the PCI root bus, instead of one for each level in the PCI hierarchy,
>   to avoid hitting MAX_LOCKDEP_SUBCLASSES
> - Drop the "PCI: Add a pci_dev_depth() helper function" since we now need
>   a special version of this to only count hotplug ports
> - Various commit message improvements
> ---
>  drivers/pci/hotplug/pciehp.h      |  3 +++
>  drivers/pci/hotplug/pciehp_core.c |  2 +-
>  drivers/pci/hotplug/pciehp_hpc.c  | 21 ++++++++++++++++++---
>  3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
> index 918dccbc74b6..e0a614acee05 100644
> --- a/drivers/pci/hotplug/pciehp.h
> +++ b/drivers/pci/hotplug/pciehp.h
> @@ -75,6 +75,8 @@ extern int pciehp_poll_time;
>   * @reset_lock: prevents access to the Data Link Layer Link Active bit in the
>   *	Link Status register and to the Presence Detect State bit in the Slot
>   *	Status register during a slot reset which may cause them to flap
> + * @depth: Number of additional hotplug ports in the path to the root bus,
> + *	used as lock subclass for @reset_lock
>   * @ist_running: flag to keep user request waiting while IRQ thread is running
>   * @request_result: result of last user request submitted to the IRQ thread
>   * @requester: wait queue to wake up on completion of user request,
> @@ -106,6 +108,7 @@ struct controller {
>  
>  	struct hotplug_slot hotplug_slot;	/* hotplug core interface */
>  	struct rw_semaphore reset_lock;
> +	unsigned int depth;
>  	unsigned int ist_running;
>  	int request_result;
>  	wait_queue_head_t requester;
> diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
> index f34114d45259..4042d87d539d 100644
> --- a/drivers/pci/hotplug/pciehp_core.c
> +++ b/drivers/pci/hotplug/pciehp_core.c
> @@ -166,7 +166,7 @@ static void pciehp_check_presence(struct controller *ctrl)
>  {
>  	int occupied;
>  
> -	down_read(&ctrl->reset_lock);
> +	down_read_nested(&ctrl->reset_lock, ctrl->depth);
>  	mutex_lock(&ctrl->state_lock);
>  
>  	occupied = pciehp_card_present_or_link_active(ctrl);
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 83a0fa119cae..963fb50528da 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -583,7 +583,7 @@ static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
>  	 * the corresponding link change may have been ignored above.
>  	 * Synthesize it to ensure that it is acted on.
>  	 */
> -	down_read(&ctrl->reset_lock);
> +	down_read_nested(&ctrl->reset_lock, ctrl->depth);
>  	if (!pciehp_check_link_active(ctrl))
>  		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
>  	up_read(&ctrl->reset_lock);
> @@ -746,7 +746,7 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
>  	 * Disable requests have higher priority than Presence Detect Changed
>  	 * or Data Link Layer State Changed events.
>  	 */
> -	down_read(&ctrl->reset_lock);
> +	down_read_nested(&ctrl->reset_lock, ctrl->depth);
>  	if (events & DISABLE_SLOT)
>  		pciehp_handle_disable_request(ctrl);
>  	else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC))
> @@ -906,7 +906,7 @@ int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, bool probe)
>  	if (probe)
>  		return 0;
>  
> -	down_write(&ctrl->reset_lock);
> +	down_write_nested(&ctrl->reset_lock, ctrl->depth);
>  
>  	if (!ATTN_BUTTN(ctrl)) {
>  		ctrl_mask |= PCI_EXP_SLTCTL_PDCE;
> @@ -962,6 +962,20 @@ static inline void dbg_ctrl(struct controller *ctrl)
>  
>  #define FLAG(x, y)	(((x) & (y)) ? '+' : '-')
>  
> +static inline int pcie_hotplug_depth(struct pci_dev *dev)
> +{
> +	struct pci_bus *bus = dev->bus;
> +	int depth = 0;
> +
> +	while (bus->parent) {
> +		bus = bus->parent;
> +		if (bus->self && bus->self->is_hotplug_bridge)
> +			depth++;
> +	}
> +
> +	return depth;
> +}
> +
>  struct controller *pcie_init(struct pcie_device *dev)
>  {
>  	struct controller *ctrl;
> @@ -975,6 +989,7 @@ struct controller *pcie_init(struct pcie_device *dev)
>  		return NULL;
>  
>  	ctrl->pcie = dev;
> +	ctrl->depth = pcie_hotplug_depth(dev->port);
>  	pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, &slot_cap);
>  
>  	if (pdev->hotplug_user_indicators)
> -- 
> 2.33.1
>
Lukas Wunner Jan. 12, 2022, 8:28 a.m. UTC | #2
On Tue, Jan 11, 2022 at 11:14:47AM -0600, Bjorn Helgaas wrote:
> On Fri, Dec 17, 2021 at 03:17:09PM +0100, Hans de Goede wrote:
> > Use down_read_nested() and down_write_nested() when taking the
> > ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
> > the path to the PCI root bus as lock subclass parameter. This fixes the
> > following false-positive lockdep report when unplugging a Lenovo X1C8 from
> > a Lenovo 2nd gen TB3 dock:
[...]
> Applied to pci/hotplug for v5.17, thanks, Hans!

I've realized only now that Hans reported this issue already in August 2020
and opened a bugzilla for it:

https://bugzilla.kernel.org/show_bug.cgi?id=208855

The status can now be set to RESOLVED FIXED.  I don't have permission
to do that but perhaps either of you, Bjorn or Hans, has?

Also, the commit could optionally be amended with a Link: tag to that
bugzilla entry.

Thanks,

Lukas
Hans de Goede Jan. 12, 2022, 12:39 p.m. UTC | #3
Hi,

On 1/12/22 09:28, Lukas Wunner wrote:
> On Tue, Jan 11, 2022 at 11:14:47AM -0600, Bjorn Helgaas wrote:
>> On Fri, Dec 17, 2021 at 03:17:09PM +0100, Hans de Goede wrote:
>>> Use down_read_nested() and down_write_nested() when taking the
>>> ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
>>> the path to the PCI root bus as lock subclass parameter. This fixes the
>>> following false-positive lockdep report when unplugging a Lenovo X1C8 from
>>> a Lenovo 2nd gen TB3 dock:
> [...]
>> Applied to pci/hotplug for v5.17, thanks, Hans!
> 
> I've realized only now that Hans reported this issue already in August 2020
> and opened a bugzilla for it:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=208855

Ah I completely forgot about having filed that bug, good catch.

> The status can now be set to RESOLVED FIXED.  I don't have permission
> to do that but perhaps either of you, Bjorn or Hans, has?

I have added a comment and closed the bug now. Note that you can email
the kernel.org admins with your bugzilla login + a friendly requests
to give you some more bugzilla rights. I did that a while ago when
I hit similar issues doing triage of bugzilla.kernel.org bugs.

> Also, the commit could optionally be amended with a Link: tag to that
> bugzilla entry.

There isn't really any new info in the bugzilla though, so I guess
the commit is fine as is. With that said if Bjorn wants to add it
that is fine too of course.

Regards,

Hans
Bjorn Helgaas Jan. 12, 2022, 12:48 p.m. UTC | #4
On Wed, Jan 12, 2022 at 01:39:07PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 1/12/22 09:28, Lukas Wunner wrote:
> > On Tue, Jan 11, 2022 at 11:14:47AM -0600, Bjorn Helgaas wrote:
> >> On Fri, Dec 17, 2021 at 03:17:09PM +0100, Hans de Goede wrote:
> >>> Use down_read_nested() and down_write_nested() when taking the
> >>> ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
> >>> the path to the PCI root bus as lock subclass parameter. This fixes the
> >>> following false-positive lockdep report when unplugging a Lenovo X1C8 from
> >>> a Lenovo 2nd gen TB3 dock:
> > [...]
> >> Applied to pci/hotplug for v5.17, thanks, Hans!
> > 
> > I've realized only now that Hans reported this issue already in August 2020
> > and opened a bugzilla for it:
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=208855
> 
> Ah I completely forgot about having filed that bug, good catch.
> 
> > The status can now be set to RESOLVED FIXED.  I don't have permission
> > to do that but perhaps either of you, Bjorn or Hans, has?
> 
> I have added a comment and closed the bug now. Note that you can email
> the kernel.org admins with your bugzilla login + a friendly requests
> to give you some more bugzilla rights. I did that a while ago when
> I hit similar issues doing triage of bugzilla.kernel.org bugs.
> 
> > Also, the commit could optionally be amended with a Link: tag to that
> > bugzilla entry.
> 
> There isn't really any new info in the bugzilla though, so I guess
> the commit is fine as is. With that said if Bjorn wants to add it
> that is fine too of course.

Thanks, I added it.  It does have a link to Ted's original email,
which includes a complete dmesg log.

Of course, that makes the commit referenced by
https://bugzilla.kernel.org/show_bug.cgi?id=208855#c2 obsolete, but
that happens anyway because I often rebase to add things like this.

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 918dccbc74b6..e0a614acee05 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -75,6 +75,8 @@  extern int pciehp_poll_time;
  * @reset_lock: prevents access to the Data Link Layer Link Active bit in the
  *	Link Status register and to the Presence Detect State bit in the Slot
  *	Status register during a slot reset which may cause them to flap
+ * @depth: Number of additional hotplug ports in the path to the root bus,
+ *	used as lock subclass for @reset_lock
  * @ist_running: flag to keep user request waiting while IRQ thread is running
  * @request_result: result of last user request submitted to the IRQ thread
  * @requester: wait queue to wake up on completion of user request,
@@ -106,6 +108,7 @@  struct controller {
 
 	struct hotplug_slot hotplug_slot;	/* hotplug core interface */
 	struct rw_semaphore reset_lock;
+	unsigned int depth;
 	unsigned int ist_running;
 	int request_result;
 	wait_queue_head_t requester;
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index f34114d45259..4042d87d539d 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -166,7 +166,7 @@  static void pciehp_check_presence(struct controller *ctrl)
 {
 	int occupied;
 
-	down_read(&ctrl->reset_lock);
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	mutex_lock(&ctrl->state_lock);
 
 	occupied = pciehp_card_present_or_link_active(ctrl);
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 83a0fa119cae..963fb50528da 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -583,7 +583,7 @@  static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
 	 * the corresponding link change may have been ignored above.
 	 * Synthesize it to ensure that it is acted on.
 	 */
-	down_read(&ctrl->reset_lock);
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	if (!pciehp_check_link_active(ctrl))
 		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
 	up_read(&ctrl->reset_lock);
@@ -746,7 +746,7 @@  static irqreturn_t pciehp_ist(int irq, void *dev_id)
 	 * Disable requests have higher priority than Presence Detect Changed
 	 * or Data Link Layer State Changed events.
 	 */
-	down_read(&ctrl->reset_lock);
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	if (events & DISABLE_SLOT)
 		pciehp_handle_disable_request(ctrl);
 	else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC))
@@ -906,7 +906,7 @@  int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, bool probe)
 	if (probe)
 		return 0;
 
-	down_write(&ctrl->reset_lock);
+	down_write_nested(&ctrl->reset_lock, ctrl->depth);
 
 	if (!ATTN_BUTTN(ctrl)) {
 		ctrl_mask |= PCI_EXP_SLTCTL_PDCE;
@@ -962,6 +962,20 @@  static inline void dbg_ctrl(struct controller *ctrl)
 
 #define FLAG(x, y)	(((x) & (y)) ? '+' : '-')
 
+static inline int pcie_hotplug_depth(struct pci_dev *dev)
+{
+	struct pci_bus *bus = dev->bus;
+	int depth = 0;
+
+	while (bus->parent) {
+		bus = bus->parent;
+		if (bus->self && bus->self->is_hotplug_bridge)
+			depth++;
+	}
+
+	return depth;
+}
+
 struct controller *pcie_init(struct pcie_device *dev)
 {
 	struct controller *ctrl;
@@ -975,6 +989,7 @@  struct controller *pcie_init(struct pcie_device *dev)
 		return NULL;
 
 	ctrl->pcie = dev;
+	ctrl->depth = pcie_hotplug_depth(dev->port);
 	pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, &slot_cap);
 
 	if (pdev->hotplug_user_indicators)