mbox series

[00/32] Rework pciehp event handling & add runtime PM

Message ID cover.1529173804.git.lukas@wunner.de
Headers show
Series Rework pciehp event handling & add runtime PM | expand

Message

Lukas Wunner June 16, 2018, 7:25 p.m. UTC
Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
on and off synchronously in the IRQ thread, no indirection via a work
queue anymore.

When the slot is enabled/disabled by the user via sysfs or an Attention
Button press, a request is sent to the IRQ thread.  The IRQ thread is
thus the sole entity enabling/disabling the slot.

The IRQ thread can cope with missed events, e.g. if a card is inserted
and immediately pulled out before the IRQ thread had a chance to react.
It also tolerates an initially unstable link as observed in the wild by
Stefan Roese.

Finally, runtime PM support is added.  This was the original motivation
of the series because runtime suspending hotplug ports is needed to power
down Thunderbolt controllers on idle, which saves ~1.5W per controller.
Runtime resuming ports takes tenths of milliseconds during which events
may be missed, this in turn necessitated the event handling rework.

I've pushed the series to GitHub to ease reviewing/fetching:
https://github.com/l1k/linux/commits/pciehp_runpm_v2

Please review and test.

Thanks,

Lukas


Lukas Wunner (32):
  PCI: hotplug: Don't leak pci_slot on registration failure
  PCI: pciehp: Fix UAF on unplug
  PCI: pciehp: Fix deadlock on unplug
  PCI: pciehp: Fix unprotected list iteration in IRQ handler
  PCI: pciehp: Drop unnecessary NULL pointer check
  PCI: pciehp: Declare pciehp_unconfigure_device() void
  PCI: pciehp: Document struct slot and struct controller
  genirq: Synchronize only with single thread on free_irq()
  PCI: pciehp: Convert to threaded IRQ
  PCI: pciehp: Convert to threaded polling
  PCI: pciehp: Stop blinking on slot enable failure
  PCI: pciehp: Handle events synchronously
  PCI: pciehp: Drop slot workqueue
  PCI: hotplug: Demidlayer registration with the core
  PCI: pciehp: Publish to user space last on probe
  PCI: pciehp: Track enable/disable status
  PCI: pciehp: Enable/disable exclusively from IRQ thread
  PCI: pciehp: Drop enable/disable lock
  PCI: pciehp: Declare pciehp_enable/disable_slot() static
  PCI: pciehp: Tolerate initially unstable link
  PCI: pciehp: Become resilient to missed events
  PCI: pciehp: Always enable occupied slot on probe
  PCI: pciehp: Avoid slot access during reset
  PCI: portdrv: Deduplicate PM callback iterator
  PCI: pciehp: Clear spurious events earlier on resume
  PCI: pciehp: Obey compulsory command delay after resume
  PCI: pciehp: Support interrupts sent from D3hot
  PCI: pciehp: Resume to D0 on enable/disable
  PCI: pciehp: Resume parent to D0 on config space access
  PCI: sysfs: Resume to D0 on function reset
  PCI: Whitelist native hotplug ports for runtime D3
  PCI: Whitelist Thunderbolt ports for runtime D3

 drivers/pci/hotplug/acpiphp_core.c      |  22 +-
 drivers/pci/hotplug/cpci_hotplug_core.c |  14 +-
 drivers/pci/hotplug/cpqphp_core.c       |  16 +-
 drivers/pci/hotplug/ibmphp_core.c       |  15 +-
 drivers/pci/hotplug/ibmphp_ebda.c       |  20 --
 drivers/pci/hotplug/pci_hotplug_core.c  | 134 +++++++--
 drivers/pci/hotplug/pciehp.h            | 117 ++++++--
 drivers/pci/hotplug/pciehp_core.c       | 104 ++++---
 drivers/pci/hotplug/pciehp_ctrl.c       | 352 ++++++++++--------------
 drivers/pci/hotplug/pciehp_hpc.c        | 261 ++++++++++--------
 drivers/pci/hotplug/pciehp_pci.c        |  42 ++-
 drivers/pci/hotplug/pcihp_skeleton.c    |  16 +-
 drivers/pci/hotplug/pnv_php.c           |   5 +-
 drivers/pci/hotplug/rpaphp_core.c       |   2 +-
 drivers/pci/hotplug/rpaphp_slot.c       |  13 +-
 drivers/pci/hotplug/s390_pci_hpc.c      |  13 +-
 drivers/pci/hotplug/sgi_hotplug.c       |   9 +-
 drivers/pci/hotplug/shpchp_core.c       |  20 +-
 drivers/pci/pci-sysfs.c                 |   2 +
 drivers/pci/pci.c                       |  21 +-
 drivers/pci/pcie/portdrv.h              |   2 +
 drivers/pci/pcie/portdrv_core.c         |  30 +-
 drivers/pci/pcie/portdrv_pci.c          |   2 +
 drivers/platform/x86/asus-wmi.c         |  12 +-
 drivers/platform/x86/eeepc-laptop.c     |  12 +-
 include/linux/pci_hotplug.h             |  15 +-
 kernel/irq/manage.c                     |  21 +-
 27 files changed, 687 insertions(+), 605 deletions(-)

Comments

Mika Westerberg June 21, 2018, 12:19 p.m. UTC | #1
On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
> on and off synchronously in the IRQ thread, no indirection via a work
> queue anymore.
> 
> When the slot is enabled/disabled by the user via sysfs or an Attention
> Button press, a request is sent to the IRQ thread.  The IRQ thread is
> thus the sole entity enabling/disabling the slot.
> 
> The IRQ thread can cope with missed events, e.g. if a card is inserted
> and immediately pulled out before the IRQ thread had a chance to react.
> It also tolerates an initially unstable link as observed in the wild by
> Stefan Roese.
> 
> Finally, runtime PM support is added.  This was the original motivation
> of the series because runtime suspending hotplug ports is needed to power
> down Thunderbolt controllers on idle, which saves ~1.5W per controller.
> Runtime resuming ports takes tenths of milliseconds during which events
> may be missed, this in turn necessitated the event handling rework.
> 
> I've pushed the series to GitHub to ease reviewing/fetching:
> https://github.com/l1k/linux/commits/pciehp_runpm_v2
> 
> Please review and test.

I've tested this series on every native PCIe hotplug system I have here,
including one non-Thunderbolt (just hotplugging NVMes) and I did not
observe any issues. There are few minor comments for few patches but in
general this is really nice cleanup.
Patel, Mayurkumar June 27, 2018, 1:35 p.m. UTC | #2
>-----Original Message-----
>From: Mika Westerberg [mailto:mika.westerberg@linux.intel.com]
>Sent: Thursday, June 21, 2018 2:20 PM
>To: Lukas Wunner <lukas@wunner.de>
>Cc: Bjorn Helgaas <bhelgaas@google.com>; Wysocki, Rafael J <rafael.j.wysocki@intel.com>; Raj, Ashok <ashok.raj@intel.com>; Busch,
>Keith <keith.busch@intel.com>; Yinghai Lu <yinghai@kernel.org>; Sinan Kaya <okaya@kernel.org>; linux-pci@vger.kernel.org; Greg Kroah-
>Hartman <greg@kroah.com>; Thomas Gleixner <tglx@linutronix.de>; Patel, Mayurkumar <mayurkumar.patel@intel.com>; Kenji Kaneshige
><kaneshige.kenji@jp.fujitsu.com>; Stefan Roese <sr@denx.de>; Rajat Jain <rajatja@google.com>; Alex Williamson
><alex.williamson@redhat.com>; Andreas Noever <andreas.noever@gmail.com>
>Subject: Re: [PATCH 00/32] Rework pciehp event handling & add runtime PM
>
>On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
>> Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
>> on and off synchronously in the IRQ thread, no indirection via a work
>> queue anymore.
>>
>> When the slot is enabled/disabled by the user via sysfs or an Attention
>> Button press, a request is sent to the IRQ thread.  The IRQ thread is
>> thus the sole entity enabling/disabling the slot.
>>
>> The IRQ thread can cope with missed events, e.g. if a card is inserted
>> and immediately pulled out before the IRQ thread had a chance to react.
>> It also tolerates an initially unstable link as observed in the wild by
>> Stefan Roese.
>>
>> Finally, runtime PM support is added.  This was the original motivation
>> of the series because runtime suspending hotplug ports is needed to power
>> down Thunderbolt controllers on idle, which saves ~1.5W per controller.
>> Runtime resuming ports takes tenths of milliseconds during which events
>> may be missed, this in turn necessitated the event handling rework.
>>
>> I've pushed the series to GitHub to ease reviewing/fetching:
>> https://github.com/l1k/linux/commits/pciehp_runpm_v2
>>
>> Please review and test.
>
>I've tested this series on every native PCIe hotplug system I have here,
>including one non-Thunderbolt (just hotplugging NVMes) and I did not
>observe any issues. There are few minor comments for few patches but in
>general this is really nice cleanup.


I have not found any issues with this test series on my test setup related to PCIe hotplug.

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
Bjorn Helgaas July 12, 2018, 10:28 p.m. UTC | #3
On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
> on and off synchronously in the IRQ thread, no indirection via a work
> queue anymore.
> 
> When the slot is enabled/disabled by the user via sysfs or an Attention
> Button press, a request is sent to the IRQ thread.  The IRQ thread is
> thus the sole entity enabling/disabling the slot.
> 
> The IRQ thread can cope with missed events, e.g. if a card is inserted
> and immediately pulled out before the IRQ thread had a chance to react.
> It also tolerates an initially unstable link as observed in the wild by
> Stefan Roese.
> 
> Finally, runtime PM support is added.  This was the original motivation
> of the series because runtime suspending hotplug ports is needed to power
> down Thunderbolt controllers on idle, which saves ~1.5W per controller.
> Runtime resuming ports takes tenths of milliseconds during which events
> may be missed, this in turn necessitated the event handling rework.
> 
> I've pushed the series to GitHub to ease reviewing/fetching:
> https://github.com/l1k/linux/commits/pciehp_runpm_v2

These patches all showed up in random order in my inbox, I think
because they were sent with zero delay between them, which made them a
bit of a hassle to read and apply via email.  If you repost these (I
think I saw something about an issue with patch 3), can you use a
second or two of delay between them?
Lukas Wunner July 13, 2018, 7:54 a.m. UTC | #4
On Thu, Jul 12, 2018 at 05:28:16PM -0500, Bjorn Helgaas wrote:
> On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> > Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
> > on and off synchronously in the IRQ thread, no indirection via a work
> > queue anymore.
> > 
> > When the slot is enabled/disabled by the user via sysfs or an Attention
> > Button press, a request is sent to the IRQ thread.  The IRQ thread is
> > thus the sole entity enabling/disabling the slot.
> > 
> > The IRQ thread can cope with missed events, e.g. if a card is inserted
> > and immediately pulled out before the IRQ thread had a chance to react.
> > It also tolerates an initially unstable link as observed in the wild by
> > Stefan Roese.
> > 
> > Finally, runtime PM support is added.  This was the original motivation
> > of the series because runtime suspending hotplug ports is needed to power
> > down Thunderbolt controllers on idle, which saves ~1.5W per controller.
> > Runtime resuming ports takes tenths of milliseconds during which events
> > may be missed, this in turn necessitated the event handling rework.
> > 
> > I've pushed the series to GitHub to ease reviewing/fetching:
> > https://github.com/l1k/linux/commits/pciehp_runpm_v2
> 
> These patches all showed up in random order in my inbox, I think
> because they were sent with zero delay between them, which made them a
> bit of a hassle to read and apply via email.

I habitually use a 20 second delay between each message, but I got
unlucky and the mail servers delivered them in an incorrect order.

Since you're using Mutt, you can work around this problem by first
limiting the displayed messages to just this series by pressing the
"l" key and typing:
~f "Lukas Wunner" ~d 16/06

This will only show messages with a From header matching my name and
a Date header of June 16.  (You may need to type 06/16 depending on
your locale.)  You can go back to seeing all messages by pressing "l"
again and typing "." or "~a".

Now press the "o" key and then the "s" key.  This sorts by subject,
so you've got the order that you need to apply patches.  I think the
default is to order by reverse-date-received, you can go back to that
by pressing "O" (shift-O) and then "r".

FWIW you can verify my claim that the messages were sent out 20 seconds
apart by looking at the bottom-most Received: header.  (Press "h" to
see the full header in Mutt.)  This is the one of patch [00/32]:

Received: from localhost (unknown [redacted])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by h08.hostsharing.net (Postfix) with ESMTPSA id 8479E603E110;
	Sat, 16 Jun 2018 21:25:00 +0200 (CEST)

And this is patch [01/32]:

Received: from localhost (unknown [redacted])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by h08.hostsharing.net (Postfix) with ESMTPSA id 25AE9603E110;
	Sat, 16 Jun 2018 21:25:21 +0200 (CEST)

It was actually 21 seconds here, I guess the TLS setup took a little
longer.  Each subsequent patch is likewise spaced apart by 20 secs.


> If you repost these (I think I saw something about an issue with
> patch 3), can you use a second or two of delay between them?

Would you like me to resend the whole series?  The ones at the
beginning of the series (with the exception of patch 3) are probably
uncontroversial, I was hoping that you could maybe apply some of them,
or comment on some of the patches in the series so that I can rework
them before reposting, thereby avoiding spamming everyone with identical
patches a second time.

Thanks,

Lukas
Bjorn Helgaas July 13, 2018, 11:43 a.m. UTC | #5
On Fri, Jul 13, 2018 at 09:54:49AM +0200, Lukas Wunner wrote:
> On Thu, Jul 12, 2018 at 05:28:16PM -0500, Bjorn Helgaas wrote:

> > These patches all showed up in random order in my inbox, I think
> > because they were sent with zero delay between them, which made them a
> > bit of a hassle to read and apply via email.
> 
> I habitually use a 20 second delay between each message, but I got
> unlucky and the mail servers delivered them in an incorrect order.

Strange, I wonder what happened.

> > If you repost these (I think I saw something about an issue with
> > patch 3), can you use a second or two of delay between them?
> 
> Would you like me to resend the whole series?  The ones at the
> beginning of the series (with the exception of patch 3) are probably
> uncontroversial, I was hoping that you could maybe apply some of them,
> or comment on some of the patches in the series so that I can rework
> them before reposting, thereby avoiding spamming everyone with identical
> patches a second time.

No need to resend for that, I had already sorted (but I didn't think to use
mutt sorting, so thanks for that!) and applied them on a test branch (see
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=pci/06-16-lukas-pciehp)
Bjorn Helgaas July 16, 2018, 2:20 p.m. UTC | #6
On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
> on and off synchronously in the IRQ thread, no indirection via a work
> queue anymore.
> 
> When the slot is enabled/disabled by the user via sysfs or an Attention
> Button press, a request is sent to the IRQ thread.  The IRQ thread is
> thus the sole entity enabling/disabling the slot.
> 
> The IRQ thread can cope with missed events, e.g. if a card is inserted
> and immediately pulled out before the IRQ thread had a chance to react.
> It also tolerates an initially unstable link as observed in the wild by
> Stefan Roese.
> 
> Finally, runtime PM support is added.  This was the original motivation
> of the series because runtime suspending hotplug ports is needed to power
> down Thunderbolt controllers on idle, which saves ~1.5W per controller.
> Runtime resuming ports takes tenths of milliseconds during which events
> may be missed, this in turn necessitated the event handling rework.
> 
> I've pushed the series to GitHub to ease reviewing/fetching:
> https://github.com/l1k/linux/commits/pciehp_runpm_v2

My current test branch:

  https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=pci/06-16-lukas-pciehp

has this series with these changes:

  - Drop the genirq patch (already merged via tip)

  - Add one blank line (pcie_cleanup_slot())

  - A few trivial changelog updates (mostly to use lkml.kernel.org
    links to reduce dependency on 3rd party archives)

Do you plan any other updates?  The open questions I see are:

  - You mentioned withdrawing "03/32 PCI: pciehp: Fix deadlock on
    unplug".  I tried simply dropping that, but that caused a conflict
    that I didn't try to resolve.

  - Mika had a few questions/comments that are still dangling.

  - Whether to include "02/32 PCI: pciehp: Fix UAF on unplug" in the
    v4.19 merge window or in v4.18.

Bjorn
Lukas Wunner July 19, 2018, 9:43 a.m. UTC | #7
On Mon, Jul 16, 2018 at 09:20:54AM -0500, Bjorn Helgaas wrote:
> On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> > Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
> > on and off synchronously in the IRQ thread, no indirection via a work
> > queue anymore.
> > 
> > When the slot is enabled/disabled by the user via sysfs or an Attention
> > Button press, a request is sent to the IRQ thread.  The IRQ thread is
> > thus the sole entity enabling/disabling the slot.
> > 
> > The IRQ thread can cope with missed events, e.g. if a card is inserted
> > and immediately pulled out before the IRQ thread had a chance to react.
> > It also tolerates an initially unstable link as observed in the wild by
> > Stefan Roese.
> > 
> > Finally, runtime PM support is added.  This was the original motivation
> > of the series because runtime suspending hotplug ports is needed to power
> > down Thunderbolt controllers on idle, which saves ~1.5W per controller.
> > Runtime resuming ports takes tenths of milliseconds during which events
> > may be missed, this in turn necessitated the event handling rework.
> > 
> > I've pushed the series to GitHub to ease reviewing/fetching:
> > https://github.com/l1k/linux/commits/pciehp_runpm_v2
> 
> My current test branch:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=pci/06-16-lukas-pciehp
> 
> has this series with these changes:
> 
>   - Drop the genirq patch (already merged via tip)
> 
>   - Add one blank line (pcie_cleanup_slot())
> 
>   - A few trivial changelog updates (mostly to use lkml.kernel.org
>     links to reduce dependency on 3rd party archives)

I've reviewed the branch and diffed every commit with the original patch
and this all LGTM.  I'll try to adhere more closely to the desired style
in the future, i.e. use kernel.org links and order the Cc: to the bottom.


> Do you plan any other updates?  The open questions I see are:
> 
>   - You mentioned withdrawing "03/32 PCI: pciehp: Fix deadlock on
>     unplug".  I tried simply dropping that, but that caused a conflict
>     that I didn't try to resolve.

Yes, a single patch succeeding it won't apply cleanly if patch 03/32 is
omitted, namely "06/32 PCI: pciehp: Declare pciehp_unconfigure_device()
void".  However resolving the conflict is straightforward, I'm including
a replacement patch below.


>   - Mika had a few questions/comments that are still dangling.

I could resolve those with two further replacement patches:

- "17/32 PCI: pciehp: Enable/disable exclusively from IRQ thread"
  => Deduplicate code to detect a change in slot occupancy
     by introducing a small helper.

- "23/32 PCI: pciehp: Avoid slot access during reset"
  => Amend commit message to justify usage of rw_semaphore.

However ISTR that you dislike replacement patches because they're
more complicated for you to handle.  Would you prefer me to repost
the full series instead?

Further open points:

- Mika suggested adding a few breaks to switch/case statements to avoid
  unintentional fall-throughs if the code is later extended.  I think
  it might be good to do that in a separate commit that is applied on
  top of this series, and at the same time mark all intentional
  fall-throughs as such for -Wimplicit-fallthrough.
  BTW, you may see a merge conflict between the pci/06-16-lukas-pciehp
  and the pci/misc branch because you've already applied Gustavo's patch
  to the latter and it touches pciehp_ctrl.c.

- The commit message of "27/32 PCI: pciehp: Support interrupts sent from
  D3hot" could optionally be extended by your comment that the "Downstream
  Port" term includes both Root Ports and Switch Downstream Ports.

- Mika voiced a concern that "32/32 PCI: Whitelist Thunderbolt ports for
  runtime D3" should probably be constrained to Apple systems, this is
  pending a reply to the mail I sent yesterday evening.


>   - Whether to include "02/32 PCI: pciehp: Fix UAF on unplug" in the
>     v4.19 merge window or in v4.18.

Personally I think submitting the fix during the 4.19 merge window is
sufficient, considering that it'll already open in two to three weeks
and the bug has been present for years.

Thanks,

Lukas

-- >8 --
Subject: [PATCH] PCI: pciehp: Declare pciehp_unconfigure_device() void

Since commit 0f4bd8014db5 ("PCI: hotplug: Drop checking of PCI_BRIDGE_
CONTROL in *_unconfigure_device()"), pciehp_unconfigure_device() can no
longer fail, so declare it and its sole caller remove_board() void, in
keeping with the usual kernel pattern that enablement can fail, but
disablement cannot.  No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/pci/hotplug/pciehp.h      |  2 +-
 drivers/pci/hotplug/pciehp_ctrl.c | 11 ++++-------
 drivers/pci/hotplug/pciehp_pci.c  |  4 +---
 3 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 79b9b5f..9bb9931 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -129,7 +129,7 @@ struct controller {
 int pciehp_sysfs_disable_slot(struct slot *slot);
 void pciehp_queue_interrupt_event(struct slot *slot, u32 event_type);
 int pciehp_configure_device(struct slot *p_slot);
-int pciehp_unconfigure_device(struct slot *p_slot);
+void pciehp_unconfigure_device(struct slot *p_slot);
 void pciehp_queue_pushbutton_work(struct work_struct *work);
 struct controller *pcie_init(struct pcie_device *dev);
 int pcie_init_notification(struct controller *ctrl);
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 5bbd28d..163947b 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -133,14 +133,11 @@ static int board_added(struct slot *p_slot)
  * remove_board - Turns off slot and LEDs
  * @p_slot: slot where board is being removed
  */
-static int remove_board(struct slot *p_slot)
+static void remove_board(struct slot *p_slot)
 {
-	int retval;
 	struct controller *ctrl = p_slot->ctrl;
 
-	retval = pciehp_unconfigure_device(p_slot);
-	if (retval)
-		return retval;
+	pciehp_unconfigure_device(p_slot);
 
 	if (POWER_CTRL(ctrl)) {
 		pciehp_power_off_slot(p_slot);
@@ -155,7 +152,6 @@ static int remove_board(struct slot *p_slot)
 
 	/* turn off Green LED */
 	pciehp_green_led_off(p_slot);
-	return 0;
 }
 
 struct power_work_info {
@@ -435,7 +431,8 @@ int pciehp_disable_slot(struct slot *p_slot)
 		}
 	}
 
-	return remove_board(p_slot);
+	remove_board(p_slot);
+	return 0;
 }
 
 int pciehp_sysfs_enable_slot(struct slot *p_slot)
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index acc360d..ec3f065 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -76,9 +76,8 @@ int pciehp_configure_device(struct slot *p_slot)
 	return ret;
 }
 
-int pciehp_unconfigure_device(struct slot *p_slot)
+void pciehp_unconfigure_device(struct slot *p_slot)
 {
-	int rc = 0;
 	u8 presence = 0;
 	struct pci_dev *dev, *temp;
 	struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
@@ -121,5 +120,4 @@ int pciehp_unconfigure_device(struct slot *p_slot)
 	}
 
 	pci_unlock_rescan_remove();
-	return rc;
 }
Bjorn Helgaas July 19, 2018, 7:05 p.m. UTC | #8
On Thu, Jul 19, 2018 at 11:43:15AM +0200, Lukas Wunner wrote:
> On Mon, Jul 16, 2018 at 09:20:54AM -0500, Bjorn Helgaas wrote:

> >   - You mentioned withdrawing "03/32 PCI: pciehp: Fix deadlock on
> >     unplug".  I tried simply dropping that, but that caused a conflict
> >     that I didn't try to resolve.
> 
> Yes, a single patch succeeding it won't apply cleanly if patch 03/32 is
> omitted, namely "06/32 PCI: pciehp: Declare pciehp_unconfigure_device()
> void".  However resolving the conflict is straightforward, I'm including
> a replacement patch below.

Thanks, I'll pull that in shortly.

> >   - Mika had a few questions/comments that are still dangling.
> 
> I could resolve those with two further replacement patches:
> 
> - "17/32 PCI: pciehp: Enable/disable exclusively from IRQ thread"
>   => Deduplicate code to detect a change in slot occupancy
>      by introducing a small helper.
> 
> - "23/32 PCI: pciehp: Avoid slot access during reset"
>   => Amend commit message to justify usage of rw_semaphore.
> 
> However ISTR that you dislike replacement patches because they're
> more complicated for you to handle.  Would you prefer me to repost
> the full series instead?

No need, if you want to post updates for those, I can incorporate
those, too.

This is an extremely well-done series; thanks for all the work you've
done on it!

Bjorn
Bjorn Helgaas July 19, 2018, 10:50 p.m. UTC | #9
On Thu, Jul 19, 2018 at 11:43:15AM +0200, Lukas Wunner wrote:
> On Mon, Jul 16, 2018 at 09:20:54AM -0500, Bjorn Helgaas wrote:
> > On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> > > Rework pciehp to use modern, threaded IRQ handling.  The slot is powered
> > > on and off synchronously in the IRQ thread, no indirection via a work
> > > queue anymore.
> > > 
> > > When the slot is enabled/disabled by the user via sysfs or an Attention
> > > Button press, a request is sent to the IRQ thread.  The IRQ thread is
> > > thus the sole entity enabling/disabling the slot.
> > > 
> > > The IRQ thread can cope with missed events, e.g. if a card is inserted
> > > and immediately pulled out before the IRQ thread had a chance to react.
> > > It also tolerates an initially unstable link as observed in the wild by
> > > Stefan Roese.
> > > 
> > > Finally, runtime PM support is added.  This was the original motivation
> > > of the series because runtime suspending hotplug ports is needed to power
> > > down Thunderbolt controllers on idle, which saves ~1.5W per controller.
> > > Runtime resuming ports takes tenths of milliseconds during which events
> > > may be missed, this in turn necessitated the event handling rework.
> > > 
> > > I've pushed the series to GitHub to ease reviewing/fetching:
> > > https://github.com/l1k/linux/commits/pciehp_runpm_v2
> > 
> > My current test branch:
> > 
> >   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=pci/06-16-lukas-pciehp
> > 
> > has this series with these changes:
> > 
> >   - Drop the genirq patch (already merged via tip)
> > 
> >   - Add one blank line (pcie_cleanup_slot())
> > 
> >   - A few trivial changelog updates (mostly to use lkml.kernel.org
> >     links to reduce dependency on 3rd party archives)
> 
> I've reviewed the branch and diffed every commit with the original patch
> and this all LGTM.  I'll try to adhere more closely to the desired style
> in the future, i.e. use kernel.org links and order the Cc: to the bottom.
> 
> 
> > Do you plan any other updates?  The open questions I see are:
> > 
> >   - You mentioned withdrawing "03/32 PCI: pciehp: Fix deadlock on
> >     unplug".  I tried simply dropping that, but that caused a conflict
> >     that I didn't try to resolve.
> 
> Yes, a single patch succeeding it won't apply cleanly if patch 03/32 is
> omitted, namely "06/32 PCI: pciehp: Declare pciehp_unconfigure_device()
> void".  However resolving the conflict is straightforward, I'm including
> a replacement patch below.

I applied the replacement and put everything on pci/hotplug for v4.19.

This is fantastic.  I really appreciate all your work, and especially
your clear, concise changelogs.

> >   - Mika had a few questions/comments that are still dangling.
> 
> I could resolve those with two further replacement patches:
> 
> - "17/32 PCI: pciehp: Enable/disable exclusively from IRQ thread"
>   => Deduplicate code to detect a change in slot occupancy
>      by introducing a small helper.
> 
> - "23/32 PCI: pciehp: Avoid slot access during reset"
>   => Amend commit message to justify usage of rw_semaphore.
> 
> However ISTR that you dislike replacement patches because they're
> more complicated for you to handle.  Would you prefer me to repost
> the full series instead?
> 
> Further open points:
> 
> - Mika suggested adding a few breaks to switch/case statements to avoid
>   unintentional fall-throughs if the code is later extended.  I think
>   it might be good to do that in a separate commit that is applied on
>   top of this series, and at the same time mark all intentional
>   fall-throughs as such for -Wimplicit-fallthrough.
>   BTW, you may see a merge conflict between the pci/06-16-lukas-pciehp
>   and the pci/misc branch because you've already applied Gustavo's patch
>   to the latter and it touches pciehp_ctrl.c.
> 
> - The commit message of "27/32 PCI: pciehp: Support interrupts sent from
>   D3hot" could optionally be extended by your comment that the "Downstream
>   Port" term includes both Root Ports and Switch Downstream Ports.
> 
> - Mika voiced a concern that "32/32 PCI: Whitelist Thunderbolt ports for
>   runtime D3" should probably be constrained to Apple systems, this is
>   pending a reply to the mail I sent yesterday evening.

If you send any of the above updates, I'll gladly update the
pci/hotplug branch.  You can either send replacement patches or
incremental ones that I can fold into existing commits.

Bjorn
Lukas Wunner July 28, 2018, 5:44 a.m. UTC | #10
On Thu, Jul 19, 2018 at 05:50:16PM -0500, Bjorn Helgaas wrote:
> On Thu, Jul 19, 2018 at 11:43:15AM +0200, Lukas Wunner wrote:
> > On Mon, Jul 16, 2018 at 09:20:54AM -0500, Bjorn Helgaas wrote:
> > >   - Mika had a few questions/comments that are still dangling.
> > 
> > I could resolve those with two further replacement patches:
> > 
> > - "17/32 PCI: pciehp: Enable/disable exclusively from IRQ thread"
> >   => Deduplicate code to detect a change in slot occupancy
> >      by introducing a small helper.
> > 
> > - "23/32 PCI: pciehp: Avoid slot access during reset"
> >   => Amend commit message to justify usage of rw_semaphore.
> > 
> > Further open points:
> > 
> > - Mika suggested adding a few breaks to switch/case statements to avoid
> >   unintentional fall-throughs if the code is later extended.  I think
> >   it might be good to do that in a separate commit that is applied on
> >   top of this series, and at the same time mark all intentional
> >   fall-throughs as such for -Wimplicit-fallthrough.
> >   BTW, you may see a merge conflict between the pci/06-16-lukas-pciehp
> >   and the pci/misc branch because you've already applied Gustavo's patch
> >   to the latter and it touches pciehp_ctrl.c.
> > 
> > - The commit message of "27/32 PCI: pciehp: Support interrupts sent from
> >   D3hot" could optionally be extended by your comment that the "Downstream
> >   Port" term includes both Root Ports and Switch Downstream Ports.
> 
> If you send any of the above updates, I'll gladly update the
> pci/hotplug branch.  You can either send replacement patches or
> incremental ones that I can fold into existing commits.

Bjorn, Mika, everyone, please excuse the delay.

I've just posted 4 patches to linux-pci@ to address all the above-listed
review comments that are still outstanding:

* 2 replacement patches for existing commits on Bjorn's pci/hotplug branch:
    PCI: pciehp: Avoid slot access during reset
    PCI: pciehp: Support interrupts sent from D3hot
  No code changes, only the commit messages have been updated.
  Specifics are below the three dash separator in each patch.

* 2 patches that are intended to be applied on top of the branch:
    PCI: pciehp: Avoid implicit fallthroughs in switch statements
    PCI: pciehp: Deduplicate presence check on probe & resume

With this approach I hope to minimize the work for Bjorn by avoiding
any rebase conflicts.

Thanks!

Lukas