Patchwork [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

login
register
mail settings
Submitter Rafael J. Wysocki
Date Oct. 11, 2013, 11:13 a.m.
Message ID <2096980.ZtIelfoX39@vostro.rjw.lan>
Download mbox | patch
Permalink /patch/282712/
State Superseded
Headers show

Comments

Rafael J. Wysocki - Oct. 11, 2013, 11:13 a.m.
On Thursday, October 10, 2013 06:42:56 PM Linus Torvalds wrote:
> On Thu, Oct 10, 2013 at 6:45 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> >         /* Register slots for ejectable funtions only. */
> > -       if (acpi_pci_check_ejectable(pbus, handle)  || is_dock_device(handle)) {
> > +       if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle))
> > +           && !(pdev && device_is_managed_by_native_pciehp(pdev))) {
> >                 unsigned long long sun;
> >                 int retval;
> 
> I can't even begin to say whether this is a good solution or not,
> because that if-conditional makes me want to go out and kill some
> homeless people to let my aggressions out.
> 
> Can we please agree to *never* write code like this? Ever?
> 
> Use a well-named inline helper function where the name describes what
> the f*ck the code is trying to do, and then comment the separate
> issues. Because none of the above line noise makes me go "Ahh, it's
> the test for an ejectable function".
> 
> What the heck _is_ an "ejectable function" anyway? The only comment
> there just makes the code even less sensible.
> 
> Please?

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug

Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
devices that have already been claimed by the native PCIe hotplug
(pciehp).

The ACPI hotplug events are essentially re-scan, remove and eject
requests.  Re-scan and remove should work regardless, because they
may be triggered by user space via sysfs and the ACPI eject (_EJ0)
should work if the BIOS wants us to use it.  There may be an issue
if the BIOS signals ACPI eject and wants us to use the native eject,
but that doesn't work without this change anyway.

This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
triggering unnecessarily for bridges whose parents are managed by
pciehp.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/hotplug/acpiphp_glue.c |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steven Rostedt - Oct. 11, 2013, 1:01 p.m.
On Fri, 11 Oct 2013 13:13:47 +0200
"Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> 
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug
> 
> Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
> devices that have already been claimed by the native PCIe hotplug
> (pciehp).
> 
> The ACPI hotplug events are essentially re-scan, remove and eject
> requests.  Re-scan and remove should work regardless, because they
> may be triggered by user space via sysfs and the ACPI eject (_EJ0)
> should work if the BIOS wants us to use it.  There may be an issue
> if the BIOS signals ACPI eject and wants us to use the native eject,
> but that doesn't work without this change anyway.
> 
> This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
> triggering unnecessarily for bridges whose parents are managed by
> pciehp.
> 

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/pci/hotplug/acpiphp_glue.c |   32 ++++++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 6 deletions(-)
> 
> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> @@ -259,6 +259,31 @@ static void acpiphp_dock_release(void *d
>  	put_bridge(context->func.parent);
>  }
>  
> +/**
> + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
> + * @bridge: ACPIPHP bridge the slot belongs to.
> + * @handle: ACPI handle of a device in the slot.
> + */
> +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
> +					  acpi_handle handle)
> +{
> +	struct pci_bus *pbus = bridge->pci_bus;
> +	struct pci_dev *pdev = bridge->pci_dev;
> +
> +	/*
> +	 * Do not expose slots whose bridges are managed by pciehp, because they
> +	 * will be exposed to user space by the pciehp driver.
> +	 */
> +	if (pdev && device_is_managed_by_native_pciehp(pdev))
> +		return false;
> +
> +	/*
> +	 * Expose slots for devices with either _EJ0 or _RMV and for devices
> +	 * on docking stations.
> +	 */
> +	return acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle);
> +}
> +
>  /* callback routine to register each ACPI PCI slot object */
>  static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
>  				 void **rv)
> @@ -271,12 +296,8 @@ static acpi_status register_slot(acpi_ha
>  	unsigned long long adr;
>  	int device, function;
>  	struct pci_bus *pbus = bridge->pci_bus;
> -	struct pci_dev *pdev = bridge->pci_dev;
>  	u32 val;
>  
> -	if (pdev && device_is_managed_by_native_pciehp(pdev))
> -		return AE_OK;
> -
>  	status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
>  	if (ACPI_FAILURE(status)) {
>  		acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
> @@ -325,8 +346,7 @@ static acpi_status register_slot(acpi_ha
>  
>  	list_add_tail(&slot->node, &bridge->slots);
>  
> -	/* Register slots for ejectable funtions only. */
> -	if (acpi_pci_check_ejectable(pbus, handle)  || is_dock_device(handle)) {
> +	if (slot_should_be_exposed(bridge, handle)) {
>  		unsigned long long sun;
>  		int retval;
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas - Oct. 11, 2013, 3:08 p.m.
On Fri, Oct 11, 2013 at 7:01 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 11 Oct 2013 13:13:47 +0200
> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
>>
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug
>>
>> Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
>> devices that have already been claimed by the native PCIe hotplug
>> (pciehp).
>>
>> The ACPI hotplug events are essentially re-scan, remove and eject
>> requests.  Re-scan and remove should work regardless, because they
>> may be triggered by user space via sysfs and the ACPI eject (_EJ0)
>> should work if the BIOS wants us to use it.  There may be an issue
>> if the BIOS signals ACPI eject and wants us to use the native eject,
>> but that doesn't work without this change anyway.
>>
>> This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
>> triggering unnecessarily for bridges whose parents are managed by
>> pciehp.
>>
>
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Tested-by: Steven Rostedt <rostedt@goodmis.org>

I opened https://bugzilla.kernel.org/show_bug.cgi?id=62831 for this
issue because I don't think this question of how to coordinate acpiphp
and pciehp is completely resolved, and I think it might be interesting
to have the complete dmesg, acpidump, and "lspci -vv" output attached
there for future reference.

Steve, I tried to attach the dmesg you mentioned yesterday in IRC, but
the link didn't work for me.  Would you mind attaching this info?

Rafael, I assume you'll probably merge this through your tree.  Would
you mind adding a reference to this bugzilla in the changelog?  I do
have a "convert to dynamic debug" acpiphp patch in my "next" branch
(bd950799), but I suspect you have several more interesting ones in
your tree.

Bjorn

>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>  drivers/pci/hotplug/acpiphp_glue.c |   32 ++++++++++++++++++++++++++------
>>  1 file changed, 26 insertions(+), 6 deletions(-)
>>
>> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
>> ===================================================================
>> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
>> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
>> @@ -259,6 +259,31 @@ static void acpiphp_dock_release(void *d
>>       put_bridge(context->func.parent);
>>  }
>>
>> +/**
>> + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
>> + * @bridge: ACPIPHP bridge the slot belongs to.
>> + * @handle: ACPI handle of a device in the slot.
>> + */
>> +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
>> +                                       acpi_handle handle)
>> +{
>> +     struct pci_bus *pbus = bridge->pci_bus;
>> +     struct pci_dev *pdev = bridge->pci_dev;
>> +
>> +     /*
>> +      * Do not expose slots whose bridges are managed by pciehp, because they
>> +      * will be exposed to user space by the pciehp driver.
>> +      */
>> +     if (pdev && device_is_managed_by_native_pciehp(pdev))
>> +             return false;
>> +
>> +     /*
>> +      * Expose slots for devices with either _EJ0 or _RMV and for devices
>> +      * on docking stations.
>> +      */
>> +     return acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle);
>> +}
>> +
>>  /* callback routine to register each ACPI PCI slot object */
>>  static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
>>                                void **rv)
>> @@ -271,12 +296,8 @@ static acpi_status register_slot(acpi_ha
>>       unsigned long long adr;
>>       int device, function;
>>       struct pci_bus *pbus = bridge->pci_bus;
>> -     struct pci_dev *pdev = bridge->pci_dev;
>>       u32 val;
>>
>> -     if (pdev && device_is_managed_by_native_pciehp(pdev))
>> -             return AE_OK;
>> -
>>       status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
>>       if (ACPI_FAILURE(status)) {
>>               acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
>> @@ -325,8 +346,7 @@ static acpi_status register_slot(acpi_ha
>>
>>       list_add_tail(&slot->node, &bridge->slots);
>>
>> -     /* Register slots for ejectable funtions only. */
>> -     if (acpi_pci_check_ejectable(pbus, handle)  || is_dock_device(handle)) {
>> +     if (slot_should_be_exposed(bridge, handle)) {
>>               unsigned long long sun;
>>               int retval;
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 11, 2013, 5:21 p.m.
On Fri, Oct 11, 2013 at 4:13 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> +/**
> + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
> + * @bridge: ACPIPHP bridge the slot belongs to.
> + * @handle: ACPI handle of a device in the slot.
> + */
> +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
> +                                         acpi_handle handle)

Thanks, that looks much better.

I do worry that we now seem to add the slot to all the acpiphp lists
even if it is managed by pciehp. That gets rid of the warning Steven
saw (because now it always has that context), but I'm left wondering
how much pcihp and aciphp will fight over the slot.

Yes, the acpiphp_register_hotplug_slot() doesn't get called, but we
still do register_hotplug_dock_device(), for example. How does that
interact with pcihp that thinks it owns the slot?

Or am I misreading the code? It's more readable, and no longer makes
me homicidal, but I don't actually know the code itself.

             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki - Oct. 11, 2013, 9:58 p.m.
On Friday, October 11, 2013 10:21:35 AM Linus Torvalds wrote:
> On Fri, Oct 11, 2013 at 4:13 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > +/**
> > + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
> > + * @bridge: ACPIPHP bridge the slot belongs to.
> > + * @handle: ACPI handle of a device in the slot.
> > + */
> > +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
> > +                                         acpi_handle handle)
> 
> Thanks, that looks much better.
> 
> I do worry that we now seem to add the slot to all the acpiphp lists
> even if it is managed by pciehp. That gets rid of the warning Steven
> saw (because now it always has that context), but I'm left wondering
> how much pcihp and aciphp will fight over the slot.
>
> Yes, the acpiphp_register_hotplug_slot() doesn't get called, but we
> still do register_hotplug_dock_device(), for example. How does that
> interact with pcihp that thinks it owns the slot?

Well, owning the slot doesn't really mean much here, because the "rescan"
and "remove" things may always be triggered by user space via sysfs from
under the PCI device in question (regardless of whether or not pciehp
thinks that it "owns" that device).  So if they are triggered by an ACPI
notify instead, that should still be fine.

Ejects are more of a gray area, but they do the "remove" first and only
then they go for an actual "eject".  Question is if we should execute
_EJ0 provided that it's actually present for the pciehp slots (which we will
do with the patch applied).  It might be safer to trigger the native eject
then, but again I'd be surprised if _EJ0 didn't work anyway (if there is a
system in which _EJ0 is available for a device handled by pciehp in the first
place).

As far as docking stations go, the undock is done by ACPI anyway and it will
carry out "remove" for all devices under the dock, so the patch doesn't change
this particular case as far as I can say.

> Or am I misreading the code? It's more readable, and no longer makes
> me homicidal, but I don't actually know the code itself.

I think you're reading it correctly, it really makes acpiphp see all slots
even if pciehp sees them too.  So the change is somewhat risky.

That said the risk doesn't seem to be huge and there seem to be cases in
which it actually would be useful to have both acpiphp and pciehp signaling
available for the same device.  For example, even if the BIOS told us that
we could use the native mechanism (pciehp), it may not actually work.  That is,
we may not get any hotplug interrupts from PCIe ports due to platform bugs of
some sort and we may get ACPI notifications instead (because the platform
designer knew about those bugs and thought it would be smart to use ACPI to
work around them).

There are bug reports indicating thinks like that, so we were going to allow
acpiphp and pciehp to handle the same devices anyway at one point.  I thought
we might as well try to do it now and see how it goes.  Still, if you think
it's too risky for this stage of the cycle, I'll just send a patch removing
the WARN_ON() and we'll revisit that thing in 3.13.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -259,6 +259,31 @@  static void acpiphp_dock_release(void *d
 	put_bridge(context->func.parent);
 }
 
+/**
+ * slot_should_be_exposed - Check whether or not to expose a slot to userland.
+ * @bridge: ACPIPHP bridge the slot belongs to.
+ * @handle: ACPI handle of a device in the slot.
+ */
+static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
+					  acpi_handle handle)
+{
+	struct pci_bus *pbus = bridge->pci_bus;
+	struct pci_dev *pdev = bridge->pci_dev;
+
+	/*
+	 * Do not expose slots whose bridges are managed by pciehp, because they
+	 * will be exposed to user space by the pciehp driver.
+	 */
+	if (pdev && device_is_managed_by_native_pciehp(pdev))
+		return false;
+
+	/*
+	 * Expose slots for devices with either _EJ0 or _RMV and for devices
+	 * on docking stations.
+	 */
+	return acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle);
+}
+
 /* callback routine to register each ACPI PCI slot object */
 static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
 				 void **rv)
@@ -271,12 +296,8 @@  static acpi_status register_slot(acpi_ha
 	unsigned long long adr;
 	int device, function;
 	struct pci_bus *pbus = bridge->pci_bus;
-	struct pci_dev *pdev = bridge->pci_dev;
 	u32 val;
 
-	if (pdev && device_is_managed_by_native_pciehp(pdev))
-		return AE_OK;
-
 	status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
 	if (ACPI_FAILURE(status)) {
 		acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
@@ -325,8 +346,7 @@  static acpi_status register_slot(acpi_ha
 
 	list_add_tail(&slot->node, &bridge->slots);
 
-	/* Register slots for ejectable funtions only. */
-	if (acpi_pci_check_ejectable(pbus, handle)  || is_dock_device(handle)) {
+	if (slot_should_be_exposed(bridge, handle)) {
 		unsigned long long sun;
 		int retval;