Patchwork [3/3] PCI Hotplug: workaround for Thunderbolt on Intel DZ77RE-75K motherboard

login
register
mail settings
Submitter Kirill A. Shutemov
Date Dec. 13, 2012, 8:08 p.m.
Message ID <20121213200857.GA24939@otc-wbsnb-06>
Download mbox | patch
Permalink /patch/206209/
State Rejected
Headers show

Comments

Kirill A. Shutemov - Dec. 13, 2012, 8:08 p.m.
On Thu, Dec 13, 2012 at 10:48:20AM -0800, Greg KH wrote:
> On Thu, Dec 13, 2012 at 05:31:48PM +0200, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > BIOS on Intel DZ77RE-75K motherboard notifies OS about Thunderbolt
> > hotplug before devices behind Thunderbolt are ready to be enumerated.
> > 
> > Let's delay enumeration by 2 seconds.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  drivers/pci/hotplug/acpiphp_glue.c |   16 +++++++++++++++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 1a2b3ca..165987a 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -49,6 +49,7 @@
> >  #include <linux/mutex.h>
> >  #include <linux/slab.h>
> >  #include <linux/acpi.h>
> > +#include <linux/dmi.h>
> >  
> >  #include "../pci.h"
> >  #include "acpiphp.h"
> > @@ -1327,6 +1328,19 @@ out:
> >  static void handle_hotplug_event_bridge(acpi_handle handle, u32 type,
> >  					void *context)
> >  {
> > +	unsigned long delay = 0;
> > +	const char *board_name;
> > +
> > +	board_name = dmi_get_system_info(DMI_BOARD_NAME);
> > +	/*
> > +	 * BIOS on Intel DZ77RE-75K motherboard notifies OS about Thunderbolt
> > +	 * hotplug before devices behind Thunderbolt are ready to be
> > +	 * enumerated.
> > +	 * Let's delay enumeration by 2 seconds.
> > +	 */
> > +	if (board_name && !strcmp(board_name, "DZ77RE-75K"))
> > +		delay = 2 * HZ;
> 
> Again, no objection to this patch as-is, but should we make it a bit
> more "general" and provide a quirk table to make this type of fix easier
> for other motherboards?  Shouldn't we also match on a manufacturer field
> as well as the board_name?  The kernel provides a very easy to match on
> any arbritary set of DMI fields for this kind of problem that we could
> use here.

Something like this (untested):
Yinghai Lu - Dec. 13, 2012, 8:25 p.m.
On Thu, Dec 13, 2012 at 12:08 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> On Thu, Dec 13, 2012 at 10:48:20AM -0800, Greg KH wrote:
>> On Thu, Dec 13, 2012 at 05:31:48PM +0200, Kirill A. Shutemov wrote:
>> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> >
>> > BIOS on Intel DZ77RE-75K motherboard notifies OS about Thunderbolt
>> > hotplug before devices behind Thunderbolt are ready to be enumerated.
>> >
>> > Let's delay enumeration by 2 seconds.
>> >
>> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> > ---
>> >  drivers/pci/hotplug/acpiphp_glue.c |   16 +++++++++++++++-
>> >  1 file changed, 15 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>> > index 1a2b3ca..165987a 100644
>> > --- a/drivers/pci/hotplug/acpiphp_glue.c
>> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
>> > @@ -49,6 +49,7 @@
>> >  #include <linux/mutex.h>
>> >  #include <linux/slab.h>
>> >  #include <linux/acpi.h>
>> > +#include <linux/dmi.h>
>> >
>> >  #include "../pci.h"
>> >  #include "acpiphp.h"
>> > @@ -1327,6 +1328,19 @@ out:
>> >  static void handle_hotplug_event_bridge(acpi_handle handle, u32 type,
>> >                                     void *context)
>> >  {
>> > +   unsigned long delay = 0;
>> > +   const char *board_name;
>> > +
>> > +   board_name = dmi_get_system_info(DMI_BOARD_NAME);
>> > +   /*
>> > +    * BIOS on Intel DZ77RE-75K motherboard notifies OS about Thunderbolt
>> > +    * hotplug before devices behind Thunderbolt are ready to be
>> > +    * enumerated.
>> > +    * Let's delay enumeration by 2 seconds.
>> > +    */
>> > +   if (board_name && !strcmp(board_name, "DZ77RE-75K"))
>> > +           delay = 2 * HZ;
>>

Linus will not be happy with those kind of delay.

is there any way for kernel to retry before device is declared not there.

pcie hotplug spec: need to retry several times in 1000ms before
delcaring the devices is not present.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2f5d8e4ff947ad6673397083b48719cd6c59cd61
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Dec. 13, 2012, 8:30 p.m.
On Thu, Dec 13, 2012 at 12:25 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>
> Linus will not be happy with those kind of delay.

Indeed. And the DMI check is bogus too, since the "there can be
delays" is apparently part of the pcie hotplug spec.

So do the sane thing. Retry a few times, with increasingly long delays
(ie something like start with 10ms, then double the delay until you
hit 1s, and then just give up: end result, ~2s total wait, but 10ms
for any sane device that doesn't suck).

No DMI checks, no hacks, not insane default delays.

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kirill A. Shutemov - Dec. 13, 2012, 8:49 p.m.
On Thu, Dec 13, 2012 at 12:30:04PM -0800, Linus Torvalds wrote:
> On Thu, Dec 13, 2012 at 12:25 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> >
> > Linus will not be happy with those kind of delay.
> 
> Indeed. And the DMI check is bogus too, since the "there can be
> delays" is apparently part of the pcie hotplug spec.

It's ACPI PCI hotplug, not PCIe native hotplug. PCIe hotplug spec is not
relevant.

IIUC, in ACPI case devices should be ready to be enumerated, before you
get notification. Rafael, is it correct?

> So do the sane thing. Retry a few times, with increasingly long delays
> (ie something like start with 10ms, then double the delay until you
> hit 1s, and then just give up: end result, ~2s total wait, but 10ms
> for any sane device that doesn't suck).

PCI rescan is expensive and generate noise in dmesg. We'll end up with
tons of useless messages.
Linus Torvalds - Dec. 13, 2012, 8:54 p.m.
On Thu, Dec 13, 2012 at 12:49 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> PCI rescan is expensive and generate noise in dmesg. We'll end up with
> tons of useless messages.

So?

That's still better than adding random DMI string checks.

And christ, this is at hotplug time only. Nobody cares.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki - Dec. 13, 2012, 10:06 p.m.
On Thursday, December 13, 2012 10:49:53 PM Kirill A. Shutemov wrote:
> On Thu, Dec 13, 2012 at 12:30:04PM -0800, Linus Torvalds wrote:
> > On Thu, Dec 13, 2012 at 12:25 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> > >
> > > Linus will not be happy with those kind of delay.
> > 
> > Indeed. And the DMI check is bogus too, since the "there can be
> > delays" is apparently part of the pcie hotplug spec.
> 
> It's ACPI PCI hotplug, not PCIe native hotplug. PCIe hotplug spec is not
> relevant.
> 
> IIUC, in ACPI case devices should be ready to be enumerated, before you
> get notification. Rafael, is it correct?

Not necessarily.  The ACPI nodes will be, the device themselves are still PCI
devices. :-)

Thanks,
Rafael
Kirill A. Shutemov - Dec. 14, 2012, 10:34 a.m.
On Thu, Dec 13, 2012 at 12:30:04PM -0800, Linus Torvalds wrote:
> On Thu, Dec 13, 2012 at 12:25 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> >
> > Linus will not be happy with those kind of delay.
> 
> Indeed. And the DMI check is bogus too, since the "there can be
> delays" is apparently part of the pcie hotplug spec.
> 
> So do the sane thing. Retry a few times, with increasingly long delays
> (ie something like start with 10ms, then double the delay until you
> hit 1s, and then just give up: end result, ~2s total wait, but 10ms
> for any sane device that doesn't suck).
> 
> No DMI checks, no hacks, not insane default delays.

I've realized that there's no strong criteria of hotplug success in ACPI
PCI Hotplug. We can't know when we should stop retrying.

In Thunderbolt case before any devices hotplugged you only see a root
port. Thunderbolt host controller is powered off and kernel can't see it.

On hotplug BIOS enables the host controller, initialize it and notify OS
about hotplug.

Normally kernel will enumerate 6 ports on Thunderbolt host controller, 2
ports on device Thunderbolt controller and target device itself. All this
for simple non-chained case. With device chaining the hierarchy is even
more complex.

On DZ77RE-75K motherboard without the workaround kernel will discover only
ports on host controller, but not device ports or device.

So kernel will find devices on broken implementation, not all of them.

Even worse: there's no way to distinguish between plug and unplug events
and kernel uses the same code path for both cases.

Patch

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 1a2b3ca..c4db634 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -49,6 +49,7 @@ 
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/acpi.h>
+#include <linux/dmi.h>
 
 #include "../pci.h"
 #include "acpiphp.h"
@@ -1327,6 +1328,31 @@  out:
 static void handle_hotplug_event_bridge(acpi_handle handle, u32 type,
 					void *context)
 {
+	unsigned long delay = 0;
+	const struct dmi_system_id *match;
+	static const struct dmi_system_id sysids[] = {
+		/*
+		 * BIOS on Intel DZ77RE-75K motherboard notifies OS about
+		 * Thunderbolt hotplug before devices behind Thunderbolt are
+		 * ready to be enumerated.
+		 * Let's delay enumeration by 2 seconds.
+		 */
+		{
+			.ident = "DZ77RE-75K",
+			.matches = {
+				DMI_MATCH(DMI_BOARD_VENDOR,
+						"Intel Corporation"),
+				DMI_MATCH(DMI_BOARD_NAME,
+						"DZ77RE-75K"),
+			},
+			.driver_data = (void *) (2 * HZ), /* delay */
+		},
+	};
+
+	match = dmi_first_match(sysids);
+	if (match)
+		delay = (unsigned long) match->driver_data;
+
 	/*
 	 * Currently the code adds all hotplug events to the kacpid_wq
 	 * queue when it should add hotplug events to the kacpi_hotplug_wq.
@@ -1336,7 +1362,7 @@  static void handle_hotplug_event_bridge(acpi_handle handle, u32 type,
 	 * don't deadlock on hotplug actions.
 	 */
 	alloc_acpiphp_hp_work(handle, type, context,
-			      _handle_hotplug_event_bridge, 0);
+			      _handle_hotplug_event_bridge, delay);
 }
 
 static void _handle_hotplug_event_func(struct work_struct *work)