Patchwork [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

login
register
mail settings
Submitter Jiri Kosina
Date March 14, 2013, 3:39 p.m.
Message ID <alpine.LNX.2.00.1303141634440.30118@pobox.suse.cz>
Download mbox | patch
Permalink /patch/227718/
State Not Applicable
Headers show

Comments

Jiri Kosina - March 14, 2013, 3:39 p.m.
On Thu, 14 Mar 2013, Alan Stern wrote:

> > > Can you try to do a git bisect for this?  Is the sluggish system 
> > > response clear enough that you can tell reliably when it is present and 
> > > when it isn't?
> > 
> > That was my first thought, but unfortunately I am afraid there will be 
> > point at which I will easily make a bisection mistake, as the 
> > responsiveness of the system varies over time, so it's not really a 
> > 100% objective measure.
> 
> All right.
> 
> There have been only three significant changes to uhci-hcd since last 
> summer, and two of them appear to be completely unrelated to this 
> issue.  The three commits are
> 
> 	3171fcabb169  USB: uhci: beautify source code
> 	13996ca7afd5  USB: uhci: check buffer length to avoid memory 
> 			overflow
> 	0f815a0a700b  USB: UHCI: fix IRQ race during initialization
> 
> Reverting the first two almost certainly will not have any effect, but
> you may as well try it anyway.  The third commit may be relevant.

I have reverted all three commits, and the "nobody cared" is still there. 

> If you revert all three and still see the problem then it must be
> caused by changes outside of the USB stack.  Differences in interrupt
> routing could be a result of changes to PCI or ACPI.  Have you compared 
> the current /proc/interrupts with versions from earlier kernels without 
> this problem?

The diff of stripped-down (without CPU statistics) /proc/interrupts from 
some oldish working 3.1 and the current tree:


IRQ16 is routed differently (usb4 vs usb6), so that might be relevant.

> Is occurrence of the "nobody cared" connected with any particular 
> device?  Somebody reported a similar problem not long ago (although IIRC 
> it was for OHCI rather than UHCI) which appeared to be related to 
> activity on the built-in webcam.

Will check this. No external devices are plugged in, I think the only 
internal one it has is bluetooth chip. I'll try turning it off.
Jiri Kosina - March 14, 2013, 3:47 p.m.
On Thu, 14 Mar 2013, Jiri Kosina wrote:

> > Is occurrence of the "nobody cared" connected with any particular 
> > device?  Somebody reported a similar problem not long ago (although IIRC 
> > it was for OHCI rather than UHCI) which appeared to be related to 
> > activity on the built-in webcam.
> 
> Will check this. No external devices are plugged in, I think the only 
> internal one it has is bluetooth chip. I'll try turning it off.

That didn't help (I disabled it via hard rfkill and it vanished from 
lsusb), i.e. it happens even with only the hubs being there.
Alan Stern - March 14, 2013, 4:10 p.m.
On Thu, 14 Mar 2013, Jiri Kosina wrote:

> I have reverted all three commits, and the "nobody cared" is still there. 
> 
> > If you revert all three and still see the problem then it must be
> > caused by changes outside of the USB stack.  Differences in interrupt
> > routing could be a result of changes to PCI or ACPI.  Have you compared 
> > the current /proc/interrupts with versions from earlier kernels without 
> > this problem?
> 
> The diff of stripped-down (without CPU statistics) /proc/interrupts from 
> some oldish working 3.1 and the current tree:
> 
> --- /tmp/interrupts-old.txt	2013-03-14 16:30:46.938710286 +0100
> +++ /tmp/interrupts-new.txt	2013-03-14 16:30:18.954571413 +0100
> @@ -3,27 +3,28 @@
>    8:IO-APIC-edge      rtc0
>    9:IO-APIC-fasteoi   acpi
>   12:IO-APIC-edge      i8042
> - 16:IO-APIC-fasteoi   uhci_hcd:usb6
> - 17:IO-APIC-fasteoi   uhci_hcd:usb7
> - 18:IO-APIC-fasteoi   ata_generic, uhci_hcd:usb8
> - 19:IO-APIC-fasteoi   ehci_hcd:usb2
> - 20:IO-APIC-fasteoi   uhci_hcd:usb3
> - 21:IO-APIC-fasteoi   uhci_hcd:usb4
> - 22:IO-APIC-fasteoi   uhci_hcd:usb5
> - 23:IO-APIC-fasteoi   ehci_hcd:usb1
> + 16:IO-APIC-fasteoi   uhci_hcd:usb4
> + 17:IO-APIC-fasteoi   uhci_hcd:usb5
> + 18:IO-APIC-fasteoi   ata_generic, uhci_hcd:usb6
> + 19:IO-APIC-fasteoi   ehci_hcd:usb8
> + 20:IO-APIC-fasteoi   uhci_hcd:usb1
> + 21:IO-APIC-fasteoi   uhci_hcd:usb2
> + 22:IO-APIC-fasteoi   uhci_hcd:usb3
> + 23:IO-APIC-fasteoi   ehci_hcd:usb7, i801_smbus
>   40:PCI-MSI-edge      PCIe PME
>   41:PCI-MSI-edge      PCIe PME
>   42:PCI-MSI-edge      PCIe PME
>   43:PCI-MSI-edge      ahci
>   44:PCI-MSI-edge      i915
>   45:PCI-MSI-edge      eth0
> - 46:PCI-MSI-edge      iwlagn
> + 46:PCI-MSI-edge      iwlwifi
>   47:PCI-MSI-edge      snd_hda_intel
>  NMI:Non-maskable interrupts
>  LOC:Local timer interrupts
>  SPU:Spurious interrupts
>  PMI:Performance monitoring interrupts
>  IWI:IRQ work interrupts
> +RTR:APIC ICR read retries
>  RES:Rescheduling interrupts
>  CAL:Function call interrupts
>  TLB:TLB shootdowns
> 
> IRQ16 is routed differently (usb4 vs usb6), so that might be relevant.

It looks like the order of probing changed.  The old kernel did 
ehci-hcd before uhci-hcd and the new kernel did them in the opposite 
order.  Consequently usb3-usb8 in the old kernel (the UHCI devices) are 
the same as usb1-usb6 in the new kernel.  Likewise, usb1-usb2 in the 
old kernel are usb7-usb8 in the new kernel.

In fact, the only major difference appears to be i801_smbus on IRQ 23.  
It's hard to see how that could have any effect.

> > Is occurrence of the "nobody cared" connected with any particular 
> > device?  Somebody reported a similar problem not long ago (although IIRC 
> > it was for OHCI rather than UHCI) which appeared to be related to 
> > activity on the built-in webcam.
> 
> Will check this. No external devices are plugged in, I think the only 
> internal one it has is bluetooth chip. I'll try turning it off.

All right.

One other thing you could try: Transplant the entire uhci-hcd driver 
from 3.1 (or whatever) into 3.9-rc1.  It should go okay -- you may have 
to apply by hand the appropriate parts of commits bc677d5b6464, 
90ab5ee94171, and 9ffc93f203c1.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern - March 14, 2013, 4:13 p.m.
On Thu, 14 Mar 2013, Jiri Kosina wrote:

> > There have been only three significant changes to uhci-hcd since last 
> > summer, and two of them appear to be completely unrelated to this 
> > issue.  The three commits are
> > 
> > 	3171fcabb169  USB: uhci: beautify source code
> > 	13996ca7afd5  USB: uhci: check buffer length to avoid memory 
> > 			overflow
> > 	0f815a0a700b  USB: UHCI: fix IRQ race during initialization
> > 
> > Reverting the first two almost certainly will not have any effect, but
> > you may as well try it anyway.  The third commit may be relevant.
> 
> I have reverted all three commits, and the "nobody cared" is still there. 

There's one other commit I failed to find at first: 840008bb5162 (USB: 
UHCI: notify usbcore about port resumes).  Probably not relevant, but 
you should check to make sure.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

--- /tmp/interrupts-old.txt	2013-03-14 16:30:46.938710286 +0100
+++ /tmp/interrupts-new.txt	2013-03-14 16:30:18.954571413 +0100
@@ -3,27 +3,28 @@ 
   8:IO-APIC-edge      rtc0
   9:IO-APIC-fasteoi   acpi
  12:IO-APIC-edge      i8042
- 16:IO-APIC-fasteoi   uhci_hcd:usb6
- 17:IO-APIC-fasteoi   uhci_hcd:usb7
- 18:IO-APIC-fasteoi   ata_generic, uhci_hcd:usb8
- 19:IO-APIC-fasteoi   ehci_hcd:usb2
- 20:IO-APIC-fasteoi   uhci_hcd:usb3
- 21:IO-APIC-fasteoi   uhci_hcd:usb4
- 22:IO-APIC-fasteoi   uhci_hcd:usb5
- 23:IO-APIC-fasteoi   ehci_hcd:usb1
+ 16:IO-APIC-fasteoi   uhci_hcd:usb4
+ 17:IO-APIC-fasteoi   uhci_hcd:usb5
+ 18:IO-APIC-fasteoi   ata_generic, uhci_hcd:usb6
+ 19:IO-APIC-fasteoi   ehci_hcd:usb8
+ 20:IO-APIC-fasteoi   uhci_hcd:usb1
+ 21:IO-APIC-fasteoi   uhci_hcd:usb2
+ 22:IO-APIC-fasteoi   uhci_hcd:usb3
+ 23:IO-APIC-fasteoi   ehci_hcd:usb7, i801_smbus
  40:PCI-MSI-edge      PCIe PME
  41:PCI-MSI-edge      PCIe PME
  42:PCI-MSI-edge      PCIe PME
  43:PCI-MSI-edge      ahci
  44:PCI-MSI-edge      i915
  45:PCI-MSI-edge      eth0
- 46:PCI-MSI-edge      iwlagn
+ 46:PCI-MSI-edge      iwlwifi
  47:PCI-MSI-edge      snd_hda_intel
 NMI:Non-maskable interrupts
 LOC:Local timer interrupts
 SPU:Spurious interrupts
 PMI:Performance monitoring interrupts
 IWI:IRQ work interrupts
+RTR:APIC ICR read retries
 RES:Rescheduling interrupts
 CAL:Function call interrupts
 TLB:TLB shootdowns