Patchwork =?UTF-8?q?=5BPATCH=5D=20Quirk=20to=20fix=20suspend/resume=20on=20another=20Lenovo=20ThinkPad=20Edge=2C=20model=20030246G=2E?=

login
register
mail settings
Submitter Thomas Schwinge
Date Feb. 13, 2011, 2:17 p.m.
Message ID <1297606666-13207-1-git-send-email-thomas@schwinge.name>
Download mbox | patch
Permalink /patch/83077/
State New
Headers show

Comments

Thomas Schwinge - Feb. 13, 2011, 2:17 p.m.
BugLink: http://launchpad.net/bugs/702434
Signed-off-by: Thomas Schwinge <thomas@schwinge.name>
---

Hallo!

I had to apply this additional patch on top of ubuntu-maverick's
Ubuntu-2.6.35-26.46 (my patch is building on top of Manoj's
b95ee31d81f578162310e346a0b3277a65ac4a4d) in order to get suspend/resume
working on my girlfriend's ThinkPad Edge.  Previously, the machine would
sort-of resume, but the screen stayed dark; remote SSH login was possible
though.


Grüße,
 Thomas


 arch/x86/kernel/acpi/boot.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)
Stefan Bader - Feb. 14, 2011, 1:41 p.m.
On 02/13/2011 03:17 PM, Thomas Schwinge wrote:
> BugLink: http://launchpad.net/bugs/702434
> Signed-off-by: Thomas Schwinge <thomas@schwinge.name>
> ---
> 
> Hallo!
> 
> I had to apply this additional patch on top of ubuntu-maverick's
> Ubuntu-2.6.35-26.46 (my patch is building on top of Manoj's
> b95ee31d81f578162310e346a0b3277a65ac4a4d) in order to get suspend/resume
> working on my girlfriend's ThinkPad Edge.  Previously, the machine would
> sort-of resume, but the screen stayed dark; remote SSH login was possible
> though.
> 
> 
> Grüße,
>  Thomas
> 
Hi Thomas,

first thanks for the test and patch. And sorry when I start this nag here. But
as you mention this patch adds to the other quirks that Manoj got added to
Maverick (and I thought he had been saying this was submitted upstream). Just
that I don't see anything there. Which makes me uneasy thinking that all of this
might be coming back with Natty.

So I would like to take this opportunity to as: what happened to those?

-Stefan

> 
>  arch/x86/kernel/acpi/boot.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 715abe9..8debd3b 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -1461,6 +1461,15 @@ static struct dmi_system_id __initdata acpi_dmi_table[] = {
>  		     DMI_MATCH(DMI_PRODUCT_NAME, "030222U"),
>  		    },
>  	},
> +	/* Lenovo ThinkPad Edge, model 030246G */
> +	{
> +	 .callback = dmi_ignore_irq0_timer_override,
> +	 .ident = "ThinkPad Edge",
> +	 .matches = {
> +		     DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +		     DMI_MATCH(DMI_PRODUCT_NAME, "030246G"),
> +		    },
> +	},
>  	{}
>  };
>
Stefan Bader - Feb. 14, 2011, 2:27 p.m.
On 02/14/2011 02:41 PM, Stefan Bader wrote:
> On 02/13/2011 03:17 PM, Thomas Schwinge wrote:
>> BugLink: http://launchpad.net/bugs/702434
>> Signed-off-by: Thomas Schwinge <thomas@schwinge.name>
>> ---
>>
>> Hallo!
>>
>> I had to apply this additional patch on top of ubuntu-maverick's
>> Ubuntu-2.6.35-26.46 (my patch is building on top of Manoj's
>> b95ee31d81f578162310e346a0b3277a65ac4a4d) in order to get suspend/resume
>> working on my girlfriend's ThinkPad Edge.  Previously, the machine would
>> sort-of resume, but the screen stayed dark; remote SSH login was possible
>> though.
>>
>>
>> Grüße,
>>  Thomas
>>
> Hi Thomas,
> 
> first thanks for the test and patch. And sorry when I start this nag here. But
> as you mention this patch adds to the other quirks that Manoj got added to
> Maverick (and I thought he had been saying this was submitted upstream). Just
> that I don't see anything there. Which makes me uneasy thinking that all of this
> might be coming back with Natty.
> 
> So I would like to take this opportunity to as: what happened to those?
> 
> -Stefan
> 
>>

So I did a bit of research and it seems that the root cause for the problem was
that there is a BIOS bug that causes IRQ0 to use the wrong polarity (at least on
systems with AMD SB800) and skipping the mapping just avoided the problem
because the other IRQs where not wrong.

The upstream discussion "Quirk to fix suspend/resume on Lenovo Edge 11,13,14,15"
ended with a test patch and the outlook to get a proper fix asap.
Andreas, I know there is always lots of stuff to do, do you have any outlook there?
Thomas, is your girlfriend's Edge an AMD based system, too?

-Stefan

Below the relevant mail with some explanation and debugging hints for convenience.

---

Seems that we've identified the root cause.

I wondered why systems with the problem have configured IOAPIC pin
with polarity=1 (low active).  That was different to what the working
systems used.

Switching the configuration to the usual polarity=0 (high active)
fixed the issue.

The explanation is that when hpet interrupt is triggerd, signal goes
from low to high. (AFAIK HPET spec even mentions that HPET interrupts
are all active high.)

Now if IO-APIC pin is configured as low active it just ignores this
signal change. It just triggers later when for next interrupt signal
will go from high to low and high again. (That happens the first time
after resume when the HPET counter wrapped around.)

Setting the correct polarity fixes the detection of the first hpet
interrupt after resume.

To confirm that your systems behave similar you should boot with
"apic=debug" kernel parameter. The output for IO APIC should show
polarity=1 for IO APIC pin 2, e.g.

 [    0.158179] IO APIC #2......
  ...
 [    0.158205]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
 [    0.158210]  00 000 1    0    0   0   0    0    0    00
 [    0.158217]  01 003 0    0    0   0   0    1    1    31
 [    0.158224]  02 003 0    0    0   1   0    1    1    30

Furthermore you can check with attached test patch whether changing
the polarity fixes the problem on your system. IO APIC debug output
with this patch should change to

 [    0.156170] IO APIC #2......
  ...
 [    0.156197]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
 [    0.156202]  00 000 1    0    0   0   0    0    0    00
 [    0.156209]  01 003 0    0    0   0   0    1    1    31
 [    0.156216]  02 003 0    0    0   0   0    1    1    30


I'll come up with an SB800 quirk asap. (Of course we'll also try to
fix the respective BIOSes but too often BIOS updates are only
available for a limited time period.)
Thomas Schwinge - Feb. 14, 2011, 2:43 p.m.
Hallo!

On Mon, 14 Feb 2011 15:27:10 +0100, Stefan Bader <stefan.bader@canonical.com> wrote:
> On 02/14/2011 02:41 PM, Stefan Bader wrote:
> > first thanks for the test and patch. And sorry when I start this nag here. But
> > as you mention this patch adds to the other quirks that Manoj got added to
> > Maverick (and I thought he had been saying this was submitted upstream). Just
> > that I don't see anything there. Which makes me uneasy thinking that all of this
> > might be coming back with Natty.
> > 
> > So I would like to take this opportunity to as: what happened to those?

I cannot comment on that -- I only replicated what Manoj had done.


> So I did a bit of research and it seems that the root cause for the problem was
> that there is a BIOS bug that causes IRQ0 to use the wrong polarity (at least on
> systems with AMD SB800) and skipping the mapping just avoided the problem
> because the other IRQs where not wrong.
> 
> The upstream discussion "Quirk to fix suspend/resume on Lenovo Edge 11,13,14,15"
> ended with a test patch and the outlook to get a proper fix asap.
> Andreas, I know there is always lots of stuff to do, do you have any outlook there?
> Thomas, is your girlfriend's Edge an AMD based system, too?

Yes, it's this one:
<http://www.campuspoint.de/shop/notebooks/hrst/lenovo/edge/edge-15/nvldhge-2.html>

    Chipsatz:       AMD RS880M Chipsatz

I can also provide further debugging output, etc.


Grüße,
 Thomas
Thomas Schwinge - Feb. 14, 2011, 7:16 p.m.
Hallo!

On Mon, 14 Feb 2011 15:27:10 +0100, Stefan Bader <stefan.bader@canonical.com> wrote:
> Seems that we've identified the root cause.
> 
> I wondered why systems with the problem have configured IOAPIC pin
> with polarity=1 (low active).  That was different to what the working
> systems used.
> 
> Switching the configuration to the usual polarity=0 (high active)
> fixed the issue.
> 
> The explanation is that when hpet interrupt is triggerd, signal goes
> from low to high. (AFAIK HPET spec even mentions that HPET interrupts
> are all active high.)
> 
> Now if IO-APIC pin is configured as low active it just ignores this
> signal change. It just triggers later when for next interrupt signal
> will go from high to low and high again. (That happens the first time
> after resume when the HPET counter wrapped around.)
> 
> Setting the correct polarity fixes the detection of the first hpet
> interrupt after resume.
> 
> To confirm that your systems behave similar you should boot with
> "apic=debug" kernel parameter. The output for IO APIC should show
> polarity=1 for IO APIC pin 2, e.g.
> 
>  [    0.158179] IO APIC #2......
>   ...
>  [    0.158205]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
>  [    0.158210]  00 000 1    0    0   0   0    0    0    00
>  [    0.158217]  01 003 0    0    0   0   0    1    1    31
>  [    0.158224]  02 003 0    0    0   1   0    1    1    30

Confirmed; see below.

> Furthermore you can check with attached test patch whether changing
> the polarity fixes the problem on your system. IO APIC debug output
> with this patch should change to
> 
>  [    0.156170] IO APIC #2......
>   ...
>  [    0.156197]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
>  [    0.156202]  00 000 1    0    0   0   0    0    0    00
>  [    0.156209]  01 003 0    0    0   0   0    1    1    31
>  [    0.156216]  02 003 0    0    0   0   0    1    1    30
> 
> 
> I'll come up with an SB800 quirk asap. (Of course we'll also try to
> fix the respective BIOSes but too often BIOS updates are only
> available for a limited time period.)

In case this helps someone -- here is the ``apic=debug'' dmesg output of
the original Ubuntu maverick linux-image-2.6.35-26-generic 2.6.35-26.46
package (suspend/resume broken) vs. my rebuilt one that has the patch /
hack applied that I posted in this thread (suspend/resume functional).
(I stripped off some uninteresting bits of the diff.)

    - Linux version 2.6.35-26-generic (buildd@crested) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #46-Ubuntu SMP Sun Jan 30 06:59:07 UTC 2011 (Ubuntu 2.6.35-26.46-generic 2.6.35.10)
    + Linux version 2.6.35-26-generic (root@Paddy) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #46 SMP Sun Feb 13 13:09:55 CET 2011 (Ubuntu 2.6.35-26.46-generic 2.6.35.10)
    [...]
    + ------------[ cut here ]------------
    + WARNING: at /home/thomi/tmp/linux-2.6.35/arch/x86/kernel/acpi/boot.c:1345 dmi_ignore_irq0_timer_override+0x2f/0x51()
    + Hardware name: 030246G
    + ati_ixp4x0 quirk not complete.
    + Modules linked in:
    + Pid: 0, comm: swapper Not tainted 2.6.35-26-generic #46
    + Call Trace:
    +  [<ffffffff8106093f>] warn_slowpath_common+0x7f/0xc0
    +  [<ffffffff81060a36>] warn_slowpath_fmt+0x46/0x50
    +  [<ffffffff81af84bc>] dmi_ignore_irq0_timer_override+0x2f/0x51
    +  [<ffffffff8147d71d>] dmi_check_system+0x3d/0x60
    +  [<ffffffff81af8c0d>] acpi_boot_table_init+0x10/0x85
    +  [<ffffffff81af2b73>] setup_arch+0x68c/0x7a3
    +  [<ffffffff81aed9ec>] start_kernel+0xdd/0x390
    +  [<ffffffff81aed341>] x86_64_start_reservations+0x12c/0x130
    +  [<ffffffff81aed43f>] x86_64_start_kernel+0xfa/0x109
    + ---[ end trace a7919e7f17c0a725 ]---
    + ThinkPad Edge detected: Ignoring BIOS IRQ0 pin2 override
    [...]
      ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
      IOAPIC[0]: apic_id 2, version 33, address 0xfec00000, GSI 0-23
      ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 low level)
    - ACPI: IRQ0 used by override.
    - ACPI: IRQ2 used by override.
    + ACPI: BIOS IRQ0 pin2 override ignored.
      ACPI: IRQ9 used by override.
      Using ACPI (MADT) for SMP configuration information
      ACPI: HPET id: 0x43538210 base: 0xfed00000
    [...]
      enabled ExtINT on CPU#0
      ENABLING IO-APIC IRQs
      init IO_APIC IRQs
    -  2-0 (apicid-pin) not connected
    + IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
      IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
    - IOAPIC[0]: Set routing entry (2-2 -> 0x30 -> IRQ 0 Mode:0 Active:1)
      IOAPIC[0]: Set routing entry (2-3 -> 0x33 -> IRQ 3 Mode:0 Active:0)
      IOAPIC[0]: Set routing entry (2-4 -> 0x34 -> IRQ 4 Mode:0 Active:0)
      IOAPIC[0]: Set routing entry (2-5 -> 0x35 -> IRQ 5 Mode:0 Active:0)
    [...]
      IOAPIC[0]: Set routing entry (2-14 -> 0x3e -> IRQ 14 Mode:0 Active:0)
      IOAPIC[0]: Set routing entry (2-15 -> 0x3f -> IRQ 15 Mode:0 Active:0)
       2-16 2-17 2-18 2-19 2-20 2-21 2-22 2-23 (apicid-pin) not connected
    - ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    + ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
    + ..MP-BIOS bug: 8254 timer not connected to IO-APIC
    + ...trying to set up timer (IRQ0) through the 8259A ...
    + ..... (found apic 0 pin 0) ...
    + ....... works.
      CPU0: AMD Athlon(tm) II P340 Dual-Core Processor stepping 03
      Using local APIC timer interrupts.
      calibrating APIC timer ...
    [...]
      printing PIC contents
    - ... PIC  IMR: ffff
    + ... PIC  IMR: fffe
      ... PIC  IRR: 0201
      ... PIC  ISR: 0000
      ... PIC ELCR: 0c20
    [...]
      ... APIC IRR field:
      0000000000000000000000000000000000000000000000000000000000000000
      ... APIC ESR: 00000000
    - ... APIC ICR: 000008fd
    + ... APIC ICR: 000008ef
      ... APIC ICR2: 02000000
      ... APIC LVTT: 000300ef
      ... APIC LVTPC: 00000400
    [...]
      ... APIC EILVT2: 00010000
      ... APIC EILVT3: 00010000
      
    - number of MP IRQ sources: 15.
    + number of MP IRQ sources: 16.
      number of IO-APIC #2 registers: 24.
      testing the IO APIC.......................
    [...]
      .......     : Boot DT    : 0
      .... IRQ redirection table:
       NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
    -  00 000 1    0    0   0   0    0    0    00
    +  00 003 0    0    0   0   0    1    1    30
       01 003 0    0    0   0   0    1    1    31
    -  02 003 0    0    0   1   0    1    1    30
    +  02 003 1    0    0   0   0    0    0    32
       03 003 0    0    0   0   0    1    1    33
       04 003 0    0    0   0   0    1    1    34
       05 003 0    0    0   0   0    1    1    35
    [...]
       16 000 1    0    0   0   0    0    0    00
       17 000 1    0    0   0   0    0    0    00
      IRQ to pin mappings:
    - IRQ0 -> 0:2
    + IRQ0 -> 0:0
      IRQ1 -> 0:1
    + IRQ2 -> 0:2
      IRQ3 -> 0:3
      IRQ4 -> 0:4
      IRQ5 -> 0:5


Grüße,
 Thomas
Stefan Bader - Feb. 16, 2011, 9:17 a.m.
Hi Andreas,

I know there probably are already enough people poking you about when there
might be a generic fix. Just wanted to point out that Thomas reported a SB880
(in case that makes a difference to SB800 that seemed to be reported before).

-Stefan

On 02/14/2011 08:16 PM, Thomas Schwinge wrote:
> Hallo!
> 
> On Mon, 14 Feb 2011 15:27:10 +0100, Stefan Bader <stefan.bader@canonical.com> wrote:
>> Seems that we've identified the root cause.
>>
>> I wondered why systems with the problem have configured IOAPIC pin
>> with polarity=1 (low active).  That was different to what the working
>> systems used.
>>
>> Switching the configuration to the usual polarity=0 (high active)
>> fixed the issue.
>>
>> The explanation is that when hpet interrupt is triggerd, signal goes
>> from low to high. (AFAIK HPET spec even mentions that HPET interrupts
>> are all active high.)
>>
>> Now if IO-APIC pin is configured as low active it just ignores this
>> signal change. It just triggers later when for next interrupt signal
>> will go from high to low and high again. (That happens the first time
>> after resume when the HPET counter wrapped around.)
>>
>> Setting the correct polarity fixes the detection of the first hpet
>> interrupt after resume.
>>
>> To confirm that your systems behave similar you should boot with
>> "apic=debug" kernel parameter. The output for IO APIC should show
>> polarity=1 for IO APIC pin 2, e.g.
>>
>>  [    0.158179] IO APIC #2......
>>   ...
>>  [    0.158205]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
>>  [    0.158210]  00 000 1    0    0   0   0    0    0    00
>>  [    0.158217]  01 003 0    0    0   0   0    1    1    31
>>  [    0.158224]  02 003 0    0    0   1   0    1    1    30
> 
> Confirmed; see below.
> 
>> Furthermore you can check with attached test patch whether changing
>> the polarity fixes the problem on your system. IO APIC debug output
>> with this patch should change to
>>
>>  [    0.156170] IO APIC #2......
>>   ...
>>  [    0.156197]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
>>  [    0.156202]  00 000 1    0    0   0   0    0    0    00
>>  [    0.156209]  01 003 0    0    0   0   0    1    1    31
>>  [    0.156216]  02 003 0    0    0   0   0    1    1    30
>>
>>
>> I'll come up with an SB800 quirk asap. (Of course we'll also try to
>> fix the respective BIOSes but too often BIOS updates are only
>> available for a limited time period.)
> 
> In case this helps someone -- here is the ``apic=debug'' dmesg output of
> the original Ubuntu maverick linux-image-2.6.35-26-generic 2.6.35-26.46
> package (suspend/resume broken) vs. my rebuilt one that has the patch /
> hack applied that I posted in this thread (suspend/resume functional).
> (I stripped off some uninteresting bits of the diff.)
> 
>     - Linux version 2.6.35-26-generic (buildd@crested) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #46-Ubuntu SMP Sun Jan 30 06:59:07 UTC 2011 (Ubuntu 2.6.35-26.46-generic 2.6.35.10)
>     + Linux version 2.6.35-26-generic (root@Paddy) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #46 SMP Sun Feb 13 13:09:55 CET 2011 (Ubuntu 2.6.35-26.46-generic 2.6.35.10)
>     [...]
>     + ------------[ cut here ]------------
>     + WARNING: at /home/thomi/tmp/linux-2.6.35/arch/x86/kernel/acpi/boot.c:1345 dmi_ignore_irq0_timer_override+0x2f/0x51()
>     + Hardware name: 030246G
>     + ati_ixp4x0 quirk not complete.
>     + Modules linked in:
>     + Pid: 0, comm: swapper Not tainted 2.6.35-26-generic #46
>     + Call Trace:
>     +  [<ffffffff8106093f>] warn_slowpath_common+0x7f/0xc0
>     +  [<ffffffff81060a36>] warn_slowpath_fmt+0x46/0x50
>     +  [<ffffffff81af84bc>] dmi_ignore_irq0_timer_override+0x2f/0x51
>     +  [<ffffffff8147d71d>] dmi_check_system+0x3d/0x60
>     +  [<ffffffff81af8c0d>] acpi_boot_table_init+0x10/0x85
>     +  [<ffffffff81af2b73>] setup_arch+0x68c/0x7a3
>     +  [<ffffffff81aed9ec>] start_kernel+0xdd/0x390
>     +  [<ffffffff81aed341>] x86_64_start_reservations+0x12c/0x130
>     +  [<ffffffff81aed43f>] x86_64_start_kernel+0xfa/0x109
>     + ---[ end trace a7919e7f17c0a725 ]---
>     + ThinkPad Edge detected: Ignoring BIOS IRQ0 pin2 override
>     [...]
>       ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
>       IOAPIC[0]: apic_id 2, version 33, address 0xfec00000, GSI 0-23
>       ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 low level)
>     - ACPI: IRQ0 used by override.
>     - ACPI: IRQ2 used by override.
>     + ACPI: BIOS IRQ0 pin2 override ignored.
>       ACPI: IRQ9 used by override.
>       Using ACPI (MADT) for SMP configuration information
>       ACPI: HPET id: 0x43538210 base: 0xfed00000
>     [...]
>       enabled ExtINT on CPU#0
>       ENABLING IO-APIC IRQs
>       init IO_APIC IRQs
>     -  2-0 (apicid-pin) not connected
>     + IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
>       IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
>     - IOAPIC[0]: Set routing entry (2-2 -> 0x30 -> IRQ 0 Mode:0 Active:1)
>       IOAPIC[0]: Set routing entry (2-3 -> 0x33 -> IRQ 3 Mode:0 Active:0)
>       IOAPIC[0]: Set routing entry (2-4 -> 0x34 -> IRQ 4 Mode:0 Active:0)
>       IOAPIC[0]: Set routing entry (2-5 -> 0x35 -> IRQ 5 Mode:0 Active:0)
>     [...]
>       IOAPIC[0]: Set routing entry (2-14 -> 0x3e -> IRQ 14 Mode:0 Active:0)
>       IOAPIC[0]: Set routing entry (2-15 -> 0x3f -> IRQ 15 Mode:0 Active:0)
>        2-16 2-17 2-18 2-19 2-20 2-21 2-22 2-23 (apicid-pin) not connected
>     - ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>     + ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
>     + ..MP-BIOS bug: 8254 timer not connected to IO-APIC
>     + ...trying to set up timer (IRQ0) through the 8259A ...
>     + ..... (found apic 0 pin 0) ...
>     + ....... works.
>       CPU0: AMD Athlon(tm) II P340 Dual-Core Processor stepping 03
>       Using local APIC timer interrupts.
>       calibrating APIC timer ...
>     [...]
>       printing PIC contents
>     - ... PIC  IMR: ffff
>     + ... PIC  IMR: fffe
>       ... PIC  IRR: 0201
>       ... PIC  ISR: 0000
>       ... PIC ELCR: 0c20
>     [...]
>       ... APIC IRR field:
>       0000000000000000000000000000000000000000000000000000000000000000
>       ... APIC ESR: 00000000
>     - ... APIC ICR: 000008fd
>     + ... APIC ICR: 000008ef
>       ... APIC ICR2: 02000000
>       ... APIC LVTT: 000300ef
>       ... APIC LVTPC: 00000400
>     [...]
>       ... APIC EILVT2: 00010000
>       ... APIC EILVT3: 00010000
>       
>     - number of MP IRQ sources: 15.
>     + number of MP IRQ sources: 16.
>       number of IO-APIC #2 registers: 24.
>       testing the IO APIC.......................
>     [...]
>       .......     : Boot DT    : 0
>       .... IRQ redirection table:
>        NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
>     -  00 000 1    0    0   0   0    0    0    00
>     +  00 003 0    0    0   0   0    1    1    30
>        01 003 0    0    0   0   0    1    1    31
>     -  02 003 0    0    0   1   0    1    1    30
>     +  02 003 1    0    0   0   0    0    0    32
>        03 003 0    0    0   0   0    1    1    33
>        04 003 0    0    0   0   0    1    1    34
>        05 003 0    0    0   0   0    1    1    35
>     [...]
>        16 000 1    0    0   0   0    0    0    00
>        17 000 1    0    0   0   0    0    0    00
>       IRQ to pin mappings:
>     - IRQ0 -> 0:2
>     + IRQ0 -> 0:0
>       IRQ1 -> 0:1
>     + IRQ2 -> 0:2
>       IRQ3 -> 0:3
>       IRQ4 -> 0:4
>       IRQ5 -> 0:5
> 
> 
> Grüße,
>  Thomas
Andreas Herrmann - Feb. 23, 2011, 2:58 p.m.
On Wed, Feb 16, 2011 at 04:17:02AM -0500, Stefan Bader wrote:
> Hi Andreas,
> 
> I know there probably are already enough people poking you about when there
> might be a generic fix. Just wanted to point out that Thomas reported a SB880
> (in case that makes a difference to SB800 that seemed to be reported before).

Hi Stefan,

patch is prepared.
Just want to have verified it on one known buggy system here.
I'll send it out by end of today or tomorrow.


Andreas

> -Stefan
> 
> On 02/14/2011 08:16 PM, Thomas Schwinge wrote:
> > Hallo!
> > 
> > On Mon, 14 Feb 2011 15:27:10 +0100, Stefan Bader <stefan.bader@canonical.com> wrote:
> >> Seems that we've identified the root cause.
> >>
> >> I wondered why systems with the problem have configured IOAPIC pin
> >> with polarity=1 (low active).  That was different to what the working
> >> systems used.
> >>
> >> Switching the configuration to the usual polarity=0 (high active)
> >> fixed the issue.
> >>
> >> The explanation is that when hpet interrupt is triggerd, signal goes
> >> from low to high. (AFAIK HPET spec even mentions that HPET interrupts
> >> are all active high.)
> >>
> >> Now if IO-APIC pin is configured as low active it just ignores this
> >> signal change. It just triggers later when for next interrupt signal
> >> will go from high to low and high again. (That happens the first time
> >> after resume when the HPET counter wrapped around.)
> >>
> >> Setting the correct polarity fixes the detection of the first hpet
> >> interrupt after resume.
> >>
> >> To confirm that your systems behave similar you should boot with
> >> "apic=debug" kernel parameter. The output for IO APIC should show
> >> polarity=1 for IO APIC pin 2, e.g.
> >>
> >>  [    0.158179] IO APIC #2......
> >>   ...
> >>  [    0.158205]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> >>  [    0.158210]  00 000 1    0    0   0   0    0    0    00
> >>  [    0.158217]  01 003 0    0    0   0   0    1    1    31
> >>  [    0.158224]  02 003 0    0    0   1   0    1    1    30
> > 
> > Confirmed; see below.
> > 
> >> Furthermore you can check with attached test patch whether changing
> >> the polarity fixes the problem on your system. IO APIC debug output
> >> with this patch should change to
> >>
> >>  [    0.156170] IO APIC #2......
> >>   ...
> >>  [    0.156197]  NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> >>  [    0.156202]  00 000 1    0    0   0   0    0    0    00
> >>  [    0.156209]  01 003 0    0    0   0   0    1    1    31
> >>  [    0.156216]  02 003 0    0    0   0   0    1    1    30
> >>
> >>
> >> I'll come up with an SB800 quirk asap. (Of course we'll also try to
> >> fix the respective BIOSes but too often BIOS updates are only
> >> available for a limited time period.)
> > 
> > In case this helps someone -- here is the ``apic=debug'' dmesg output of
> > the original Ubuntu maverick linux-image-2.6.35-26-generic 2.6.35-26.46
> > package (suspend/resume broken) vs. my rebuilt one that has the patch /
> > hack applied that I posted in this thread (suspend/resume functional).
> > (I stripped off some uninteresting bits of the diff.)
> > 
> >     - Linux version 2.6.35-26-generic (buildd@crested) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #46-Ubuntu SMP Sun Jan 30 06:59:07 UTC 2011 (Ubuntu 2.6.35-26.46-generic 2.6.35.10)
> >     + Linux version 2.6.35-26-generic (root@Paddy) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #46 SMP Sun Feb 13 13:09:55 CET 2011 (Ubuntu 2.6.35-26.46-generic 2.6.35.10)
> >     [...]
> >     + ------------[ cut here ]------------
> >     + WARNING: at /home/thomi/tmp/linux-2.6.35/arch/x86/kernel/acpi/boot.c:1345 dmi_ignore_irq0_timer_override+0x2f/0x51()
> >     + Hardware name: 030246G
> >     + ati_ixp4x0 quirk not complete.
> >     + Modules linked in:
> >     + Pid: 0, comm: swapper Not tainted 2.6.35-26-generic #46
> >     + Call Trace:
> >     +  [<ffffffff8106093f>] warn_slowpath_common+0x7f/0xc0
> >     +  [<ffffffff81060a36>] warn_slowpath_fmt+0x46/0x50
> >     +  [<ffffffff81af84bc>] dmi_ignore_irq0_timer_override+0x2f/0x51
> >     +  [<ffffffff8147d71d>] dmi_check_system+0x3d/0x60
> >     +  [<ffffffff81af8c0d>] acpi_boot_table_init+0x10/0x85
> >     +  [<ffffffff81af2b73>] setup_arch+0x68c/0x7a3
> >     +  [<ffffffff81aed9ec>] start_kernel+0xdd/0x390
> >     +  [<ffffffff81aed341>] x86_64_start_reservations+0x12c/0x130
> >     +  [<ffffffff81aed43f>] x86_64_start_kernel+0xfa/0x109
> >     + ---[ end trace a7919e7f17c0a725 ]---
> >     + ThinkPad Edge detected: Ignoring BIOS IRQ0 pin2 override
> >     [...]
> >       ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> >       IOAPIC[0]: apic_id 2, version 33, address 0xfec00000, GSI 0-23
> >       ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 low level)
> >     - ACPI: IRQ0 used by override.
> >     - ACPI: IRQ2 used by override.
> >     + ACPI: BIOS IRQ0 pin2 override ignored.
> >       ACPI: IRQ9 used by override.
> >       Using ACPI (MADT) for SMP configuration information
> >       ACPI: HPET id: 0x43538210 base: 0xfed00000
> >     [...]
> >       enabled ExtINT on CPU#0
> >       ENABLING IO-APIC IRQs
> >       init IO_APIC IRQs
> >     -  2-0 (apicid-pin) not connected
> >     + IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
> >       IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
> >     - IOAPIC[0]: Set routing entry (2-2 -> 0x30 -> IRQ 0 Mode:0 Active:1)
> >       IOAPIC[0]: Set routing entry (2-3 -> 0x33 -> IRQ 3 Mode:0 Active:0)
> >       IOAPIC[0]: Set routing entry (2-4 -> 0x34 -> IRQ 4 Mode:0 Active:0)
> >       IOAPIC[0]: Set routing entry (2-5 -> 0x35 -> IRQ 5 Mode:0 Active:0)
> >     [...]
> >       IOAPIC[0]: Set routing entry (2-14 -> 0x3e -> IRQ 14 Mode:0 Active:0)
> >       IOAPIC[0]: Set routing entry (2-15 -> 0x3f -> IRQ 15 Mode:0 Active:0)
> >        2-16 2-17 2-18 2-19 2-20 2-21 2-22 2-23 (apicid-pin) not connected
> >     - ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> >     + ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
> >     + ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> >     + ...trying to set up timer (IRQ0) through the 8259A ...
> >     + ..... (found apic 0 pin 0) ...
> >     + ....... works.
> >       CPU0: AMD Athlon(tm) II P340 Dual-Core Processor stepping 03
> >       Using local APIC timer interrupts.
> >       calibrating APIC timer ...
> >     [...]
> >       printing PIC contents
> >     - ... PIC  IMR: ffff
> >     + ... PIC  IMR: fffe
> >       ... PIC  IRR: 0201
> >       ... PIC  ISR: 0000
> >       ... PIC ELCR: 0c20
> >     [...]
> >       ... APIC IRR field:
> >       0000000000000000000000000000000000000000000000000000000000000000
> >       ... APIC ESR: 00000000
> >     - ... APIC ICR: 000008fd
> >     + ... APIC ICR: 000008ef
> >       ... APIC ICR2: 02000000
> >       ... APIC LVTT: 000300ef
> >       ... APIC LVTPC: 00000400
> >     [...]
> >       ... APIC EILVT2: 00010000
> >       ... APIC EILVT3: 00010000
> >       
> >     - number of MP IRQ sources: 15.
> >     + number of MP IRQ sources: 16.
> >       number of IO-APIC #2 registers: 24.
> >       testing the IO APIC.......................
> >     [...]
> >       .......     : Boot DT    : 0
> >       .... IRQ redirection table:
> >        NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> >     -  00 000 1    0    0   0   0    0    0    00
> >     +  00 003 0    0    0   0   0    1    1    30
> >        01 003 0    0    0   0   0    1    1    31
> >     -  02 003 0    0    0   1   0    1    1    30
> >     +  02 003 1    0    0   0   0    0    0    32
> >        03 003 0    0    0   0   0    1    1    33
> >        04 003 0    0    0   0   0    1    1    34
> >        05 003 0    0    0   0   0    1    1    35
> >     [...]
> >        16 000 1    0    0   0   0    0    0    00
> >        17 000 1    0    0   0   0    0    0    00
> >       IRQ to pin mappings:
> >     - IRQ0 -> 0:2
> >     + IRQ0 -> 0:0
> >       IRQ1 -> 0:1
> >     + IRQ2 -> 0:2
> >       IRQ3 -> 0:3
> >       IRQ4 -> 0:4
> >       IRQ5 -> 0:5
> > 
> > 
> > Grüße,
> >  Thomas
> 
> 
>
Stefan Bader - Feb. 24, 2011, 4:30 p.m.
On 02/24/2011 03:56 PM, Andreas Herrmann wrote:
> On Wed, Feb 23, 2011 at 03:58:40PM +0100, Andreas Herrmann wrote:
>> On Wed, Feb 16, 2011 at 04:17:02AM -0500, Stefan Bader wrote:
>>> Hi Andreas,
>>>
>>> I know there probably are already enough people poking you about when there
>>> might be a generic fix. Just wanted to point out that Thomas reported a SB880
>>> (in case that makes a difference to SB800 that seemed to be reported before).
>>
>> Hi Stefan,
>>
>> patch is prepared.
>> Just want to have verified it on one known buggy system here.
>> I'll send it out by end of today or tomorrow.
> 
> I've just submitted below patch to LKML
> (http://marc.info/?i=20110224145346.GD3658@alberich.amd.com)
> 
> 
> Andreas
> 

Cool. Thanks.

/me ponders whether this might be something to have a cmdline quirking for...

> ---
> On some SB800 systems polarity for IOAPIC pin2 is wrongly specified as
> low active by BIOS. This caused system hangs after resume from S3 when
> HPET was used in one-shot mode on such systems because a timer
> interrupt was missed (HPET signal is high active).
> 
> For more details see http://marc.info/?l=linux-kernel&m=129623757413868
> 
> Cc: stable@kernel.org # 37.x, 32.x
> Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
> Tested-by: Andre Przywara <andre.przywara@amd.com>
> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
> ---
>  arch/x86/include/asm/acpi.h    |    1 +
>  arch/x86/kernel/acpi/boot.c    |   16 ++++++++++++----
>  arch/x86/kernel/early-quirks.c |   16 +++++++---------
>  3 files changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
> index 211ca3f..4ea15ca 100644
> --- a/arch/x86/include/asm/acpi.h
> +++ b/arch/x86/include/asm/acpi.h
> @@ -88,6 +88,7 @@ extern int acpi_disabled;
>  extern int acpi_pci_disabled;
>  extern int acpi_skip_timer_override;
>  extern int acpi_use_timer_override;
> +extern int acpi_fix_pin2_polarity;
>  
>  extern u8 acpi_sci_flags;
>  extern int acpi_sci_override_gsi;
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index b3a7113..ff25db6 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -72,6 +72,7 @@ u8 acpi_sci_flags __initdata;
>  int acpi_sci_override_gsi __initdata;
>  int acpi_skip_timer_override __initdata;
>  int acpi_use_timer_override __initdata;
> +int acpi_fix_pin2_polarity __initdata;
>  
>  #ifdef CONFIG_X86_LOCAL_APIC
>  static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
> @@ -415,10 +416,17 @@ acpi_parse_int_src_ovr(struct acpi_subtable_header * header,
>  		return 0;
>  	}
>  
> -	if (acpi_skip_timer_override &&
> -	    intsrc->source_irq == 0 && intsrc->global_irq == 2) {
> -		printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
> -		return 0;
> +	if (intsrc->source_irq == 0 && intsrc->global_irq == 2) {
> +		if (acpi_skip_timer_override) {
> +			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
> +			return 0;
> +		}
> +		if (acpi_fix_pin2_polarity &&
> +		    (intsrc->inti_flags & ACPI_MADT_POLARITY_MASK)) {
> +			intsrc->inti_flags &= ~ACPI_MADT_POLARITY_MASK;
> +			printk(PREFIX "BIOS IRQ0 pin2 override: "
> +			       "forcing polarity to high active.\n");
> +		}
>  	}
>  
>  	mp_override_legacy_irq(intsrc->source_irq,
> diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> index 76b8cd9..9efbdcc 100644
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -143,15 +143,10 @@ static void __init ati_bugs(int num, int slot, int func)
>  
>  static u32 __init ati_sbx00_rev(int num, int slot, int func)
>  {
> -	u32 old, d;
> +	u32 d;
>  
> -	d = read_pci_config(num, slot, func, 0x70);
> -	old = d;
> -	d &= ~(1<<8);
> -	write_pci_config(num, slot, func, 0x70, d);
>  	d = read_pci_config(num, slot, func, 0x8);
>  	d &= 0xff;
> -	write_pci_config(num, slot, func, 0x70, old);
>  
>  	return d;
>  }
> @@ -160,13 +155,16 @@ static void __init ati_bugs_contd(int num, int slot, int func)
>  {
>  	u32 d, rev;
>  
> -	if (acpi_use_timer_override)
> -		return;
> -
>  	rev = ati_sbx00_rev(num, slot, func);
> +	if (rev >= 0x40)
> +		acpi_fix_pin2_polarity = 1;
> +
>  	if (rev > 0x13)
>  		return;
>  
> +	if (acpi_use_timer_override)
> +		return;
> +
>  	/* check for IRQ0 interrupt swap */
>  	d = read_pci_config(num, slot, func, 0x64);
>  	if (!(d & (1<<14)))

Patch

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 715abe9..8debd3b 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -1461,6 +1461,15 @@  static struct dmi_system_id __initdata acpi_dmi_table[] = {
 		     DMI_MATCH(DMI_PRODUCT_NAME, "030222U"),
 		    },
 	},
+	/* Lenovo ThinkPad Edge, model 030246G */
+	{
+	 .callback = dmi_ignore_irq0_timer_override,
+	 .ident = "ThinkPad Edge",
+	 .matches = {
+		     DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+		     DMI_MATCH(DMI_PRODUCT_NAME, "030246G"),
+		    },
+	},
 	{}
 };