Patchwork Re: [PATCH 7/7] ahci: work around bug with level interrupts

login
register
mail settings
Submitter Jan Kiszka
Date Feb. 3, 2011, 10:30 a.m.
Message ID <4D4A83B6.3030007@siemens.com>
Download mbox | patch
Permalink /patch/81628/
State New
Headers show

Comments

Jan Kiszka - Feb. 3, 2011, 10:30 a.m.
On 2011-02-02 15:39, Alexander Graf wrote:
> When using level based interrupts, the interrupt is treated the same as an
> edge triggered one: leaving the line up does not retrigger the interrupt.
> 
> In fact, when not lowering the line, we won't ever get a new interrupt inside
> the guest. So let's always retrigger an interrupt as soon as the OS ack'ed
> something on the device. This way we're sure the guest doesn't starve on
> interrupts until someone fixes the actual interrupt path.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> 
> ---
> 
> v2 -> v3:
> 
>   - add comment about the interrupt hack
> 
> v3 -> v4:
> 
>   - embed non-workaround version in the code
> ---
>  hw/ide/ahci.c |   13 +++++++++++++
>  1 files changed, 13 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 98bdf70..10377ca 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -152,12 +152,25 @@ static void ahci_check_irq(AHCIState *s)
>          }
>      }
>  
> +    /* XXX We always lower the interrupt line here because of a bug with
> +           interrupt handling in Qemu. When leaving a line up, the interrupt
> +           does not get retriggered automatically currently. Once that bug is
> +           fixed, this workaround is not necessary anymore and we only need to
> +           lower in the else branch. */
> +#if 0
>      if (s->control_regs.irqstatus &&
>          (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>              ahci_irq_raise(s, NULL);
>      } else {
>          ahci_irq_lower(s, NULL);
>      }
> +#else
> +    ahci_irq_lower(s, NULL);
> +    if (s->control_regs.irqstatus &&
> +        (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
> +            ahci_irq_raise(s, NULL);
> +    }
> +#endif
>  }
>  
>  static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,

Could you check if this quick-hack obsoletes the above hack?

I was hoping it has some influence on our 64-bit Windows issue, but it
hasn't, or it's still buggy, or both. However, its intention is to
reassert still pending level-triggered IRQs on APIC EOI. This logic is
missing in the user space model but exists in KVM's kernel model (I was
asking for a test against the latter but unfortunately did not receive
an answer - I bet you already don't need your patch over qemu-kvm).

Thanks,
Jan

---
 hw/apic.c   |    9 ++++++---
 hw/ioapic.c |   43 +++++++++++++++++++++++++++++++++++++++++--
 hw/ioapic.h |   20 ++++++++++++++++++++
 3 files changed, 67 insertions(+), 5 deletions(-)
 create mode 100644 hw/ioapic.h
Alexander Graf - Feb. 3, 2011, 10:38 a.m.
On 03.02.2011, at 11:30, Jan Kiszka wrote:

> On 2011-02-02 15:39, Alexander Graf wrote:
>> When using level based interrupts, the interrupt is treated the same as an
>> edge triggered one: leaving the line up does not retrigger the interrupt.
>> 
>> In fact, when not lowering the line, we won't ever get a new interrupt inside
>> the guest. So let's always retrigger an interrupt as soon as the OS ack'ed
>> something on the device. This way we're sure the guest doesn't starve on
>> interrupts until someone fixes the actual interrupt path.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> 
>> ---
>> 
>> v2 -> v3:
>> 
>>  - add comment about the interrupt hack
>> 
>> v3 -> v4:
>> 
>>  - embed non-workaround version in the code
>> ---
>> hw/ide/ahci.c |   13 +++++++++++++
>> 1 files changed, 13 insertions(+), 0 deletions(-)
>> 
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>> index 98bdf70..10377ca 100644
>> --- a/hw/ide/ahci.c
>> +++ b/hw/ide/ahci.c
>> @@ -152,12 +152,25 @@ static void ahci_check_irq(AHCIState *s)
>>         }
>>     }
>> 
>> +    /* XXX We always lower the interrupt line here because of a bug with
>> +           interrupt handling in Qemu. When leaving a line up, the interrupt
>> +           does not get retriggered automatically currently. Once that bug is
>> +           fixed, this workaround is not necessary anymore and we only need to
>> +           lower in the else branch. */
>> +#if 0
>>     if (s->control_regs.irqstatus &&
>>         (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>>             ahci_irq_raise(s, NULL);
>>     } else {
>>         ahci_irq_lower(s, NULL);
>>     }
>> +#else
>> +    ahci_irq_lower(s, NULL);
>> +    if (s->control_regs.irqstatus &&
>> +        (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>> +            ahci_irq_raise(s, NULL);
>> +    }
>> +#endif
>> }
>> 
>> static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,
> 
> Could you check if this quick-hack obsoletes the above hack?
> 
> I was hoping it has some influence on our 64-bit Windows issue, but it
> hasn't, or it's still buggy, or both. However, its intention is to
> reassert still pending level-triggered IRQs on APIC EOI. This logic is
> missing in the user space model but exists in KVM's kernel model (I was
> asking for a test against the latter but unfortunately did not receive
> an answer - I bet you already don't need your patch over qemu-kvm).

Oh, sorry for not replying there. I work on qemu.git, so the in-kernel apic is out of question for testing. I tried merging qemu-kvm.git and qemu.git several times and every time just wasted a few hours of my life, so I gave up on testing things on qemu-kvm.git.

The patch works though. If everybody agrees to take it, we can drop patch 7/7 of my ahci set.


Alex
Jan Kiszka - Feb. 3, 2011, 11:19 a.m.
On 2011-02-03 11:38, Alexander Graf wrote:
> 
> On 03.02.2011, at 11:30, Jan Kiszka wrote:
> 
>> On 2011-02-02 15:39, Alexander Graf wrote:
>>> When using level based interrupts, the interrupt is treated the same as an
>>> edge triggered one: leaving the line up does not retrigger the interrupt.
>>>
>>> In fact, when not lowering the line, we won't ever get a new interrupt inside
>>> the guest. So let's always retrigger an interrupt as soon as the OS ack'ed
>>> something on the device. This way we're sure the guest doesn't starve on
>>> interrupts until someone fixes the actual interrupt path.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>
>>> ---
>>>
>>> v2 -> v3:
>>>
>>>  - add comment about the interrupt hack
>>>
>>> v3 -> v4:
>>>
>>>  - embed non-workaround version in the code
>>> ---
>>> hw/ide/ahci.c |   13 +++++++++++++
>>> 1 files changed, 13 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>>> index 98bdf70..10377ca 100644
>>> --- a/hw/ide/ahci.c
>>> +++ b/hw/ide/ahci.c
>>> @@ -152,12 +152,25 @@ static void ahci_check_irq(AHCIState *s)
>>>         }
>>>     }
>>>
>>> +    /* XXX We always lower the interrupt line here because of a bug with
>>> +           interrupt handling in Qemu. When leaving a line up, the interrupt
>>> +           does not get retriggered automatically currently. Once that bug is
>>> +           fixed, this workaround is not necessary anymore and we only need to
>>> +           lower in the else branch. */
>>> +#if 0
>>>     if (s->control_regs.irqstatus &&
>>>         (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>>>             ahci_irq_raise(s, NULL);
>>>     } else {
>>>         ahci_irq_lower(s, NULL);
>>>     }
>>> +#else
>>> +    ahci_irq_lower(s, NULL);
>>> +    if (s->control_regs.irqstatus &&
>>> +        (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>>> +            ahci_irq_raise(s, NULL);
>>> +    }
>>> +#endif
>>> }
>>>
>>> static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,
>>
>> Could you check if this quick-hack obsoletes the above hack?
>>
>> I was hoping it has some influence on our 64-bit Windows issue, but it
>> hasn't, or it's still buggy, or both. However, its intention is to
>> reassert still pending level-triggered IRQs on APIC EOI. This logic is
>> missing in the user space model but exists in KVM's kernel model (I was
>> asking for a test against the latter but unfortunately did not receive
>> an answer - I bet you already don't need your patch over qemu-kvm).
> 
> Oh, sorry for not replying there. I work on qemu.git, so the in-kernel apic is out of question for testing. I tried merging qemu-kvm.git and qemu.git several times and every time just wasted a few hours of my life, so I gave up on testing things on qemu-kvm.git.

Ah, I see. Marcelo recently merged back, resolving the tricky bits, and
now it should be much easier. Anyway.

> 
> The patch works though. If everybody agrees to take it, we can drop patch 7/7 of my ahci set.

Cool! I've a few more minor things to clean up here and will sent that
as a series later today, also for 0.14.

Jan

Patch

diff --git a/hw/apic.c b/hw/apic.c
index ff581f0..05a115f 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -18,6 +18,7 @@ 
  */
 #include "hw.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -57,7 +58,8 @@ 
 
 #define ESR_ILLEGAL_ADDRESS (1 << 7)
 
-#define APIC_SV_ENABLE (1 << 8)
+#define APIC_SV_DIRECTED_IO             (1<<12)
+#define APIC_SV_ENABLE                  (1<<8)
 
 #define MAX_APICS 255
 #define MAX_APIC_WORDS 8
@@ -420,8 +422,9 @@  static void apic_eoi(APICState *s)
     if (isrv < 0)
         return;
     reset_bit(s->isr, isrv);
-    /* XXX: send the EOI packet to the APIC bus to allow the I/O APIC to
-            set the remote IRR bit for level triggered interrupts. */
+    if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && get_bit(s->tmr, isrv)) {
+        ioapic_eio_broadcast(isrv);
+    }
     apic_update_irq(s);
 }
 
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 2109568..1d23474 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -23,6 +23,7 @@ 
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -36,7 +37,10 @@ 
 #define DPRINTF(fmt, ...)
 #endif
 
-#define IOAPIC_LVT_MASKED 		(1<<16)
+#define MAX_IOAPICS                     1
+
+#define IOAPIC_LVT_MASKED               (1ULL << 16)
+#define IOAPIC_LVT_REMOTE_IRR           (1ULL << 14)
 
 #define IOAPIC_TRIGGER_EDGE		0
 #define IOAPIC_TRIGGER_LEVEL		1
@@ -61,6 +65,8 @@  struct IOAPICState {
     uint64_t ioredtbl[IOAPIC_NUM_PINS];
 };
 
+static IOAPICState *ioapics[MAX_IOAPICS];
+
 static void ioapic_service(IOAPICState *s)
 {
     uint8_t i;
@@ -83,8 +89,11 @@  static void ioapic_service(IOAPICState *s)
                 dest_mode = (entry >> 11) & 1;
                 delivery_mode = (entry >> 8) & 7;
                 polarity = (entry >> 13) & 1;
-                if (trig_mode == IOAPIC_TRIGGER_EDGE)
+                if (trig_mode == IOAPIC_TRIGGER_EDGE) {
                     s->irr &= ~mask;
+                } else {
+                    s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
+                }
                 if (delivery_mode == IOAPIC_DM_EXTINT)
                     vector = pic_read_irq(isa_pic);
                 else
@@ -131,6 +140,29 @@  static void ioapic_set_irq(void *opaque, int vector, int level)
     }
 }
 
+void ioapic_eio_broadcast(int vector)
+{
+    IOAPICState *s;
+    uint64_t entry;
+    int i, n;
+
+    for (i = 0; i < MAX_IOAPICS; i++) {
+        s = ioapics[i];
+        if (!s) {
+            continue;
+        }
+        for (n = 0; n < IOAPIC_NUM_PINS; n++) {
+            entry = s->ioredtbl[n];
+            if ((entry & IOAPIC_LVT_REMOTE_IRR) && (entry & 0xff) == vector) {
+                s->ioredtbl[n] = entry & ~IOAPIC_LVT_REMOTE_IRR;
+                if (!(entry & IOAPIC_LVT_MASKED) && (s->irr & (1 << n))) {
+                    ioapic_service(s);
+                }
+            }
+        }
+    }
+}
+
 static uint32_t ioapic_mem_readl(void *opaque, target_phys_addr_t addr)
 {
     IOAPICState *s = opaque;
@@ -240,6 +272,11 @@  static int ioapic_init1(SysBusDevice *dev)
 {
     IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
     int io_memory;
+    static int ioapic_no;
+
+    if (ioapic_no >= MAX_IOAPICS) {
+        return -1;
+    }
 
     io_memory = cpu_register_io_memory(ioapic_mem_read,
                                        ioapic_mem_write, s,
@@ -248,6 +285,8 @@  static int ioapic_init1(SysBusDevice *dev)
 
     qdev_init_gpio_in(&dev->qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
+    ioapics[ioapic_no++] = s;
+
     return 0;
 }
 
diff --git a/hw/ioapic.h b/hw/ioapic.h
new file mode 100644
index 0000000..c441567
--- /dev/null
+++ b/hw/ioapic.h
@@ -0,0 +1,20 @@ 
+/*
+ *  ioapic.c IOAPIC emulation logic
+ *
+ *  Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+void ioapic_eio_broadcast(int vector);