diff mbox

[v2,6/6] i8259: add -no-spurious-interrupt-hack option

Message ID 1345703083-25322-7-git-send-email-mmogilvi_qemu@miniinfo.net
State New
Headers show

Commit Message

Matthew Ogilvie Aug. 23, 2012, 6:24 a.m. UTC
This patch provides a way to optionally suppress spurious interrupts,
as a workaround for systems described below:

Some old operating systems do not handle spurious interrupts well,
and qemu tends to generate them significantly more often than
real hardware.

Examples:
  - Microport UNIX System V/386 v 2.1 (ca 1987)
    (The main problem I'm fixing: Without this patch, it panics
    sporadically when accessing the hard disk.)
  - AT&T UNIX System V/386 Release 4.0 Version 2.1a (ca 1991)
    See screenshot in "QEMU Official OS Support List":
    http://www.claunia.com/qemu/objectManager.php?sClass=application&iId=9
    (I don't have this system to test.)
  - A report about OS/2 boot lockup from 2004 by Hampa Hug:
    http://lists.nongnu.org/archive/html/qemu-devel/2004-09/msg00367.html
    (My patch was partially inspired by his.)
    Also: http://lists.nongnu.org/archive/html/qemu-devel/2005-06/msg00243.html
    (I don't have this system to test.)

Signed-off-by: Matthew Ogilvie <mmogilvi_qemu@miniinfo.net>
---

Note: checkpatches.pl gives an error about initializing the global 
"int no_spurious_interrupt_hack = 0;", even though existing lines
near it are doing the same thing.  Should I give precedence to
checkpatches.pl, or nearby code?

There was no version 1 of this patch; this was the last thing I had to
work around to get UNIX running.

High level symptoms:
   1. Despite using this UNIX system for nearly 10 years (ca 1987-1996)
      on an early 80386, I don't remember ever seeing any crash like
      this.  I vaguely remember I may have had one or two crashes for
      which I don't have other explanations that perhaps could have
      been this, but I don't remember the error messages to confirm it.
   2. It is somewhat random when UNIX crashes when running in qemu.
       - Sometimes it crashes the first time the floppy-based installer
         tries to access the hard disk (partition table?).
       - Other times (though fairly rarely), it actually finishes
         formatting and copying the first disk's files to the
         hard disk without crashing.
       - On the other hand, I've never seen it successfully boot from
         the hard disk without this patch.  An attempt to boot from
         the hard drive always panics quite early.
   3. I tried -win2k-hack instead, thinking maybe the hard disk is just
      responding faster than UNIX expected.  But it doesn't seem
      to have any effect.  UNIX still panics sporadically the same way.
       - TANGENT: I was going to see if my patch provides an
         alternative fix for installing Windows 2000, but
         I was unable to reproduce the original -win2k-hack problem at
         all (with neither -win2k-hack NOR this patch).  Maybe
         some other change has fixed it some other way?  Or maybe
         it is only an issue in configurations I didn't test?
         (KVM instead of TCG?  Less RAM?  Something else?)
            It might be worth doing a little more investigation,
         and eliminating the -win2k-hack option if appropriate.
   4. If I enable KVM, I get a different error very early in
      bootup (in splx function instead of splint), and this patch
      doesn't help.

Comments

Jan Kiszka Aug. 24, 2012, 5:40 a.m. UTC | #1
On 2012-08-23 08:24, Matthew Ogilvie wrote:
> This patch provides a way to optionally suppress spurious interrupts,
> as a workaround for systems described below:
> 
> Some old operating systems do not handle spurious interrupts well,
> and qemu tends to generate them significantly more often than
> real hardware.
> 
> Examples:
>   - Microport UNIX System V/386 v 2.1 (ca 1987)
>     (The main problem I'm fixing: Without this patch, it panics
>     sporadically when accessing the hard disk.)
>   - AT&T UNIX System V/386 Release 4.0 Version 2.1a (ca 1991)
>     See screenshot in "QEMU Official OS Support List":
>     http://www.claunia.com/qemu/objectManager.php?sClass=application&iId=9
>     (I don't have this system to test.)
>   - A report about OS/2 boot lockup from 2004 by Hampa Hug:
>     http://lists.nongnu.org/archive/html/qemu-devel/2004-09/msg00367.html
>     (My patch was partially inspired by his.)
>     Also: http://lists.nongnu.org/archive/html/qemu-devel/2005-06/msg00243.html
>     (I don't have this system to test.)
> 
> Signed-off-by: Matthew Ogilvie <mmogilvi_qemu@miniinfo.net>
> ---
> 
> Note: checkpatches.pl gives an error about initializing the global 
> "int no_spurious_interrupt_hack = 0;", even though existing lines
> near it are doing the same thing.  Should I give precedence to
> checkpatches.pl, or nearby code?
> 
> There was no version 1 of this patch; this was the last thing I had to
> work around to get UNIX running.
> 
> High level symptoms:
>    1. Despite using this UNIX system for nearly 10 years (ca 1987-1996)
>       on an early 80386, I don't remember ever seeing any crash like
>       this.  I vaguely remember I may have had one or two crashes for
>       which I don't have other explanations that perhaps could have
>       been this, but I don't remember the error messages to confirm it.
>    2. It is somewhat random when UNIX crashes when running in qemu.
>        - Sometimes it crashes the first time the floppy-based installer
>          tries to access the hard disk (partition table?).
>        - Other times (though fairly rarely), it actually finishes
>          formatting and copying the first disk's files to the
>          hard disk without crashing.
>        - On the other hand, I've never seen it successfully boot from
>          the hard disk without this patch.  An attempt to boot from
>          the hard drive always panics quite early.
>    3. I tried -win2k-hack instead, thinking maybe the hard disk is just
>       responding faster than UNIX expected.  But it doesn't seem
>       to have any effect.  UNIX still panics sporadically the same way.
>        - TANGENT: I was going to see if my patch provides an
>          alternative fix for installing Windows 2000, but
>          I was unable to reproduce the original -win2k-hack problem at
>          all (with neither -win2k-hack NOR this patch).  Maybe
>          some other change has fixed it some other way?  Or maybe
>          it is only an issue in configurations I didn't test?
>          (KVM instead of TCG?  Less RAM?  Something else?)
>             It might be worth doing a little more investigation,
>          and eliminating the -win2k-hack option if appropriate.
>    4. If I enable KVM, I get a different error very early in
>       bootup (in splx function instead of splint), and this patch
>       doesn't help.
> 
> ============
> My low level analysis of what is going on:
> 
> It is hard to track down all the details, but based on logging a
> lot of qemu IRQ stuff, and setting a breakpoint in the earliest
> panic-related UNIX function using gdb, it looks like:
> 
>    1. It is near the end of servicing a previous IRQ14 from the
>       hard disk.
>    2. The processor has interrupts disabled (I think), while UNIX
>       clears the slave 8259's IMR (mask) register (sets it to 0), allowing
>       all interrupts to be passed on to the master.
>    3. While in that state, IRQ14 is raised (on the slave), which
>       gets propagated to the master (IRQ2), but the CPU
>       is not interrupted yet.
>    4. UNIX then masks the slave 8259's IMR register
>       completely (sets to 0xff).
>    5. Because the master elcr register is set (by BIOS; UNIX never
>       touches it) to edge trigger for IRQ2, the master latched on
>       to IRQ2 earlier, and continues to assert the processors INT line
>       (the env->interrupt_request&CPU_INTERRUPT_HARD bit) even
>       after all slave IRQs have been masked off (clearing the input
>       IRQ2).
>    6. Finally, UNIX enables CPU interrupts and the interrupt is delivered
>       to the CPU, which ends up as a spurious IRQ15 due to the
>       slave's imr register.  UNIX doesn't know what to do with
>       that, and panics/halts.
> 
> I'm not sure why it only sporadically hits this sequence of events.
> There doesn't seem to be other IRQs asserted or serviced anywhere
> in the near past; the last several were all IRQ14's.  But I can't
> help feeling I'm not reading the log output correctly or something,
> because that doesn't make sense.  Maybe there is there some kind
> of a-few-instructions delay before a CPU interrupt is actually
> deliviered after interrupts are enabled, or some delay in raising
> IRQ14 after a hard drive operation is requested, and such delays
> need to fall into a narrow window of opportunity left by UNIX?
> 
> I can get a disassembly of the UNIX kernel using a "coff"-enabled
> build of GNU objdump, giving function names but not much else.
> But I haven't studied it in enough detail to actually find the
> relevant code path that is manipulating imr as described above.
> However, this old post outlines some of the high level theory
> of UNIX spl*() functions:
> http://www.linuxmisc.com/29-unix-internals/4e6c1f6fa2e41670.htm
> 
> If anyone wants to look into this further, I can provide access to the
> initial boot install floppy, at least.  Email me.  (Without the rest
> of the install disks, it isn't much use for anything except testing
> virtual machines like qemu against rare corner cases...)
> 
> ============
> Alternative Approaches:
> 
> An alternative to this patch that might work (I haven't tried) would
> be to have BIOS set the master's elcr register 0x04 bit, making IRQ2
> level triggered instead of edge triggered.  I'm not sure what other
> effects this might have.  Maybe it would actually be a more accurate
> model (I haven't checked documentation; maybe "slave mode" of a
> IRQ line into the master is supposed to be level triggered?)
> 
> Or perhaps find a way to model the minimum timescale that a interrupt
> request needs to be active to be recognized?
> 
> Or maybe my analysis isn't correct; I wasn't able to find the
> relevant code path in the UNIX kernel.
> 
> ============
> 
>  cpu-exec.c      | 12 +++++++-----
>  hw/i8259.c      | 18 ++++++++++++++++++
>  qemu-options.hx | 12 ++++++++++++
>  sysemu.h        |  1 +
>  vl.c            |  4 ++++
>  5 files changed, 42 insertions(+), 5 deletions(-)
> 
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 134b3c4..c309847 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -329,11 +329,15 @@ int cpu_exec(CPUArchState *env)
>                                                            0);
>                              env->interrupt_request &= ~(CPU_INTERRUPT_HARD | CPU_INTERRUPT_VIRQ);
>                              intno = cpu_get_pic_interrupt(env);
> -                            qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing hardware INT=0x%02x\n", intno);
> -                            do_interrupt_x86_hardirq(env, intno, 1);
> -                            /* ensure that no TB jump will be modified as
> -                               the program flow was changed */
> -                            next_tb = 0;
> +                            if (intno >= 0) {
> +                                qemu_log_mask(CPU_LOG_TB_IN_ASM,
> +                                              "Servicing hardware INT=0x%02x\n",
> +                                              intno);
> +                                do_interrupt_x86_hardirq(env, intno, 1);
> +                                /* ensure that no TB jump will be modified as
> +                                   the program flow was changed */
> +                                next_tb = 0;
> +                            }
>  #if !defined(CONFIG_USER_ONLY)
>                          } else if ((interrupt_request & CPU_INTERRUPT_VIRQ) &&
>                                     (env->eflags & IF_MASK) && 
> diff --git a/hw/i8259.c b/hw/i8259.c
> index 6587666..7ecb7e1 100644
> --- a/hw/i8259.c
> +++ b/hw/i8259.c
> @@ -26,6 +26,7 @@
>  #include "isa.h"
>  #include "monitor.h"
>  #include "qemu-timer.h"
> +#include "sysemu.h"
>  #include "i8259_internal.h"
>  
>  /* debug PIC */
> @@ -193,6 +194,20 @@ int pic_read_irq(DeviceState *d)
>                  pic_intack(slave_pic, irq2);
>              } else {
>                  /* spurious IRQ on slave controller */
> +                if (no_spurious_interrupt_hack) {
> +                    /* Pretend it was delivered and acknowledged.  If
> +                     * it was spurious due to slave_pic->imr, then
> +                     * as soon as the mask is cleared, the slave will
> +                     * re-trigger IRQ2 on the master.  If it is spurious for
> +                     * some other reason, make sure we don't keep trying
> +                     * to half-process the same spurious interrupt over
> +                     * and over again.
> +                     */
> +                    s->irr &= ~(1<<irq);
> +                    s->last_irr &= ~(1<<irq);
> +                    s->isr &= ~(1<<irq);
> +                    return -1;
> +                }
>                  irq2 = 7;
>              }
>              intno = slave_pic->irq_base + irq2;
> @@ -202,6 +217,9 @@ int pic_read_irq(DeviceState *d)
>          pic_intack(s, irq);
>      } else {
>          /* spurious IRQ on host controller */
> +        if (no_spurious_interrupt_hack) {
> +            return -1;
> +        }
>          irq = 7;
>          intno = s->irq_base + irq;
>      }
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 03e13ec..57bb0b4 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1188,6 +1188,18 @@ Windows 2000 is installed, you no longer need this option (this option
>  slows down the IDE transfers).
>  ETEXI
>  
> +DEF("no-spurious-interrupt-hack", 0, QEMU_OPTION_no_spurious_interrupt_hack,
> +    "-no-spurious-interrupt-hack     disable delivery of spurious interrupts\n",
> +    QEMU_ARCH_I386)
> +STEXI
> +@item -no-spurious-interrupt-hack
> +@findex -no-spurious-interrupt-hack
> +Use it as a workaround for operating systems that drive PICs in a way that
> +can generate spurious interrupts, but the OS doesn't handle spurious
> +interrupts gracefully.  (e.g. late 80s/early 90s versions of ATT UNIX
> +and derivatives)

Has to mention or even actively warn that it doesn't work with KVM and
its in-kernel irqchip (as that PIC model lacks your hack).

However, I strongly suspect you are nastily papering over an issue in
some device model. So I would prefer to dig deeper before installing
this in upstream (also due to its dependency on the userspace PIC model).

Jan
Matthew Ogilvie Aug. 24, 2012, 8:05 a.m. UTC | #2
On Fri, Aug 24, 2012 at 07:40:36AM +0200, Jan Kiszka wrote:
> On 2012-08-23 08:24, Matthew Ogilvie wrote:
> > This patch provides a way to optionally suppress spurious interrupts,

[snip]

> > I'm not sure why it only sporadically hits this sequence of events.
> > There doesn't seem to be other IRQs asserted or serviced anywhere
> > in the near past; the last several were all IRQ14's.  But I can't
> > help feeling I'm not reading the log output correctly or something,
> > because that doesn't make sense.  Maybe there is there some kind
> > of a-few-instructions delay before a CPU interrupt is actually
> > deliviered after interrupts are enabled, or some delay in raising
> > IRQ14 after a hard drive operation is requested, and such delays
> > need to fall into a narrow window of opportunity left by UNIX?
> > 
> > I can get a disassembly of the UNIX kernel using a "coff"-enabled
> > build of GNU objdump, giving function names but not much else.
> > But I haven't studied it in enough detail to actually find the
> > relevant code path that is manipulating imr as described above.
> > However, this old post outlines some of the high level theory
> > of UNIX spl*() functions:
> > http://www.linuxmisc.com/29-unix-internals/4e6c1f6fa2e41670.htm
> > 
> > If anyone wants to look into this further, I can provide access to the
> > initial boot install floppy, at least.  Email me.  (Without the rest
> > of the install disks, it isn't much use for anything except testing
> > virtual machines like qemu against rare corner cases...)
> > 
> > ============
> > Alternative Approaches:
> > 
> > An alternative to this patch that might work (I haven't tried) would
> > be to have BIOS set the master's elcr register 0x04 bit, making IRQ2
> > level triggered instead of edge triggered.  I'm not sure what other
> > effects this might have.  Maybe it would actually be a more accurate
> > model (I haven't checked documentation; maybe "slave mode" of a
> > IRQ line into the master is supposed to be level triggered?)
> > 
> > Or perhaps find a way to model the minimum timescale that a interrupt
> > request needs to be active to be recognized?
> > 
> > Or maybe my analysis isn't correct; I wasn't able to find the
> > relevant code path in the UNIX kernel.

[snip]

> 
> Has to mention or even actively warn that it doesn't work with KVM and
> its in-kernel irqchip (as that PIC model lacks your hack).

I'll make an incremental patch to the documentation soon.

> 
> However, I strongly suspect you are nastily papering over an issue in
> some device model. So I would prefer to dig deeper before installing
> this in upstream (also due to its dependency on the userspace PIC model).

This is certainly possible.  I'm not an expert on the whole interrupt
subsystem design in a PC.  But other than the wild speculation above
(making IRQ2 level triggered via elcr, or some kind of timing preventing the
edge triggering from catching a very short blip), I'm not sure what
to look for.

- Matthew Ogilvie
Jan Kiszka Aug. 24, 2012, 8:16 a.m. UTC | #3
On 2012-08-24 10:05, Matthew Ogilvie wrote:
> On Fri, Aug 24, 2012 at 07:40:36AM +0200, Jan Kiszka wrote:
>> On 2012-08-23 08:24, Matthew Ogilvie wrote:
>>> This patch provides a way to optionally suppress spurious interrupts,
> 
> [snip]
> 
>>> I'm not sure why it only sporadically hits this sequence of events.
>>> There doesn't seem to be other IRQs asserted or serviced anywhere
>>> in the near past; the last several were all IRQ14's.  But I can't
>>> help feeling I'm not reading the log output correctly or something,
>>> because that doesn't make sense.  Maybe there is there some kind
>>> of a-few-instructions delay before a CPU interrupt is actually
>>> deliviered after interrupts are enabled, or some delay in raising
>>> IRQ14 after a hard drive operation is requested, and such delays
>>> need to fall into a narrow window of opportunity left by UNIX?
>>>
>>> I can get a disassembly of the UNIX kernel using a "coff"-enabled
>>> build of GNU objdump, giving function names but not much else.
>>> But I haven't studied it in enough detail to actually find the
>>> relevant code path that is manipulating imr as described above.
>>> However, this old post outlines some of the high level theory
>>> of UNIX spl*() functions:
>>> http://www.linuxmisc.com/29-unix-internals/4e6c1f6fa2e41670.htm
>>>
>>> If anyone wants to look into this further, I can provide access to the
>>> initial boot install floppy, at least.  Email me.  (Without the rest
>>> of the install disks, it isn't much use for anything except testing
>>> virtual machines like qemu against rare corner cases...)
>>>
>>> ============
>>> Alternative Approaches:
>>>
>>> An alternative to this patch that might work (I haven't tried) would
>>> be to have BIOS set the master's elcr register 0x04 bit, making IRQ2
>>> level triggered instead of edge triggered.  I'm not sure what other
>>> effects this might have.  Maybe it would actually be a more accurate
>>> model (I haven't checked documentation; maybe "slave mode" of a
>>> IRQ line into the master is supposed to be level triggered?)
>>>
>>> Or perhaps find a way to model the minimum timescale that a interrupt
>>> request needs to be active to be recognized?
>>>
>>> Or maybe my analysis isn't correct; I wasn't able to find the
>>> relevant code path in the UNIX kernel.
> 
> [snip]
> 
>>
>> Has to mention or even actively warn that it doesn't work with KVM and
>> its in-kernel irqchip (as that PIC model lacks your hack).
> 
> I'll make an incremental patch to the documentation soon.
> 
>>
>> However, I strongly suspect you are nastily papering over an issue in
>> some device model. So I would prefer to dig deeper before installing
>> this in upstream (also due to its dependency on the userspace PIC model).
> 
> This is certainly possible.  I'm not an expert on the whole interrupt
> subsystem design in a PC.  But other than the wild speculation above
> (making IRQ2 level triggered via elcr, or some kind of timing preventing the
> edge triggering from catching a very short blip), I'm not sure what
> to look for.

Sorry, but as long as we do not understand the problem better, I'm
against such a hack option in upstream. These things have barely any
users and can therefore easily break as no one tests them when
refactoring code. Not to speak of how this is introduce (new top-level
command line switches are "out of fashion").

Jan
Anthony Liguori Aug. 27, 2012, 1:55 p.m. UTC | #4
Matthew Ogilvie <mmogilvi_qemu@miniinfo.net> writes:

> This patch provides a way to optionally suppress spurious interrupts,
> as a workaround for systems described below:
>
> Some old operating systems do not handle spurious interrupts well,
> and qemu tends to generate them significantly more often than
> real hardware.

This is the wrong approach.  You add a LostTickPolicy property to the
i8259 device.

Regards,

Anthony Liguori

>
> Examples:
>   - Microport UNIX System V/386 v 2.1 (ca 1987)
>     (The main problem I'm fixing: Without this patch, it panics
>     sporadically when accessing the hard disk.)
>   - AT&T UNIX System V/386 Release 4.0 Version 2.1a (ca 1991)
>     See screenshot in "QEMU Official OS Support List":
>     http://www.claunia.com/qemu/objectManager.php?sClass=application&iId=9
>     (I don't have this system to test.)
>   - A report about OS/2 boot lockup from 2004 by Hampa Hug:
>     http://lists.nongnu.org/archive/html/qemu-devel/2004-09/msg00367.html
>     (My patch was partially inspired by his.)
>     Also: http://lists.nongnu.org/archive/html/qemu-devel/2005-06/msg00243.html
>     (I don't have this system to test.)
>
> Signed-off-by: Matthew Ogilvie <mmogilvi_qemu@miniinfo.net>
> ---
>
> Note: checkpatches.pl gives an error about initializing the global 
> "int no_spurious_interrupt_hack = 0;", even though existing lines
> near it are doing the same thing.  Should I give precedence to
> checkpatches.pl, or nearby code?
>
> There was no version 1 of this patch; this was the last thing I had to
> work around to get UNIX running.
>
> High level symptoms:
>    1. Despite using this UNIX system for nearly 10 years (ca 1987-1996)
>       on an early 80386, I don't remember ever seeing any crash like
>       this.  I vaguely remember I may have had one or two crashes for
>       which I don't have other explanations that perhaps could have
>       been this, but I don't remember the error messages to confirm it.
>    2. It is somewhat random when UNIX crashes when running in qemu.
>        - Sometimes it crashes the first time the floppy-based installer
>          tries to access the hard disk (partition table?).
>        - Other times (though fairly rarely), it actually finishes
>          formatting and copying the first disk's files to the
>          hard disk without crashing.
>        - On the other hand, I've never seen it successfully boot from
>          the hard disk without this patch.  An attempt to boot from
>          the hard drive always panics quite early.
>    3. I tried -win2k-hack instead, thinking maybe the hard disk is just
>       responding faster than UNIX expected.  But it doesn't seem
>       to have any effect.  UNIX still panics sporadically the same way.
>        - TANGENT: I was going to see if my patch provides an
>          alternative fix for installing Windows 2000, but
>          I was unable to reproduce the original -win2k-hack problem at
>          all (with neither -win2k-hack NOR this patch).  Maybe
>          some other change has fixed it some other way?  Or maybe
>          it is only an issue in configurations I didn't test?
>          (KVM instead of TCG?  Less RAM?  Something else?)
>             It might be worth doing a little more investigation,
>          and eliminating the -win2k-hack option if appropriate.
>    4. If I enable KVM, I get a different error very early in
>       bootup (in splx function instead of splint), and this patch
>       doesn't help.
>
> ============
> My low level analysis of what is going on:
>
> It is hard to track down all the details, but based on logging a
> lot of qemu IRQ stuff, and setting a breakpoint in the earliest
> panic-related UNIX function using gdb, it looks like:
>
>    1. It is near the end of servicing a previous IRQ14 from the
>       hard disk.
>    2. The processor has interrupts disabled (I think), while UNIX
>       clears the slave 8259's IMR (mask) register (sets it to 0), allowing
>       all interrupts to be passed on to the master.
>    3. While in that state, IRQ14 is raised (on the slave), which
>       gets propagated to the master (IRQ2), but the CPU
>       is not interrupted yet.
>    4. UNIX then masks the slave 8259's IMR register
>       completely (sets to 0xff).
>    5. Because the master elcr register is set (by BIOS; UNIX never
>       touches it) to edge trigger for IRQ2, the master latched on
>       to IRQ2 earlier, and continues to assert the processors INT line
>       (the env->interrupt_request&CPU_INTERRUPT_HARD bit) even
>       after all slave IRQs have been masked off (clearing the input
>       IRQ2).
>    6. Finally, UNIX enables CPU interrupts and the interrupt is delivered
>       to the CPU, which ends up as a spurious IRQ15 due to the
>       slave's imr register.  UNIX doesn't know what to do with
>       that, and panics/halts.
>
> I'm not sure why it only sporadically hits this sequence of events.
> There doesn't seem to be other IRQs asserted or serviced anywhere
> in the near past; the last several were all IRQ14's.  But I can't
> help feeling I'm not reading the log output correctly or something,
> because that doesn't make sense.  Maybe there is there some kind
> of a-few-instructions delay before a CPU interrupt is actually
> deliviered after interrupts are enabled, or some delay in raising
> IRQ14 after a hard drive operation is requested, and such delays
> need to fall into a narrow window of opportunity left by UNIX?
>
> I can get a disassembly of the UNIX kernel using a "coff"-enabled
> build of GNU objdump, giving function names but not much else.
> But I haven't studied it in enough detail to actually find the
> relevant code path that is manipulating imr as described above.
> However, this old post outlines some of the high level theory
> of UNIX spl*() functions:
> http://www.linuxmisc.com/29-unix-internals/4e6c1f6fa2e41670.htm
>
> If anyone wants to look into this further, I can provide access to the
> initial boot install floppy, at least.  Email me.  (Without the rest
> of the install disks, it isn't much use for anything except testing
> virtual machines like qemu against rare corner cases...)
>
> ============
> Alternative Approaches:
>
> An alternative to this patch that might work (I haven't tried) would
> be to have BIOS set the master's elcr register 0x04 bit, making IRQ2
> level triggered instead of edge triggered.  I'm not sure what other
> effects this might have.  Maybe it would actually be a more accurate
> model (I haven't checked documentation; maybe "slave mode" of a
> IRQ line into the master is supposed to be level triggered?)
>
> Or perhaps find a way to model the minimum timescale that a interrupt
> request needs to be active to be recognized?
>
> Or maybe my analysis isn't correct; I wasn't able to find the
> relevant code path in the UNIX kernel.
>
> ============
>
>  cpu-exec.c      | 12 +++++++-----
>  hw/i8259.c      | 18 ++++++++++++++++++
>  qemu-options.hx | 12 ++++++++++++
>  sysemu.h        |  1 +
>  vl.c            |  4 ++++
>  5 files changed, 42 insertions(+), 5 deletions(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 134b3c4..c309847 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -329,11 +329,15 @@ int cpu_exec(CPUArchState *env)
>                                                            0);
>                              env->interrupt_request &= ~(CPU_INTERRUPT_HARD | CPU_INTERRUPT_VIRQ);
>                              intno = cpu_get_pic_interrupt(env);
> -                            qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing hardware INT=0x%02x\n", intno);
> -                            do_interrupt_x86_hardirq(env, intno, 1);
> -                            /* ensure that no TB jump will be modified as
> -                               the program flow was changed */
> -                            next_tb = 0;
> +                            if (intno >= 0) {
> +                                qemu_log_mask(CPU_LOG_TB_IN_ASM,
> +                                              "Servicing hardware INT=0x%02x\n",
> +                                              intno);
> +                                do_interrupt_x86_hardirq(env, intno, 1);
> +                                /* ensure that no TB jump will be modified as
> +                                   the program flow was changed */
> +                                next_tb = 0;
> +                            }
>  #if !defined(CONFIG_USER_ONLY)
>                          } else if ((interrupt_request & CPU_INTERRUPT_VIRQ) &&
>                                     (env->eflags & IF_MASK) && 
> diff --git a/hw/i8259.c b/hw/i8259.c
> index 6587666..7ecb7e1 100644
> --- a/hw/i8259.c
> +++ b/hw/i8259.c
> @@ -26,6 +26,7 @@
>  #include "isa.h"
>  #include "monitor.h"
>  #include "qemu-timer.h"
> +#include "sysemu.h"
>  #include "i8259_internal.h"
>  
>  /* debug PIC */
> @@ -193,6 +194,20 @@ int pic_read_irq(DeviceState *d)
>                  pic_intack(slave_pic, irq2);
>              } else {
>                  /* spurious IRQ on slave controller */
> +                if (no_spurious_interrupt_hack) {
> +                    /* Pretend it was delivered and acknowledged.  If
> +                     * it was spurious due to slave_pic->imr, then
> +                     * as soon as the mask is cleared, the slave will
> +                     * re-trigger IRQ2 on the master.  If it is spurious for
> +                     * some other reason, make sure we don't keep trying
> +                     * to half-process the same spurious interrupt over
> +                     * and over again.
> +                     */
> +                    s->irr &= ~(1<<irq);
> +                    s->last_irr &= ~(1<<irq);
> +                    s->isr &= ~(1<<irq);
> +                    return -1;
> +                }
>                  irq2 = 7;
>              }
>              intno = slave_pic->irq_base + irq2;
> @@ -202,6 +217,9 @@ int pic_read_irq(DeviceState *d)
>          pic_intack(s, irq);
>      } else {
>          /* spurious IRQ on host controller */
> +        if (no_spurious_interrupt_hack) {
> +            return -1;
> +        }
>          irq = 7;
>          intno = s->irq_base + irq;
>      }
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 03e13ec..57bb0b4 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1188,6 +1188,18 @@ Windows 2000 is installed, you no longer need this option (this option
>  slows down the IDE transfers).
>  ETEXI
>  
> +DEF("no-spurious-interrupt-hack", 0, QEMU_OPTION_no_spurious_interrupt_hack,
> +    "-no-spurious-interrupt-hack     disable delivery of spurious interrupts\n",
> +    QEMU_ARCH_I386)
> +STEXI
> +@item -no-spurious-interrupt-hack
> +@findex -no-spurious-interrupt-hack
> +Use it as a workaround for operating systems that drive PICs in a way that
> +can generate spurious interrupts, but the OS doesn't handle spurious
> +interrupts gracefully.  (e.g. late 80s/early 90s versions of ATT UNIX
> +and derivatives)
> +ETEXI
> +
>  HXCOMM Deprecated by -rtc
>  DEF("rtc-td-hack", 0, QEMU_OPTION_rtc_td_hack, "", QEMU_ARCH_I386)
>  
> diff --git a/sysemu.h b/sysemu.h
> index 65552ac..0170109 100644
> --- a/sysemu.h
> +++ b/sysemu.h
> @@ -117,6 +117,7 @@ extern int graphic_depth;
>  extern DisplayType display_type;
>  extern const char *keyboard_layout;
>  extern int win2k_install_hack;
> +extern int no_spurious_interrupt_hack;
>  extern int alt_grab;
>  extern int ctrl_grab;
>  extern int usb_enabled;
> diff --git a/vl.c b/vl.c
> index 16d04a2..6de41c1 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -204,6 +204,7 @@ CharDriverState *serial_hds[MAX_SERIAL_PORTS];
>  CharDriverState *parallel_hds[MAX_PARALLEL_PORTS];
>  CharDriverState *virtcon_hds[MAX_VIRTIO_CONSOLES];
>  int win2k_install_hack = 0;
> +int no_spurious_interrupt_hack = 0;
>  int usb_enabled = 0;
>  int singlestep = 0;
>  int smp_cpus = 1;
> @@ -3046,6 +3047,9 @@ int main(int argc, char **argv, char **envp)
>              case QEMU_OPTION_win2k_hack:
>                  win2k_install_hack = 1;
>                  break;
> +            case QEMU_OPTION_no_spurious_interrupt_hack:
> +                no_spurious_interrupt_hack = 1;
> +                break;
>              case QEMU_OPTION_rtc_td_hack: {
>                  static GlobalProperty slew_lost_ticks[] = {
>                      {
> -- 
> 1.7.10.2.484.gcd07cc5
Paolo Bonzini Aug. 27, 2012, 2:23 p.m. UTC | #5
Il 27/08/2012 15:55, Anthony Liguori ha scritto:
>> > This patch provides a way to optionally suppress spurious interrupts,
>> > as a workaround for systems described below:
>> >
>> > Some old operating systems do not handle spurious interrupts well,
>> > and qemu tends to generate them significantly more often than
>> > real hardware.
> This is the wrong approach.  You add a LostTickPolicy property to the
> i8259 device.

Isn't the i8254 the one that would need a LostTickPolicy?

But this seems like a bug that is either in the i8259 emulation, or in
the firmware.  Your own suggestion of setting IRQ2 to level-triggered in
SeaBIOS is definitely a good one.

Paolo
Anthony Liguori Aug. 27, 2012, 3:50 p.m. UTC | #6
Paolo Bonzini <pbonzini@redhat.com> writes:

> Il 27/08/2012 15:55, Anthony Liguori ha scritto:
>>> > This patch provides a way to optionally suppress spurious interrupts,
>>> > as a workaround for systems described below:
>>> >
>>> > Some old operating systems do not handle spurious interrupts well,
>>> > and qemu tends to generate them significantly more often than
>>> > real hardware.
>> This is the wrong approach.  You add a LostTickPolicy property to the
>> i8259 device.
>
> Isn't the i8254 the one that would need a LostTickPolicy?

You're right, I too quickly read this patch and misunderstood what it's
doing.

Regards,

Anthony Liguori

>
> But this seems like a bug that is either in the i8259 emulation, or in
> the firmware.  Your own suggestion of setting IRQ2 to level-triggered in
> SeaBIOS is definitely a good one.

>
> Paolo
diff mbox

Patch

============
My low level analysis of what is going on:

It is hard to track down all the details, but based on logging a
lot of qemu IRQ stuff, and setting a breakpoint in the earliest
panic-related UNIX function using gdb, it looks like:

   1. It is near the end of servicing a previous IRQ14 from the
      hard disk.
   2. The processor has interrupts disabled (I think), while UNIX
      clears the slave 8259's IMR (mask) register (sets it to 0), allowing
      all interrupts to be passed on to the master.
   3. While in that state, IRQ14 is raised (on the slave), which
      gets propagated to the master (IRQ2), but the CPU
      is not interrupted yet.
   4. UNIX then masks the slave 8259's IMR register
      completely (sets to 0xff).
   5. Because the master elcr register is set (by BIOS; UNIX never
      touches it) to edge trigger for IRQ2, the master latched on
      to IRQ2 earlier, and continues to assert the processors INT line
      (the env->interrupt_request&CPU_INTERRUPT_HARD bit) even
      after all slave IRQs have been masked off (clearing the input
      IRQ2).
   6. Finally, UNIX enables CPU interrupts and the interrupt is delivered
      to the CPU, which ends up as a spurious IRQ15 due to the
      slave's imr register.  UNIX doesn't know what to do with
      that, and panics/halts.

I'm not sure why it only sporadically hits this sequence of events.
There doesn't seem to be other IRQs asserted or serviced anywhere
in the near past; the last several were all IRQ14's.  But I can't
help feeling I'm not reading the log output correctly or something,
because that doesn't make sense.  Maybe there is there some kind
of a-few-instructions delay before a CPU interrupt is actually
deliviered after interrupts are enabled, or some delay in raising
IRQ14 after a hard drive operation is requested, and such delays
need to fall into a narrow window of opportunity left by UNIX?

I can get a disassembly of the UNIX kernel using a "coff"-enabled
build of GNU objdump, giving function names but not much else.
But I haven't studied it in enough detail to actually find the
relevant code path that is manipulating imr as described above.
However, this old post outlines some of the high level theory
of UNIX spl*() functions:
http://www.linuxmisc.com/29-unix-internals/4e6c1f6fa2e41670.htm

If anyone wants to look into this further, I can provide access to the
initial boot install floppy, at least.  Email me.  (Without the rest
of the install disks, it isn't much use for anything except testing
virtual machines like qemu against rare corner cases...)

============
Alternative Approaches:

An alternative to this patch that might work (I haven't tried) would
be to have BIOS set the master's elcr register 0x04 bit, making IRQ2
level triggered instead of edge triggered.  I'm not sure what other
effects this might have.  Maybe it would actually be a more accurate
model (I haven't checked documentation; maybe "slave mode" of a
IRQ line into the master is supposed to be level triggered?)

Or perhaps find a way to model the minimum timescale that a interrupt
request needs to be active to be recognized?

Or maybe my analysis isn't correct; I wasn't able to find the
relevant code path in the UNIX kernel.

============

 cpu-exec.c      | 12 +++++++-----
 hw/i8259.c      | 18 ++++++++++++++++++
 qemu-options.hx | 12 ++++++++++++
 sysemu.h        |  1 +
 vl.c            |  4 ++++
 5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 134b3c4..c309847 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -329,11 +329,15 @@  int cpu_exec(CPUArchState *env)
                                                           0);
                             env->interrupt_request &= ~(CPU_INTERRUPT_HARD | CPU_INTERRUPT_VIRQ);
                             intno = cpu_get_pic_interrupt(env);
-                            qemu_log_mask(CPU_LOG_TB_IN_ASM, "Servicing hardware INT=0x%02x\n", intno);
-                            do_interrupt_x86_hardirq(env, intno, 1);
-                            /* ensure that no TB jump will be modified as
-                               the program flow was changed */
-                            next_tb = 0;
+                            if (intno >= 0) {
+                                qemu_log_mask(CPU_LOG_TB_IN_ASM,
+                                              "Servicing hardware INT=0x%02x\n",
+                                              intno);
+                                do_interrupt_x86_hardirq(env, intno, 1);
+                                /* ensure that no TB jump will be modified as
+                                   the program flow was changed */
+                                next_tb = 0;
+                            }
 #if !defined(CONFIG_USER_ONLY)
                         } else if ((interrupt_request & CPU_INTERRUPT_VIRQ) &&
                                    (env->eflags & IF_MASK) && 
diff --git a/hw/i8259.c b/hw/i8259.c
index 6587666..7ecb7e1 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -26,6 +26,7 @@ 
 #include "isa.h"
 #include "monitor.h"
 #include "qemu-timer.h"
+#include "sysemu.h"
 #include "i8259_internal.h"
 
 /* debug PIC */
@@ -193,6 +194,20 @@  int pic_read_irq(DeviceState *d)
                 pic_intack(slave_pic, irq2);
             } else {
                 /* spurious IRQ on slave controller */
+                if (no_spurious_interrupt_hack) {
+                    /* Pretend it was delivered and acknowledged.  If
+                     * it was spurious due to slave_pic->imr, then
+                     * as soon as the mask is cleared, the slave will
+                     * re-trigger IRQ2 on the master.  If it is spurious for
+                     * some other reason, make sure we don't keep trying
+                     * to half-process the same spurious interrupt over
+                     * and over again.
+                     */
+                    s->irr &= ~(1<<irq);
+                    s->last_irr &= ~(1<<irq);
+                    s->isr &= ~(1<<irq);
+                    return -1;
+                }
                 irq2 = 7;
             }
             intno = slave_pic->irq_base + irq2;
@@ -202,6 +217,9 @@  int pic_read_irq(DeviceState *d)
         pic_intack(s, irq);
     } else {
         /* spurious IRQ on host controller */
+        if (no_spurious_interrupt_hack) {
+            return -1;
+        }
         irq = 7;
         intno = s->irq_base + irq;
     }
diff --git a/qemu-options.hx b/qemu-options.hx
index 03e13ec..57bb0b4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1188,6 +1188,18 @@  Windows 2000 is installed, you no longer need this option (this option
 slows down the IDE transfers).
 ETEXI
 
+DEF("no-spurious-interrupt-hack", 0, QEMU_OPTION_no_spurious_interrupt_hack,
+    "-no-spurious-interrupt-hack     disable delivery of spurious interrupts\n",
+    QEMU_ARCH_I386)
+STEXI
+@item -no-spurious-interrupt-hack
+@findex -no-spurious-interrupt-hack
+Use it as a workaround for operating systems that drive PICs in a way that
+can generate spurious interrupts, but the OS doesn't handle spurious
+interrupts gracefully.  (e.g. late 80s/early 90s versions of ATT UNIX
+and derivatives)
+ETEXI
+
 HXCOMM Deprecated by -rtc
 DEF("rtc-td-hack", 0, QEMU_OPTION_rtc_td_hack, "", QEMU_ARCH_I386)
 
diff --git a/sysemu.h b/sysemu.h
index 65552ac..0170109 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -117,6 +117,7 @@  extern int graphic_depth;
 extern DisplayType display_type;
 extern const char *keyboard_layout;
 extern int win2k_install_hack;
+extern int no_spurious_interrupt_hack;
 extern int alt_grab;
 extern int ctrl_grab;
 extern int usb_enabled;
diff --git a/vl.c b/vl.c
index 16d04a2..6de41c1 100644
--- a/vl.c
+++ b/vl.c
@@ -204,6 +204,7 @@  CharDriverState *serial_hds[MAX_SERIAL_PORTS];
 CharDriverState *parallel_hds[MAX_PARALLEL_PORTS];
 CharDriverState *virtcon_hds[MAX_VIRTIO_CONSOLES];
 int win2k_install_hack = 0;
+int no_spurious_interrupt_hack = 0;
 int usb_enabled = 0;
 int singlestep = 0;
 int smp_cpus = 1;
@@ -3046,6 +3047,9 @@  int main(int argc, char **argv, char **envp)
             case QEMU_OPTION_win2k_hack:
                 win2k_install_hack = 1;
                 break;
+            case QEMU_OPTION_no_spurious_interrupt_hack:
+                no_spurious_interrupt_hack = 1;
+                break;
             case QEMU_OPTION_rtc_td_hack: {
                 static GlobalProperty slew_lost_ticks[] = {
                     {