Patchwork [8/8] ahci: fix !msi interrupts

login
register
mail settings
Submitter Alexander Graf
Date Dec. 20, 2010, 9:13 p.m.
Message ID <1292879604-22268-9-git-send-email-agraf@suse.de>
Download mbox | patch
Permalink /patch/76223/
State New
Headers show

Comments

Alexander Graf - Dec. 20, 2010, 9:13 p.m.
When not using MSI, receiving an interrupt while the interrupt line is active
pulses the interrupt line. Without this, guests don't realize that a new
interrupt occured.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/ide/ahci.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)
Jan Kiszka - Jan. 17, 2011, 2:37 p.m.
On 2010-12-20 22:13, Alexander Graf wrote:
> When not using MSI, receiving an interrupt while the interrupt line is active
> pulses the interrupt line. Without this, guests don't realize that a new
> interrupt occured.

This doesn't look OK. The device model should look at the currently used
mode and switch between edge and level triggering accordingly. As it
appears like this is what it already does, this change may just paper
over the real issue.

Jan

> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  hw/ide/ahci.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 97aef68..4c920da 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -153,11 +153,10 @@ static void ahci_check_irq(AHCIState *s)
>          }
>      }
>  
> +    ahci_irq_lower(s, NULL);
>      if (s->control_regs.irqstatus &&
>          (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>              ahci_irq_raise(s, NULL);
> -    } else {
> -        ahci_irq_lower(s, NULL);
>      }
>  }
>
Alexander Graf - Jan. 17, 2011, 4 p.m.
Jan Kiszka wrote:
> On 2010-12-20 22:13, Alexander Graf wrote:
>   
>> When not using MSI, receiving an interrupt while the interrupt line is active
>> pulses the interrupt line. Without this, guests don't realize that a new
>> interrupt occured.
>>     
>
> This doesn't look OK. The device model should look at the currently used
> mode and switch between edge and level triggering accordingly. As it
> appears like this is what it already does, this change may just paper
> over the real issue.
>   

Well, I have this internal abstraction to make edge and level triggered
interrupt triggering easier. irq_lower is a simple nop for the edge case.


Alex
Jan Kiszka - Jan. 17, 2011, 4:03 p.m.
On 2011-01-17 17:00, Alexander Graf wrote:
> Jan Kiszka wrote:
>> On 2010-12-20 22:13, Alexander Graf wrote:
>>   
>>> When not using MSI, receiving an interrupt while the interrupt line is active
>>> pulses the interrupt line. Without this, guests don't realize that a new
>>> interrupt occured.
>>>     
>>
>> This doesn't look OK. The device model should look at the currently used
>> mode and switch between edge and level triggering accordingly. As it
>> appears like this is what it already does, this change may just paper
>> over the real issue.
>>   
> 
> Well, I have this internal abstraction to make edge and level triggered
> interrupt triggering easier. irq_lower is a simple nop for the edge case.
> 

I'm concerned about the artificial edge you generate for the level
triggered case. That's not like real hw behaves. If you need it,
something else might still be broken.

Jan
Alexander Graf - Jan. 17, 2011, 4:04 p.m.
Jan Kiszka wrote:
> On 2011-01-17 17:00, Alexander Graf wrote:
>   
>> Jan Kiszka wrote:
>>     
>>> On 2010-12-20 22:13, Alexander Graf wrote:
>>>   
>>>       
>>>> When not using MSI, receiving an interrupt while the interrupt line is active
>>>> pulses the interrupt line. Without this, guests don't realize that a new
>>>> interrupt occured.
>>>>     
>>>>         
>>> This doesn't look OK. The device model should look at the currently used
>>> mode and switch between edge and level triggering accordingly. As it
>>> appears like this is what it already does, this change may just paper
>>> over the real issue.
>>>   
>>>       
>> Well, I have this internal abstraction to make edge and level triggered
>> interrupt triggering easier. irq_lower is a simple nop for the edge case.
>>
>>     
>
> I'm concerned about the artificial edge you generate for the level
> triggered case. That's not like real hw behaves. If you need it,
> something else might still be broken.
>   

Hrm. So worst case we generate a spurious interrupt?


Alex
Jan Kiszka - Jan. 17, 2011, 4:20 p.m.
On 2011-01-17 17:04, Alexander Graf wrote:
> Jan Kiszka wrote:
>> On 2011-01-17 17:00, Alexander Graf wrote:
>>   
>>> Jan Kiszka wrote:
>>>     
>>>> On 2010-12-20 22:13, Alexander Graf wrote:
>>>>   
>>>>       
>>>>> When not using MSI, receiving an interrupt while the interrupt line is active
>>>>> pulses the interrupt line. Without this, guests don't realize that a new
>>>>> interrupt occured.
>>>>>     
>>>>>         
>>>> This doesn't look OK. The device model should look at the currently used
>>>> mode and switch between edge and level triggering accordingly. As it
>>>> appears like this is what it already does, this change may just paper
>>>> over the real issue.
>>>>   
>>>>       
>>> Well, I have this internal abstraction to make edge and level triggered
>>> interrupt triggering easier. irq_lower is a simple nop for the edge case.
>>>
>>>     
>>
>> I'm concerned about the artificial edge you generate for the level
>> triggered case. That's not like real hw behaves. If you need it,
>> something else might still be broken.
>>   
> 
> Hrm. So worst case we generate a spurious interrupt?
> 

Worse might also be that unknown issue that force you to inject an IRQ
here. We don't know. That's probably worst.

Jan
Alexander Graf - Jan. 17, 2011, 4:33 p.m.
Jan Kiszka wrote:
> On 2011-01-17 17:04, Alexander Graf wrote:
>   
>> Jan Kiszka wrote:
>>     
>>> On 2011-01-17 17:00, Alexander Graf wrote:
>>>   
>>>       
>>>> Jan Kiszka wrote:
>>>>     
>>>>         
>>>>> On 2010-12-20 22:13, Alexander Graf wrote:
>>>>>   
>>>>>       
>>>>>           
>>>>>> When not using MSI, receiving an interrupt while the interrupt line is active
>>>>>> pulses the interrupt line. Without this, guests don't realize that a new
>>>>>> interrupt occured.
>>>>>>     
>>>>>>         
>>>>>>             
>>>>> This doesn't look OK. The device model should look at the currently used
>>>>> mode and switch between edge and level triggering accordingly. As it
>>>>> appears like this is what it already does, this change may just paper
>>>>> over the real issue.
>>>>>   
>>>>>       
>>>>>           
>>>> Well, I have this internal abstraction to make edge and level triggered
>>>> interrupt triggering easier. irq_lower is a simple nop for the edge case.
>>>>
>>>>     
>>>>         
>>> I'm concerned about the artificial edge you generate for the level
>>> triggered case. That's not like real hw behaves. If you need it,
>>> something else might still be broken.
>>>   
>>>       
>> Hrm. So worst case we generate a spurious interrupt?
>>
>>     
>
> Worse might also be that unknown issue that force you to inject an IRQ
> here. We don't know. That's probably worst.
>   

Well, IIRC the issue was that usually a level high interrupt line would
simply retrigger an interrupt after enabling the interrupt line in the
APIC again. Unless my memory completely fails on me, that didn't happen,
so I added the manual retrigger on a partial command ACK in ahci code.


Alex
Jan Kiszka - Jan. 17, 2011, 4:48 p.m.
On 2011-01-17 17:33, Alexander Graf wrote:
> Jan Kiszka wrote:
>> On 2011-01-17 17:04, Alexander Graf wrote:
>>   
>>> Jan Kiszka wrote:
>>>     
>>>> On 2011-01-17 17:00, Alexander Graf wrote:
>>>>   
>>>>       
>>>>> Jan Kiszka wrote:
>>>>>     
>>>>>         
>>>>>> On 2010-12-20 22:13, Alexander Graf wrote:
>>>>>>   
>>>>>>       
>>>>>>           
>>>>>>> When not using MSI, receiving an interrupt while the interrupt line is active
>>>>>>> pulses the interrupt line. Without this, guests don't realize that a new
>>>>>>> interrupt occured.
>>>>>>>     
>>>>>>>         
>>>>>>>             
>>>>>> This doesn't look OK. The device model should look at the currently used
>>>>>> mode and switch between edge and level triggering accordingly. As it
>>>>>> appears like this is what it already does, this change may just paper
>>>>>> over the real issue.
>>>>>>   
>>>>>>       
>>>>>>           
>>>>> Well, I have this internal abstraction to make edge and level triggered
>>>>> interrupt triggering easier. irq_lower is a simple nop for the edge case.
>>>>>
>>>>>     
>>>>>         
>>>> I'm concerned about the artificial edge you generate for the level
>>>> triggered case. That's not like real hw behaves. If you need it,
>>>> something else might still be broken.
>>>>   
>>>>       
>>> Hrm. So worst case we generate a spurious interrupt?
>>>
>>>     
>>
>> Worse might also be that unknown issue that force you to inject an IRQ
>> here. We don't know. That's probably worst.
>>   
> 
> Well, IIRC the issue was that usually a level high interrupt line would
> simply retrigger an interrupt after enabling the interrupt line in the
> APIC again. Unless my memory completely fails on me, that didn't happen,
> so I added the manual retrigger on a partial command ACK in ahci code.
> 

How many other device models require this workaround? And is this a
limitation of a specific irqchip model or of the irq layer (I can't
believe it's the latter though)? All fairly suspicious...

Jan
Alexander Graf - Jan. 17, 2011, 4:50 p.m.
Jan Kiszka wrote:
> On 2011-01-17 17:33, Alexander Graf wrote:
>   
>> Jan Kiszka wrote:
>>     
>>> On 2011-01-17 17:04, Alexander Graf wrote:
>>>   
>>>       
>>>> Jan Kiszka wrote:
>>>>     
>>>>         
>>>>> On 2011-01-17 17:00, Alexander Graf wrote:
>>>>>   
>>>>>       
>>>>>           
>>>>>> Jan Kiszka wrote:
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> On 2010-12-20 22:13, Alexander Graf wrote:
>>>>>>>   
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>>>> When not using MSI, receiving an interrupt while the interrupt line is active
>>>>>>>> pulses the interrupt line. Without this, guests don't realize that a new
>>>>>>>> interrupt occured.
>>>>>>>>     
>>>>>>>>         
>>>>>>>>             
>>>>>>>>                 
>>>>>>> This doesn't look OK. The device model should look at the currently used
>>>>>>> mode and switch between edge and level triggering accordingly. As it
>>>>>>> appears like this is what it already does, this change may just paper
>>>>>>> over the real issue.
>>>>>>>   
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> Well, I have this internal abstraction to make edge and level triggered
>>>>>> interrupt triggering easier. irq_lower is a simple nop for the edge case.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>> I'm concerned about the artificial edge you generate for the level
>>>>> triggered case. That's not like real hw behaves. If you need it,
>>>>> something else might still be broken.
>>>>>   
>>>>>       
>>>>>           
>>>> Hrm. So worst case we generate a spurious interrupt?
>>>>
>>>>     
>>>>         
>>> Worse might also be that unknown issue that force you to inject an IRQ
>>> here. We don't know. That's probably worst.
>>>   
>>>       
>> Well, IIRC the issue was that usually a level high interrupt line would
>> simply retrigger an interrupt after enabling the interrupt line in the
>> APIC again. Unless my memory completely fails on me, that didn't happen,
>> so I added the manual retrigger on a partial command ACK in ahci code.
>>
>>     
>
> How many other device models require this workaround? And is this a
> limitation of a specific irqchip model or of the irq layer (I can't
> believe it's the latter though)? All fairly suspicious...
>   

I don't know :).


Alex
Gerd Hoffmann - Jan. 18, 2011, 9:08 a.m.
Hi,

>> Worse might also be that unknown issue that force you to inject an IRQ
>> here. We don't know. That's probably worst.
>
> Well, IIRC the issue was that usually a level high interrupt line would
> simply retrigger an interrupt after enabling the interrupt line in the
> APIC again.

edge triggered interrupts wouldn't though.

> Unless my memory completely fails on me, that didn't happen,
> so I added the manual retrigger on a partial command ACK in ahci code.

That re-trigger smells like you are not clearing the interrupt line 
where you should.  For starters try calling ahci_check_irq() after guest 
writes to PORT_IRQ_STAT.

cheers,
   Gerd
Alexander Graf - Jan. 18, 2011, 12:05 p.m.
On 18.01.2011, at 10:08, Gerd Hoffmann wrote:

>  Hi,
> 
>>> Worse might also be that unknown issue that force you to inject an IRQ
>>> here. We don't know. That's probably worst.
>> 
>> Well, IIRC the issue was that usually a level high interrupt line would
>> simply retrigger an interrupt after enabling the interrupt line in the
>> APIC again.
> 
> edge triggered interrupts wouldn't though.

The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.

> 
>> Unless my memory completely fails on me, that didn't happen,
>> so I added the manual retrigger on a partial command ACK in ahci code.
> 
> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.

The problem happened when I had the following:

ahci irq bits = 0
<events happen>
ahci irq bits = 1 | 2
irq line trigger
guest clears 1
ahci bits = 2
<no irq line trigger, since we're still irq high>

On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?


Alex
Jan Kiszka - Jan. 18, 2011, 12:58 p.m.
On 2011-01-18 13:05, Alexander Graf wrote:
> 
> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> 
>>  Hi,
>>
>>>> Worse might also be that unknown issue that force you to inject an IRQ
>>>> here. We don't know. That's probably worst.
>>>
>>> Well, IIRC the issue was that usually a level high interrupt line would
>>> simply retrigger an interrupt after enabling the interrupt line in the
>>> APIC again.
>>
>> edge triggered interrupts wouldn't though.
> 
> The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
> 
>>
>>> Unless my memory completely fails on me, that didn't happen,
>>> so I added the manual retrigger on a partial command ACK in ahci code.
>>
>> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
> 
> The problem happened when I had the following:
> 
> ahci irq bits = 0
> <events happen>
> ahci irq bits = 1 | 2
> irq line trigger
> guest clears 1
> ahci bits = 2
> <no irq line trigger, since we're still irq high>
> 
> On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
> 

No, real devices would simply leave the line asserted as long as there
is a reason to.

So again my question: With which irqchips do you see this effect, kvm's
in-kernel model and/or qemu's user space model? If there is an issue
with retriggering a CPU interrupt while the source is still asserted,
that probably needs to be fixed.

Jan
Gerd Hoffmann - Jan. 18, 2011, 1:02 p.m.
Hi,

>> edge triggered interrupts wouldn't though.
>
> The code change doesn't change anything for edge triggered
> interrupts
- they work fine. Only !msi (== level) ones are broken.

apic irqs can be both edge and level triggered too ...

>> That re-trigger smells like you are not clearing the interrupt line
>> where you should.  For starters try calling ahci_check_irq() after
>> guest writes to PORT_IRQ_STAT.

>
> The problem happened when I had the following:
>
> ahci irq bits = 0
> <events happen>
> ahci irq bits = 1 | 2
> irq line trigger
> guest clears 1
> ahci bits = 2
> <no irq line trigger, since we're still irq high>



> On normal hardware, the high state of the irq line would simply
trigger another interrupt in the guest when it gets ACKed on the LAPIC.
Somehow it doesn't get triggered here. I don't see why clearing the
interrupt line would help? It's not what real hardware would do, no?

I think the guest and the ahci emulation simply disagree on whenever a 
irq is pending or not.  Guest thinks it has cleared the IRQ, but the 
code path it has taken didn't trigger ahci_irq_lower().

It is probably related to how the per-port and global irq status play 
together.  It isn't covered very well in the specs :-(

If an interrupt condition happens a bit in pr->irq_stat will be set. 
When the contition is enabled the port bit in irqstatus will be set, 
which in turn will trigger an irq.  That is the easy part.

Now the guest checks irqstatus to find the port, then checks 
pr->irq_stat to find the condition, acks it by clearing the bits it has 
seen in pr->irq_stat.  What does happen with the port bit in irqstatus 
now?  Will it be cleared automatically?  Must the guest clear it 
explicitly?  What happens in case another irq condition happened which 
the guest didn't ack (yet), will the guest be able to clear the bit?

cheers,
   Gerd
Aurelien Jarno - Jan. 22, 2011, 1:13 p.m.
On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> On 2011-01-18 13:05, Alexander Graf wrote:
> > 
> > On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> > 
> >>  Hi,
> >>
> >>>> Worse might also be that unknown issue that force you to inject an IRQ
> >>>> here. We don't know. That's probably worst.
> >>>
> >>> Well, IIRC the issue was that usually a level high interrupt line would
> >>> simply retrigger an interrupt after enabling the interrupt line in the
> >>> APIC again.
> >>
> >> edge triggered interrupts wouldn't though.
> > 
> > The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
> > 
> >>
> >>> Unless my memory completely fails on me, that didn't happen,
> >>> so I added the manual retrigger on a partial command ACK in ahci code.
> >>
> >> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
> > 
> > The problem happened when I had the following:
> > 
> > ahci irq bits = 0
> > <events happen>
> > ahci irq bits = 1 | 2
> > irq line trigger
> > guest clears 1
> > ahci bits = 2
> > <no irq line trigger, since we're still irq high>
> > 
> > On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
> > 
> 
> No, real devices would simply leave the line asserted as long as there
> is a reason to.
> 
> So again my question: With which irqchips do you see this effect, kvm's
> in-kernel model and/or qemu's user space model? If there is an issue
> with retriggering a CPU interrupt while the source is still asserted,
> that probably needs to be fixed.
> 

I guess it might be related to a problem identified long time ago on
some targets, and that leads to the following #ifdef in i8259.c:

| /* all targets should do this rather than acking the IRQ in the cpu */
| #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)

For your information it has been fixed on MIPS in commit 
4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.

Basically when an interrupt line is high, the interrupt is getting
delivered to the CPU. This part works correctly on x86. The CPU will
take a corresponding action, basically either disabling the interrupt
at the CPU or controller level or doing something on the device so that
it lower its IRQ line. This new IRQ line level should then propagate
through the various controllers, which should also lower their IRQ line
if no other interrupt line are active. This ACK process should then
continue up to the CPU.

For x86 the interrupt state is cleared as soon as the interrupt is
signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
is still pending, it won't be seen by the CPU. It's probably what you
observed with AHCI.
Edgar Iglesias - Jan. 22, 2011, 2:14 p.m.
On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> > On 2011-01-18 13:05, Alexander Graf wrote:
> > > 
> > > On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> > > 
> > >>  Hi,
> > >>
> > >>>> Worse might also be that unknown issue that force you to inject an IRQ
> > >>>> here. We don't know. That's probably worst.
> > >>>
> > >>> Well, IIRC the issue was that usually a level high interrupt line would
> > >>> simply retrigger an interrupt after enabling the interrupt line in the
> > >>> APIC again.
> > >>
> > >> edge triggered interrupts wouldn't though.
> > > 
> > > The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
> > > 
> > >>
> > >>> Unless my memory completely fails on me, that didn't happen,
> > >>> so I added the manual retrigger on a partial command ACK in ahci code.
> > >>
> > >> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
> > > 
> > > The problem happened when I had the following:
> > > 
> > > ahci irq bits = 0
> > > <events happen>
> > > ahci irq bits = 1 | 2
> > > irq line trigger
> > > guest clears 1
> > > ahci bits = 2
> > > <no irq line trigger, since we're still irq high>
> > > 
> > > On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
> > > 
> > 
> > No, real devices would simply leave the line asserted as long as there
> > is a reason to.
> > 
> > So again my question: With which irqchips do you see this effect, kvm's
> > in-kernel model and/or qemu's user space model? If there is an issue
> > with retriggering a CPU interrupt while the source is still asserted,
> > that probably needs to be fixed.
> > 
> 
> I guess it might be related to a problem identified long time ago on
> some targets, and that leads to the following #ifdef in i8259.c:
> 
> | /* all targets should do this rather than acking the IRQ in the cpu */
> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> 
> For your information it has been fixed on MIPS in commit 
> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> 
> Basically when an interrupt line is high, the interrupt is getting
> delivered to the CPU. This part works correctly on x86. The CPU will
> take a corresponding action, basically either disabling the interrupt
> at the CPU or controller level or doing something on the device so that
> it lower its IRQ line. This new IRQ line level should then propagate
> through the various controllers, which should also lower their IRQ line
> if no other interrupt line are active. This ACK process should then
> continue up to the CPU.

I totally agree.

 
> For x86 the interrupt state is cleared as soon as the interrupt is
> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> is still pending, it won't be seen by the CPU. It's probably what you
> observed with AHCI. 

Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
The CPU cannot drive it directly. To lower it, it must take some kind
of indirect action (IO or whatever) to clear the condition that is
forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
signal from within the CPU core are likely wrong..

FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...

Cheers
Alexander Graf - Jan. 22, 2011, 2:28 p.m.
On 22.01.2011, at 14:13, Aurelien Jarno wrote:

> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
>> On 2011-01-18 13:05, Alexander Graf wrote:
>>> 
>>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
>>> 
>>>> Hi,
>>>> 
>>>>>> Worse might also be that unknown issue that force you to inject an IRQ
>>>>>> here. We don't know. That's probably worst.
>>>>> 
>>>>> Well, IIRC the issue was that usually a level high interrupt line would
>>>>> simply retrigger an interrupt after enabling the interrupt line in the
>>>>> APIC again.
>>>> 
>>>> edge triggered interrupts wouldn't though.
>>> 
>>> The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
>>> 
>>>> 
>>>>> Unless my memory completely fails on me, that didn't happen,
>>>>> so I added the manual retrigger on a partial command ACK in ahci code.
>>>> 
>>>> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
>>> 
>>> The problem happened when I had the following:
>>> 
>>> ahci irq bits = 0
>>> <events happen>
>>> ahci irq bits = 1 | 2
>>> irq line trigger
>>> guest clears 1
>>> ahci bits = 2
>>> <no irq line trigger, since we're still irq high>
>>> 
>>> On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
>>> 
>> 
>> No, real devices would simply leave the line asserted as long as there
>> is a reason to.
>> 
>> So again my question: With which irqchips do you see this effect, kvm's
>> in-kernel model and/or qemu's user space model? If there is an issue
>> with retriggering a CPU interrupt while the source is still asserted,
>> that probably needs to be fixed.
>> 
> 
> I guess it might be related to a problem identified long time ago on
> some targets, and that leads to the following #ifdef in i8259.c:
> 
> | /* all targets should do this rather than acking the IRQ in the cpu */
> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> 
> For your information it has been fixed on MIPS in commit 
> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> 
> Basically when an interrupt line is high, the interrupt is getting
> delivered to the CPU. This part works correctly on x86. The CPU will
> take a corresponding action, basically either disabling the interrupt
> at the CPU or controller level or doing something on the device so that
> it lower its IRQ line. This new IRQ line level should then propagate
> through the various controllers, which should also lower their IRQ line
> if no other interrupt line are active. This ACK process should then
> continue up to the CPU.
> 
> For x86 the interrupt state is cleared as soon as the interrupt is
> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> is still pending, it won't be seen by the CPU. It's probably what you
> observed with AHCI. 

I'm not sure what the best way to solve it would be, but maybe a callback on iret would work? Then the interrupt can be checked on again.

Alternatively, env->interrupt_request really should get cleared by the LAPIC when the interrupt line is pulled down there.


Alex
Alexander Graf - Jan. 22, 2011, 2:32 p.m.
On 22.01.2011, at 15:14, Edgar E. Iglesias wrote:

> On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
>> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
>>> On 2011-01-18 13:05, Alexander Graf wrote:
>>>> 
>>>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>>>> Worse might also be that unknown issue that force you to inject an IRQ
>>>>>>> here. We don't know. That's probably worst.
>>>>>> 
>>>>>> Well, IIRC the issue was that usually a level high interrupt line would
>>>>>> simply retrigger an interrupt after enabling the interrupt line in the
>>>>>> APIC again.
>>>>> 
>>>>> edge triggered interrupts wouldn't though.
>>>> 
>>>> The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
>>>> 
>>>>> 
>>>>>> Unless my memory completely fails on me, that didn't happen,
>>>>>> so I added the manual retrigger on a partial command ACK in ahci code.
>>>>> 
>>>>> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
>>>> 
>>>> The problem happened when I had the following:
>>>> 
>>>> ahci irq bits = 0
>>>> <events happen>
>>>> ahci irq bits = 1 | 2
>>>> irq line trigger
>>>> guest clears 1
>>>> ahci bits = 2
>>>> <no irq line trigger, since we're still irq high>
>>>> 
>>>> On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
>>>> 
>>> 
>>> No, real devices would simply leave the line asserted as long as there
>>> is a reason to.
>>> 
>>> So again my question: With which irqchips do you see this effect, kvm's
>>> in-kernel model and/or qemu's user space model? If there is an issue
>>> with retriggering a CPU interrupt while the source is still asserted,
>>> that probably needs to be fixed.
>>> 
>> 
>> I guess it might be related to a problem identified long time ago on
>> some targets, and that leads to the following #ifdef in i8259.c:
>> 
>> | /* all targets should do this rather than acking the IRQ in the cpu */
>> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
>> 
>> For your information it has been fixed on MIPS in commit 
>> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
>> 
>> Basically when an interrupt line is high, the interrupt is getting
>> delivered to the CPU. This part works correctly on x86. The CPU will
>> take a corresponding action, basically either disabling the interrupt
>> at the CPU or controller level or doing something on the device so that
>> it lower its IRQ line. This new IRQ line level should then propagate
>> through the various controllers, which should also lower their IRQ line
>> if no other interrupt line are active. This ACK process should then
>> continue up to the CPU.
> 
> I totally agree.
> 
> 
>> For x86 the interrupt state is cleared as soon as the interrupt is
>> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
>> is still pending, it won't be seen by the CPU. It's probably what you
>> observed with AHCI. 
> 
> Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
> The CPU cannot drive it directly. To lower it, it must take some kind
> of indirect action (IO or whatever) to clear the condition that is
> forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
> signal from within the CPU core are likely wrong..
> 
> FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...

How's that suspicious? As long as pending_interrupts is only reset on actual interrupt delivery, everything's fine. Interrupts like the decrementor are not level based on some PPCs, so the actual semantics are implementation dependent.

Either way, I agree that getting interrupts right is very hard :).


Alex
Aurelien Jarno - Jan. 22, 2011, 2:34 p.m.
On Sat, Jan 22, 2011 at 03:28:06PM +0100, Alexander Graf wrote:
> 
> On 22.01.2011, at 14:13, Aurelien Jarno wrote:
> 
> > On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> >> On 2011-01-18 13:05, Alexander Graf wrote:
> >>> 
> >>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> >>> 
> >>>> Hi,
> >>>> 
> >>>>>> Worse might also be that unknown issue that force you to inject an IRQ
> >>>>>> here. We don't know. That's probably worst.
> >>>>> 
> >>>>> Well, IIRC the issue was that usually a level high interrupt line would
> >>>>> simply retrigger an interrupt after enabling the interrupt line in the
> >>>>> APIC again.
> >>>> 
> >>>> edge triggered interrupts wouldn't though.
> >>> 
> >>> The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
> >>> 
> >>>> 
> >>>>> Unless my memory completely fails on me, that didn't happen,
> >>>>> so I added the manual retrigger on a partial command ACK in ahci code.
> >>>> 
> >>>> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
> >>> 
> >>> The problem happened when I had the following:
> >>> 
> >>> ahci irq bits = 0
> >>> <events happen>
> >>> ahci irq bits = 1 | 2
> >>> irq line trigger
> >>> guest clears 1
> >>> ahci bits = 2
> >>> <no irq line trigger, since we're still irq high>
> >>> 
> >>> On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
> >>> 
> >> 
> >> No, real devices would simply leave the line asserted as long as there
> >> is a reason to.
> >> 
> >> So again my question: With which irqchips do you see this effect, kvm's
> >> in-kernel model and/or qemu's user space model? If there is an issue
> >> with retriggering a CPU interrupt while the source is still asserted,
> >> that probably needs to be fixed.
> >> 
> > 
> > I guess it might be related to a problem identified long time ago on
> > some targets, and that leads to the following #ifdef in i8259.c:
> > 
> > | /* all targets should do this rather than acking the IRQ in the cpu */
> > | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> > 
> > For your information it has been fixed on MIPS in commit 
> > 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> > 
> > Basically when an interrupt line is high, the interrupt is getting
> > delivered to the CPU. This part works correctly on x86. The CPU will
> > take a corresponding action, basically either disabling the interrupt
> > at the CPU or controller level or doing something on the device so that
> > it lower its IRQ line. This new IRQ line level should then propagate
> > through the various controllers, which should also lower their IRQ line
> > if no other interrupt line are active. This ACK process should then
> > continue up to the CPU.
> > 
> > For x86 the interrupt state is cleared as soon as the interrupt is
> > signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> > is still pending, it won't be seen by the CPU. It's probably what you
> > observed with AHCI. 
> 
> I'm not sure what the best way to solve it would be, but maybe a callback on iret would work? Then the interrupt can be checked on again.

This looks really like a hack, and the one you put in AHCI is probably
better than this one

> Alternatively, env->interrupt_request really should get cleared by the LAPIC when the interrupt line is pulled down there.

I think this is the way to go, as it matches what happens on real
hardware. The changes might be a bit more invasive though.
Aurelien Jarno - Jan. 22, 2011, 2:40 p.m.
On Sat, Jan 22, 2011 at 03:32:36PM +0100, Alexander Graf wrote:
> 
> On 22.01.2011, at 15:14, Edgar E. Iglesias wrote:
> 
> > On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
> >> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> >>> On 2011-01-18 13:05, Alexander Graf wrote:
> >>>> 
> >>>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> >>>> 
> >>>>> Hi,
> >>>>> 
> >>>>>>> Worse might also be that unknown issue that force you to inject an IRQ
> >>>>>>> here. We don't know. That's probably worst.
> >>>>>> 
> >>>>>> Well, IIRC the issue was that usually a level high interrupt line would
> >>>>>> simply retrigger an interrupt after enabling the interrupt line in the
> >>>>>> APIC again.
> >>>>> 
> >>>>> edge triggered interrupts wouldn't though.
> >>>> 
> >>>> The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
> >>>> 
> >>>>> 
> >>>>>> Unless my memory completely fails on me, that didn't happen,
> >>>>>> so I added the manual retrigger on a partial command ACK in ahci code.
> >>>>> 
> >>>>> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
> >>>> 
> >>>> The problem happened when I had the following:
> >>>> 
> >>>> ahci irq bits = 0
> >>>> <events happen>
> >>>> ahci irq bits = 1 | 2
> >>>> irq line trigger
> >>>> guest clears 1
> >>>> ahci bits = 2
> >>>> <no irq line trigger, since we're still irq high>
> >>>> 
> >>>> On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
> >>>> 
> >>> 
> >>> No, real devices would simply leave the line asserted as long as there
> >>> is a reason to.
> >>> 
> >>> So again my question: With which irqchips do you see this effect, kvm's
> >>> in-kernel model and/or qemu's user space model? If there is an issue
> >>> with retriggering a CPU interrupt while the source is still asserted,
> >>> that probably needs to be fixed.
> >>> 
> >> 
> >> I guess it might be related to a problem identified long time ago on
> >> some targets, and that leads to the following #ifdef in i8259.c:
> >> 
> >> | /* all targets should do this rather than acking the IRQ in the cpu */
> >> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> >> 
> >> For your information it has been fixed on MIPS in commit 
> >> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> >> 
> >> Basically when an interrupt line is high, the interrupt is getting
> >> delivered to the CPU. This part works correctly on x86. The CPU will
> >> take a corresponding action, basically either disabling the interrupt
> >> at the CPU or controller level or doing something on the device so that
> >> it lower its IRQ line. This new IRQ line level should then propagate
> >> through the various controllers, which should also lower their IRQ line
> >> if no other interrupt line are active. This ACK process should then
> >> continue up to the CPU.
> > 
> > I totally agree.
> > 
> > 
> >> For x86 the interrupt state is cleared as soon as the interrupt is
> >> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> >> is still pending, it won't be seen by the CPU. It's probably what you
> >> observed with AHCI. 
> > 
> > Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
> > The CPU cannot drive it directly. To lower it, it must take some kind
> > of indirect action (IO or whatever) to clear the condition that is
> > forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
> > signal from within the CPU core are likely wrong..
> > 
> > FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...
> 
> How's that suspicious? As long as pending_interrupts is only reset on actual interrupt delivery, everything's fine. Interrupts like the decrementor are not level based on some PPCs, so the actual semantics are implementation dependent.
> 

It's suspicious because it's not the right place to do that. It should
be done in the interrupt controller. This seems to be already done in
hw/ppc.c when updating pending_interrupts, so this code seems to be
useless.
Edgar Iglesias - Jan. 23, 2011, 3:34 a.m.
On Sat, Jan 22, 2011 at 03:40:34PM +0100, Aurelien Jarno wrote:
> On Sat, Jan 22, 2011 at 03:32:36PM +0100, Alexander Graf wrote:
> > 
> > On 22.01.2011, at 15:14, Edgar E. Iglesias wrote:
> > 
> > > On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
> > >> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> > >>> On 2011-01-18 13:05, Alexander Graf wrote:
> > >>>> 
> > >>>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> > >>>> 
> > >>>>> Hi,
> > >>>>> 
> > >>>>>>> Worse might also be that unknown issue that force you to inject an IRQ
> > >>>>>>> here. We don't know. That's probably worst.
> > >>>>>> 
> > >>>>>> Well, IIRC the issue was that usually a level high interrupt line would
> > >>>>>> simply retrigger an interrupt after enabling the interrupt line in the
> > >>>>>> APIC again.
> > >>>>> 
> > >>>>> edge triggered interrupts wouldn't though.
> > >>>> 
> > >>>> The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken.
> > >>>> 
> > >>>>> 
> > >>>>>> Unless my memory completely fails on me, that didn't happen,
> > >>>>>> so I added the manual retrigger on a partial command ACK in ahci code.
> > >>>>> 
> > >>>>> That re-trigger smells like you are not clearing the interrupt line where you should.  For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT.
> > >>>> 
> > >>>> The problem happened when I had the following:
> > >>>> 
> > >>>> ahci irq bits = 0
> > >>>> <events happen>
> > >>>> ahci irq bits = 1 | 2
> > >>>> irq line trigger
> > >>>> guest clears 1
> > >>>> ahci bits = 2
> > >>>> <no irq line trigger, since we're still irq high>
> > >>>> 
> > >>>> On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no?
> > >>>> 
> > >>> 
> > >>> No, real devices would simply leave the line asserted as long as there
> > >>> is a reason to.
> > >>> 
> > >>> So again my question: With which irqchips do you see this effect, kvm's
> > >>> in-kernel model and/or qemu's user space model? If there is an issue
> > >>> with retriggering a CPU interrupt while the source is still asserted,
> > >>> that probably needs to be fixed.
> > >>> 
> > >> 
> > >> I guess it might be related to a problem identified long time ago on
> > >> some targets, and that leads to the following #ifdef in i8259.c:
> > >> 
> > >> | /* all targets should do this rather than acking the IRQ in the cpu */
> > >> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> > >> 
> > >> For your information it has been fixed on MIPS in commit 
> > >> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> > >> 
> > >> Basically when an interrupt line is high, the interrupt is getting
> > >> delivered to the CPU. This part works correctly on x86. The CPU will
> > >> take a corresponding action, basically either disabling the interrupt
> > >> at the CPU or controller level or doing something on the device so that
> > >> it lower its IRQ line. This new IRQ line level should then propagate
> > >> through the various controllers, which should also lower their IRQ line
> > >> if no other interrupt line are active. This ACK process should then
> > >> continue up to the CPU.
> > > 
> > > I totally agree.
> > > 
> > > 
> > >> For x86 the interrupt state is cleared as soon as the interrupt is
> > >> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> > >> is still pending, it won't be seen by the CPU. It's probably what you
> > >> observed with AHCI. 
> > > 
> > > Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
> > > The CPU cannot drive it directly. To lower it, it must take some kind
> > > of indirect action (IO or whatever) to clear the condition that is
> > > forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
> > > signal from within the CPU core are likely wrong..
> > > 
> > > FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...
> > 
> > How's that suspicious? As long as pending_interrupts is only reset on actual interrupt delivery, everything's fine. Interrupts like the decrementor are not level based on some PPCs, so the actual semantics are implementation dependent.
> > 
> 
> It's suspicious because it's not the right place to do that. It should
> be done in the interrupt controller. This seems to be already done in
> hw/ppc.c when updating pending_interrupts, so this code seems to be
> useless.

Exactly.

The CPU model is trying to drive an input signal. We should find a way to
somehow forbid that in C.

Cheers

Patch

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 97aef68..4c920da 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -153,11 +153,10 @@  static void ahci_check_irq(AHCIState *s)
         }
     }
 
+    ahci_irq_lower(s, NULL);
     if (s->control_regs.irqstatus &&
         (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
             ahci_irq_raise(s, NULL);
-    } else {
-        ahci_irq_lower(s, NULL);
     }
 }