diff mbox

Re: Errors on MMIO read access on VM suspend / resume operations

Message ID 4D3EFF01.9080608@linux.vnet.ibm.com
State New
Headers show

Commit Message

Stefan Berger Jan. 25, 2011, 4:49 p.m. UTC
On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>
> Do you see a chance to look closer at the issue yourself? E.g.
> instrument the kernel's irqchip models and dump their states once your
> guest is stuck?
The device runs on iRQ 3. So I applied this patch here.




While it's still working I see this here with the levels changing 0-1-0. 
Though then it stops and levels are only at '1'.

[ 1773.833824] kvm_pic_set_irq
[ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
[ 1773.834161] kvm_pic_set_irq
[ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
[ 1773.834193] kvm_pic_set_irq
[ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
[ 1773.835028] kvm_pic_set_irq
[ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
[ 1773.835542] kvm_pic_set_irq
[ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
[ 1773.889892] kvm_pic_set_irq
[ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
[ 1791.258793] pic_set_irq1 119: level=1, irr = d9
[ 1791.258824] pic_set_irq1 119: level=0, irr = d1
[ 1791.402476] pic_set_irq1 119: level=1, irr = d9
[ 1791.402534] pic_set_irq1 119: level=0, irr = d1
[ 1791.402538] pic_set_irq1 119: level=1, irr = d9
[...]


I believe the last 5 shown calls can be ignored. After that the 
interrupts don't go through anymore.

In the device model I see interrupts being raised and cleared. After the 
last one was cleared in 'my' device model, only interrupts are raised. 
This looks like as if the interrupt handler in the guest Linux was never 
run, thus the IRQ is never cleared and we're stuck.



Regards,
     Stefan

Comments

Jan Kiszka Jan. 26, 2011, 8:14 a.m. UTC | #1
On 2011-01-25 17:49, Stefan Berger wrote:
> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>
>> Do you see a chance to look closer at the issue yourself? E.g.
>> instrument the kernel's irqchip models and dump their states once your
>> guest is stuck?
> The device runs on iRQ 3. So I applied this patch here.
> 
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index 3cece05..8f4f94c 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
> *s, int irq, int level)
>  {
>      int mask, ret = 1;
>      mask = 1<<  irq;
> -    if (s->elcr&  mask)    /* level triggered */
> +    if (s->elcr&  mask)    /* level triggered */ {
>          if (level) {
>              ret = !(s->irr&  mask);
>              s->irr |= mask;
> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
> kvm_kpic_state *s, int irq, int level)
>              s->irr&= ~mask;
>              s->last_irr&= ~mask;
>          }
> -    else    /* edge triggered */
> +if (irq == 3)
> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
> s->irr);
> +        }
> +    else    /* edge triggered */ {
>          if (level) {
>              if ((s->last_irr&  mask) == 0) {
>                  ret = !(s->irr&  mask);
> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
> *s, int irq, int level)
>              s->last_irr |= mask;
>          } else
>              s->last_irr&= ~mask;
> -
> +if (irq == 3)
> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
> s->irr);
> +        }
>      return (s->imr&  mask) ? -1 : ret;
>  }
> 
> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
> 
>      pic_lock(s);
>      if (irq>= 0&&  irq<  PIC_NUM_PINS) {
> +if (irq == 3)
> +printk("%s\n", __FUNCTION__);
>          ret = pic_set_irq1(&s->pics[irq>>  3], irq&  7, level);
>          pic_update_irq(s);
>          trace_kvm_pic_set_irq(irq>>  3, irq&  7, s->pics[irq>>  3].elcr,
> 
> 
> 
> While it's still working I see this here with the levels changing 0-1-0.
> Though then it stops and levels are only at '1'.
> 
> [ 1773.833824] kvm_pic_set_irq
> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
> [ 1773.834161] kvm_pic_set_irq
> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
> [ 1773.834193] kvm_pic_set_irq
> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
> [ 1773.835028] kvm_pic_set_irq
> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
> [ 1773.835542] kvm_pic_set_irq
> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
> [ 1773.889892] kvm_pic_set_irq
> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
> [...]
> 
> 
> I believe the last 5 shown calls can be ignored. After that the
> interrupts don't go through anymore.
> 
> In the device model I see interrupts being raised and cleared. After the
> last one was cleared in 'my' device model, only interrupts are raised.
> This looks like as if the interrupt handler in the guest Linux was never
> run, thus the IRQ is never cleared and we're stuck.
> 

User space is responsible for both setting and clearing that line. IRQ3
means you are using some serial device model? Then you should check what
its state is.

Moreover, a complete picture of the kernel/user space interaction should
be obtainable by using fstrace for capturing kvm events.

Jan
Stefan Berger Jan. 26, 2011, 12:05 p.m. UTC | #2
On 01/26/2011 03:14 AM, Jan Kiszka wrote:
> On 2011-01-25 17:49, Stefan Berger wrote:
>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>> Do you see a chance to look closer at the issue yourself? E.g.
>>> instrument the kernel's irqchip models and dump their states once your
>>> guest is stuck?
>> The device runs on iRQ 3. So I applied this patch here.
>>
>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>> index 3cece05..8f4f94c 100644
>> --- a/arch/x86/kvm/i8259.c
>> +++ b/arch/x86/kvm/i8259.c
>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>> *s, int irq, int level)
>>   {
>>       int mask, ret = 1;
>>       mask = 1<<   irq;
>> -    if (s->elcr&   mask)    /* level triggered */
>> +    if (s->elcr&   mask)    /* level triggered */ {
>>           if (level) {
>>               ret = !(s->irr&   mask);
>>               s->irr |= mask;
>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>> kvm_kpic_state *s, int irq, int level)
>>               s->irr&= ~mask;
>>               s->last_irr&= ~mask;
>>           }
>> -    else    /* edge triggered */
>> +if (irq == 3)
>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>> s->irr);
>> +        }
>> +    else    /* edge triggered */ {
>>           if (level) {
>>               if ((s->last_irr&   mask) == 0) {
>>                   ret = !(s->irr&   mask);
>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>> *s, int irq, int level)
>>               s->last_irr |= mask;
>>           } else
>>               s->last_irr&= ~mask;
>> -
>> +if (irq == 3)
>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>> s->irr);
>> +        }
>>       return (s->imr&   mask) ? -1 : ret;
>>   }
>>
>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
>>
>>       pic_lock(s);
>>       if (irq>= 0&&   irq<   PIC_NUM_PINS) {
>> +if (irq == 3)
>> +printk("%s\n", __FUNCTION__);
>>           ret = pic_set_irq1(&s->pics[irq>>   3], irq&   7, level);
>>           pic_update_irq(s);
>>           trace_kvm_pic_set_irq(irq>>   3, irq&   7, s->pics[irq>>   3].elcr,
>>
>>
>>
>> While it's still working I see this here with the levels changing 0-1-0.
>> Though then it stops and levels are only at '1'.
>>
>> [ 1773.833824] kvm_pic_set_irq
>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>> [ 1773.834161] kvm_pic_set_irq
>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>> [ 1773.834193] kvm_pic_set_irq
>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>> [ 1773.835028] kvm_pic_set_irq
>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>> [ 1773.835542] kvm_pic_set_irq
>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>> [ 1773.889892] kvm_pic_set_irq
>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>> [...]
>>
>>
>> I believe the last 5 shown calls can be ignored. After that the
>> interrupts don't go through anymore.
>>
>> In the device model I see interrupts being raised and cleared. After the
>> last one was cleared in 'my' device model, only interrupts are raised.
>> This looks like as if the interrupt handler in the guest Linux was never
>> run, thus the IRQ is never cleared and we're stuck.
>>
> User space is responsible for both setting and clearing that line. IRQ3
> means you are using some serial device model? Then you should check what
> its state is.
Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git) 
from what I can see. There was no UART on IRQ3 before, though, but 
certainly it was the wrong IRQ for it.
> Moreover, a complete picture of the kernel/user space interaction should
> be obtainable by using fstrace for capturing kvm events.
>
Should it be working on IRQ3? If so, I'd look into it when I get a chance...
    Stefan
Jan Kiszka Jan. 26, 2011, 12:09 p.m. UTC | #3
On 2011-01-26 13:05, Stefan Berger wrote:
> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>> On 2011-01-25 17:49, Stefan Berger wrote:
>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>> instrument the kernel's irqchip models and dump their states once your
>>>> guest is stuck?
>>> The device runs on iRQ 3. So I applied this patch here.
>>>
>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>> index 3cece05..8f4f94c 100644
>>> --- a/arch/x86/kvm/i8259.c
>>> +++ b/arch/x86/kvm/i8259.c
>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>> *s, int irq, int level)
>>>   {
>>>       int mask, ret = 1;
>>>       mask = 1<<   irq;
>>> -    if (s->elcr&   mask)    /* level triggered */
>>> +    if (s->elcr&   mask)    /* level triggered */ {
>>>           if (level) {
>>>               ret = !(s->irr&   mask);
>>>               s->irr |= mask;
>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>> kvm_kpic_state *s, int irq, int level)
>>>               s->irr&= ~mask;
>>>               s->last_irr&= ~mask;
>>>           }
>>> -    else    /* edge triggered */
>>> +if (irq == 3)
>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>> s->irr);
>>> +        }
>>> +    else    /* edge triggered */ {
>>>           if (level) {
>>>               if ((s->last_irr&   mask) == 0) {
>>>                   ret = !(s->irr&   mask);
>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>> *s, int irq, int level)
>>>               s->last_irr |= mask;
>>>           } else
>>>               s->last_irr&= ~mask;
>>> -
>>> +if (irq == 3)
>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>> s->irr);
>>> +        }
>>>       return (s->imr&   mask) ? -1 : ret;
>>>   }
>>>
>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>> level)
>>>
>>>       pic_lock(s);
>>>       if (irq>= 0&&   irq<   PIC_NUM_PINS) {
>>> +if (irq == 3)
>>> +printk("%s\n", __FUNCTION__);
>>>           ret = pic_set_irq1(&s->pics[irq>>   3], irq&   7, level);
>>>           pic_update_irq(s);
>>>           trace_kvm_pic_set_irq(irq>>   3, irq&   7, s->pics[irq>>  
>>> 3].elcr,
>>>
>>>
>>>
>>> While it's still working I see this here with the levels changing 0-1-0.
>>> Though then it stops and levels are only at '1'.
>>>
>>> [ 1773.833824] kvm_pic_set_irq
>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>> [ 1773.834161] kvm_pic_set_irq
>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1773.834193] kvm_pic_set_irq
>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>> [ 1773.835028] kvm_pic_set_irq
>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1773.835542] kvm_pic_set_irq
>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1773.889892] kvm_pic_set_irq
>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>> [...]
>>>
>>>
>>> I believe the last 5 shown calls can be ignored. After that the
>>> interrupts don't go through anymore.
>>>
>>> In the device model I see interrupts being raised and cleared. After the
>>> last one was cleared in 'my' device model, only interrupts are raised.
>>> This looks like as if the interrupt handler in the guest Linux was never
>>> run, thus the IRQ is never cleared and we're stuck.
>>>
>> User space is responsible for both setting and clearing that line. IRQ3
>> means you are using some serial device model? Then you should check what
>> its state is.
> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
> from what I can see. There was no UART on IRQ3 before, though, but
> certainly it was the wrong IRQ for it.
>> Moreover, a complete picture of the kernel/user space interaction should
>> be obtainable by using fstrace for capturing kvm events.
>>
> Should it be working on IRQ3? If so, I'd look into it when I get a
> chance...

I don't know your customizations, so it's hard to tell if that should
work or not. IRQ3 is intended to be used by ISA devices on the PC
machine. Are you adding an ISA model, or what is your use case?

Jan
Stefan Berger Jan. 26, 2011, 1:08 p.m. UTC | #4
On 01/26/2011 07:09 AM, Jan Kiszka wrote:
> On 2011-01-26 13:05, Stefan Berger wrote:
>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>> instrument the kernel's irqchip models and dump their states once your
>>>>> guest is stuck?
>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>
>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>> index 3cece05..8f4f94c 100644
>>>> --- a/arch/x86/kvm/i8259.c
>>>> +++ b/arch/x86/kvm/i8259.c
>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>>> *s, int irq, int level)
>>>>    {
>>>>        int mask, ret = 1;
>>>>        mask = 1<<    irq;
>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>            if (level) {
>>>>                ret = !(s->irr&    mask);
>>>>                s->irr |= mask;
>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>> kvm_kpic_state *s, int irq, int level)
>>>>                s->irr&= ~mask;
>>>>                s->last_irr&= ~mask;
>>>>            }
>>>> -    else    /* edge triggered */
>>>> +if (irq == 3)
>>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>>> s->irr);
>>>> +        }
>>>> +    else    /* edge triggered */ {
>>>>            if (level) {
>>>>                if ((s->last_irr&    mask) == 0) {
>>>>                    ret = !(s->irr&    mask);
>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>>> *s, int irq, int level)
>>>>                s->last_irr |= mask;
>>>>            } else
>>>>                s->last_irr&= ~mask;
>>>> -
>>>> +if (irq == 3)
>>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>>> s->irr);
>>>> +        }
>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>    }
>>>>
>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>> level)
>>>>
>>>>        pic_lock(s);
>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>> +if (irq == 3)
>>>> +printk("%s\n", __FUNCTION__);
>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>            pic_update_irq(s);
>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>> 3].elcr,
>>>>
>>>>
>>>>
>>>> While it's still working I see this here with the levels changing 0-1-0.
>>>> Though then it stops and levels are only at '1'.
>>>>
>>>> [ 1773.833824] kvm_pic_set_irq
>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>> [ 1773.834161] kvm_pic_set_irq
>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1773.834193] kvm_pic_set_irq
>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>> [ 1773.835028] kvm_pic_set_irq
>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1773.835542] kvm_pic_set_irq
>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1773.889892] kvm_pic_set_irq
>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>> [...]
>>>>
>>>>
>>>> I believe the last 5 shown calls can be ignored. After that the
>>>> interrupts don't go through anymore.
>>>>
>>>> In the device model I see interrupts being raised and cleared. After the
>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>> This looks like as if the interrupt handler in the guest Linux was never
>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>
>>> User space is responsible for both setting and clearing that line. IRQ3
>>> means you are using some serial device model? Then you should check what
>>> its state is.
>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>> from what I can see. There was no UART on IRQ3 before, though, but
>> certainly it was the wrong IRQ for it.
>>> Moreover, a complete picture of the kernel/user space interaction should
>>> be obtainable by using fstrace for capturing kvm events.
>>>
>> Should it be working on IRQ3? If so, I'd look into it when I get a
>> chance...
> I don't know your customizations, so it's hard to tell if that should
> work or not. IRQ3 is intended to be used by ISA devices on the PC
> machine. Are you adding an ISA model, or what is your use case?
>
The use case is to add a TPM device interface.

http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c

This one typically is connected to the LPC bus.

    Stefan

> Jan
>
Jan Kiszka Jan. 26, 2011, 1:15 p.m. UTC | #5
On 2011-01-26 14:08, Stefan Berger wrote:
> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>> On 2011-01-26 13:05, Stefan Berger wrote:
>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>> your
>>>>>> guest is stuck?
>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>
>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>> index 3cece05..8f4f94c 100644
>>>>> --- a/arch/x86/kvm/i8259.c
>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>> kvm_kpic_state
>>>>> *s, int irq, int level)
>>>>>    {
>>>>>        int mask, ret = 1;
>>>>>        mask = 1<<    irq;
>>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>>            if (level) {
>>>>>                ret = !(s->irr&    mask);
>>>>>                s->irr |= mask;
>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>                s->irr&= ~mask;
>>>>>                s->last_irr&= ~mask;
>>>>>            }
>>>>> -    else    /* edge triggered */
>>>>> +if (irq == 3)
>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>> __FUNCTION__,__LINE__,level,
>>>>> s->irr);
>>>>> +        }
>>>>> +    else    /* edge triggered */ {
>>>>>            if (level) {
>>>>>                if ((s->last_irr&    mask) == 0) {
>>>>>                    ret = !(s->irr&    mask);
>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>> kvm_kpic_state
>>>>> *s, int irq, int level)
>>>>>                s->last_irr |= mask;
>>>>>            } else
>>>>>                s->last_irr&= ~mask;
>>>>> -
>>>>> +if (irq == 3)
>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>> __FUNCTION__,__LINE__,level,
>>>>> s->irr);
>>>>> +        }
>>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>>    }
>>>>>
>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>> level)
>>>>>
>>>>>        pic_lock(s);
>>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>>> +if (irq == 3)
>>>>> +printk("%s\n", __FUNCTION__);
>>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>>            pic_update_irq(s);
>>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>>> 3].elcr,
>>>>>
>>>>>
>>>>>
>>>>> While it's still working I see this here with the levels changing
>>>>> 0-1-0.
>>>>> Though then it stops and levels are only at '1'.
>>>>>
>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>> [...]
>>>>>
>>>>>
>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>> interrupts don't go through anymore.
>>>>>
>>>>> In the device model I see interrupts being raised and cleared.
>>>>> After the
>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>> never
>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>
>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>> means you are using some serial device model? Then you should check
>>>> what
>>>> its state is.
>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>> from what I can see. There was no UART on IRQ3 before, though, but
>>> certainly it was the wrong IRQ for it.
>>>> Moreover, a complete picture of the kernel/user space interaction
>>>> should
>>>> be obtainable by using fstrace for capturing kvm events.
>>>>
>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>> chance...
>> I don't know your customizations, so it's hard to tell if that should
>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>> machine. Are you adding an ISA model, or what is your use case?
>>
> The use case is to add a TPM device interface.
> 
> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
> 
> 
> This one typically is connected to the LPC bus.

I see. Do you also have the xen-free version of it? Maybe there are
still issues with proper qdev integration etc.

Jan
Jan Kiszka Jan. 26, 2011, 1:31 p.m. UTC | #6
On 2011-01-26 14:15, Jan Kiszka wrote:
> On 2011-01-26 14:08, Stefan Berger wrote:
>> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>>> On 2011-01-26 13:05, Stefan Berger wrote:
>>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>>> your
>>>>>>> guest is stuck?
>>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>>> index 3cece05..8f4f94c 100644
>>>>>> --- a/arch/x86/kvm/i8259.c
>>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state
>>>>>> *s, int irq, int level)
>>>>>>    {
>>>>>>        int mask, ret = 1;
>>>>>>        mask = 1<<    irq;
>>>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>>>            if (level) {
>>>>>>                ret = !(s->irr&    mask);
>>>>>>                s->irr |= mask;
>>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>>                s->irr&= ~mask;
>>>>>>                s->last_irr&= ~mask;
>>>>>>            }
>>>>>> -    else    /* edge triggered */
>>>>>> +if (irq == 3)
>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>> __FUNCTION__,__LINE__,level,
>>>>>> s->irr);
>>>>>> +        }
>>>>>> +    else    /* edge triggered */ {
>>>>>>            if (level) {
>>>>>>                if ((s->last_irr&    mask) == 0) {
>>>>>>                    ret = !(s->irr&    mask);
>>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state
>>>>>> *s, int irq, int level)
>>>>>>                s->last_irr |= mask;
>>>>>>            } else
>>>>>>                s->last_irr&= ~mask;
>>>>>> -
>>>>>> +if (irq == 3)
>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>> __FUNCTION__,__LINE__,level,
>>>>>> s->irr);
>>>>>> +        }
>>>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>>>    }
>>>>>>
>>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>>> level)
>>>>>>
>>>>>>        pic_lock(s);
>>>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>>>> +if (irq == 3)
>>>>>> +printk("%s\n", __FUNCTION__);
>>>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>>>            pic_update_irq(s);
>>>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>>>> 3].elcr,
>>>>>>
>>>>>>
>>>>>>
>>>>>> While it's still working I see this here with the levels changing
>>>>>> 0-1-0.
>>>>>> Though then it stops and levels are only at '1'.
>>>>>>
>>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>>> [...]
>>>>>>
>>>>>>
>>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>>> interrupts don't go through anymore.
>>>>>>
>>>>>> In the device model I see interrupts being raised and cleared.
>>>>>> After the
>>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>>> never
>>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>>
>>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>>> means you are using some serial device model? Then you should check
>>>>> what
>>>>> its state is.
>>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>>> from what I can see. There was no UART on IRQ3 before, though, but
>>>> certainly it was the wrong IRQ for it.
>>>>> Moreover, a complete picture of the kernel/user space interaction
>>>>> should
>>>>> be obtainable by using fstrace for capturing kvm events.
>>>>>
>>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>>> chance...
>>> I don't know your customizations, so it's hard to tell if that should
>>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>>> machine. Are you adding an ISA model, or what is your use case?
>>>
>> The use case is to add a TPM device interface.
>>
>> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
>>
>>
>> This one typically is connected to the LPC bus.
> 
> I see. Do you also have the xen-free version of it? Maybe there are
> still issues with proper qdev integration etc.
> 

Without knowing the hardware spec or what is actually behind set_irq,
this looks at least suspicious:

[...]
if (off == TPM_REG_INT_STATUS) {
    /* clearing of interrupt flags */
    if ((val & INTERRUPTS_SUPPORTED) &&
        (s->loc[locty].ints & INTERRUPTS_SUPPORTED)) {
        s->set_irq(s->irq_opaque, s->irq, 0);
        s->irq_pending = 0;
    }
    s->loc[locty].ints &= ~(val & INTERRUPTS_SUPPORTED);
} else
[...]

The code does no
t check if there are ints left after masking out those provided in val.
Does that device already de-asserts the line if you only clear a single
interrupt reason?

BTW, irq_pending looks redundant, at least when using the qemu irq
subsystem.

Jan
Stefan Berger Jan. 26, 2011, 1:52 p.m. UTC | #7
On 01/26/2011 08:31 AM, Jan Kiszka wrote:
> On 2011-01-26 14:15, Jan Kiszka wrote:
>> On 2011-01-26 14:08, Stefan Berger wrote:
>>> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>>>> On 2011-01-26 13:05, Stefan Berger wrote:
>>>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>>>> your
>>>>>>>> guest is stuck?
>>>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>>>
>>>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>>>> index 3cece05..8f4f94c 100644
>>>>>>> --- a/arch/x86/kvm/i8259.c
>>>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>>>> kvm_kpic_state
>>>>>>> *s, int irq, int level)
>>>>>>>     {
>>>>>>>         int mask, ret = 1;
>>>>>>>         mask = 1<<     irq;
>>>>>>> -    if (s->elcr&     mask)    /* level triggered */
>>>>>>> +    if (s->elcr&     mask)    /* level triggered */ {
>>>>>>>             if (level) {
>>>>>>>                 ret = !(s->irr&     mask);
>>>>>>>                 s->irr |= mask;
>>>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>>>                 s->irr&= ~mask;
>>>>>>>                 s->last_irr&= ~mask;
>>>>>>>             }
>>>>>>> -    else    /* edge triggered */
>>>>>>> +if (irq == 3)
>>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>>> __FUNCTION__,__LINE__,level,
>>>>>>> s->irr);
>>>>>>> +        }
>>>>>>> +    else    /* edge triggered */ {
>>>>>>>             if (level) {
>>>>>>>                 if ((s->last_irr&     mask) == 0) {
>>>>>>>                     ret = !(s->irr&     mask);
>>>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>>>> kvm_kpic_state
>>>>>>> *s, int irq, int level)
>>>>>>>                 s->last_irr |= mask;
>>>>>>>             } else
>>>>>>>                 s->last_irr&= ~mask;
>>>>>>> -
>>>>>>> +if (irq == 3)
>>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>>> __FUNCTION__,__LINE__,level,
>>>>>>> s->irr);
>>>>>>> +        }
>>>>>>>         return (s->imr&     mask) ? -1 : ret;
>>>>>>>     }
>>>>>>>
>>>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>>>> level)
>>>>>>>
>>>>>>>         pic_lock(s);
>>>>>>>         if (irq>= 0&&     irq<     PIC_NUM_PINS) {
>>>>>>> +if (irq == 3)
>>>>>>> +printk("%s\n", __FUNCTION__);
>>>>>>>             ret = pic_set_irq1(&s->pics[irq>>     3], irq&     7, level);
>>>>>>>             pic_update_irq(s);
>>>>>>>             trace_kvm_pic_set_irq(irq>>     3, irq&     7, s->pics[irq>>
>>>>>>> 3].elcr,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> While it's still working I see this here with the levels changing
>>>>>>> 0-1-0.
>>>>>>> Though then it stops and levels are only at '1'.
>>>>>>>
>>>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>>>> [...]
>>>>>>>
>>>>>>>
>>>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>>>> interrupts don't go through anymore.
>>>>>>>
>>>>>>> In the device model I see interrupts being raised and cleared.
>>>>>>> After the
>>>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>>>> never
>>>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>>>
>>>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>>>> means you are using some serial device model? Then you should check
>>>>>> what
>>>>>> its state is.
>>>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>>>> from what I can see. There was no UART on IRQ3 before, though, but
>>>>> certainly it was the wrong IRQ for it.
>>>>>> Moreover, a complete picture of the kernel/user space interaction
>>>>>> should
>>>>>> be obtainable by using fstrace for capturing kvm events.
>>>>>>
>>>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>>>> chance...
>>>> I don't know your customizations, so it's hard to tell if that should
>>>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>>>> machine. Are you adding an ISA model, or what is your use case?
>>>>
>>> The use case is to add a TPM device interface.
>>>
>>> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
>>>
>>>
>>> This one typically is connected to the LPC bus.
>> I see. Do you also have the xen-free version of it? Maybe there are
>> still issues with proper qdev integration etc.
>>
> Without knowing the hardware spec or what is actually behind set_irq,
> this looks at least suspicious:
>
> [...]
> if (off == TPM_REG_INT_STATUS) {
>      /* clearing of interrupt flags */
>      if ((val&  INTERRUPTS_SUPPORTED)&&
>          (s->loc[locty].ints&  INTERRUPTS_SUPPORTED)) {
>          s->set_irq(s->irq_opaque, s->irq, 0);
>          s->irq_pending = 0;
>      }
>      s->loc[locty].ints&= ~(val&  INTERRUPTS_SUPPORTED);
> } else
> [...]
>
> The code does no
> t check if there are ints left after masking out those provided in val.
> Does that device already de-asserts the line if you only clear a single
> interrupt reason?
>
> BTW, irq_pending looks redundant, at least when using the qemu irq
> subsystem.
The code has substantially changed in the meantime -- the Xen repository 
code is from > 3 years ago - I had to go backwards in the xen unstable 
repository to find it. The link was merely meant to show what type of 
device is being added.  As said, some other things need to come together 
first before this will become available.

    Stefan
diff mbox

Patch

diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 3cece05..8f4f94c 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -106,7 +106,7 @@  static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level)
  {
  	int mask, ret = 1;
  	mask = 1<<  irq;
-	if (s->elcr&  mask)	/* level triggered */
+	if (s->elcr&  mask)	/* level triggered */ {
  		if (level) {
  			ret = !(s->irr&  mask);
  			s->irr |= mask;
@@ -115,7 +115,10 @@  static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level)
  			s->irr&= ~mask;
  			s->last_irr&= ~mask;
  		}
-	else	/* edge triggered */
+if (irq == 3)
+    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, s->irr);
+        }
+	else	/* edge triggered */ {
  		if (level) {
  			if ((s->last_irr&  mask) == 0) {
  				ret = !(s->irr&  mask);
@@ -124,7 +127,9 @@  static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level)
  			s->last_irr |= mask;
  		} else
  			s->last_irr&= ~mask;
-
+if (irq == 3)
+    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, s->irr);
+        }
  	return (s->imr&  mask) ? -1 : ret;
  }

@@ -206,6 +211,8 @@  int kvm_pic_set_irq(void *opaque, int irq, int level)

  	pic_lock(s);
  	if (irq>= 0&&  irq<  PIC_NUM_PINS) {
+if (irq == 3)
+printk("%s\n", __FUNCTION__);
  		ret = pic_set_irq1(&s->pics[irq>>  3], irq&  7, level);
  		pic_update_irq(s);
  		trace_kvm_pic_set_irq(irq>>  3, irq&  7, s->pics[irq>>  3].elcr,