Patchwork [v2,1/2] qxl: send interrupt after migration in case ram->int_pending != 0, RHBZ #732949

login
register
mail settings
Submitter Yonit Halperin
Date Aug. 31, 2011, 12:37 p.m.
Message ID <1314794254-11624-1-git-send-email-yhalperi@redhat.com>
Download mbox | patch
Permalink /patch/112548/
State New
Headers show

Comments

Yonit Halperin - Aug. 31, 2011, 12:37 p.m.
if qxl_send_events was called from spice server context, and then
migration had completed before a call to pipe_read, the target
guest qxl driver didn't get the interrupt. In addition,
qxl_send_events ignored further interrupts of the same kind, since
ram->int_pending was set. As a result, the guest driver was stacked
or very slow (when the waiting for the interrupt was with timeout).
---
 hw/qxl.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)
Gerd Hoffmann - Aug. 31, 2011, 12:42 p.m.
On 08/31/11 14:37, Yonit Halperin wrote:
> if qxl_send_events was called from spice server context, and then
> migration had completed before a call to pipe_read, the target
> guest qxl driver didn't get the interrupt. In addition,
> qxl_send_events ignored further interrupts of the same kind, since
> ram->int_pending was set. As a result, the guest driver was stacked
> or very slow (when the waiting for the interrupt was with timeout).

Looks fine except for this:

=== checkpatch complains ===
WARNING: line over 80 characters
#22: FILE: hw/qxl.c:1468:
+         * migration ended, qxl_set_irq for these events might not have 
been called

cheers,
   Gerd

PS: /me suggests to check out
http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html
Michael S. Tsirkin - Sept. 1, 2011, 7:36 p.m.
On Wed, Aug 31, 2011 at 03:37:33PM +0300, Yonit Halperin wrote:
> if qxl_send_events was called from spice server context, and then
> migration had completed before a call to pipe_read, the target
> guest qxl driver didn't get the interrupt.

This is a general issue with interrupt migration, and PCI core has code
to handle this, migrating interrupts.  So rather than work around this
in qxl I'd like us to first understand whether there really exists such
a problem, since if yes it would affect other devices.

Could you help with that please?

> In addition,
> qxl_send_events ignored further interrupts of the same kind, since
> ram->int_pending was set.

Maybe this is the only issue?
A way to check would be to call
    uint32_t pending = le32_to_cpu(d->ram->int_pending);
    uint32_t mask    = le32_to_cpu(d->ram->int_mask);
    int level = !!(pending & mask);
    qxl_ring_set_dirty(d);

instead of qxl_set_irq, and see if that is enough.

Note: I don't object to reusing qxl_set_irq in
production, just let us make sure we don't hide bugs.

> As a result, the guest driver was stacked
> or very slow (when the waiting for the interrupt was with timeout).

You need to sign off :)

> ---
>  hw/qxl.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/qxl.c b/hw/qxl.c
> index b34bccf..c7edc60 100644
> --- a/hw/qxl.c
> +++ b/hw/qxl.c
> @@ -1362,7 +1362,6 @@ static void pipe_read(void *opaque)
>      qxl_set_irq(d);
>  }
>  
> -/* called from spice server thread context only */
>  static void qxl_send_events(PCIQXLDevice *d, uint32_t events)
>  {
>      uint32_t old_pending;
> @@ -1463,7 +1462,13 @@ static void qxl_vm_change_state_handler(void *opaque, int running, int reason)
>      PCIQXLDevice *qxl = opaque;
>      qemu_spice_vm_change_state_handler(&qxl->ssd, running, reason);
>  
> -    if (!running && qxl->mode == QXL_MODE_NATIVE) {
> +    if (running) {
> +        /*
> +         * if qxl_send_events was called from spice server context before
> +         * migration ended, qxl_set_irq for these events might not have been called
> +         */
> +         qxl_set_irq(qxl);
> +    } else if (qxl->mode == QXL_MODE_NATIVE) {
>          /* dirty all vram (which holds surfaces) and devram (primary surface)
>           * to make sure they are saved */
>          /* FIXME #1: should go out during "live" stage */
> -- 
> 1.7.4.4
>
Yonit Halperin - Sept. 4, 2011, 5:38 a.m.
On 09/01/2011 10:36 PM, Michael S. Tsirkin wrote:
> On Wed, Aug 31, 2011 at 03:37:33PM +0300, Yonit Halperin wrote:
>> if qxl_send_events was called from spice server context, and then
>> migration had completed before a call to pipe_read, the target
>> guest qxl driver didn't get the interrupt.
>
> This is a general issue with interrupt migration, and PCI core has code
> to handle this, migrating interrupts.  So rather than work around this
> in qxl I'd like us to first understand whether there really exists such
> a problem, since if yes it would affect other devices.
>
> Could you help with that please?
>
I think this issue is spice-specific: the problem is that when a 
spice_server thread issues a request for interrupt, the request is 
passed to the qemu thread through a pipe. This pipe status is not saved 
during migration. Thus, any pending interrupt request are purged when 
migration completes.
>> In addition,
>> qxl_send_events ignored further interrupts of the same kind, since
>> ram->int_pending was set.
>
> Maybe this is the only issue?
> A way to check would be to call
>      uint32_t pending = le32_to_cpu(d->ram->int_pending);
>      uint32_t mask    = le32_to_cpu(d->ram->int_mask);
>      int level = !!(pending&  mask);
>      qxl_ring_set_dirty(d);
>
> instead of qxl_set_irq, and see if that is enough.
>
I was talking about the check in qxl_send_events
> Note: I don't object to reusing qxl_set_irq in
> production, just let us make sure we don't hide bugs.
>
>> As a result, the guest driver was stacked
>> or very slow (when the waiting for the interrupt was with timeout).
>
> You need to sign off :)
>
>> ---
>>   hw/qxl.c |    9 +++++++--
>>   1 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/qxl.c b/hw/qxl.c
>> index b34bccf..c7edc60 100644
>> --- a/hw/qxl.c
>> +++ b/hw/qxl.c
>> @@ -1362,7 +1362,6 @@ static void pipe_read(void *opaque)
>>       qxl_set_irq(d);
>>   }
>>
>> -/* called from spice server thread context only */
>>   static void qxl_send_events(PCIQXLDevice *d, uint32_t events)
>>   {
>>       uint32_t old_pending;
>> @@ -1463,7 +1462,13 @@ static void qxl_vm_change_state_handler(void *opaque, int running, int reason)
>>       PCIQXLDevice *qxl = opaque;
>>       qemu_spice_vm_change_state_handler(&qxl->ssd, running, reason);
>>
>> -    if (!running&&  qxl->mode == QXL_MODE_NATIVE) {
>> +    if (running) {
>> +        /*
>> +         * if qxl_send_events was called from spice server context before
>> +         * migration ended, qxl_set_irq for these events might not have been called
>> +         */
>> +         qxl_set_irq(qxl);
>> +    } else if (qxl->mode == QXL_MODE_NATIVE) {
>>           /* dirty all vram (which holds surfaces) and devram (primary surface)
>>            * to make sure they are saved */
>>           /* FIXME #1: should go out during "live" stage */
>> --
>> 1.7.4.4
>>

Patch

diff --git a/hw/qxl.c b/hw/qxl.c
index b34bccf..c7edc60 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -1362,7 +1362,6 @@  static void pipe_read(void *opaque)
     qxl_set_irq(d);
 }
 
-/* called from spice server thread context only */
 static void qxl_send_events(PCIQXLDevice *d, uint32_t events)
 {
     uint32_t old_pending;
@@ -1463,7 +1462,13 @@  static void qxl_vm_change_state_handler(void *opaque, int running, int reason)
     PCIQXLDevice *qxl = opaque;
     qemu_spice_vm_change_state_handler(&qxl->ssd, running, reason);
 
-    if (!running && qxl->mode == QXL_MODE_NATIVE) {
+    if (running) {
+        /*
+         * if qxl_send_events was called from spice server context before
+         * migration ended, qxl_set_irq for these events might not have been called
+         */
+         qxl_set_irq(qxl);
+    } else if (qxl->mode == QXL_MODE_NATIVE) {
         /* dirty all vram (which holds surfaces) and devram (primary surface)
          * to make sure they are saved */
         /* FIXME #1: should go out during "live" stage */