diff mbox

[1/1] virtio: fallback from irqfd to non-irqfd notify

Message ID 08ca0c91-4a6d-1750-ed79-a0f6e2ca7eaf@linux.vnet.ibm.com
State New
Headers show

Commit Message

Halil Pasic March 1, 2017, 4:08 p.m. UTC
On 03/01/2017 03:29 PM, Paolo Bonzini wrote:
> 
> 
> On 01/03/2017 14:22, Halil Pasic wrote:
>> Here a trace:
>>
>> 135871@1488304024.512533:virtio_blk_req_complete req 0x2aa6b117e10 status 0
>> 135871@1488304024.512541:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> 135871@1488304024.522607:virtio_blk_req_complete req 0x2aa6b118980 status 0
>> 135871@1488304024.522616:virtio_blk_req_complete req 0x2aa6b119260 status 0
>> 135871@1488304024.522627:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> 135871@1488304024.527386:virtio_blk_req_complete req 0x2aa6b118980 status 0
>> 135871@1488304024.527431:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> 135871@1488304024.528611:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de880
>> 135871@1488304024.528628:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de8f8
>> 135871@1488304024.528753:virtio_blk_data_plane_stop dataplane 0x2aa6b0e5540
>>                          ^== DATAPLANE STOP  
>> 135871@1488304024.530709:virtio_blk_req_complete req 0x2aa6b117e10 status 0
>> 135871@1488304024.530752:virtio_guest_notifier_read vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>>                          ^== comes from k->set_guest_notifiers(qbus->parent, nvqs, false);
>>                              in virtio_blk_data_plane_stop and done immediately after
>>                              irqfd is cleaned up by the transport
>> 135871@1488304024.530836:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> halil: error in event_notifier_set: Bad file descriptor
>>                          ^== here we have the problem
>>
>> If you want a stacktrace that can be arranged to.
>>
>>> like a reset should cause it (the only call in virtio-blk is from
>>> virtio_blk_data_plane_stop), and then the guest doesn't care anymore
>>> about interrupts.
>> I do not understand this with 'doesn't care anymore about interrupts'.
>> I was debugging a virtio-blk device being stuck waiting for a host
>> notification (interrupt) after migration.
> 
> Ok, this explains it better then.  The issue is that
> virtio_blk_data_plane_stop doesn't flush the bottom half, which you want
> to do when the caller is, for example, virtio_ccw_vmstate_change.
> 
> Does it work if you call to qemu_bh_cancel(s->bh) and notify_guest_bh(s)
> after
> 
>     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> 
> ?
> 

With

thinking (see questions below) until tomorrow.

I should probably cc stable, or?

I would also like to do some diagnostic stuff if virtio_notify_irqfd fails.
Maybe assert success for event_notifier_set. Would that be OK with you?

I have a couple of questions about the ways of the dataplane code. If
you are too busy, feel free to not answer -- I will keep thinking myself.

Q1. For this to work correctly, it seems to me, we need to be sure that
virtio_blk_req_complete can not be happen between the newly added
notify_guest_bh(s);
and 
vblk->dataplane_started = false; 
becomes visible. How is this ensured?
Q2. The virtio_blk_data_plane_stop should be from the thread/context
associated with the main event loop, and with that
vblk->dataplane_started = false too. But I think dataplane_started
may end up being used form a different thread (e.g. req_complete).
How does the sequencing work there and/or is it even important?

Regards,
Halil

Comments

Michael S. Tsirkin March 1, 2017, 4:53 p.m. UTC | #1
On Wed, Mar 01, 2017 at 05:08:39PM +0100, Halil Pasic wrote:
> 
> 
> On 03/01/2017 03:29 PM, Paolo Bonzini wrote:
> > 
> > 
> > On 01/03/2017 14:22, Halil Pasic wrote:
> >> Here a trace:
> >>
> >> 135871@1488304024.512533:virtio_blk_req_complete req 0x2aa6b117e10 status 0
> >> 135871@1488304024.512541:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> 135871@1488304024.522607:virtio_blk_req_complete req 0x2aa6b118980 status 0
> >> 135871@1488304024.522616:virtio_blk_req_complete req 0x2aa6b119260 status 0
> >> 135871@1488304024.522627:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> 135871@1488304024.527386:virtio_blk_req_complete req 0x2aa6b118980 status 0
> >> 135871@1488304024.527431:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> 135871@1488304024.528611:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de880
> >> 135871@1488304024.528628:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de8f8
> >> 135871@1488304024.528753:virtio_blk_data_plane_stop dataplane 0x2aa6b0e5540
> >>                          ^== DATAPLANE STOP  
> >> 135871@1488304024.530709:virtio_blk_req_complete req 0x2aa6b117e10 status 0
> >> 135871@1488304024.530752:virtio_guest_notifier_read vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >>                          ^== comes from k->set_guest_notifiers(qbus->parent, nvqs, false);
> >>                              in virtio_blk_data_plane_stop and done immediately after
> >>                              irqfd is cleaned up by the transport
> >> 135871@1488304024.530836:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> halil: error in event_notifier_set: Bad file descriptor
> >>                          ^== here we have the problem
> >>
> >> If you want a stacktrace that can be arranged to.
> >>
> >>> like a reset should cause it (the only call in virtio-blk is from
> >>> virtio_blk_data_plane_stop), and then the guest doesn't care anymore
> >>> about interrupts.
> >> I do not understand this with 'doesn't care anymore about interrupts'.
> >> I was debugging a virtio-blk device being stuck waiting for a host
> >> notification (interrupt) after migration.
> > 
> > Ok, this explains it better then.  The issue is that
> > virtio_blk_data_plane_stop doesn't flush the bottom half, which you want
> > to do when the caller is, for example, virtio_ccw_vmstate_change.
> > 
> > Does it work if you call to qemu_bh_cancel(s->bh) and notify_guest_bh(s)
> > after
> > 
> >     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> > 
> > ?
> > 
> 
> With
> 
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -260,6 +260,8 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
>  
>      /* Drain and switch bs back to the QEMU main loop */
>      blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> +    qemu_bh_cancel(s->bh);
> +    notify_guest_bh(s);
>  
> applied I do not see the problem any more. I will most likely
> turn this into a patch tomorrow. I would like to give it some more testing and
> thinking (see questions below) until tomorrow.
> 
> I should probably cc stable, or?
> 
> I would also like to do some diagnostic stuff if virtio_notify_irqfd fails.
> Maybe assert success for event_notifier_set. Would that be OK with you?

Main reason it can't fail is because we don't close the fd.
Given no callers check the return status, I'd be inclined to go further
and stick that assert into event_notifier_set, convert that
function to int.

> I have a couple of questions about the ways of the dataplane code. If
> you are too busy, feel free to not answer -- I will keep thinking myself.
> 
> Q1. For this to work correctly, it seems to me, we need to be sure that
> virtio_blk_req_complete can not be happen between the newly added
> notify_guest_bh(s);
> and 
> vblk->dataplane_started = false; 
> becomes visible. How is this ensured?
> Q2. The virtio_blk_data_plane_stop should be from the thread/context
> associated with the main event loop, and with that
> vblk->dataplane_started = false too. But I think dataplane_started
> may end up being used form a different thread (e.g. req_complete).
> How does the sequencing work there and/or is it even important?
> 
> Regards,
> Halil
Paolo Bonzini March 1, 2017, 7:53 p.m. UTC | #2
On 01/03/2017 17:08, Halil Pasic wrote:
> applied I do not see the problem any more. I will most likely
> turn this into a patch tomorrow. I would like to give it some more testing and
> thinking (see questions below) until tomorrow.
> 
> I should probably cc stable, or?

Yes, please do!

> 
> Q1. For this to work correctly, it seems to me, we need to be sure that
> virtio_blk_req_complete can not be happen between the newly added
> notify_guest_bh(s);
> and 
> vblk->dataplane_started = false; 
> becomes visible. How is this ensured?

blk_set_aio_context drains the block device, and the event notifiers are
not active anymore so draining the block device coincides with the last
call to virtio_blk_req_complete.

Please add a comment - it's a good observation.

> Q2. The virtio_blk_data_plane_stop should be from the thread/context
> associated with the main event loop, and with that
> vblk->dataplane_started = false too. But I think dataplane_started
> may end up being used form a different thread (e.g. req_complete).

1) virtio_queue_aio_set_host_notifier_handler stops the event notifiers

2) virtio_bus_set_host_notifier invokes them one last time before exiting

Note that this could call again virtio_queue_notify_vq and hence
virtio_device_start_ioeventfd, but dataplane won't be reactivated
because vblk->dataplane_started is still true.

> How does the sequencing work there and/or is it even important?

It is important and not really easy to get right---as shown by the bug
you found, in fact.

Thanks,

Paolo
Halil Pasic March 2, 2017, 1:14 p.m. UTC | #3
On 03/01/2017 08:53 PM, Paolo Bonzini wrote:
> 
> 
> On 01/03/2017 17:08, Halil Pasic wrote:
>> applied I do not see the problem any more. I will most likely
>> turn this into a patch tomorrow. I would like to give it some more testing and
>> thinking (see questions below) until tomorrow.
>>
>> I should probably cc stable, or?
> 
> Yes, please do!
> 
>>
>> Q1. For this to work correctly, it seems to me, we need to be sure that
>> virtio_blk_req_complete can not be happen between the newly added
>> notify_guest_bh(s);
>> and 
>> vblk->dataplane_started = false; 
>> becomes visible. How is this ensured?
> 
> blk_set_aio_context drains the block device, and the event notifiers are
> not active anymore so draining the block device coincides with the last
> call to virtio_blk_req_complete.
> 
> Please add a comment - it's a good observation.
> 
>> Q2. The virtio_blk_data_plane_stop should be from the thread/context
>> associated with the main event loop, and with that
>> vblk->dataplane_started = false too. But I think dataplane_started
>> may end up being used form a different thread (e.g. req_complete).
> 
> 1) virtio_queue_aio_set_host_notifier_handler stops the event notifiers
> 
> 2) virtio_bus_set_host_notifier invokes them one last time before exiting
> 
> Note that this could call again virtio_queue_notify_vq and hence
> virtio_device_start_ioeventfd, but dataplane won't be reactivated
> because vblk->dataplane_started is still true.
> 
>> How does the sequencing work there and/or is it even important?
> 
> It is important and not really easy to get right---as shown by the bug
> you found, in fact.
> 

Thank you very much for the explanations. I have just sent a patch
based on what we discussed here. I think I roughly understand now, how
this is supposed to work regarding concurrency, but I guess I will
have to just trust you to some extent.
diff mbox

Patch

--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -260,6 +260,8 @@  void virtio_blk_data_plane_stop(VirtIODevice *vdev)
 
     /* Drain and switch bs back to the QEMU main loop */
     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
+    qemu_bh_cancel(s->bh);
+    notify_guest_bh(s);
 
applied I do not see the problem any more. I will most likely
turn this into a patch tomorrow. I would like to give it some more testing and