Message ID | CABUUfwPRe+OKF9xdsq9cSUeOn3+Tqv16BidEccihr8At2wumzA@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Thu, Jun 25, 2015 at 01:01:29PM +0200, Thibaut Collet wrote: > On Thu, Jun 25, 2015 at 11:59 AM, Jason Wang <jasowang@redhat.com> wrote: > > > > > > > > On 06/24/2015 07:05 PM, Michael S. Tsirkin wrote: > > > On Wed, Jun 24, 2015 at 04:31:15PM +0800, Jason Wang wrote: > > >> > > >> On 06/23/2015 01:49 PM, Michael S. Tsirkin wrote: > > >>> On Tue, Jun 23, 2015 at 10:12:17AM +0800, Jason Wang wrote: > > >>>>> > > >>>>> On 06/18/2015 11:16 PM, Thibaut Collet wrote: > > >>>>>>> On Tue, Jun 16, 2015 at 10:05 AM, Jason Wang <jasowang@redhat.com> wrote: > > >>>>>>>>> On 06/16/2015 03:24 PM, Thibaut Collet wrote: > > >>>>>>>>>>> If my understanding is correct, on a resume operation, we have the > > >>>>>>>>>>> following callback trace: > > >>>>>>>>>>> 1. virtio_pci_restore function that calls all restore call back of > > >>>>>>>>>>> virtio devices > > >>>>>>>>>>> 2. virtnet_restore that calls try_fill_recv function for each virtual queues > > >>>>>>>>>>> 3. try_fill_recv function kicks the virtual queue (through > > >>>>>>>>>>> virtqueue_kick function) > > >>>>>>>>> Yes, but this happens only after pm resume not migration. Migration is > > >>>>>>>>> totally transparent to guest. > > >>>>>>>>> > > >>>>>>> Hi Jason, > > >>>>>>> > > >>>>>>> After a deeper look in the migration code of QEMU a resume event is > > >>>>>>> always sent when the live migration is finished. > > >>>>>>> On a live migration we have the following callback trace: > > >>>>>>> 1. The VM on the new host is set to the state RUN_STATE_INMIGRATE, the > > >>>>>>> autostart boolean to 1 and calls the qemu_start_incoming_migration > > >>>>>>> function (see function main of vl.c) > > >>>>>>> ..... > > >>>>>>> 2. call of process_incoming_migration function in > > >>>>>>> migration/migration.c file whatever the way to do the live migration > > >>>>>>> (tcp:, fd:, unix:, exec: ...) > > >>>>>>> 3. call of process_incoming_migration_co function in migration/migration.c > > >>>>>>> 4. call of vm_start function in vl.c (otherwise the migrated VM stay > > >>>>>>> in the pause state, the autostart boolean is set to 1 by the main > > >>>>>>> function in vl.c) > > >>>>>>> 5. call of vm_start function that sets the VM is the RUN_STATE_RUNNING state. > > >>>>>>> 6. call of qapi_event_send_resume function that ends a resume event to the VM > > >>>>> AFAIK, this function sends resume event to qemu monitor not VM. > > >>>>> > > >>>>>>> So when a live migration is ended: > > >>>>>>> 1. a resume event is sent to the guest > > >>>>>>> 2. On the reception of this resume event the virtual queue are kicked > > >>>>>>> by the guest > > >>>>>>> 3. Backend vhost user catches this kick and can emit a RARP to guest > > >>>>>>> that does not support GUEST_ANNOUNCE > > >>>>>>> > > >>>>>>> This solution, as solution based on detection of DRIVER_OK status > > >>>>>>> suggested by Michael, allows backend to send the RARP to legacy guest > > >>>>>>> without involving QEMU and add ioctl to vhost-user. > > >>>>> A question here is did vhost-user code pass status to the backend? If > > >>>>> not, how can userspace backend detect DRIVER_OK? > > >>> Sorry, I must have been unclear. > > >>> vhost core calls VHOST_NET_SET_BACKEND on DRIVER_OK. > > >>> Unfortunately vhost user currently translates it to VHOST_USER_NONE. > > >> Looks like VHOST_NET_SET_BACKEND was only used for tap backend. > > >> > > >>> As a work around, I think kicking ioeventfds once you get > > >>> VHOST_NET_SET_BACKEND will work. > > >> Maybe just a eventfd_set() in vhost_net_start(). But is this > > >> "workaround" elegant enough to be documented? Is it better to do this > > >> explicitly with a new feature? > > > If you are going to do this anyway, there are a couple of other changes > > > we should do, in particular, decide what we want to do with control vq. > > > > > > > If I understand correctly, you mean VIRTIO_NET_CTRL_MQ and > > VIRTIO_NET_CTRL_GUEST_OFFLOADS? Looks like both of these were broken. > > Need more thought, maybe new kinds of requests. > > > > > > Are there any objections to add VHOST_NET_SET_BACKEND support to vhost > user with a patch like that: > > > hw/net/vhost_net.c | 8 ++++++++ > hw/virtio/vhost-user.c | 10 +++++++++- > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > index 907e002..7a008c0 100644 > --- a/hw/net/vhost_net.c > +++ b/hw/net/vhost_net.c > @@ -234,6 +234,14 @@ static int vhost_net_start_one(struct vhost_net *net, > goto fail; > } > } > + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) { > + file.fd = 0; > + for (file.index = 0; file.index < net->dev.nvqs; ++file.index) { > + const VhostOps *vhost_ops = net->dev.vhost_ops; > + int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, > + &file); > + assert(r >= 0); > + } > } > return 0; > fail: > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > index d6f2163..32c6bd9 100644 > --- a/hw/virtio/vhost-user.c > +++ b/hw/virtio/vhost-user.c > @@ -41,6 +41,7 @@ typedef enum VhostUserRequest { > VHOST_USER_SET_VRING_KICK = 12, > VHOST_USER_SET_VRING_CALL = 13, > VHOST_USER_SET_VRING_ERR = 14, > + VHOST_USER_NET_SET_BACKEND = 15, > VHOST_USER_MAX > } VhostUserRequest; > > @@ -104,7 +105,8 @@ static unsigned long int > ioctl_to_vhost_user_request[VHOST_USER_MAX] = { > VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */ > VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */ > VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */ > - VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */ > + VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */ > + VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */ > }; > > static VhostUserRequest vhost_user_request_translate(unsigned long int request) > @@ -287,6 +289,12 @@ static int vhost_user_call(struct vhost_dev *dev, > unsigned long int request, > msg.u64 |= VHOST_USER_VRING_NOFD_MASK; > } > break; > + > + case VHOST_NET_SET_BACKEND: > + memcpy(&msg.file, arg, sizeof(struct vhost_vring_state)); > + msg.size = sizeof(m.state); > + break; > + > default: > error_report("vhost-user trying to send unhandled ioctl"); > return -1; > > > This message will be sent when guest is ready and can be used by vhost > user backend to send RARP to legacy guest. > > This solution avoids to add new message and has no impact on control vq. I think that you can't add messages to protocol unconditionally. For example, snabbswitch simply crashes if it gets an unknown message. Either this needs a new feature bit, or implement [PATCH RFC] vhost-user: protocol extensions making it safe to add new messages.
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 907e002..7a008c0 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -234,6 +234,14 @@ static int vhost_net_start_one(struct vhost_net *net, goto fail; } } + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) { + file.fd = 0; + for (file.index = 0; file.index < net->dev.nvqs; ++file.index) { + const VhostOps *vhost_ops = net->dev.vhost_ops; + int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, + &file); + assert(r >= 0); + } } return 0; fail: diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index d6f2163..32c6bd9 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -41,6 +41,7 @@ typedef enum VhostUserRequest { VHOST_USER_SET_VRING_KICK = 12, VHOST_USER_SET_VRING_CALL = 13, VHOST_USER_SET_VRING_ERR = 14, + VHOST_USER_NET_SET_BACKEND = 15, VHOST_USER_MAX } VhostUserRequest; @@ -104,7 +105,8 @@ static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = { VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */ VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */ VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */ - VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */ + VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */ + VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */ };