diff mbox

[2/2] vhost-user: only seek a reply if needed in set_mem_table

Message ID 1473323650-13298-3-git-send-email-maxime.coquelin@redhat.com
State New
Headers show

Commit Message

Maxime Coquelin Sept. 8, 2016, 8:34 a.m. UTC
The goal of this patch is to only request a sync (reply_ack,
or get_features) in set_mem_table only when necessary.

It should not be necessary the first time we set the table,
or when we add a new regions which hadn't been merged with an
existing ones.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Prerna Saxena <prerna.saxena@nutanix.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 hw/virtio/vhost-user.c    |  7 +++++++
 hw/virtio/vhost.c         | 10 ++++++++++
 include/hw/virtio/vhost.h |  1 +
 3 files changed, 18 insertions(+)

Comments

Prerna Saxena Sept. 8, 2016, 11:33 a.m. UTC | #1
Hi Maxime,


On 08/09/16 2:04 pm, "Maxime Coquelin" <maxime.coquelin@redhat.com> wrote:

>The goal of this patch is to only request a sync (reply_ack,

>or get_features) in set_mem_table only when necessary.

>

>It should not be necessary the first time we set the table,

>or when we add a new regions which hadn't been merged with an

>existing ones.



I don’t think so. 
This patch is not helping us solve the issue.
The hang introduced by original use of get_features() in set_mem_table was traced down to use of TCG mode for vhost-user test. This has now been fixed via:

-----
commit cdafe929615ec5eca71bcd5a3d12bab5678e5886
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Fri Sep 2 15:59:43 2016 -0300


    vhost-user-test: Use libqos instead of pxe-virtio.rom
    
    vhost-user-test relies on iPXE just to initialize the virtio-net
    device, and doesn't do any actual packet tx/rx testing.
    
    In addition to that, the test relies on TCG, which is
    imcompatible with vhost. The test only worked by accident: a bug
    the memory backend initialization made memory regions not have
    the DIRTY_MEMORY_CODE bit set in dirty_log_mask.
    
    This changes vhost-user-test to initialize the virtio-net device
    using libqos, and not use TCG nor pxe-virtio.rom.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>


-------

So I think the original hang seems to have been fixed with Patch 1/2 of this series alone.

Regarding Patch 2/2:
This patch seems to filter responses from set_mem_table only for certain updates of memory regions. It violates the definition of the REPLY_ACK feature. This feature expects the client to send a response for every call of set_mem_table. And here, qemu exits the set_mem_table() function in some cases without even waiting for the reply that is going to come in.

As for use of this approach with get_features, we have already debated that on the list before : https://lists.nongnu.org/archive/html/qemu-devel/2016-07/msg00689.html
To quote:
"I do not entirely agree with that. The first set_mem_table command is not much
different from subsequent set_mem_table calls."

Regards,
Prerna
Marc-Andre Lureau Sept. 8, 2016, 11:56 a.m. UTC | #2
Hi

----- Original Message -----
> Regarding Patch 2/2:
> This patch seems to filter responses from set_mem_table only for certain
> updates of memory regions. It violates the definition of the REPLY_ACK
> feature. This feature expects the client to send a response for every call
> of set_mem_table. And here, qemu exits the set_mem_table() function in some
> cases without even waiting for the reply that is going to come in.
> 

Agreed with Prerna here,

Furthermore, I haven't followed closely the recents developments, and the commit message doesn't explain why it should not be necessary to sync the first time set-mem-table is called. Could you develop that?

thanks
Michael S. Tsirkin Sept. 8, 2016, 3:06 p.m. UTC | #3
On Thu, Sep 08, 2016 at 07:56:38AM -0400, Marc-André Lureau wrote:
> Hi
> 
> ----- Original Message -----
> > Regarding Patch 2/2:
> > This patch seems to filter responses from set_mem_table only for certain
> > updates of memory regions. It violates the definition of the REPLY_ACK
> > feature. This feature expects the client to send a response for every call
> > of set_mem_table. And here, qemu exits the set_mem_table() function in some
> > cases without even waiting for the reply that is going to come in.
> > 
> 
> Agreed with Prerna here,
> 
> Furthermore, I haven't followed closely the recents developments, and the commit message doesn't explain why it should not be necessary to sync the first time set-mem-table is called. Could you develop that?
> 
> thanks

Basically if we send set mem table when backend is not started,
there is no reason to wait for a response.
Michael S. Tsirkin Sept. 8, 2016, 3:15 p.m. UTC | #4
On Thu, Sep 08, 2016 at 10:34:10AM +0200, Maxime Coquelin wrote:
> The goal of this patch is to only request a sync (reply_ack,
> or get_features) in set_mem_table only when necessary.
> 
> It should not be necessary the first time we set the table,
> or when we add a new regions which hadn't been merged with an
> existing ones.

I'm not sure I get the second part. If we don't sync,
can't use of memory by guest bypass the request?
Might this cause the backend to fail?
I guess backend could try to recover by flushing the
message queue, but if so, we probably should document this.
And if not, why do we care about merged regions?

> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Cc: Prerna Saxena <prerna.saxena@nutanix.com>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  hw/virtio/vhost-user.c    |  7 +++++++
>  hw/virtio/vhost.c         | 10 ++++++++++
>  include/hw/virtio/vhost.h |  1 +
>  3 files changed, 18 insertions(+)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 1a7d53c..ca41728 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -531,6 +531,11 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
>  
>      vhost_user_write(dev, &msg, fds, fd_num);
>  
> +    if (!dev->mem_changed_req_sync) {
> +        /* The update only add regions, skip the sync */
> +        return 0;
> +    }
> +
>      if (reply_supported) {
>          return process_message_reply(dev, msg.request);
>      } else {

This still sets  VHOST_USER_NEED_REPLY_MASK - I think we
should clear reply_supported and avoid setting that in
requests.


> @@ -541,6 +546,8 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
>          vhost_user_get_features(dev, &features);
>      }
>  
> +    dev->mem_changed_req_sync = false;
> +
>      return 0;
>  }
>  
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 3d0c807..e653067 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -303,7 +303,11 @@ static void vhost_dev_assign_memory(struct vhost_dev *dev,
>          reg->guest_phys_addr = start_addr;
>          reg->userspace_addr = uaddr;
>          ++to;
> +    } else {
> +        /* Existing mapping updated, sync is required */
> +        dev->mem_changed_req_sync = true;
>      }
> +
>      assert(to <= dev->mem->nregions + 1);
>      dev->mem->nregions = to;
>  }
> @@ -533,6 +537,7 @@ static void vhost_set_memory(MemoryListener *listener,
>      } else {
>          /* Remove old mapping for this memory, if any. */
>          vhost_dev_unassign_memory(dev, start_addr, size);
> +        dev->mem_changed_req_sync = true;
>      }
>      dev->mem_changed_start_addr = MIN(dev->mem_changed_start_addr, start_addr);
>      dev->mem_changed_end_addr = MAX(dev->mem_changed_end_addr, start_addr + size - 1);
> @@ -1126,6 +1131,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>      hdev->log_enabled = false;
>      hdev->started = false;
>      hdev->memory_changed = false;
> +    hdev->mem_changed_req_sync = false;
>      memory_listener_register(&hdev->memory_listener, &address_space_memory);
>      QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>      return 0;
> @@ -1301,6 +1307,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>      if (r < 0) {
>          goto fail_features;
>      }
> +
> +    /* First time the mem table is set, skip sync for completion */
> +    hdev->mem_changed_req_sync = false;
> +
>      r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
>      if (r < 0) {
>          VHOST_OPS_DEBUG("vhost_set_mem_table failed");


Kind of asymmetrical. How about we set it to false on stop,
and to true on start? Seems cleaner to me ...


> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index e433089..4bbf36a 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -55,6 +55,7 @@ struct vhost_dev {
>      uint64_t log_size;
>      Error *migration_blocker;
>      bool memory_changed;
> +    bool mem_changed_req_sync;
>      hwaddr mem_changed_start_addr;
>      hwaddr mem_changed_end_addr;
>      const VhostOps *vhost_ops;
> -- 
> 2.7.4
Michael S. Tsirkin Sept. 8, 2016, 3:17 p.m. UTC | #5
On Thu, Sep 08, 2016 at 11:33:34AM +0000, Prerna Saxena wrote:
> Hi Maxime,
> 
> 
> On 08/09/16 2:04 pm, "Maxime Coquelin" <maxime.coquelin@redhat.com> wrote:
> 
> >The goal of this patch is to only request a sync (reply_ack,
> >or get_features) in set_mem_table only when necessary.
> >
> >It should not be necessary the first time we set the table,
> >or when we add a new regions which hadn't been merged with an
> >existing ones.
> 
> 
> I don’t think so. 
> This patch is not helping us solve the issue.
> The hang introduced by original use of get_features() in set_mem_table
> was traced down to use of TCG mode for vhost-user test.

Right but we don't know for sure there are no backends
that do this kind of simplistic thing, and assume that
get features does not happen later. And for most setups
without memory hotplug, they would be right.


> This has now
> been fixed via:
> 
> -----
> commit cdafe929615ec5eca71bcd5a3d12bab5678e5886
> Author: Eduardo Habkost <ehabkost@redhat.com>
> Date:   Fri Sep 2 15:59:43 2016 -0300
> 
> 
>     vhost-user-test: Use libqos instead of pxe-virtio.rom
>     
>     vhost-user-test relies on iPXE just to initialize the virtio-net
>     device, and doesn't do any actual packet tx/rx testing.
>     
>     In addition to that, the test relies on TCG, which is
>     imcompatible with vhost. The test only worked by accident: a bug
>     the memory backend initialization made memory regions not have
>     the DIRTY_MEMORY_CODE bit set in dirty_log_mask.
>     
>     This changes vhost-user-test to initialize the virtio-net device
>     using libqos, and not use TCG nor pxe-virtio.rom.
>     
>     Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> 
> -------
> 
> So I think the original hang seems to have been fixed with Patch 1/2 of this series alone.
> 
> Regarding Patch 2/2:
> This patch seems to filter responses from set_mem_table only for certain updates of memory regions. It violates the definition of the REPLY_ACK feature. This feature expects the client to send a response for every call of set_mem_table. And here, qemu exits the set_mem_table() function in some cases without even waiting for the reply that is going to come in.
> 
> As for use of this approach with get_features, we have already debated that on the list before : https://lists.nongnu.org/archive/html/qemu-devel/2016-07/msg00689.html
> To quote:
> "I do not entirely agree with that. The first set_mem_table command is not much
> different from subsequent set_mem_table calls."
> 
> Regards,
> Prerna
>
Maxime Coquelin Sept. 12, 2016, 7:27 a.m. UTC | #6
On 09/08/2016 05:15 PM, Michael S. Tsirkin wrote:
> On Thu, Sep 08, 2016 at 10:34:10AM +0200, Maxime Coquelin wrote:
>> The goal of this patch is to only request a sync (reply_ack,
>> or get_features) in set_mem_table only when necessary.
>>
>> It should not be necessary the first time we set the table,
>> or when we add a new regions which hadn't been merged with an
>> existing ones.
>
> I'm not sure I get the second part. If we don't sync,
> can't use of memory by guest bypass the request?
> Might this cause the backend to fail?
> I guess backend could try to recover by flushing the
> message queue, but if so, we probably should document this.
> And if not, why do we care about merged regions?

You are right, this is not working for the second part,
it was a misunderstanding from my side.

Now, for the first set_mem_table_call, Prerna is right,
the client having negotiated the reply_ack, we should wait for it.


I propose we drop patch 2, and only pick the first one.

Michael, ok for you?

Thanks,
Maxime
diff mbox

Patch

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 1a7d53c..ca41728 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -531,6 +531,11 @@  static int vhost_user_set_mem_table(struct vhost_dev *dev,
 
     vhost_user_write(dev, &msg, fds, fd_num);
 
+    if (!dev->mem_changed_req_sync) {
+        /* The update only add regions, skip the sync */
+        return 0;
+    }
+
     if (reply_supported) {
         return process_message_reply(dev, msg.request);
     } else {
@@ -541,6 +546,8 @@  static int vhost_user_set_mem_table(struct vhost_dev *dev,
         vhost_user_get_features(dev, &features);
     }
 
+    dev->mem_changed_req_sync = false;
+
     return 0;
 }
 
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 3d0c807..e653067 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -303,7 +303,11 @@  static void vhost_dev_assign_memory(struct vhost_dev *dev,
         reg->guest_phys_addr = start_addr;
         reg->userspace_addr = uaddr;
         ++to;
+    } else {
+        /* Existing mapping updated, sync is required */
+        dev->mem_changed_req_sync = true;
     }
+
     assert(to <= dev->mem->nregions + 1);
     dev->mem->nregions = to;
 }
@@ -533,6 +537,7 @@  static void vhost_set_memory(MemoryListener *listener,
     } else {
         /* Remove old mapping for this memory, if any. */
         vhost_dev_unassign_memory(dev, start_addr, size);
+        dev->mem_changed_req_sync = true;
     }
     dev->mem_changed_start_addr = MIN(dev->mem_changed_start_addr, start_addr);
     dev->mem_changed_end_addr = MAX(dev->mem_changed_end_addr, start_addr + size - 1);
@@ -1126,6 +1131,7 @@  int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     hdev->log_enabled = false;
     hdev->started = false;
     hdev->memory_changed = false;
+    hdev->mem_changed_req_sync = false;
     memory_listener_register(&hdev->memory_listener, &address_space_memory);
     QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
     return 0;
@@ -1301,6 +1307,10 @@  int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
     if (r < 0) {
         goto fail_features;
     }
+
+    /* First time the mem table is set, skip sync for completion */
+    hdev->mem_changed_req_sync = false;
+
     r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
     if (r < 0) {
         VHOST_OPS_DEBUG("vhost_set_mem_table failed");
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index e433089..4bbf36a 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -55,6 +55,7 @@  struct vhost_dev {
     uint64_t log_size;
     Error *migration_blocker;
     bool memory_changed;
+    bool mem_changed_req_sync;
     hwaddr mem_changed_start_addr;
     hwaddr mem_changed_end_addr;
     const VhostOps *vhost_ops;