diff mbox

[1/2] Vring: vring's listener's priority should higher than kvm

Message ID 1368060022-16911-1-git-send-email-qemulist@gmail.com
State New
Headers show

Commit Message

pingfan liu May 9, 2013, 12:40 a.m. UTC
From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>

Hosts threads which handle vring should have high MemoryListener priority
than kvm. For currently code, take the following scenario:
  kvm_region_add() run earlier before vhost_region_add(), then in guest,
vring's desc[i] can refer to addressX in the new region known by guest.
But vhost does not know this new region yet, and the vring handler will
fail.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 hw/virtio/dataplane/hostmem.c |    2 +-
 hw/virtio/vhost.c             |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Stefan Hajnoczi May 9, 2013, 8:30 a.m. UTC | #1
On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> 
> Hosts threads which handle vring should have high MemoryListener priority
> than kvm. For currently code, take the following scenario:
>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
> vring's desc[i] can refer to addressX in the new region known by guest.
> But vhost does not know this new region yet, and the vring handler will
> fail.

Is there a concrete scenario where this happens?

I can think of situations like the ioeventfd being readable before
vhost/hostmem is populated.  But I don't see how that's related to the
priority of kvm_region_add().

Stefan
Michael S. Tsirkin May 9, 2013, 8:44 a.m. UTC | #2
On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> 
> Hosts threads which handle vring should have high MemoryListener priority
> than kvm. For currently code, take the following scenario:
>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
> vring's desc[i] can refer to addressX in the new region known by guest.
> But vhost does not know this new region yet, and the vring handler will
> fail.
> 
> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>

Is this seen in testing, or are you describing a theorecitical
scenario? Please make this clear in the commit log.

> ---
>  hw/virtio/dataplane/hostmem.c |    2 +-
>  hw/virtio/vhost.c             |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
> index 37292ff..67cbce1 100644
> --- a/hw/virtio/dataplane/hostmem.c
> +++ b/hw/virtio/dataplane/hostmem.c
> @@ -158,7 +158,7 @@ void hostmem_init(HostMem *hostmem)
>          .eventfd_del = hostmem_listener_eventfd_dummy,
>          .coalesced_mmio_add = hostmem_listener_coalesced_mmio_dummy,
>          .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy,
> -        .priority = 10,
> +        .priority = 9,
>      };
>  
>      memory_listener_register(&hostmem->listener, &address_space_memory);
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index fbabf99..91c313b 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -856,7 +856,7 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
>          .log_global_stop = vhost_log_global_stop,
>          .eventfd_add = vhost_eventfd_add,
>          .eventfd_del = vhost_eventfd_del,
> -        .priority = 10
> +        .priority = 9
>      };
>      hdev->mem = g_malloc0(offsetof(struct vhost_memory, regions));
>      hdev->n_mem_sections = 0;
> -- 
> 1.7.4.4
pingfan liu May 9, 2013, 8:54 a.m. UTC | #3
On Thu, May 9, 2013 at 4:44 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
>> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>
>> Hosts threads which handle vring should have high MemoryListener priority
>> than kvm. For currently code, take the following scenario:
>>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
>> vring's desc[i] can refer to addressX in the new region known by guest.
>> But vhost does not know this new region yet, and the vring handler will
>> fail.
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>
> Is this seen in testing, or are you describing a theorecitical
> scenario? Please make this clear in the commit log.
>
A  theorecitical scenario.  I think vcpu thread and vhost are async,
so we need this method to sync.
>> ---
>>  hw/virtio/dataplane/hostmem.c |    2 +-
>>  hw/virtio/vhost.c             |    2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
>> index 37292ff..67cbce1 100644
>> --- a/hw/virtio/dataplane/hostmem.c
>> +++ b/hw/virtio/dataplane/hostmem.c
>> @@ -158,7 +158,7 @@ void hostmem_init(HostMem *hostmem)
>>          .eventfd_del = hostmem_listener_eventfd_dummy,
>>          .coalesced_mmio_add = hostmem_listener_coalesced_mmio_dummy,
>>          .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy,
>> -        .priority = 10,
>> +        .priority = 9,
>>      };
>>
>>      memory_listener_register(&hostmem->listener, &address_space_memory);
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index fbabf99..91c313b 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -856,7 +856,7 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
>>          .log_global_stop = vhost_log_global_stop,
>>          .eventfd_add = vhost_eventfd_add,
>>          .eventfd_del = vhost_eventfd_del,
>> -        .priority = 10
>> +        .priority = 9
>>      };
>>      hdev->mem = g_malloc0(offsetof(struct vhost_memory, regions));
>>      hdev->n_mem_sections = 0;
>> --
>> 1.7.4.4
pingfan liu May 9, 2013, 9 a.m. UTC | #4
On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
>> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>
>> Hosts threads which handle vring should have high MemoryListener priority
>> than kvm. For currently code, take the following scenario:
>>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
>> vring's desc[i] can refer to addressX in the new region known by guest.
>> But vhost does not know this new region yet, and the vring handler will
>> fail.
>
> Is there a concrete scenario where this happens?
>
> I can think of situations like the ioeventfd being readable before
> vhost/hostmem is populated.  But I don't see how that's related to the
> priority of kvm_region_add().
>
For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in
the new added memory, and kick vhost. The vhost has not added this new
region, so its local lookup table can not translate this new address,
and vring handler will fail.  If vhost priority is higher than kvm,
then, it will know this new address earlier than kvm.

> Stefan
Paolo Bonzini May 9, 2013, 2:58 p.m. UTC | #5
Il 09/05/2013 10:54, liu ping fan ha scritto:
> On Thu, May 9, 2013 at 4:44 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
>>> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>>
>>> Hosts threads which handle vring should have high MemoryListener priority
>>> than kvm. For currently code, take the following scenario:
>>>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
>>> vring's desc[i] can refer to addressX in the new region known by guest.
>>> But vhost does not know this new region yet, and the vring handler will
>>> fail.
>>>
>>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>
>> Is this seen in testing, or are you describing a theorecitical
>> scenario? Please make this clear in the commit log.
>>
> A  theorecitical scenario.  I think vcpu thread and vhost are async,
> so we need this method to sync.

But why should this matter for hostmem?  It doesn't communicate in any
way with the hypervisor.

Paolo
Stefan Hajnoczi May 9, 2013, 3:26 p.m. UTC | #6
On Thu, May 09, 2013 at 05:00:20PM +0800, liu ping fan wrote:
> On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
> >> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> >>
> >> Hosts threads which handle vring should have high MemoryListener priority
> >> than kvm. For currently code, take the following scenario:
> >>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
> >> vring's desc[i] can refer to addressX in the new region known by guest.
> >> But vhost does not know this new region yet, and the vring handler will
> >> fail.
> >
> > Is there a concrete scenario where this happens?
> >
> > I can think of situations like the ioeventfd being readable before
> > vhost/hostmem is populated.  But I don't see how that's related to the
> > priority of kvm_region_add().
> >
> For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in
> the new added memory, and kick vhost. The vhost has not added this new
> region, so its local lookup table can not translate this new address,
> and vring handler will fail.  If vhost priority is higher than kvm,
> then, it will know this new address earlier than kvm.

Isn't the real solution to ensure that the memory API is up-to-date
before we notify the guest of memory hotplug?

I still don't see a kvm vs vhost race.  I see a guest vs vhost race
which priority doesn't fix.

Stefan
pingfan liu May 10, 2013, 6:03 a.m. UTC | #7
On Thu, May 9, 2013 at 11:26 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Thu, May 09, 2013 at 05:00:20PM +0800, liu ping fan wrote:
>> On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>> > On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
>> >> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> >>
>> >> Hosts threads which handle vring should have high MemoryListener priority
>> >> than kvm. For currently code, take the following scenario:
>> >>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
>> >> vring's desc[i] can refer to addressX in the new region known by guest.
>> >> But vhost does not know this new region yet, and the vring handler will
>> >> fail.
>> >
>> > Is there a concrete scenario where this happens?
>> >
>> > I can think of situations like the ioeventfd being readable before
>> > vhost/hostmem is populated.  But I don't see how that's related to the
>> > priority of kvm_region_add().
>> >
>> For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in
>> the new added memory, and kick vhost. The vhost has not added this new
>> region, so its local lookup table can not translate this new address,
>> and vring handler will fail.  If vhost priority is higher than kvm,
>> then, it will know this new address earlier than kvm.
>
> Isn't the real solution to ensure that the memory API is up-to-date
> before we notify the guest of memory hotplug?
>
No, it is not.

> I still don't see a kvm vs vhost race.  I see a guest vs vhost race
> which priority doesn't fix.
>
Yes, you are right.
The priority should be vhost > guest, and kvm > guest. So vhost == kvm
is OK. But can it be higher or why chosen as 10 not zero?

If the dependency only lies between MemoryListeners and guest, not
between listeners, then is the priority meanless?  I think we should
make sure about this, because if converting core listener to rcu
style, we will definitely break the sequence of region_add/del, ie
both add&del comes after kvm.

> Stefan
Stefan Hajnoczi May 10, 2013, 7:12 a.m. UTC | #8
On Fri, May 10, 2013 at 02:03:34PM +0800, liu ping fan wrote:
> On Thu, May 9, 2013 at 11:26 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Thu, May 09, 2013 at 05:00:20PM +0800, liu ping fan wrote:
> >> On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >> > On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
> >> >> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> >> >>
> >> >> Hosts threads which handle vring should have high MemoryListener priority
> >> >> than kvm. For currently code, take the following scenario:
> >> >>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
> >> >> vring's desc[i] can refer to addressX in the new region known by guest.
> >> >> But vhost does not know this new region yet, and the vring handler will
> >> >> fail.
> >> >
> >> > Is there a concrete scenario where this happens?
> >> >
> >> > I can think of situations like the ioeventfd being readable before
> >> > vhost/hostmem is populated.  But I don't see how that's related to the
> >> > priority of kvm_region_add().
> >> >
> >> For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in
> >> the new added memory, and kick vhost. The vhost has not added this new
> >> region, so its local lookup table can not translate this new address,
> >> and vring handler will fail.  If vhost priority is higher than kvm,
> >> then, it will know this new address earlier than kvm.
> >
> > Isn't the real solution to ensure that the memory API is up-to-date
> > before we notify the guest of memory hotplug?
> >
> No, it is not.
> 
> > I still don't see a kvm vs vhost race.  I see a guest vs vhost race
> > which priority doesn't fix.
> >
> Yes, you are right.
> The priority should be vhost > guest, and kvm > guest. So vhost == kvm
> is OK. But can it be higher or why chosen as 10 not zero?
> 
> If the dependency only lies between MemoryListeners and guest, not
> between listeners, then is the priority meanless?  I think we should
> make sure about this, because if converting core listener to rcu
> style, we will definitely break the sequence of region_add/del, ie
> both add&del comes after kvm.

Okay, so now we're left with the question "what are the ordering
dependencies between memory listeners?".

I poked around with git-blame(1) but didn't find an explanation.  The
best I can come up with is that the core listeners in exec.c update
QEMU's guest RAM and I/O port mappings, kvm/vhost/xen should be able to
query them.  Therefore exec.c listeners have priority 0 or 1.

BTW the commit that introduced priorities is:

  commit 72e22d2fe17b85e56b4f0c437c61c6e2de97b308
  Author: Avi Kivity <avi@redhat.com>
  Date:   Wed Feb 8 15:05:50 2012 +0200

      memory: switch memory listeners to a QTAILQ

Stefan
pingfan liu May 10, 2013, 9:04 a.m. UTC | #9
On Fri, May 10, 2013 at 3:12 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Fri, May 10, 2013 at 02:03:34PM +0800, liu ping fan wrote:
>> On Thu, May 9, 2013 at 11:26 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>> > On Thu, May 09, 2013 at 05:00:20PM +0800, liu ping fan wrote:
>> >> On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>> >> > On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote:
>> >> >> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> >> >>
>> >> >> Hosts threads which handle vring should have high MemoryListener priority
>> >> >> than kvm. For currently code, take the following scenario:
>> >> >>   kvm_region_add() run earlier before vhost_region_add(), then in guest,
>> >> >> vring's desc[i] can refer to addressX in the new region known by guest.
>> >> >> But vhost does not know this new region yet, and the vring handler will
>> >> >> fail.
>> >> >
>> >> > Is there a concrete scenario where this happens?
>> >> >
>> >> > I can think of situations like the ioeventfd being readable before
>> >> > vhost/hostmem is populated.  But I don't see how that's related to the
>> >> > priority of kvm_region_add().
>> >> >
>> >> For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in
>> >> the new added memory, and kick vhost. The vhost has not added this new
>> >> region, so its local lookup table can not translate this new address,
>> >> and vring handler will fail.  If vhost priority is higher than kvm,
>> >> then, it will know this new address earlier than kvm.
>> >
>> > Isn't the real solution to ensure that the memory API is up-to-date
>> > before we notify the guest of memory hotplug?
>> >
>> No, it is not.
>>
>> > I still don't see a kvm vs vhost race.  I see a guest vs vhost race
>> > which priority doesn't fix.
>> >
>> Yes, you are right.
>> The priority should be vhost > guest, and kvm > guest. So vhost == kvm
>> is OK. But can it be higher or why chosen as 10 not zero?
>>
>> If the dependency only lies between MemoryListeners and guest, not
>> between listeners, then is the priority meanless?  I think we should
>> make sure about this, because if converting core listener to rcu
>> style, we will definitely break the sequence of region_add/del, ie
>> both add&del comes after kvm.
>
> Okay, so now we're left with the question "what are the ordering
> dependencies between memory listeners?".
>
> I poked around with git-blame(1) but didn't find an explanation.  The
> best I can come up with is that the core listeners in exec.c update
> QEMU's guest RAM and I/O port mappings, kvm/vhost/xen should be able to
> query them.  Therefore exec.c listeners have priority 0 or 1.
>
I think
  1. vhost has not relation with exec.c listeners, right?
  2. kvm has relation with exec.c listeners, but as discussed, the
real dependency is between guest and exec.c listeners.
      So kvm just need to get ready before guest, and this is not
guarded by "priority"
  3. xen ? I do not know, but I guess it is like kvm.

So can we ignore the priority? It seems redundant since we rely on
other mechenism.
> BTW the commit that introduced priorities is:
>
>   commit 72e22d2fe17b85e56b4f0c437c61c6e2de97b308
>   Author: Avi Kivity <avi@redhat.com>
>   Date:   Wed Feb 8 15:05:50 2012 +0200
>
>       memory: switch memory listeners to a QTAILQ
>
> Stefan
diff mbox

Patch

diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
index 37292ff..67cbce1 100644
--- a/hw/virtio/dataplane/hostmem.c
+++ b/hw/virtio/dataplane/hostmem.c
@@ -158,7 +158,7 @@  void hostmem_init(HostMem *hostmem)
         .eventfd_del = hostmem_listener_eventfd_dummy,
         .coalesced_mmio_add = hostmem_listener_coalesced_mmio_dummy,
         .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy,
-        .priority = 10,
+        .priority = 9,
     };
 
     memory_listener_register(&hostmem->listener, &address_space_memory);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index fbabf99..91c313b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -856,7 +856,7 @@  int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
         .log_global_stop = vhost_log_global_stop,
         .eventfd_add = vhost_eventfd_add,
         .eventfd_del = vhost_eventfd_del,
-        .priority = 10
+        .priority = 9
     };
     hdev->mem = g_malloc0(offsetof(struct vhost_memory, regions));
     hdev->n_mem_sections = 0;