Patchwork [11/21] ioport: insert event_tap_ioport() to ioport_write().

login
register
mail settings
Submitter Yoshiaki Tamura
Date Nov. 25, 2010, 6:06 a.m.
Message ID <1290665220-26478-12-git-send-email-tamura.yoshiaki@lab.ntt.co.jp>
Download mbox | patch
Permalink /patch/72988/
State New
Headers show

Comments

Yoshiaki Tamura - Nov. 25, 2010, 6:06 a.m.
Record ioport event to replay it upon failover.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
---
 ioport.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)
Michael S. Tsirkin - Nov. 28, 2010, 9:40 a.m.
On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
> Record ioport event to replay it upon failover.
> 
> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>

Interesting. This will have to be extended to support ioeventfd.
Since each eventfd is really just a binary trigger
it should be enough to read out the fd state.

> ---
>  ioport.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/ioport.c b/ioport.c
> index aa4188a..74aebf5 100644
> --- a/ioport.c
> +++ b/ioport.c
> @@ -27,6 +27,7 @@
>  
>  #include "ioport.h"
>  #include "trace.h"
> +#include "event-tap.h"
>  
>  /***********************************************************/
>  /* IO Port */
> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data)
>          default_ioport_writel
>      };
>      IOPortWriteFunc *func = ioport_write_table[index][address];
> +    event_tap_ioport(index, address, data);
>      if (!func)
>          func = default_func[index];
>      func(ioport_opaque[address], address, data);
> -- 
> 1.7.1.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yoshiaki Tamura - Nov. 28, 2010, noon
2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
>> Record ioport event to replay it upon failover.
>>
>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
>
> Interesting. This will have to be extended to support ioeventfd.
> Since each eventfd is really just a binary trigger
> it should be enough to read out the fd state.

Haven't thought about eventfd yet.  Will try doing it in the next
spin.

Yoshi

>
>> ---
>>  ioport.c |    2 ++
>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/ioport.c b/ioport.c
>> index aa4188a..74aebf5 100644
>> --- a/ioport.c
>> +++ b/ioport.c
>> @@ -27,6 +27,7 @@
>>
>>  #include "ioport.h"
>>  #include "trace.h"
>> +#include "event-tap.h"
>>
>>  /***********************************************************/
>>  /* IO Port */
>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data)
>>          default_ioport_writel
>>      };
>>      IOPortWriteFunc *func = ioport_write_table[index][address];
>> +    event_tap_ioport(index, address, data);
>>      if (!func)
>>          func = default_func[index];
>>      func(ioport_opaque[address], address, data);
>> --
>> 1.7.1.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Yoshiaki Tamura - Dec. 16, 2010, 7:37 a.m.
2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
> 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
>> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
>>> Record ioport event to replay it upon failover.
>>>
>>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
>>
>> Interesting. This will have to be extended to support ioeventfd.
>> Since each eventfd is really just a binary trigger
>> it should be enough to read out the fd state.
>
> Haven't thought about eventfd yet.  Will try doing it in the next
> spin.

Hi Michael,

I looked into eventfd and realized it's only used with vhost now.  However, I
believe vhost bypass the net layer in qemu, and there is no way for Kemari to
detect the outputs.  To me, it doesn't make sense to extend this patch to
support eventfd...

Thanks,

Yoshi

>
> Yoshi
>
>>
>>> ---
>>>  ioport.c |    2 ++
>>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/ioport.c b/ioport.c
>>> index aa4188a..74aebf5 100644
>>> --- a/ioport.c
>>> +++ b/ioport.c
>>> @@ -27,6 +27,7 @@
>>>
>>>  #include "ioport.h"
>>>  #include "trace.h"
>>> +#include "event-tap.h"
>>>
>>>  /***********************************************************/
>>>  /* IO Port */
>>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data)
>>>          default_ioport_writel
>>>      };
>>>      IOPortWriteFunc *func = ioport_write_table[index][address];
>>> +    event_tap_ioport(index, address, data);
>>>      if (!func)
>>>          func = default_func[index];
>>>      func(ioport_opaque[address], address, data);
>>> --
>>> 1.7.1.2
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
Michael S. Tsirkin - Dec. 16, 2010, 9:22 a.m.
On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
> >>> Record ioport event to replay it upon failover.
> >>>
> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
> >>
> >> Interesting. This will have to be extended to support ioeventfd.
> >> Since each eventfd is really just a binary trigger
> >> it should be enough to read out the fd state.
> >
> > Haven't thought about eventfd yet.  Will try doing it in the next
> > spin.
> 
> Hi Michael,
> 
> I looked into eventfd and realized it's only used with vhost now.

There are patches on list to use it for block/userspace net.

>  However, I
> believe vhost bypass the net layer in qemu, and there is no way for Kemari to
> detect the outputs.  To me, it doesn't make sense to extend this patch to
> support eventfd...
> 
> Thanks,
> 
> Yoshi
> 
> >
> > Yoshi
> >
> >>
> >>> ---
> >>>  ioport.c |    2 ++
> >>>  1 files changed, 2 insertions(+), 0 deletions(-)
> >>>
> >>> diff --git a/ioport.c b/ioport.c
> >>> index aa4188a..74aebf5 100644
> >>> --- a/ioport.c
> >>> +++ b/ioport.c
> >>> @@ -27,6 +27,7 @@
> >>>
> >>>  #include "ioport.h"
> >>>  #include "trace.h"
> >>> +#include "event-tap.h"
> >>>
> >>>  /***********************************************************/
> >>>  /* IO Port */
> >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data)
> >>>          default_ioport_writel
> >>>      };
> >>>      IOPortWriteFunc *func = ioport_write_table[index][address];
> >>> +    event_tap_ioport(index, address, data);
> >>>      if (!func)
> >>>          func = default_func[index];
> >>>      func(ioport_opaque[address], address, data);
> >>> --
> >>> 1.7.1.2
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe kvm" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe kvm" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >
Yoshiaki Tamura - Dec. 16, 2010, 9:50 a.m.
2010/12/16 Michael S. Tsirkin <mst@redhat.com>:
> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
>> >>> Record ioport event to replay it upon failover.
>> >>>
>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
>> >>
>> >> Interesting. This will have to be extended to support ioeventfd.
>> >> Since each eventfd is really just a binary trigger
>> >> it should be enough to read out the fd state.
>> >
>> > Haven't thought about eventfd yet.  Will try doing it in the next
>> > spin.
>>
>> Hi Michael,
>>
>> I looked into eventfd and realized it's only used with vhost now.
>
> There are patches on list to use it for block/userspace net.

Thanks.  Now I understand.
In that case, inserting an even-tap function to the following code
should be appropriate?

int event_notifier_test_and_clear(EventNotifier *e)
{
    uint64_t value;
    int r = read(e->fd, &value, sizeof(value));
    return r == sizeof(value);
}

>
>>  However, I
>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to
>> detect the outputs.  To me, it doesn't make sense to extend this patch to
>> support eventfd...
>>
>> Thanks,
>>
>> Yoshi
>>
>> >
>> > Yoshi
>> >
>> >>
>> >>> ---
>> >>>  ioport.c |    2 ++
>> >>>  1 files changed, 2 insertions(+), 0 deletions(-)
>> >>>
>> >>> diff --git a/ioport.c b/ioport.c
>> >>> index aa4188a..74aebf5 100644
>> >>> --- a/ioport.c
>> >>> +++ b/ioport.c
>> >>> @@ -27,6 +27,7 @@
>> >>>
>> >>>  #include "ioport.h"
>> >>>  #include "trace.h"
>> >>> +#include "event-tap.h"
>> >>>
>> >>>  /***********************************************************/
>> >>>  /* IO Port */
>> >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data)
>> >>>          default_ioport_writel
>> >>>      };
>> >>>      IOPortWriteFunc *func = ioport_write_table[index][address];
>> >>> +    event_tap_ioport(index, address, data);
>> >>>      if (!func)
>> >>>          func = default_func[index];
>> >>>      func(ioport_opaque[address], address, data);
>> >>> --
>> >>> 1.7.1.2
>> >>>
>> >>> --
>> >>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> >>> the body of a message to majordomo@vger.kernel.org
>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> >> the body of a message to majordomo@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Michael S. Tsirkin - Dec. 16, 2010, 9:54 a.m.
On Thu, Dec 16, 2010 at 06:50:04PM +0900, Yoshiaki Tamura wrote:
> 2010/12/16 Michael S. Tsirkin <mst@redhat.com>:
> > On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
> >> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
> >> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
> >> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
> >> >>> Record ioport event to replay it upon failover.
> >> >>>
> >> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
> >> >>
> >> >> Interesting. This will have to be extended to support ioeventfd.
> >> >> Since each eventfd is really just a binary trigger
> >> >> it should be enough to read out the fd state.
> >> >
> >> > Haven't thought about eventfd yet.  Will try doing it in the next
> >> > spin.
> >>
> >> Hi Michael,
> >>
> >> I looked into eventfd and realized it's only used with vhost now.
> >
> > There are patches on list to use it for block/userspace net.
> 
> Thanks.  Now I understand.
> In that case, inserting an even-tap function to the following code
> should be appropriate?
> 
> int event_notifier_test_and_clear(EventNotifier *e)
> {
>     uint64_t value;
>     int r = read(e->fd, &value, sizeof(value));
>     return r == sizeof(value);
> }

Possibly.

> >
> >>  However, I
> >> believe vhost bypass the net layer in qemu, and there is no way for Kemari to
> >> detect the outputs.


Then maybe you should check for this combination and either disable
vhost-net on the backend when kemari is active or fail.

> >>  To me, it doesn't make sense to extend this patch to
> >> support eventfd...
> >> Thanks,
> >>
> >> Yoshi
> >>
> >> >
> >> > Yoshi
> >> >
> >> >>
> >> >>> ---
> >> >>>  ioport.c |    2 ++
> >> >>>  1 files changed, 2 insertions(+), 0 deletions(-)
> >> >>>
> >> >>> diff --git a/ioport.c b/ioport.c
> >> >>> index aa4188a..74aebf5 100644
> >> >>> --- a/ioport.c
> >> >>> +++ b/ioport.c
> >> >>> @@ -27,6 +27,7 @@
> >> >>>
> >> >>>  #include "ioport.h"
> >> >>>  #include "trace.h"
> >> >>> +#include "event-tap.h"
> >> >>>
> >> >>>  /***********************************************************/
> >> >>>  /* IO Port */
> >> >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data)
> >> >>>          default_ioport_writel
> >> >>>      };
> >> >>>      IOPortWriteFunc *func = ioport_write_table[index][address];
> >> >>> +    event_tap_ioport(index, address, data);
> >> >>>      if (!func)
> >> >>>          func = default_func[index];
> >> >>>      func(ioport_opaque[address], address, data);
> >> >>> --
> >> >>> 1.7.1.2
> >> >>>
> >> >>> --
> >> >>> To unsubscribe from this list: send the line "unsubscribe kvm" in
> >> >>> the body of a message to majordomo@vger.kernel.org
> >> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe kvm" in
> >> >> the body of a message to majordomo@vger.kernel.org
> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>
> >> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
Stefan Hajnoczi - Dec. 16, 2010, 4:27 p.m.
On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura
<tamura.yoshiaki@lab.ntt.co.jp> wrote:
> 2010/12/16 Michael S. Tsirkin <mst@redhat.com>:
>> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
>>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
>>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
>>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
>>> >>> Record ioport event to replay it upon failover.
>>> >>>
>>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
>>> >>
>>> >> Interesting. This will have to be extended to support ioeventfd.
>>> >> Since each eventfd is really just a binary trigger
>>> >> it should be enough to read out the fd state.
>>> >
>>> > Haven't thought about eventfd yet.  Will try doing it in the next
>>> > spin.
>>>
>>> Hi Michael,
>>>
>>> I looked into eventfd and realized it's only used with vhost now.
>>
>> There are patches on list to use it for block/userspace net.
>
> Thanks.  Now I understand.
> In that case, inserting an even-tap function to the following code
> should be appropriate?
>
> int event_notifier_test_and_clear(EventNotifier *e)
> {
>    uint64_t value;
>    int r = read(e->fd, &value, sizeof(value));
>    return r == sizeof(value);
> }
>
>>
>>>  However, I
>>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to
>>> detect the outputs.  To me, it doesn't make sense to extend this patch to
>>> support eventfd...

Here is the userspace ioeventfd patch series:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html

Instead of switching to QEMU userspace to handle the virtqueue kick
pio write, we signal the eventfd inside the kernel and resume guest
code execution.  The I/O thread can then process the virtqueue kick in
parallel to guest code execution.

I think this can still be tied into Kemari.  If you are switching to a
pure net/block-layer event tap instead of pio/mmio, then I think it
should just work.

For vhost it would be more difficult to integrate with Kemari.

Stefan
Yoshiaki Tamura - Dec. 17, 2010, 4:19 p.m.
2010/12/17 Stefan Hajnoczi <stefanha@gmail.com>:
> On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura
> <tamura.yoshiaki@lab.ntt.co.jp> wrote:
>> 2010/12/16 Michael S. Tsirkin <mst@redhat.com>:
>>> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
>>>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
>>>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
>>>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
>>>> >>> Record ioport event to replay it upon failover.
>>>> >>>
>>>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
>>>> >>
>>>> >> Interesting. This will have to be extended to support ioeventfd.
>>>> >> Since each eventfd is really just a binary trigger
>>>> >> it should be enough to read out the fd state.
>>>> >
>>>> > Haven't thought about eventfd yet.  Will try doing it in the next
>>>> > spin.
>>>>
>>>> Hi Michael,
>>>>
>>>> I looked into eventfd and realized it's only used with vhost now.
>>>
>>> There are patches on list to use it for block/userspace net.
>>
>> Thanks.  Now I understand.
>> In that case, inserting an even-tap function to the following code
>> should be appropriate?
>>
>> int event_notifier_test_and_clear(EventNotifier *e)
>> {
>>    uint64_t value;
>>    int r = read(e->fd, &value, sizeof(value));
>>    return r == sizeof(value);
>> }
>>
>>>
>>>>  However, I
>>>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to
>>>> detect the outputs.  To me, it doesn't make sense to extend this patch to
>>>> support eventfd...
>
> Here is the userspace ioeventfd patch series:
> http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html
>
> Instead of switching to QEMU userspace to handle the virtqueue kick
> pio write, we signal the eventfd inside the kernel and resume guest
> code execution.  The I/O thread can then process the virtqueue kick in
> parallel to guest code execution.
>
> I think this can still be tied into Kemari.  If you are switching to a
> pure net/block-layer event tap instead of pio/mmio, then I think it
> should just work.

That should take a while until we solve how to set correct
callbacks to the secondary upon failover.  BTW, do you have a
plan to move the eventfd framework to the upper layer as
pio/mmio.  Not only Kemari works for free, other emulators should
be able to benefit from it.

> For vhost it would be more difficult to integrate with Kemari.

At this point, it's impossible.  As Michael said, I should
prevent starting Kemari when vhost=on.

Yoshi

>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Stefan Hajnoczi - Dec. 18, 2010, 8:36 a.m.
On Fri, Dec 17, 2010 at 4:19 PM, Yoshiaki Tamura
<tamura.yoshiaki@lab.ntt.co.jp> wrote:
> 2010/12/17 Stefan Hajnoczi <stefanha@gmail.com>:
>> On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura
>> <tamura.yoshiaki@lab.ntt.co.jp> wrote:
>>> 2010/12/16 Michael S. Tsirkin <mst@redhat.com>:
>>>> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
>>>>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>:
>>>>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>:
>>>>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
>>>>> >>> Record ioport event to replay it upon failover.
>>>>> >>>
>>>>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
>>>>> >>
>>>>> >> Interesting. This will have to be extended to support ioeventfd.
>>>>> >> Since each eventfd is really just a binary trigger
>>>>> >> it should be enough to read out the fd state.
>>>>> >
>>>>> > Haven't thought about eventfd yet.  Will try doing it in the next
>>>>> > spin.
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> I looked into eventfd and realized it's only used with vhost now.
>>>>
>>>> There are patches on list to use it for block/userspace net.
>>>
>>> Thanks.  Now I understand.
>>> In that case, inserting an even-tap function to the following code
>>> should be appropriate?
>>>
>>> int event_notifier_test_and_clear(EventNotifier *e)
>>> {
>>>    uint64_t value;
>>>    int r = read(e->fd, &value, sizeof(value));
>>>    return r == sizeof(value);
>>> }
>>>
>>>>
>>>>>  However, I
>>>>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to
>>>>> detect the outputs.  To me, it doesn't make sense to extend this patch to
>>>>> support eventfd...
>>
>> Here is the userspace ioeventfd patch series:
>> http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html
>>
>> Instead of switching to QEMU userspace to handle the virtqueue kick
>> pio write, we signal the eventfd inside the kernel and resume guest
>> code execution.  The I/O thread can then process the virtqueue kick in
>> parallel to guest code execution.
>>
>> I think this can still be tied into Kemari.  If you are switching to a
>> pure net/block-layer event tap instead of pio/mmio, then I think it
>> should just work.
>
> That should take a while until we solve how to set correct
> callbacks to the secondary upon failover.  BTW, do you have a
> plan to move the eventfd framework to the upper layer as
> pio/mmio.  Not only Kemari works for free, other emulators should
> be able to benefit from it.

I'm not sure I understand the question but I have considered making
ioeventfd a first-class interface like register_ioport_write().  In
some ways that would be cleaner than the way we use ioeventfd in vhost
and virtio-pci today.

>> For vhost it would be more difficult to integrate with Kemari.
>
> At this point, it's impossible.  As Michael said, I should
> prevent starting Kemari when vhost=on.

If you add some functionality to vhost it might be possible, although
that would slow it down.  So perhaps for the near future using vhost
with Kemari is pointless anyway since you won't be able to reach the
performance that vhost-net can achieve.

Stefan

Patch

diff --git a/ioport.c b/ioport.c
index aa4188a..74aebf5 100644
--- a/ioport.c
+++ b/ioport.c
@@ -27,6 +27,7 @@ 
 
 #include "ioport.h"
 #include "trace.h"
+#include "event-tap.h"
 
 /***********************************************************/
 /* IO Port */
@@ -76,6 +77,7 @@  static void ioport_write(int index, uint32_t address, uint32_t data)
         default_ioport_writel
     };
     IOPortWriteFunc *func = ioport_write_table[index][address];
+    event_tap_ioport(index, address, data);
     if (!func)
         func = default_func[index];
     func(ioport_opaque[address], address, data);