Message ID | 1290665220-26478-12-git-send-email-tamura.yoshiaki@lab.ntt.co.jp |
---|---|
State | New |
Headers | show |
On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: > Record ioport event to replay it upon failover. > > Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> Interesting. This will have to be extended to support ioeventfd. Since each eventfd is really just a binary trigger it should be enough to read out the fd state. > --- > ioport.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/ioport.c b/ioport.c > index aa4188a..74aebf5 100644 > --- a/ioport.c > +++ b/ioport.c > @@ -27,6 +27,7 @@ > > #include "ioport.h" > #include "trace.h" > +#include "event-tap.h" > > /***********************************************************/ > /* IO Port */ > @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) > default_ioport_writel > }; > IOPortWriteFunc *func = ioport_write_table[index][address]; > + event_tap_ioport(index, address, data); > if (!func) > func = default_func[index]; > func(ioport_opaque[address], address, data); > -- > 1.7.1.2 > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
2010/11/28 Michael S. Tsirkin <mst@redhat.com>: > On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: >> Record ioport event to replay it upon failover. >> >> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> > > Interesting. This will have to be extended to support ioeventfd. > Since each eventfd is really just a binary trigger > it should be enough to read out the fd state. Haven't thought about eventfd yet. Will try doing it in the next spin. Yoshi > >> --- >> ioport.c | 2 ++ >> 1 files changed, 2 insertions(+), 0 deletions(-) >> >> diff --git a/ioport.c b/ioport.c >> index aa4188a..74aebf5 100644 >> --- a/ioport.c >> +++ b/ioport.c >> @@ -27,6 +27,7 @@ >> >> #include "ioport.h" >> #include "trace.h" >> +#include "event-tap.h" >> >> /***********************************************************/ >> /* IO Port */ >> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) >> default_ioport_writel >> }; >> IOPortWriteFunc *func = ioport_write_table[index][address]; >> + event_tap_ioport(index, address, data); >> if (!func) >> func = default_func[index]; >> func(ioport_opaque[address], address, data); >> -- >> 1.7.1.2 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: >>> Record ioport event to replay it upon failover. >>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> >> >> Interesting. This will have to be extended to support ioeventfd. >> Since each eventfd is really just a binary trigger >> it should be enough to read out the fd state. > > Haven't thought about eventfd yet. Will try doing it in the next > spin. Hi Michael, I looked into eventfd and realized it's only used with vhost now. However, I believe vhost bypass the net layer in qemu, and there is no way for Kemari to detect the outputs. To me, it doesn't make sense to extend this patch to support eventfd... Thanks, Yoshi > > Yoshi > >> >>> --- >>> ioport.c | 2 ++ >>> 1 files changed, 2 insertions(+), 0 deletions(-) >>> >>> diff --git a/ioport.c b/ioport.c >>> index aa4188a..74aebf5 100644 >>> --- a/ioport.c >>> +++ b/ioport.c >>> @@ -27,6 +27,7 @@ >>> >>> #include "ioport.h" >>> #include "trace.h" >>> +#include "event-tap.h" >>> >>> /***********************************************************/ >>> /* IO Port */ >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) >>> default_ioport_writel >>> }; >>> IOPortWriteFunc *func = ioport_write_table[index][address]; >>> + event_tap_ioport(index, address, data); >>> if (!func) >>> func = default_func[index]; >>> func(ioport_opaque[address], address, data); >>> -- >>> 1.7.1.2 >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe kvm" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >
On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote: > 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: > > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: > >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: > >>> Record ioport event to replay it upon failover. > >>> > >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> > >> > >> Interesting. This will have to be extended to support ioeventfd. > >> Since each eventfd is really just a binary trigger > >> it should be enough to read out the fd state. > > > > Haven't thought about eventfd yet. Will try doing it in the next > > spin. > > Hi Michael, > > I looked into eventfd and realized it's only used with vhost now. There are patches on list to use it for block/userspace net. > However, I > believe vhost bypass the net layer in qemu, and there is no way for Kemari to > detect the outputs. To me, it doesn't make sense to extend this patch to > support eventfd... > > Thanks, > > Yoshi > > > > > Yoshi > > > >> > >>> --- > >>> ioport.c | 2 ++ > >>> 1 files changed, 2 insertions(+), 0 deletions(-) > >>> > >>> diff --git a/ioport.c b/ioport.c > >>> index aa4188a..74aebf5 100644 > >>> --- a/ioport.c > >>> +++ b/ioport.c > >>> @@ -27,6 +27,7 @@ > >>> > >>> #include "ioport.h" > >>> #include "trace.h" > >>> +#include "event-tap.h" > >>> > >>> /***********************************************************/ > >>> /* IO Port */ > >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) > >>> default_ioport_writel > >>> }; > >>> IOPortWriteFunc *func = ioport_write_table[index][address]; > >>> + event_tap_ioport(index, address, data); > >>> if (!func) > >>> func = default_func[index]; > >>> func(ioport_opaque[address], address, data); > >>> -- > >>> 1.7.1.2 > >>> > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe kvm" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> -- > >> To unsubscribe from this list: send the line "unsubscribe kvm" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > >
2010/12/16 Michael S. Tsirkin <mst@redhat.com>: > On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote: >> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: >> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: >> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: >> >>> Record ioport event to replay it upon failover. >> >>> >> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> >> >> >> >> Interesting. This will have to be extended to support ioeventfd. >> >> Since each eventfd is really just a binary trigger >> >> it should be enough to read out the fd state. >> > >> > Haven't thought about eventfd yet. Will try doing it in the next >> > spin. >> >> Hi Michael, >> >> I looked into eventfd and realized it's only used with vhost now. > > There are patches on list to use it for block/userspace net. Thanks. Now I understand. In that case, inserting an even-tap function to the following code should be appropriate? int event_notifier_test_and_clear(EventNotifier *e) { uint64_t value; int r = read(e->fd, &value, sizeof(value)); return r == sizeof(value); } > >> However, I >> believe vhost bypass the net layer in qemu, and there is no way for Kemari to >> detect the outputs. To me, it doesn't make sense to extend this patch to >> support eventfd... >> >> Thanks, >> >> Yoshi >> >> > >> > Yoshi >> > >> >> >> >>> --- >> >>> ioport.c | 2 ++ >> >>> 1 files changed, 2 insertions(+), 0 deletions(-) >> >>> >> >>> diff --git a/ioport.c b/ioport.c >> >>> index aa4188a..74aebf5 100644 >> >>> --- a/ioport.c >> >>> +++ b/ioport.c >> >>> @@ -27,6 +27,7 @@ >> >>> >> >>> #include "ioport.h" >> >>> #include "trace.h" >> >>> +#include "event-tap.h" >> >>> >> >>> /***********************************************************/ >> >>> /* IO Port */ >> >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) >> >>> default_ioport_writel >> >>> }; >> >>> IOPortWriteFunc *func = ioport_write_table[index][address]; >> >>> + event_tap_ioport(index, address, data); >> >>> if (!func) >> >>> func = default_func[index]; >> >>> func(ioport_opaque[address], address, data); >> >>> -- >> >>> 1.7.1.2 >> >>> >> >>> -- >> >>> To unsubscribe from this list: send the line "unsubscribe kvm" in >> >>> the body of a message to majordomo@vger.kernel.org >> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> >> the body of a message to majordomo@vger.kernel.org >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
On Thu, Dec 16, 2010 at 06:50:04PM +0900, Yoshiaki Tamura wrote: > 2010/12/16 Michael S. Tsirkin <mst@redhat.com>: > > On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote: > >> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: > >> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: > >> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: > >> >>> Record ioport event to replay it upon failover. > >> >>> > >> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> > >> >> > >> >> Interesting. This will have to be extended to support ioeventfd. > >> >> Since each eventfd is really just a binary trigger > >> >> it should be enough to read out the fd state. > >> > > >> > Haven't thought about eventfd yet. Will try doing it in the next > >> > spin. > >> > >> Hi Michael, > >> > >> I looked into eventfd and realized it's only used with vhost now. > > > > There are patches on list to use it for block/userspace net. > > Thanks. Now I understand. > In that case, inserting an even-tap function to the following code > should be appropriate? > > int event_notifier_test_and_clear(EventNotifier *e) > { > uint64_t value; > int r = read(e->fd, &value, sizeof(value)); > return r == sizeof(value); > } Possibly. > > > >> However, I > >> believe vhost bypass the net layer in qemu, and there is no way for Kemari to > >> detect the outputs. Then maybe you should check for this combination and either disable vhost-net on the backend when kemari is active or fail. > >> To me, it doesn't make sense to extend this patch to > >> support eventfd... > >> Thanks, > >> > >> Yoshi > >> > >> > > >> > Yoshi > >> > > >> >> > >> >>> --- > >> >>> ioport.c | 2 ++ > >> >>> 1 files changed, 2 insertions(+), 0 deletions(-) > >> >>> > >> >>> diff --git a/ioport.c b/ioport.c > >> >>> index aa4188a..74aebf5 100644 > >> >>> --- a/ioport.c > >> >>> +++ b/ioport.c > >> >>> @@ -27,6 +27,7 @@ > >> >>> > >> >>> #include "ioport.h" > >> >>> #include "trace.h" > >> >>> +#include "event-tap.h" > >> >>> > >> >>> /***********************************************************/ > >> >>> /* IO Port */ > >> >>> @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) > >> >>> default_ioport_writel > >> >>> }; > >> >>> IOPortWriteFunc *func = ioport_write_table[index][address]; > >> >>> + event_tap_ioport(index, address, data); > >> >>> if (!func) > >> >>> func = default_func[index]; > >> >>> func(ioport_opaque[address], address, data); > >> >>> -- > >> >>> 1.7.1.2 > >> >>> > >> >>> -- > >> >>> To unsubscribe from this list: send the line "unsubscribe kvm" in > >> >>> the body of a message to majordomo@vger.kernel.org > >> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >> -- > >> >> To unsubscribe from this list: send the line "unsubscribe kvm" in > >> >> the body of a message to majordomo@vger.kernel.org > >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >> > >> > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > >
On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> wrote: > 2010/12/16 Michael S. Tsirkin <mst@redhat.com>: >> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote: >>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: >>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: >>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: >>> >>> Record ioport event to replay it upon failover. >>> >>> >>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> >>> >> >>> >> Interesting. This will have to be extended to support ioeventfd. >>> >> Since each eventfd is really just a binary trigger >>> >> it should be enough to read out the fd state. >>> > >>> > Haven't thought about eventfd yet. Will try doing it in the next >>> > spin. >>> >>> Hi Michael, >>> >>> I looked into eventfd and realized it's only used with vhost now. >> >> There are patches on list to use it for block/userspace net. > > Thanks. Now I understand. > In that case, inserting an even-tap function to the following code > should be appropriate? > > int event_notifier_test_and_clear(EventNotifier *e) > { > uint64_t value; > int r = read(e->fd, &value, sizeof(value)); > return r == sizeof(value); > } > >> >>> However, I >>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to >>> detect the outputs. To me, it doesn't make sense to extend this patch to >>> support eventfd... Here is the userspace ioeventfd patch series: http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html Instead of switching to QEMU userspace to handle the virtqueue kick pio write, we signal the eventfd inside the kernel and resume guest code execution. The I/O thread can then process the virtqueue kick in parallel to guest code execution. I think this can still be tied into Kemari. If you are switching to a pure net/block-layer event tap instead of pio/mmio, then I think it should just work. For vhost it would be more difficult to integrate with Kemari. Stefan
2010/12/17 Stefan Hajnoczi <stefanha@gmail.com>: > On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura > <tamura.yoshiaki@lab.ntt.co.jp> wrote: >> 2010/12/16 Michael S. Tsirkin <mst@redhat.com>: >>> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote: >>>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: >>>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: >>>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: >>>> >>> Record ioport event to replay it upon failover. >>>> >>> >>>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> >>>> >> >>>> >> Interesting. This will have to be extended to support ioeventfd. >>>> >> Since each eventfd is really just a binary trigger >>>> >> it should be enough to read out the fd state. >>>> > >>>> > Haven't thought about eventfd yet. Will try doing it in the next >>>> > spin. >>>> >>>> Hi Michael, >>>> >>>> I looked into eventfd and realized it's only used with vhost now. >>> >>> There are patches on list to use it for block/userspace net. >> >> Thanks. Now I understand. >> In that case, inserting an even-tap function to the following code >> should be appropriate? >> >> int event_notifier_test_and_clear(EventNotifier *e) >> { >> uint64_t value; >> int r = read(e->fd, &value, sizeof(value)); >> return r == sizeof(value); >> } >> >>> >>>> However, I >>>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to >>>> detect the outputs. To me, it doesn't make sense to extend this patch to >>>> support eventfd... > > Here is the userspace ioeventfd patch series: > http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html > > Instead of switching to QEMU userspace to handle the virtqueue kick > pio write, we signal the eventfd inside the kernel and resume guest > code execution. The I/O thread can then process the virtqueue kick in > parallel to guest code execution. > > I think this can still be tied into Kemari. If you are switching to a > pure net/block-layer event tap instead of pio/mmio, then I think it > should just work. That should take a while until we solve how to set correct callbacks to the secondary upon failover. BTW, do you have a plan to move the eventfd framework to the upper layer as pio/mmio. Not only Kemari works for free, other emulators should be able to benefit from it. > For vhost it would be more difficult to integrate with Kemari. At this point, it's impossible. As Michael said, I should prevent starting Kemari when vhost=on. Yoshi > > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
On Fri, Dec 17, 2010 at 4:19 PM, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> wrote: > 2010/12/17 Stefan Hajnoczi <stefanha@gmail.com>: >> On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura >> <tamura.yoshiaki@lab.ntt.co.jp> wrote: >>> 2010/12/16 Michael S. Tsirkin <mst@redhat.com>: >>>> On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote: >>>>> 2010/11/28 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: >>>>> > 2010/11/28 Michael S. Tsirkin <mst@redhat.com>: >>>>> >> On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote: >>>>> >>> Record ioport event to replay it upon failover. >>>>> >>> >>>>> >>> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> >>>>> >> >>>>> >> Interesting. This will have to be extended to support ioeventfd. >>>>> >> Since each eventfd is really just a binary trigger >>>>> >> it should be enough to read out the fd state. >>>>> > >>>>> > Haven't thought about eventfd yet. Will try doing it in the next >>>>> > spin. >>>>> >>>>> Hi Michael, >>>>> >>>>> I looked into eventfd and realized it's only used with vhost now. >>>> >>>> There are patches on list to use it for block/userspace net. >>> >>> Thanks. Now I understand. >>> In that case, inserting an even-tap function to the following code >>> should be appropriate? >>> >>> int event_notifier_test_and_clear(EventNotifier *e) >>> { >>> uint64_t value; >>> int r = read(e->fd, &value, sizeof(value)); >>> return r == sizeof(value); >>> } >>> >>>> >>>>> However, I >>>>> believe vhost bypass the net layer in qemu, and there is no way for Kemari to >>>>> detect the outputs. To me, it doesn't make sense to extend this patch to >>>>> support eventfd... >> >> Here is the userspace ioeventfd patch series: >> http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html >> >> Instead of switching to QEMU userspace to handle the virtqueue kick >> pio write, we signal the eventfd inside the kernel and resume guest >> code execution. The I/O thread can then process the virtqueue kick in >> parallel to guest code execution. >> >> I think this can still be tied into Kemari. If you are switching to a >> pure net/block-layer event tap instead of pio/mmio, then I think it >> should just work. > > That should take a while until we solve how to set correct > callbacks to the secondary upon failover. BTW, do you have a > plan to move the eventfd framework to the upper layer as > pio/mmio. Not only Kemari works for free, other emulators should > be able to benefit from it. I'm not sure I understand the question but I have considered making ioeventfd a first-class interface like register_ioport_write(). In some ways that would be cleaner than the way we use ioeventfd in vhost and virtio-pci today. >> For vhost it would be more difficult to integrate with Kemari. > > At this point, it's impossible. As Michael said, I should > prevent starting Kemari when vhost=on. If you add some functionality to vhost it might be possible, although that would slow it down. So perhaps for the near future using vhost with Kemari is pointless anyway since you won't be able to reach the performance that vhost-net can achieve. Stefan
diff --git a/ioport.c b/ioport.c index aa4188a..74aebf5 100644 --- a/ioport.c +++ b/ioport.c @@ -27,6 +27,7 @@ #include "ioport.h" #include "trace.h" +#include "event-tap.h" /***********************************************************/ /* IO Port */ @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) default_ioport_writel }; IOPortWriteFunc *func = ioport_write_table[index][address]; + event_tap_ioport(index, address, data); if (!func) func = default_func[index]; func(ioport_opaque[address], address, data);
Record ioport event to replay it upon failover. Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> --- ioport.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-)