Message ID | 5f91d600c749e66de107f60298c5ebd36645beff.1288892774.git.mst@redhat.com |
---|---|
State | New |
Headers | show |
"Michael S. Tsirkin" <mst@redhat.com> wrote: > There's no reason for tap to run when VM is stopped. > If we let it, it confuses the bridge on TX > and corrupts DMA memory on RX. > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> once here, what handlers make sense to run while stopped? /me can think of the normal console, non live migration, loadvm and not much more. Perhaps it is easier to just move the other way around? Later, Juan.
On 11/15/2010 08:52 AM, Juan Quintela wrote: > "Michael S. Tsirkin"<mst@redhat.com> wrote: > >> There's no reason for tap to run when VM is stopped. >> If we let it, it confuses the bridge on TX >> and corrupts DMA memory on RX. >> >> Signed-off-by: Michael S. Tsirkin<mst@redhat.com> >> > once here, what handlers make sense to run while stopped? > /me can think of the normal console, non live migration, loadvm and not > much more. Perhaps it is easier to just move the other way around? > I'm not sure I concur that this is really a problem. Semantically, I don't think that stop has to imply that the guest memory no longer changes. Regards, Anthony Liguori > Later, Juan. >
On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: > On 11/15/2010 08:52 AM, Juan Quintela wrote: > >"Michael S. Tsirkin"<mst@redhat.com> wrote: > >>There's no reason for tap to run when VM is stopped. > >>If we let it, it confuses the bridge on TX > >>and corrupts DMA memory on RX. > >> > >>Signed-off-by: Michael S. Tsirkin<mst@redhat.com> > >once here, what handlers make sense to run while stopped? > >/me can think of the normal console, non live migration, loadvm and not > >much more. Perhaps it is easier to just move the other way around? > > I'm not sure I concur that this is really a problem. > Semantically, I don't think that stop has to imply that the guest > memory no longer changes. > > Regards, > > Anthony Liguori > > >Later, Juan. Well, I do not really know about vmstop that is not for migration. For most vmstop calls are for migration. And there, the problems are very real. First, it's not just memory. At least for network transmit, sending out packets with the same MAC from two locations is a problem. See? For memory, it is much worse: any memory changes can either get discarded or not. This breaks consistency guarantees that guest relies upon. Imagine virtio index getting updated but content not being updated. See?
On Mon, Nov 15, 2010 at 03:52:54PM +0100, Juan Quintela wrote: > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > There's no reason for tap to run when VM is stopped. > > If we let it, it confuses the bridge on TX > > and corrupts DMA memory on RX. > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > > once here, what handlers make sense to run while stopped? > /me can think of the normal console, non live migration, loadvm and not > much more. Perhaps it is easier to just move the other way around? > > Later, Juan. vnc? SDL? Yes, more devices need to be stopped than not, but I tread carefully to avoid breaking existing functionality. If you could solve it for all devices in one swoop, that'd be great. I'm not up to it.
On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: > On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: > >> On 11/15/2010 08:52 AM, Juan Quintela wrote: >> >>> "Michael S. Tsirkin"<mst@redhat.com> wrote: >>> >>>> There's no reason for tap to run when VM is stopped. >>>> If we let it, it confuses the bridge on TX >>>> and corrupts DMA memory on RX. >>>> >>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com> >>>> >>> once here, what handlers make sense to run while stopped? >>> /me can think of the normal console, non live migration, loadvm and not >>> much more. Perhaps it is easier to just move the other way around? >>> >> I'm not sure I concur that this is really a problem. >> Semantically, I don't think that stop has to imply that the guest >> memory no longer changes. >> >> Regards, >> >> Anthony Liguori >> >> >>> Later, Juan. >>> > Well, I do not really know about vmstop that is not for migration. > They are separate. For instance, we don't rely on stop to pause pending disk I/O because we don't serialize pending disk I/O operations. Instead, we flush all pending I/O and rely on the fact that disk I/O requests are always submitted in the context of a vCPU operation. This assumption breaks down though with ioeventfd so we need to revisit it. > For most vmstop calls are for migration. And there, the problems are very > real. > > First, it's not just memory. At least for network transmit, sending out > packets with the same MAC from two locations is a problem. See? > I agree it's a problem but I'm not sure that just marking fd handlers really helps. Bottom halves can also trigger transmits. I think that if we put something in the network layer that just queued packets if the vm is stopped, it would be a more robust solution to the problem. > For memory, it is much worse: any memory changes can either get > discarded or not. This breaks consistency guarantees that guest relies > upon. Imagine virtio index getting updated but content not being > updated. See? > If you suppress any I/O then the memory changes don't matter because the same changes will happen on the destination too. I think this basic problem is the same as Kemari. We can either attempt to totally freeze a guest which means stopping all callbacks that are device related or we can prevent I/O from happening which should introduce enough determinism to fix the problem in practice. Regards, Anthony Liguori
On Mon, Nov 15, 2010 at 10:09:29AM -0600, Anthony Liguori wrote: > On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: > >On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: > >>On 11/15/2010 08:52 AM, Juan Quintela wrote: > >>>"Michael S. Tsirkin"<mst@redhat.com> wrote: > >>>>There's no reason for tap to run when VM is stopped. > >>>>If we let it, it confuses the bridge on TX > >>>>and corrupts DMA memory on RX. > >>>> > >>>>Signed-off-by: Michael S. Tsirkin<mst@redhat.com> > >>>once here, what handlers make sense to run while stopped? > >>>/me can think of the normal console, non live migration, loadvm and not > >>>much more. Perhaps it is easier to just move the other way around? > >>I'm not sure I concur that this is really a problem. > >>Semantically, I don't think that stop has to imply that the guest > >>memory no longer changes. > >> > >>Regards, > >> > >>Anthony Liguori > >> > >>>Later, Juan. > >Well, I do not really know about vmstop that is not for migration. > > They are separate. For instance, we don't rely on stop to pause > pending disk I/O because we don't serialize pending disk I/O > operations. Instead, we flush all pending I/O and rely on the fact > that disk I/O requests are always submitted in the context of a vCPU > operation. This assumption breaks down though with ioeventfd so we > need to revisit it. > > >For most vmstop calls are for migration. And there, the problems are very > >real. > > > >First, it's not just memory. At least for network transmit, sending out > >packets with the same MAC from two locations is a problem. See? > > I agree it's a problem but I'm not sure that just marking fd > handlers really helps. > > Bottom halves can also trigger transmits. Are there any system ones? Can we just stop processing them? > I think that if we put > something in the network layer that just queued packets if the vm is > stopped, it would be a more robust solution to the problem. Will only work for -net. The problem is for anything that can trigger activity when vm is stopped. > >For memory, it is much worse: any memory changes can either get > >discarded or not. This breaks consistency guarantees that guest relies > >upon. Imagine virtio index getting updated but content not being > >updated. See? > > If you suppress any I/O then the memory changes don't matter because > the same changes will happen on the destination too. They matter, and same changes won't happen. Example: virtio used index is in page 1, it can point at data in page 2. device writes into data, *then* into index. Order matters, but won't be preserved: migration assumes memory does not change after vmstop, and so it might send old values for data but new values for index. Result will be invalid data coming into guest. On the destination guest will pick up the index and get bad (stale) data. > I think this basic problem is the same as Kemari. We can either > attempt to totally freeze a guest which means stopping all callbacks > that are device related or we can prevent I/O from happening which > should introduce enough determinism to fix the problem in practice. > > Regards, > > Anthony Liguori See above. IMO it's a different problem. Unlike Kemari, I don't really see any drawbacks to stop all callbacks. Do you?
On 11/15/2010 02:26 PM, Michael S. Tsirkin wrote: > Are there any system ones? > There's one in qemu-char.c. Not sure it's easy to just say for the future that there can't be BHs that get delivered during vmstop. > Can we just stop processing them? > We ran into a lot of problems trying to do this with emulating synchronous I/O in the block layer. If we can avoid the stop-dispatching handlers approach, I think we'll have far fewer problems. >> I think that if we put >> something in the network layer that just queued packets if the vm is >> stopped, it would be a more robust solution to the problem. >> > Will only work for -net. The problem is for anything > that can trigger activity when vm is stopped. > Activity or external I/O? My assertion is that if we can stop things from hitting the disk, escaping the network, or leaving a character device, we're solid. >>> For memory, it is much worse: any memory changes can either get >>> discarded or not. This breaks consistency guarantees that guest relies >>> upon. Imagine virtio index getting updated but content not being >>> updated. See? >>> >> If you suppress any I/O then the memory changes don't matter because >> the same changes will happen on the destination too. >> > They matter, and same changes won't happen. > Example: > > virtio used index is in page 1, it can point at data in page 2. > device writes into data, *then* into index. Order matters, > but won't be preserved: migration assumes memory does not > change after vmstop, and so it might send old values for > data but new values for index. Result will be invalid > data coming into guest. > No, this scenario is invalid. Migration copies data while the guest is live and whenever a change happens, updates the dirty bitmap (that's why cpu_physical_memory_rw matters). Once the VM is stopped, we never return to the main loop before completing migration so nothing else gets an opportunity to run. This means when we send a page, we guarantee it won't be made dirty against until after the migration completes. Once the migration completes, the contents of memory may change but that's okay. As long as you stop packets from being sent, if you need to resume the guest, it'll pick up where it left off. > On the destination guest will pick up the index and > get bad (stale) data. > If you're seeing this happen with vhost, you aren't updating dirty bits correctly. AFAICT, this cannot happen with userspace device models. > >> I think this basic problem is the same as Kemari. We can either >> attempt to totally freeze a guest which means stopping all callbacks >> that are device related or we can prevent I/O from happening which >> should introduce enough determinism to fix the problem in practice. >> >> Regards, >> >> Anthony Liguori >> > > See above. IMO it's a different problem. Unlike Kemari, > I don't really see any drawbacks to stop all callbacks. > Do you? > I think it's going to be extremely difficult to get right and keep right. A bunch of code needs to be converted to make us safe and then becomes brittle as it's very simple to change things in such a way that we're broken again. Regards, Anthony Liguori
On Mon, Nov 15, 2010 at 4:09 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: >> >> On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: >> >>> >>> On 11/15/2010 08:52 AM, Juan Quintela wrote: >>> >>>> >>>> "Michael S. Tsirkin"<mst@redhat.com> wrote: >>>> >>>>> >>>>> There's no reason for tap to run when VM is stopped. >>>>> If we let it, it confuses the bridge on TX >>>>> and corrupts DMA memory on RX. >>>>> >>>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com> >>>>> >>>> >>>> once here, what handlers make sense to run while stopped? >>>> /me can think of the normal console, non live migration, loadvm and not >>>> much more. Perhaps it is easier to just move the other way around? >>>> >>> >>> I'm not sure I concur that this is really a problem. >>> Semantically, I don't think that stop has to imply that the guest >>> memory no longer changes. >>> >>> Regards, >>> >>> Anthony Liguori >>> >>> >>>> >>>> Later, Juan. >>>> >> >> Well, I do not really know about vmstop that is not for migration. >> > > They are separate. For instance, we don't rely on stop to pause pending > disk I/O because we don't serialize pending disk I/O operations. Instead, > we flush all pending I/O and rely on the fact that disk I/O requests are > always submitted in the context of a vCPU operation. This assumption breaks > down though with ioeventfd so we need to revisit it. vhost has a solution for this: register a VMChangeStateHandler function that stops ioeventfd while vm_stop(0) is in effect. Then poll to see if there is work pending. I will add equivalent functionality to virtio-ioeventfd. Stefan
On 11/15/2010 02:53 PM, Stefan Hajnoczi wrote: > > vhost has a solution for this: register a VMChangeStateHandler > function that stops ioeventfd while vm_stop(0) is in effect. Then > poll to see if there is work pending. > > I will add equivalent functionality to virtio-ioeventfd. > I still think that stopping this at the net/block layer is going to be a lot more robust in the long term. All net/block traffic flows through a single set of interfaces whereas getting the devices to completely stop themselves requires touching every device and making sure that noone adds back vmstop-unfriendly behavior down the road. Regards, Anthony Liguori > Stefan >
On Mon, Nov 15, 2010 at 03:05:39PM -0600, Anthony Liguori wrote: > On 11/15/2010 02:53 PM, Stefan Hajnoczi wrote: > > > >vhost has a solution for this: register a VMChangeStateHandler > >function that stops ioeventfd while vm_stop(0) is in effect. Then > >poll to see if there is work pending. > > > >I will add equivalent functionality to virtio-ioeventfd. > > I still think that stopping this at the net/block layer is going to > be a lot more robust in the long term. > All net/block traffic flows > through a single set of interfaces whereas getting the devices to > completely stop themselves requires touching every device and making > sure that noone adds back vmstop-unfriendly behavior down the road. > > Regards, > > Anthony Liguori So I'm not sure I understand what is proposed here. Care posting a patch?
On 11/15/2010 04:44 PM, Michael S. Tsirkin wrote: > So I'm not sure I understand what is proposed here. > Care posting a patch? > I *think* just adding: if (vm_running == 0) { return qemu_net_queue_append(queue, sender, flags, data, size, NULL); } To qemu_net_queue_send() along with a state notifier to kick the queue on start would do the trick. Regards, Anthony Liguori
diff --git a/hw/vhost_net.c b/hw/vhost_net.c index c068be1..7061d6c 100644 --- a/hw/vhost_net.c +++ b/hw/vhost_net.c @@ -139,7 +139,7 @@ int vhost_net_start(struct vhost_net *net, } net->vc->info->poll(net->vc, false); - qemu_set_fd_handler(net->backend, NULL, NULL, NULL); + qemu_set_fd_handler3(true, net->backend, NULL, NULL, NULL, NULL); file.fd = net->backend; for (file.index = 0; file.index < net->dev.nvqs; ++file.index) { r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file); diff --git a/net/tap.c b/net/tap.c index 4afb314..246ef01 100644 --- a/net/tap.c +++ b/net/tap.c @@ -71,7 +71,8 @@ static void tap_writable(void *opaque); static void tap_update_fd_handler(TAPState *s) { - qemu_set_fd_handler2(s->fd, + qemu_set_fd_handler3(true, + s->fd, s->read_poll ? tap_can_send : NULL, s->read_poll ? tap_send : NULL, s->write_poll ? tap_writable : NULL,
There's no reason for tap to run when VM is stopped. If we let it, it confuses the bridge on TX and corrupts DMA memory on RX. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> --- hw/vhost_net.c | 2 +- net/tap.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-)