Message ID | 152044905701.22965.8924520463227226351.stgit@bahia.lan |
---|---|
State | New |
Headers | show |
Series | virtio_net: flush uncompleted TX on reset | expand |
Greg Kurz <groug@kaod.org> wrote on 08/03/2018 12:27:37 AM: > From: Greg Kurz <groug@kaod.org> > To: qemu-devel@nongnu.org > Cc: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang > <jasowang@redhat.com>, R Nageswara Sastry <nasastry@in.ibm.com> > Date: 08/03/2018 12:27 AM > Subject: [PATCH] virtio_net: flush uncompleted TX on reset > > If the backend could not transmit a packet right away for some reason, > the packet is queued for asynchronous sending. The corresponding vq > element is tracked in the async_tx.elem field of the VirtIONetQueue, > for later freeing when the transmission is complete. > > If a reset happens before completion, virtio_net_tx_complete() will push > async_tx.elem back to the guest anyway, and we end up with the inuse flag > of the vq being equal to -1. The next call to virtqueue_pop() is then > likely to fail with "Virtqueue size exceeded". > > This can be reproduced easily by starting a guest without a net backend, > doing a system reset when it is booted, and finally snapshotting it. > > The appropriate fix is to ensure that such an asynchronous transmission > cannot survive a device reset. So for all queues, we first try to send > the packet again, and eventually we purge it if the backend still could > not deliver it. > > Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Tested-by: R. Nageswara Sastry <nasastry@in.ibm.com> > Buglink: https://urldefense.proofpoint.com/v2/url? > u=https-3A__github.com_open-2Dpower-2Dhost-2Dos_qemu_issues_37&d=DwICaQ&c=jf_iaSHvJObTbx- > siA1ZOg&r=mxAxqGE8eb0FlPPFjDZkTNaoci-GdQbkJayE4r- > wzYY&m=hPoY6b601IXJbUV2uh22jBrnYuByNQpwi1d7gvN4yZs&s=Ic3NN3mM_Nv1gAJ7dY22- > ebnJsG7c0yNkbThX8Tu6xg&e= > Signed-off-by: Greg Kurz <groug@kaod.org> > --- > hw/net/virtio-net.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 188744e17d57..eea3cdb2c700 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -422,6 +422,7 @@ static RxFilterInfo > *virtio_net_query_rxfilter(NetClientState *nc) > static void virtio_net_reset(VirtIODevice *vdev) > { > VirtIONet *n = VIRTIO_NET(vdev); > + int i; > > /* Reset back to compatibility mode */ > n->promisc = 1; > @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) > memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > memset(n->vlans, 0, MAX_VLAN >> 3); > + > + /* Flush any async TX */ > + for (i = 0; i < n->max_queues; i++) { > + NetClientState *nc = qemu_get_subqueue(n->nic, i); > + > + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { > + qemu_net_queue_purge(nc->peer->incoming_queue, nc); > + } > + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); > + } > } > > static void peer_test_vnet_hdr(VirtIONet *n) > With out patch: (qemu) system_reset (qemu) savevm 1 Virtqueue size exceeded (qemu) loadvm 1 VQ 1 size 0x100 < last_avail_idx 0x0 - used_idx 0x1 Failed to load virtio-net:virtio error while loading state for instance 0x0 of device 'pci@800000020000000:00.0/virtio-net' Error -1 while loading VM state With patch: (qemu) system_reset (qemu) savevm 1 (qemu) loadvm 1
Ping ? On Thu, 8 Mar 2018 10:38:56 +0530 "Nageswara R Sastry" <nasastry@in.ibm.com> wrote: > Greg Kurz <groug@kaod.org> wrote on 08/03/2018 12:27:37 AM: > > > From: Greg Kurz <groug@kaod.org> > > To: qemu-devel@nongnu.org > > Cc: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang > > <jasowang@redhat.com>, R Nageswara Sastry <nasastry@in.ibm.com> > > Date: 08/03/2018 12:27 AM > > Subject: [PATCH] virtio_net: flush uncompleted TX on reset > > > > If the backend could not transmit a packet right away for some reason, > > the packet is queued for asynchronous sending. The corresponding vq > > element is tracked in the async_tx.elem field of the VirtIONetQueue, > > for later freeing when the transmission is complete. > > > > If a reset happens before completion, virtio_net_tx_complete() will push > > async_tx.elem back to the guest anyway, and we end up with the inuse > flag > > of the vq being equal to -1. The next call to virtqueue_pop() is then > > likely to fail with "Virtqueue size exceeded". > > > > This can be reproduced easily by starting a guest without a net backend, > > doing a system reset when it is booted, and finally snapshotting it. > > > > The appropriate fix is to ensure that such an asynchronous transmission > > cannot survive a device reset. So for all queues, we first try to send > > the packet again, and eventually we purge it if the backend still could > > not deliver it. > > > > Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> > > Tested-by: R. Nageswara Sastry <nasastry@in.ibm.com> > > > Buglink: https://urldefense.proofpoint.com/v2/url? > > > u=https-3A__github.com_open-2Dpower-2Dhost-2Dos_qemu_issues_37&d=DwICaQ&c=jf_iaSHvJObTbx- > > siA1ZOg&r=mxAxqGE8eb0FlPPFjDZkTNaoci-GdQbkJayE4r- > > > wzYY&m=hPoY6b601IXJbUV2uh22jBrnYuByNQpwi1d7gvN4yZs&s=Ic3NN3mM_Nv1gAJ7dY22- > > ebnJsG7c0yNkbThX8Tu6xg&e= > > Signed-off-by: Greg Kurz <groug@kaod.org> > > --- > > hw/net/virtio-net.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > index 188744e17d57..eea3cdb2c700 100644 > > --- a/hw/net/virtio-net.c > > +++ b/hw/net/virtio-net.c > > @@ -422,6 +422,7 @@ static RxFilterInfo > > *virtio_net_query_rxfilter(NetClientState *nc) > > static void virtio_net_reset(VirtIODevice *vdev) > > { > > VirtIONet *n = VIRTIO_NET(vdev); > > + int i; > > > > /* Reset back to compatibility mode */ > > n->promisc = 1; > > @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) > > memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); > > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > > memset(n->vlans, 0, MAX_VLAN >> 3); > > + > > + /* Flush any async TX */ > > + for (i = 0; i < n->max_queues; i++) { > > + NetClientState *nc = qemu_get_subqueue(n->nic, i); > > + > > + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { > > + qemu_net_queue_purge(nc->peer->incoming_queue, nc); > > + } > > + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); > > + } > > } > > > > static void peer_test_vnet_hdr(VirtIONet *n) > > > > > With out patch: > (qemu) system_reset > (qemu) savevm 1 > Virtqueue size exceeded > (qemu) loadvm 1 > VQ 1 size 0x100 < last_avail_idx 0x0 - used_idx 0x1 > Failed to load virtio-net:virtio > error while loading state for instance 0x0 of device > 'pci@800000020000000:00.0/virtio-net' > Error -1 while loading VM state > > With patch: > (qemu) system_reset > (qemu) savevm 1 > (qemu) loadvm 1 > > > >
On 2018年03月08日 02:57, Greg Kurz wrote: > If the backend could not transmit a packet right away for some reason, > the packet is queued for asynchronous sending. The corresponding vq > element is tracked in the async_tx.elem field of the VirtIONetQueue, > for later freeing when the transmission is complete. > > If a reset happens before completion, virtio_net_tx_complete() will push > async_tx.elem back to the guest anyway, and we end up with the inuse flag > of the vq being equal to -1. The next call to virtqueue_pop() is then > likely to fail with "Virtqueue size exceeded". > > This can be reproduced easily by starting a guest without a net backend, > doing a system reset when it is booted, and finally snapshotting it. > > The appropriate fix is to ensure that such an asynchronous transmission > cannot survive a device reset. So for all queues, we first try to send > the packet again, and eventually we purge it if the backend still could > not deliver it. > > Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> > Buglink: https://github.com/open-power-host-os/qemu/issues/37 > Signed-off-by: Greg Kurz <groug@kaod.org> > --- > hw/net/virtio-net.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 188744e17d57..eea3cdb2c700 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -422,6 +422,7 @@ static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) > static void virtio_net_reset(VirtIODevice *vdev) > { > VirtIONet *n = VIRTIO_NET(vdev); > + int i; > > /* Reset back to compatibility mode */ > n->promisc = 1; > @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) > memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > memset(n->vlans, 0, MAX_VLAN >> 3); > + > + /* Flush any async TX */ > + for (i = 0; i < n->max_queues; i++) { > + NetClientState *nc = qemu_get_subqueue(n->nic, i); > + > + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { > + qemu_net_queue_purge(nc->peer->incoming_queue, nc); > + } Looks like we can use qemu_flush_or_purge_queued_packets(nc->peer) here. But a questions, you said it could be reproduced without a backend, in this case nc->peer should be NULL I believe or we won't even get here since qemu_sendv_packet_async() won't return zero? Thanks > + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); > + } > } > > static void peer_test_vnet_hdr(VirtIONet *n) > >
On Fri, 16 Mar 2018 16:05:17 +0800 Jason Wang <jasowang@redhat.com> wrote: > On 2018年03月08日 02:57, Greg Kurz wrote: > > If the backend could not transmit a packet right away for some reason, > > the packet is queued for asynchronous sending. The corresponding vq > > element is tracked in the async_tx.elem field of the VirtIONetQueue, > > for later freeing when the transmission is complete. > > > > If a reset happens before completion, virtio_net_tx_complete() will push > > async_tx.elem back to the guest anyway, and we end up with the inuse flag > > of the vq being equal to -1. The next call to virtqueue_pop() is then > > likely to fail with "Virtqueue size exceeded". > > > > This can be reproduced easily by starting a guest without a net backend, > > doing a system reset when it is booted, and finally snapshotting it. > > > > The appropriate fix is to ensure that such an asynchronous transmission > > cannot survive a device reset. So for all queues, we first try to send > > the packet again, and eventually we purge it if the backend still could > > not deliver it. > > > > Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> > > Buglink: https://github.com/open-power-host-os/qemu/issues/37 > > Signed-off-by: Greg Kurz <groug@kaod.org> > > --- > > hw/net/virtio-net.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > index 188744e17d57..eea3cdb2c700 100644 > > --- a/hw/net/virtio-net.c > > +++ b/hw/net/virtio-net.c > > @@ -422,6 +422,7 @@ static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) > > static void virtio_net_reset(VirtIODevice *vdev) > > { > > VirtIONet *n = VIRTIO_NET(vdev); > > + int i; > > > > /* Reset back to compatibility mode */ > > n->promisc = 1; > > @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) > > memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); > > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > > memset(n->vlans, 0, MAX_VLAN >> 3); > > + > > + /* Flush any async TX */ > > + for (i = 0; i < n->max_queues; i++) { > > + NetClientState *nc = qemu_get_subqueue(n->nic, i); > > + > > + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { > > + qemu_net_queue_purge(nc->peer->incoming_queue, nc); > > + } > > Looks like we can use qemu_flush_or_purge_queued_packets(nc->peer) here. > It should be made extern first but we can use it indeed. > But a questions, you said it could be reproduced without a backend, in > this case nc->peer should be NULL I believe or we won't even get here > since qemu_sendv_packet_async() won't return zero? > My bad, I didn't use an appropriate wording. The issue is always reproducible if you only pass '-net nic,model=virtio', without any other -net option that would provide a functional network. ie, -device virtio-net-pci,netdev=netdev0 -netdev hubport,id=netdev0,hubid=0 So, we do have a hubport peer, but since it isn't connected to the host network, net_hub_port_can_receive() returns 0, and so does qemu_sendv_packet_async(). What about the following change in the changelog ? "This can be reproduced easily by starting a guest with an hubport backend that is not connected to a functional network, eg, -device virtio-net-pci,netdev=netdev0 -netdev hubport,id=netdev0,hubid=0" Cheers, -- Greg > Thanks > > > + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); > > + } > > } > > > > static void peer_test_vnet_hdr(VirtIONet *n) > > > > >
On 2018年03月16日 17:58, Greg Kurz wrote: > On Fri, 16 Mar 2018 16:05:17 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> On 2018年03月08日 02:57, Greg Kurz wrote: >>> If the backend could not transmit a packet right away for some reason, >>> the packet is queued for asynchronous sending. The corresponding vq >>> element is tracked in the async_tx.elem field of the VirtIONetQueue, >>> for later freeing when the transmission is complete. >>> >>> If a reset happens before completion, virtio_net_tx_complete() will push >>> async_tx.elem back to the guest anyway, and we end up with the inuse flag >>> of the vq being equal to -1. The next call to virtqueue_pop() is then >>> likely to fail with "Virtqueue size exceeded". >>> >>> This can be reproduced easily by starting a guest without a net backend, >>> doing a system reset when it is booted, and finally snapshotting it. >>> >>> The appropriate fix is to ensure that such an asynchronous transmission >>> cannot survive a device reset. So for all queues, we first try to send >>> the packet again, and eventually we purge it if the backend still could >>> not deliver it. >>> >>> Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> >>> Buglink: https://github.com/open-power-host-os/qemu/issues/37 >>> Signed-off-by: Greg Kurz <groug@kaod.org> >>> --- >>> hw/net/virtio-net.c | 11 +++++++++++ >>> 1 file changed, 11 insertions(+) >>> >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >>> index 188744e17d57..eea3cdb2c700 100644 >>> --- a/hw/net/virtio-net.c >>> +++ b/hw/net/virtio-net.c >>> @@ -422,6 +422,7 @@ static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) >>> static void virtio_net_reset(VirtIODevice *vdev) >>> { >>> VirtIONet *n = VIRTIO_NET(vdev); >>> + int i; >>> >>> /* Reset back to compatibility mode */ >>> n->promisc = 1; >>> @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) >>> memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); >>> qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); >>> memset(n->vlans, 0, MAX_VLAN >> 3); >>> + >>> + /* Flush any async TX */ >>> + for (i = 0; i < n->max_queues; i++) { >>> + NetClientState *nc = qemu_get_subqueue(n->nic, i); >>> + >>> + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { >>> + qemu_net_queue_purge(nc->peer->incoming_queue, nc); >>> + } >> Looks like we can use qemu_flush_or_purge_queued_packets(nc->peer) here. >> > It should be made extern first but we can use it indeed. > >> But a questions, you said it could be reproduced without a backend, in >> this case nc->peer should be NULL I believe or we won't even get here >> since qemu_sendv_packet_async() won't return zero? >> > My bad, I didn't use an appropriate wording. The issue is always reproducible > if you only pass '-net nic,model=virtio', without any other -net option that > would provide a functional network. > > ie, > > -device virtio-net-pci,netdev=netdev0 -netdev hubport,id=netdev0,hubid=0 > > So, we do have a hubport peer, but since it isn't connected to the host network, > net_hub_port_can_receive() returns 0, and so does qemu_sendv_packet_async(). > > What about the following change in the changelog ? > > "This can be reproduced easily by starting a guest with an hubport > backend that is not connected to a functional network, eg, > > -device virtio-net-pci,netdev=netdev0 -netdev hubport,id=netdev0,hubid=0" > > Cheers, Looks good to me. Thanks > -- > Greg > >> Thanks >> >>> + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); >>> + } >>> } >>> >>> static void peer_test_vnet_hdr(VirtIONet *n) >>> >>>
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 188744e17d57..eea3cdb2c700 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -422,6 +422,7 @@ static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) static void virtio_net_reset(VirtIODevice *vdev) { VirtIONet *n = VIRTIO_NET(vdev); + int i; /* Reset back to compatibility mode */ n->promisc = 1; @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); memset(n->vlans, 0, MAX_VLAN >> 3); + + /* Flush any async TX */ + for (i = 0; i < n->max_queues; i++) { + NetClientState *nc = qemu_get_subqueue(n->nic, i); + + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { + qemu_net_queue_purge(nc->peer->incoming_queue, nc); + } + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); + } } static void peer_test_vnet_hdr(VirtIONet *n)
If the backend could not transmit a packet right away for some reason, the packet is queued for asynchronous sending. The corresponding vq element is tracked in the async_tx.elem field of the VirtIONetQueue, for later freeing when the transmission is complete. If a reset happens before completion, virtio_net_tx_complete() will push async_tx.elem back to the guest anyway, and we end up with the inuse flag of the vq being equal to -1. The next call to virtqueue_pop() is then likely to fail with "Virtqueue size exceeded". This can be reproduced easily by starting a guest without a net backend, doing a system reset when it is booted, and finally snapshotting it. The appropriate fix is to ensure that such an asynchronous transmission cannot survive a device reset. So for all queues, we first try to send the packet again, and eventually we purge it if the backend still could not deliver it. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://github.com/open-power-host-os/qemu/issues/37 Signed-off-by: Greg Kurz <groug@kaod.org> --- hw/net/virtio-net.c | 11 +++++++++++ 1 file changed, 11 insertions(+)