mbox series

[v1,0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed

Message ID 20230129025150.119972-1-xuanzhuo@linux.alibaba.com
Headers show
Series virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed | expand

Message

Xuan Zhuo Jan. 29, 2023, 2:51 a.m. UTC
In the current design, we stop the device from operating on the vring
during per-queue reset by resetting the structure VirtQueue.

But before the reset operation, when recycling some resources, we should
stop referencing new vring resources.

This bug is caused by this reason.

    https://gitlab.com/qemu-project/qemu/-/issues/1451

Before we reset the structure, we called the ->queue_reset callback to let the
device reclaim resources. Here virtio-net tries to release the packets sent
asynchronously, but during this process virtio_net_flush_tx() will be called,
and new data will be sent again. This leads to asserted.

     assert(!virtio_net_get_subqueue(nc)->async_tx.elem);

This patch set introduce new item "reset" into struct VirtQueue, then device can
know this virtqueue is per-queue reset state.

v1:
    1. rename "reset" to disabled_by_reset
    2. add api: virtio_queue_reset_state()

Xuan Zhuo (2):
  virtio: struct VirtQueue introduce reset
  virtio-net: virtio_net_flush_tx() check for per-queue reset

 hw/net/virtio-net.c        |  3 ++-
 hw/virtio/virtio.c         | 15 +++++++++++++++
 include/hw/virtio/virtio.h |  1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

--
2.32.0.3.g01195cf9f

Comments

Michael S. Tsirkin Jan. 29, 2023, 7:26 a.m. UTC | #1
On Sun, Jan 29, 2023 at 10:51:48AM +0800, Xuan Zhuo wrote:
> In the current design, we stop the device from operating on the vring
> during per-queue reset by resetting the structure VirtQueue.
> 
> But before the reset operation, when recycling some resources, we should
> stop referencing new vring resources.
> 
> This bug is caused by this reason.
> 
>     https://gitlab.com/qemu-project/qemu/-/issues/1451
> 
> Before we reset the structure, we called the ->queue_reset callback to let the
> device reclaim resources. Here virtio-net tries to release the packets sent
> asynchronously, but during this process virtio_net_flush_tx() will be called,
> and new data will be sent again. This leads to asserted.
> 
>      assert(!virtio_net_get_subqueue(nc)->async_tx.elem);
> 
> This patch set introduce new item "reset" into struct VirtQueue, then device can
> know this virtqueue is per-queue reset state.

Better but I still don't exacctly understand what this state means.
Sent some questions on the patches themselves.
Thanks!


> v1:
>     1. rename "reset" to disabled_by_reset
>     2. add api: virtio_queue_reset_state()
> 
> Xuan Zhuo (2):
>   virtio: struct VirtQueue introduce reset
>   virtio-net: virtio_net_flush_tx() check for per-queue reset
> 
>  hw/net/virtio-net.c        |  3 ++-
>  hw/virtio/virtio.c         | 15 +++++++++++++++
>  include/hw/virtio/virtio.h |  1 +
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> --
> 2.32.0.3.g01195cf9f