diff mbox series

[v1,2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset

Message ID 20230129025150.119972-3-xuanzhuo@linux.alibaba.com
State New
Headers show
Series virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed | expand

Commit Message

Xuan Zhuo Jan. 29, 2023, 2:51 a.m. UTC
Check whether it is per-queue reset state in virtio_net_flush_tx().

Before per-queue reset, we need to recover async tx resources. At this
time, virtio_net_flush_tx() is called, but we should not try to send
new packets, so virtio_net_flush_tx() should check the current
per-queue reset state.

Fixes: 7dc6be52 ("virtio-net: support queue reset")
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 hw/net/virtio-net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jason Wang Jan. 29, 2023, 6:23 a.m. UTC | #1
On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> Check whether it is per-queue reset state in virtio_net_flush_tx().
>
> Before per-queue reset, we need to recover async tx resources. At this
> time, virtio_net_flush_tx() is called, but we should not try to send
> new packets, so virtio_net_flush_tx() should check the current
> per-queue reset state.
>
> Fixes: 7dc6be52 ("virtio-net: support queue reset")
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> Reported-by: Alexander Bulekov <alxndr@bu.edu>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  hw/net/virtio-net.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 3ae909041a..fba6451a50 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>      VirtQueueElement *elem;
>      int32_t num_packets = 0;
>      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> +        virtio_queue_reset_state(q->tx_vq)) {

We have other places that check DRIVER_OK do we need to check queue
reset as well?

E.g:
virtio_net_can_receive()
virtio_net_tx_{timer|bh}()

Thanks

>          return num_packets;
>      }
>
> --
> 2.32.0.3.g01195cf9f
>
Michael S. Tsirkin Jan. 29, 2023, 7:25 a.m. UTC | #2
On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> Check whether it is per-queue reset state in virtio_net_flush_tx().
> 
> Before per-queue reset, we need to recover async tx resources. At this
> time, virtio_net_flush_tx() is called, but we should not try to send
> new packets, so virtio_net_flush_tx() should check the current
> per-queue reset state.


What does "at this time" mean here?
Do you in fact mean it's called from flush_or_purge_queued_packets?
What does the call stack look like?

If yes introducing a vq state just so virtio_net_flush_tx
knows we are in the process of reset would be a bad idea.
We want something much more local, ideally on stack even ...


> 
> Fixes: 7dc6be52 ("virtio-net: support queue reset")
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> Reported-by: Alexander Bulekov <alxndr@bu.edu>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  hw/net/virtio-net.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 3ae909041a..fba6451a50 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>      VirtQueueElement *elem;
>      int32_t num_packets = 0;
>      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> +        virtio_queue_reset_state(q->tx_vq)) {
>          return num_packets;
>      }
>  
> -- 
> 2.32.0.3.g01195cf9f
Xuan Zhuo Jan. 29, 2023, 7:28 a.m. UTC | #3
On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > Check whether it is per-queue reset state in virtio_net_flush_tx().
> >
> > Before per-queue reset, we need to recover async tx resources. At this
> > time, virtio_net_flush_tx() is called, but we should not try to send
> > new packets, so virtio_net_flush_tx() should check the current
> > per-queue reset state.
>
>
> What does "at this time" mean here?
> Do you in fact mean it's called from flush_or_purge_queued_packets?

Yes

virtio_queue_reset
	k->queue_reset
		virtio_net_queue_reset
			flush_or_purge_queued_packets
				qemu_flush_or_purge_queued_packets
					.....
					(callback) virtio_net_tx_complete
						virtio_net_flush_tx <-- here send new packet. We need stop it.


Because it is inside the callback, I can't pass information through the stack. I
originally thought it was a general situation, so I wanted to put it in
struct VirtQueue.

If it is not very suitable, it may be better to put it in VirtIONetQueue.

Thanks.

> What does the call stack look like?
>
> If yes introducing a vq state just so virtio_net_flush_tx
> knows we are in the process of reset would be a bad idea.
> We want something much more local, ideally on stack even ...
>
>
> >
> > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  hw/net/virtio-net.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 3ae909041a..fba6451a50 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> >      VirtQueueElement *elem;
> >      int32_t num_packets = 0;
> >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > +        virtio_queue_reset_state(q->tx_vq)) {
> >          return num_packets;
> >      }
> >
> > --
> > 2.32.0.3.g01195cf9f
>
Xuan Zhuo Jan. 29, 2023, 7:43 a.m. UTC | #4
On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > Check whether it is per-queue reset state in virtio_net_flush_tx().
> >
> > Before per-queue reset, we need to recover async tx resources. At this
> > time, virtio_net_flush_tx() is called, but we should not try to send
> > new packets, so virtio_net_flush_tx() should check the current
> > per-queue reset state.
> >
> > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  hw/net/virtio-net.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 3ae909041a..fba6451a50 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> >      VirtQueueElement *elem;
> >      int32_t num_packets = 0;
> >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > +        virtio_queue_reset_state(q->tx_vq)) {
>
> We have other places that check DRIVER_OK do we need to check queue
> reset as well?

I checked it again. I still think that the location of other checking DRIVER_OK
does not need to check the queue reset.

Thanks.


>
> E.g:
> virtio_net_can_receive()
> virtio_net_tx_{timer|bh}()
>
> Thanks
>
> >          return num_packets;
> >      }
> >
> > --
> > 2.32.0.3.g01195cf9f
> >
>
Michael S. Tsirkin Jan. 29, 2023, 8:12 a.m. UTC | #5
On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > >
> > > Before per-queue reset, we need to recover async tx resources. At this
> > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > new packets, so virtio_net_flush_tx() should check the current
> > > per-queue reset state.
> >
> >
> > What does "at this time" mean here?
> > Do you in fact mean it's called from flush_or_purge_queued_packets?
> 
> Yes
> 
> virtio_queue_reset
> 	k->queue_reset
> 		virtio_net_queue_reset
> 			flush_or_purge_queued_packets
> 				qemu_flush_or_purge_queued_packets
> 					.....
> 					(callback) virtio_net_tx_complete
> 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> 
> 
> Because it is inside the callback, I can't pass information through the stack. I
> originally thought it was a general situation, so I wanted to put it in
> struct VirtQueue.
> 
> If it is not very suitable, it may be better to put it in VirtIONetQueue.
> 
> Thanks.

Hmm maybe. Another idea: isn't virtio_net_tx_complete called
with length 0 here? Are there other cases where length is 0?


> > What does the call stack look like?
> >
> > If yes introducing a vq state just so virtio_net_flush_tx
> > knows we are in the process of reset would be a bad idea.
> > We want something much more local, ideally on stack even ...
> >
> >
> > >
> > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  hw/net/virtio-net.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > index 3ae909041a..fba6451a50 100644
> > > --- a/hw/net/virtio-net.c
> > > +++ b/hw/net/virtio-net.c
> > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > >      VirtQueueElement *elem;
> > >      int32_t num_packets = 0;
> > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > +        virtio_queue_reset_state(q->tx_vq)) {

btw this sounds like you are asking it to reset some state.

> > >          return num_packets;

and then

    ret = virtio_net_flush_tx(q);
    if (ret >= n->tx_burst)


will reschedule automatically won't it?

also why check in virtio_net_flush_tx and not virtio_net_tx_complete?


> > >      }
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> >
Xuan Zhuo Jan. 29, 2023, 8:23 a.m. UTC | #6
On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > >
> > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > new packets, so virtio_net_flush_tx() should check the current
> > > > per-queue reset state.
> > >
> > >
> > > What does "at this time" mean here?
> > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> >
> > Yes
> >
> > virtio_queue_reset
> > 	k->queue_reset
> > 		virtio_net_queue_reset
> > 			flush_or_purge_queued_packets
> > 				qemu_flush_or_purge_queued_packets
> > 					.....
> > 					(callback) virtio_net_tx_complete
> > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> >
> >
> > Because it is inside the callback, I can't pass information through the stack. I
> > originally thought it was a general situation, so I wanted to put it in
> > struct VirtQueue.
> >
> > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> >
> > Thanks.
>
> Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> with length 0 here? Are there other cases where length is 0?
>
>
> > > What does the call stack look like?
> > >
> > > If yes introducing a vq state just so virtio_net_flush_tx
> > > knows we are in the process of reset would be a bad idea.
> > > We want something much more local, ideally on stack even ...
> > >
> > >
> > > >
> > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > ---
> > > >  hw/net/virtio-net.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > index 3ae909041a..fba6451a50 100644
> > > > --- a/hw/net/virtio-net.c
> > > > +++ b/hw/net/virtio-net.c
> > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > >      VirtQueueElement *elem;
> > > >      int32_t num_packets = 0;
> > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > +        virtio_queue_reset_state(q->tx_vq)) {
>
> btw this sounds like you are asking it to reset some state.
>
> > > >          return num_packets;
>
> and then
>
>     ret = virtio_net_flush_tx(q);
>     if (ret >= n->tx_burst)
>
>
> will reschedule automatically won't it?
>
> also why check in virtio_net_flush_tx and not virtio_net_tx_complete?

virtio_net_flush_tx may been called by timer.

Thanks.

>
>
> > > >      }
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > >
>
Michael S. Tsirkin Jan. 29, 2023, 11:57 a.m. UTC | #7
On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > >
> > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > per-queue reset state.
> > > >
> > > >
> > > > What does "at this time" mean here?
> > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > >
> > > Yes
> > >
> > > virtio_queue_reset
> > > 	k->queue_reset
> > > 		virtio_net_queue_reset
> > > 			flush_or_purge_queued_packets
> > > 				qemu_flush_or_purge_queued_packets
> > > 					.....
> > > 					(callback) virtio_net_tx_complete
> > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > >
> > >
> > > Because it is inside the callback, I can't pass information through the stack. I
> > > originally thought it was a general situation, so I wanted to put it in
> > > struct VirtQueue.
> > >
> > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > >
> > > Thanks.
> >
> > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > with length 0 here? Are there other cases where length is 0?
> >
> >
> > > > What does the call stack look like?
> > > >
> > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > knows we are in the process of reset would be a bad idea.
> > > > We want something much more local, ideally on stack even ...
> > > >
> > > >
> > > > >
> > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > ---
> > > > >  hw/net/virtio-net.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index 3ae909041a..fba6451a50 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > >      VirtQueueElement *elem;
> > > > >      int32_t num_packets = 0;
> > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> >
> > btw this sounds like you are asking it to reset some state.
> >
> > > > >          return num_packets;
> >
> > and then
> >
> >     ret = virtio_net_flush_tx(q);
> >     if (ret >= n->tx_burst)
> >
> >
> > will reschedule automatically won't it?
> >
> > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> 
> virtio_net_flush_tx may been called by timer.
> 
> Thanks.

timer won't run while flush_or_purge_queued_packets is in progress.

> >
> >
> > > > >      }
> > > > >
> > > > > --
> > > > > 2.32.0.3.g01195cf9f
> > > >
> >
Xuan Zhuo Jan. 29, 2023, 12:03 p.m. UTC | #8
On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > >
> > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > per-queue reset state.
> > > > >
> > > > >
> > > > > What does "at this time" mean here?
> > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > >
> > > > Yes
> > > >
> > > > virtio_queue_reset
> > > > 	k->queue_reset
> > > > 		virtio_net_queue_reset
> > > > 			flush_or_purge_queued_packets
> > > > 				qemu_flush_or_purge_queued_packets
> > > > 					.....
> > > > 					(callback) virtio_net_tx_complete
> > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > >
> > > >
> > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > originally thought it was a general situation, so I wanted to put it in
> > > > struct VirtQueue.
> > > >
> > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > >
> > > > Thanks.
> > >
> > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > with length 0 here? Are there other cases where length is 0?
> > >
> > >
> > > > > What does the call stack look like?
> > > > >
> > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > knows we are in the process of reset would be a bad idea.
> > > > > We want something much more local, ideally on stack even ...
> > > > >
> > > > >
> > > > > >
> > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > ---
> > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > --- a/hw/net/virtio-net.c
> > > > > > +++ b/hw/net/virtio-net.c
> > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > >      VirtQueueElement *elem;
> > > > > >      int32_t num_packets = 0;
> > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > >
> > > btw this sounds like you are asking it to reset some state.
> > >
> > > > > >          return num_packets;
> > >
> > > and then
> > >
> > >     ret = virtio_net_flush_tx(q);
> > >     if (ret >= n->tx_burst)
> > >
> > >
> > > will reschedule automatically won't it?
> > >
> > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> >
> > virtio_net_flush_tx may been called by timer.
> >
> > Thanks.
>
> timer won't run while flush_or_purge_queued_packets is in progress.

Is timer not executed during the VMEXIT process? Otherwise, we still have to
consider that after the flush_or_purge_queued_packets, this process before the
structure is cleared.

Even if it can be processed in virtio_net_tx_complete, is there any good way?
This is a callback, it is not convenient to pass the parameters.

Thanks


>
> > >
> > >
> > > > > >      }
> > > > > >
> > > > > > --
> > > > > > 2.32.0.3.g01195cf9f
> > > > >
> > >
>
Michael S. Tsirkin Jan. 29, 2023, 12:15 p.m. UTC | #9
On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > >
> > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > per-queue reset state.
> > > > > >
> > > > > >
> > > > > > What does "at this time" mean here?
> > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > >
> > > > > Yes
> > > > >
> > > > > virtio_queue_reset
> > > > > 	k->queue_reset
> > > > > 		virtio_net_queue_reset
> > > > > 			flush_or_purge_queued_packets
> > > > > 				qemu_flush_or_purge_queued_packets
> > > > > 					.....
> > > > > 					(callback) virtio_net_tx_complete
> > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > >
> > > > >
> > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > struct VirtQueue.
> > > > >
> > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > >
> > > > > Thanks.
> > > >
> > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > with length 0 here? Are there other cases where length is 0?
> > > >
> > > >
> > > > > > What does the call stack look like?
> > > > > >
> > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > We want something much more local, ideally on stack even ...
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > ---
> > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > >      VirtQueueElement *elem;
> > > > > > >      int32_t num_packets = 0;
> > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > >
> > > > btw this sounds like you are asking it to reset some state.
> > > >
> > > > > > >          return num_packets;
> > > >
> > > > and then
> > > >
> > > >     ret = virtio_net_flush_tx(q);
> > > >     if (ret >= n->tx_burst)
> > > >
> > > >
> > > > will reschedule automatically won't it?
> > > >
> > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > >
> > > virtio_net_flush_tx may been called by timer.
> > >
> > > Thanks.
> >
> > timer won't run while flush_or_purge_queued_packets is in progress.
> 
> Is timer not executed during the VMEXIT process? Otherwise, we still have to
> consider that after the flush_or_purge_queued_packets, this process before the
> structure is cleared.



void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
{
    VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);

    if (k->queue_reset) {
        k->queue_reset(vdev, queue_index);
    }

    __virtio_queue_reset(vdev, queue_index);
}


No timers do not run between  k->queue_reset and __virtio_queue_reset.


> Even if it can be processed in virtio_net_tx_complete, is there any good way?
> This is a callback, it is not convenient to pass the parameters.
> 
> Thanks


How about checking that length is 0?

> 
> >
> > > >
> > > >
> > > > > > >      }
> > > > > > >
> > > > > > > --
> > > > > > > 2.32.0.3.g01195cf9f
> > > > > >
> > > >
> >
Xuan Zhuo Jan. 29, 2023, 12:28 p.m. UTC | #10
On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > >
> > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > per-queue reset state.
> > > > > > >
> > > > > > >
> > > > > > > What does "at this time" mean here?
> > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > >
> > > > > > Yes
> > > > > >
> > > > > > virtio_queue_reset
> > > > > > 	k->queue_reset
> > > > > > 		virtio_net_queue_reset
> > > > > > 			flush_or_purge_queued_packets
> > > > > > 				qemu_flush_or_purge_queued_packets
> > > > > > 					.....
> > > > > > 					(callback) virtio_net_tx_complete
> > > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > >
> > > > > >
> > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > struct VirtQueue.
> > > > > >
> > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > with length 0 here? Are there other cases where length is 0?
> > > > >
> > > > >
> > > > > > > What does the call stack look like?
> > > > > > >
> > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > We want something much more local, ideally on stack even ...
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > ---
> > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > >      VirtQueueElement *elem;
> > > > > > > >      int32_t num_packets = 0;
> > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > >
> > > > > btw this sounds like you are asking it to reset some state.
> > > > >
> > > > > > > >          return num_packets;
> > > > >
> > > > > and then
> > > > >
> > > > >     ret = virtio_net_flush_tx(q);
> > > > >     if (ret >= n->tx_burst)
> > > > >
> > > > >
> > > > > will reschedule automatically won't it?
> > > > >
> > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > >
> > > > virtio_net_flush_tx may been called by timer.
> > > >
> > > > Thanks.
> > >
> > > timer won't run while flush_or_purge_queued_packets is in progress.
> >
> > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > consider that after the flush_or_purge_queued_packets, this process before the
> > structure is cleared.
>
>
>
> void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> {
>     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
>
>     if (k->queue_reset) {
>         k->queue_reset(vdev, queue_index);
>     }
>
>     __virtio_queue_reset(vdev, queue_index);
> }
>
>
> No timers do not run between  k->queue_reset and __virtio_queue_reset.
>
>
> > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > This is a callback, it is not convenient to pass the parameters.
> >
> > Thanks
>
>
> How about checking that length is 0?


I think you refer to the second parameter of virtio_net_tx_complete "len".

void qemu_net_queue_purge(NetQueue *queue, NetClientState *from)
{
    NetPacket *packet, *next;

    QTAILQ_FOREACH_SAFE(packet, &queue->packets, entry, next) {
        if (packet->sender == from) {
            QTAILQ_REMOVE(&queue->packets, packet, entry);
            queue->nq_count--;
            if (packet->sent_cb) {
                packet->sent_cb(packet->sender, 0);
            }
            g_free(packet);
        }
    }
}

qemu_net_queue_purge pass 0 as len. This function has been called in large
quantities. I am not sure if it will have a bad impact. I will try if there is a
good way to pass the parameters on the stack.

Thanks.

>
> >
> > >
> > > > >
> > > > >
> > > > > > > >      }
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > >
> > > > >
> > >
>
Xuan Zhuo Jan. 30, 2023, 2:15 a.m. UTC | #11
On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > >
> > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > per-queue reset state.
> > > > > > >
> > > > > > >
> > > > > > > What does "at this time" mean here?
> > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > >
> > > > > > Yes
> > > > > >
> > > > > > virtio_queue_reset
> > > > > > 	k->queue_reset
> > > > > > 		virtio_net_queue_reset
> > > > > > 			flush_or_purge_queued_packets
> > > > > > 				qemu_flush_or_purge_queued_packets
> > > > > > 					.....
> > > > > > 					(callback) virtio_net_tx_complete
> > > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > >
> > > > > >
> > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > struct VirtQueue.
> > > > > >
> > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > with length 0 here? Are there other cases where length is 0?
> > > > >
> > > > >
> > > > > > > What does the call stack look like?
> > > > > > >
> > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > We want something much more local, ideally on stack even ...
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > ---
> > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > >      VirtQueueElement *elem;
> > > > > > > >      int32_t num_packets = 0;
> > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > >
> > > > > btw this sounds like you are asking it to reset some state.
> > > > >
> > > > > > > >          return num_packets;
> > > > >
> > > > > and then
> > > > >
> > > > >     ret = virtio_net_flush_tx(q);
> > > > >     if (ret >= n->tx_burst)
> > > > >
> > > > >
> > > > > will reschedule automatically won't it?
> > > > >
> > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > >
> > > > virtio_net_flush_tx may been called by timer.
> > > >
> > > > Thanks.
> > >
> > > timer won't run while flush_or_purge_queued_packets is in progress.
> >
> > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > consider that after the flush_or_purge_queued_packets, this process before the
> > structure is cleared.
>
>
>
> void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> {
>     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
>
>     if (k->queue_reset) {
>         k->queue_reset(vdev, queue_index);
>     }
>
>     __virtio_queue_reset(vdev, queue_index);
> }
>
>
> No timers do not run between  k->queue_reset and __virtio_queue_reset.
>
>
> > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > This is a callback, it is not convenient to pass the parameters.
> >
> > Thanks
>
>
> How about checking that length is 0?


I think that check length is not a good way. This modifys the semantics of 0. It is
not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
will pass 0, and this function is called by many places.

How about we add an api in queue.c to replace the sent_cb callback on queue?

Thanks.


>
> >
> > >
> > > > >
> > > > >
> > > > > > > >      }
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > >
> > > > >
> > >
>
Jason Wang Jan. 30, 2023, 3:01 a.m. UTC | #12
On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > >
> > > Before per-queue reset, we need to recover async tx resources. At this
> > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > new packets, so virtio_net_flush_tx() should check the current
> > > per-queue reset state.
> > >
> > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  hw/net/virtio-net.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > index 3ae909041a..fba6451a50 100644
> > > --- a/hw/net/virtio-net.c
> > > +++ b/hw/net/virtio-net.c
> > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > >      VirtQueueElement *elem;
> > >      int32_t num_packets = 0;
> > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > +        virtio_queue_reset_state(q->tx_vq)) {
> >
> > We have other places that check DRIVER_OK do we need to check queue
> > reset as well?
>
> I checked it again. I still think that the location of other checking DRIVER_OK
> does not need to check the queue reset.

For example, if we don't disable can_receive() when the queue is
reset, it means rx may go for virtio_net_receive_rcu(). It means the
Qemu is still trying to process the traffic from the network backend
like tap which may waste cpu cycles.

I think the correct way is to return false when the queue is reset in
can_receive(), then the backend poll will be disabled (e.g TAP). When
the queue is enabled again, qemu_flush_queued_packets() will wake up
the backend polling.

Having had time to check other places but it would be better to
mention why it doesn't need a check in the changelog.

Thanks

>
> Thanks.
>
>
> >
> > E.g:
> > virtio_net_can_receive()
> > virtio_net_tx_{timer|bh}()
> >
> > Thanks
> >
> > >          return num_packets;
> > >      }
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> > >
> >
>
Xuan Zhuo Jan. 30, 2023, 3:38 a.m. UTC | #13
On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > >
> > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > new packets, so virtio_net_flush_tx() should check the current
> > > > per-queue reset state.
> > > >
> > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > ---
> > > >  hw/net/virtio-net.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > index 3ae909041a..fba6451a50 100644
> > > > --- a/hw/net/virtio-net.c
> > > > +++ b/hw/net/virtio-net.c
> > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > >      VirtQueueElement *elem;
> > > >      int32_t num_packets = 0;
> > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > >
> > > We have other places that check DRIVER_OK do we need to check queue
> > > reset as well?
> >
> > I checked it again. I still think that the location of other checking DRIVER_OK
> > does not need to check the queue reset.
>
> For example, if we don't disable can_receive() when the queue is
> reset, it means rx may go for virtio_net_receive_rcu(). It means the
> Qemu is still trying to process the traffic from the network backend
> like tap which may waste cpu cycles.
>
> I think the correct way is to return false when the queue is reset in
> can_receive(), then the backend poll will be disabled (e.g TAP). When
> the queue is enabled again, qemu_flush_queued_packets() will wake up
> the backend polling.
>
> Having had time to check other places but it would be better to
> mention why it doesn't need a check in the changelog.


static bool virtio_net_can_receive(NetClientState *nc)
{
    VirtIONet *n = qemu_get_nic_opaque(nc);
    VirtIODevice *vdev = VIRTIO_DEVICE(n);
    VirtIONetQueue *q = virtio_net_get_subqueue(nc);

    if (!vdev->vm_running) {
        return false;
    }

    if (nc->queue_index >= n->curr_queue_pairs) {
        return false;
    }

    if (!virtio_queue_ready(q->rx_vq) ||
        !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
        return false;
    }

    return true;
}

int virtio_queue_ready(VirtQueue *vq)
{
    return vq->vring.avail != 0;
}


static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
{
    vdev->vq[i].vring.desc = 0;
    vdev->vq[i].vring.avail = 0;
    vdev->vq[i].vring.used = 0;
    vdev->vq[i].last_avail_idx = 0;
    vdev->vq[i].shadow_avail_idx = 0;
    vdev->vq[i].used_idx = 0;
    vdev->vq[i].last_avail_wrap_counter = true;
    vdev->vq[i].shadow_avail_wrap_counter = true;
    vdev->vq[i].used_wrap_counter = true;
    virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
    vdev->vq[i].signalled_used = 0;
    vdev->vq[i].signalled_used_valid = false;
    vdev->vq[i].notification = true;
    vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
    vdev->vq[i].inuse = 0;
    virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
}

In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.
Then callback can_receive will return False.


Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > E.g:
> > > virtio_net_can_receive()
> > > virtio_net_tx_{timer|bh}()
> > >
> > > Thanks
> > >
> > > >          return num_packets;
> > > >      }
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > > >
> > >
> >
>
Jason Wang Jan. 30, 2023, 3:53 a.m. UTC | #14
On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >
> > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > >
> > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > per-queue reset state.
> > > > >
> > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > ---
> > > > >  hw/net/virtio-net.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index 3ae909041a..fba6451a50 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > >      VirtQueueElement *elem;
> > > > >      int32_t num_packets = 0;
> > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > >
> > > > We have other places that check DRIVER_OK do we need to check queue
> > > > reset as well?
> > >
> > > I checked it again. I still think that the location of other checking DRIVER_OK
> > > does not need to check the queue reset.
> >
> > For example, if we don't disable can_receive() when the queue is
> > reset, it means rx may go for virtio_net_receive_rcu(). It means the
> > Qemu is still trying to process the traffic from the network backend
> > like tap which may waste cpu cycles.
> >
> > I think the correct way is to return false when the queue is reset in
> > can_receive(), then the backend poll will be disabled (e.g TAP). When
> > the queue is enabled again, qemu_flush_queued_packets() will wake up
> > the backend polling.
> >
> > Having had time to check other places but it would be better to
> > mention why it doesn't need a check in the changelog.
>
>
> static bool virtio_net_can_receive(NetClientState *nc)
> {
>     VirtIONet *n = qemu_get_nic_opaque(nc);
>     VirtIODevice *vdev = VIRTIO_DEVICE(n);
>     VirtIONetQueue *q = virtio_net_get_subqueue(nc);
>
>     if (!vdev->vm_running) {
>         return false;
>     }
>
>     if (nc->queue_index >= n->curr_queue_pairs) {
>         return false;
>     }
>
>     if (!virtio_queue_ready(q->rx_vq) ||
>         !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
>         return false;
>     }
>
>     return true;
> }
>
> int virtio_queue_ready(VirtQueue *vq)
> {
>     return vq->vring.avail != 0;
> }
>
>
> static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
> {
>     vdev->vq[i].vring.desc = 0;
>     vdev->vq[i].vring.avail = 0;
>     vdev->vq[i].vring.used = 0;
>     vdev->vq[i].last_avail_idx = 0;
>     vdev->vq[i].shadow_avail_idx = 0;
>     vdev->vq[i].used_idx = 0;
>     vdev->vq[i].last_avail_wrap_counter = true;
>     vdev->vq[i].shadow_avail_wrap_counter = true;
>     vdev->vq[i].used_wrap_counter = true;
>     virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
>     vdev->vq[i].signalled_used = 0;
>     vdev->vq[i].signalled_used_valid = false;
>     vdev->vq[i].notification = true;
>     vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
>     vdev->vq[i].inuse = 0;
>     virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> }
>
> In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.

Ok, but this is kind of fragile (especially when vIOMMU is enabled).
I'd add an explicit check for reset there.

(probably on top).

Thanks

> Then callback can_receive will return False.
>
>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > E.g:
> > > > virtio_net_can_receive()
> > > > virtio_net_tx_{timer|bh}()
> > > >
> > > > Thanks
> > > >
> > > > >          return num_packets;
> > > > >      }
> > > > >
> > > > > --
> > > > > 2.32.0.3.g01195cf9f
> > > > >
> > > >
> > >
> >
>
Michael S. Tsirkin Jan. 30, 2023, 5:32 a.m. UTC | #15
On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > >
> > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > per-queue reset state.
> > > > > > > >
> > > > > > > >
> > > > > > > > What does "at this time" mean here?
> > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > >
> > > > > > > Yes
> > > > > > >
> > > > > > > virtio_queue_reset
> > > > > > > 	k->queue_reset
> > > > > > > 		virtio_net_queue_reset
> > > > > > > 			flush_or_purge_queued_packets
> > > > > > > 				qemu_flush_or_purge_queued_packets
> > > > > > > 					.....
> > > > > > > 					(callback) virtio_net_tx_complete
> > > > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > >
> > > > > > >
> > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > struct VirtQueue.
> > > > > > >
> > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > >
> > > > > >
> > > > > > > > What does the call stack look like?
> > > > > > > >
> > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > ---
> > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > >
> > > > > > btw this sounds like you are asking it to reset some state.
> > > > > >
> > > > > > > > >          return num_packets;
> > > > > >
> > > > > > and then
> > > > > >
> > > > > >     ret = virtio_net_flush_tx(q);
> > > > > >     if (ret >= n->tx_burst)
> > > > > >
> > > > > >
> > > > > > will reschedule automatically won't it?
> > > > > >
> > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > >
> > > > > virtio_net_flush_tx may been called by timer.
> > > > >
> > > > > Thanks.
> > > >
> > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > >
> > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > consider that after the flush_or_purge_queued_packets, this process before the
> > > structure is cleared.
> >
> >
> >
> > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > {
> >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> >
> >     if (k->queue_reset) {
> >         k->queue_reset(vdev, queue_index);
> >     }
> >
> >     __virtio_queue_reset(vdev, queue_index);
> > }
> >
> >
> > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> >
> >
> > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > This is a callback, it is not convenient to pass the parameters.
> > >
> > > Thanks
> >
> >
> > How about checking that length is 0?
> 
> 
> I think that check length is not a good way. This modifys the semantics of 0. It is
> not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> will pass 0, and this function is called by many places.
> 
> How about we add an api in queue.c to replace the sent_cb callback on queue?
> 
> Thanks.

OK I guess. Jason?

> 
> >
> > >
> > > >
> > > > > >
> > > > > >
> > > > > > > > >      }
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > >
> > > > > >
> > > >
> >
Michael S. Tsirkin Jan. 30, 2023, 5:50 a.m. UTC | #16
On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote:
> On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > >
> > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > >
> > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > per-queue reset state.
> > > > > >
> > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > ---
> > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > --- a/hw/net/virtio-net.c
> > > > > > +++ b/hw/net/virtio-net.c
> > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > >      VirtQueueElement *elem;
> > > > > >      int32_t num_packets = 0;
> > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > >
> > > > > We have other places that check DRIVER_OK do we need to check queue
> > > > > reset as well?
> > > >
> > > > I checked it again. I still think that the location of other checking DRIVER_OK
> > > > does not need to check the queue reset.
> > >
> > > For example, if we don't disable can_receive() when the queue is
> > > reset, it means rx may go for virtio_net_receive_rcu(). It means the
> > > Qemu is still trying to process the traffic from the network backend
> > > like tap which may waste cpu cycles.
> > >
> > > I think the correct way is to return false when the queue is reset in
> > > can_receive(), then the backend poll will be disabled (e.g TAP). When
> > > the queue is enabled again, qemu_flush_queued_packets() will wake up
> > > the backend polling.
> > >
> > > Having had time to check other places but it would be better to
> > > mention why it doesn't need a check in the changelog.
> >
> >
> > static bool virtio_net_can_receive(NetClientState *nc)
> > {
> >     VirtIONet *n = qemu_get_nic_opaque(nc);
> >     VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >     VirtIONetQueue *q = virtio_net_get_subqueue(nc);
> >
> >     if (!vdev->vm_running) {
> >         return false;
> >     }
> >
> >     if (nc->queue_index >= n->curr_queue_pairs) {
> >         return false;
> >     }
> >
> >     if (!virtio_queue_ready(q->rx_vq) ||
> >         !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> >         return false;
> >     }
> >
> >     return true;
> > }
> >
> > int virtio_queue_ready(VirtQueue *vq)
> > {
> >     return vq->vring.avail != 0;
> > }
> >
> >
> > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
> > {
> >     vdev->vq[i].vring.desc = 0;
> >     vdev->vq[i].vring.avail = 0;
> >     vdev->vq[i].vring.used = 0;
> >     vdev->vq[i].last_avail_idx = 0;
> >     vdev->vq[i].shadow_avail_idx = 0;
> >     vdev->vq[i].used_idx = 0;
> >     vdev->vq[i].last_avail_wrap_counter = true;
> >     vdev->vq[i].shadow_avail_wrap_counter = true;
> >     vdev->vq[i].used_wrap_counter = true;
> >     virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
> >     vdev->vq[i].signalled_used = 0;
> >     vdev->vq[i].signalled_used_valid = false;
> >     vdev->vq[i].notification = true;
> >     vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
> >     vdev->vq[i].inuse = 0;
> >     virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> > }
> >
> > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.
> 
> Ok, but this is kind of fragile (especially when vIOMMU is enabled).
> I'd add an explicit check for reset there.

It's not great in that spec says avail 0 is actually legal.
But I don't really want to see more and more checks.
If we are doing cleanups, the right way is probably a new "live" flag
that transports can set correctly from the combination of
DRIVER_OK, desc, kick, queue_enable, queue_reset and so on.

> (probably on top).
> 
> Thanks
> 
> > Then callback can_receive will return False.
> >
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > E.g:
> > > > > virtio_net_can_receive()
> > > > > virtio_net_tx_{timer|bh}()
> > > > >
> > > > > Thanks
> > > > >
> > > > > >          return num_packets;
> > > > > >      }
> > > > > >
> > > > > > --
> > > > > > 2.32.0.3.g01195cf9f
> > > > > >
> > > > >
> > > >
> > >
> >
Jason Wang Jan. 30, 2023, 7:49 a.m. UTC | #17
On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > >
> > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > per-queue reset state.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > What does "at this time" mean here?
> > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > >
> > > > > > > > Yes
> > > > > > > >
> > > > > > > > virtio_queue_reset
> > > > > > > >   k->queue_reset
> > > > > > > >           virtio_net_queue_reset
> > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > >                                   .....
> > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > >
> > > > > > > >
> > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > struct VirtQueue.
> > > > > > > >
> > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > >
> > > > > > >
> > > > > > > > > What does the call stack look like?
> > > > > > > > >
> > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > ---
> > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > >
> > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > >
> > > > > > > > > >          return num_packets;
> > > > > > >
> > > > > > > and then
> > > > > > >
> > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > >     if (ret >= n->tx_burst)
> > > > > > >
> > > > > > >
> > > > > > > will reschedule automatically won't it?
> > > > > > >
> > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > >
> > > > > > virtio_net_flush_tx may been called by timer.

We stop timer/bh during device reset, do we need to do the same with vq reset?

> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > >
> > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > structure is cleared.
> > >
> > >
> > >
> > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > {
> > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > >
> > >     if (k->queue_reset) {
> > >         k->queue_reset(vdev, queue_index);
> > >     }
> > >
> > >     __virtio_queue_reset(vdev, queue_index);
> > > }
> > >
> > >
> > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > >
> > >
> > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > This is a callback, it is not convenient to pass the parameters.
> > > >
> > > > Thanks
> > >
> > >
> > > How about checking that length is 0?
> >
> >
> > I think that check length is not a good way. This modifys the semantics of 0.

0 seems to mean "purge" and

> It is
> > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > will pass 0, and this function is called by many places.

That's exactly what we want actually, when do purge we don't need a flush?

> >
> > How about we add an api in queue.c to replace the sent_cb callback on queue?
> >
> > Thanks.
>
> OK I guess. Jason?

Not sure, anything different from adding a check in
virtio_net_tx_complete()? (assuming bh and timer is cancelled or
deleted).

Thanks

>
> >
> > >
> > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > >      }
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>
Xuan Zhuo Jan. 30, 2023, 7:53 a.m. UTC | #18
On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > >
> > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > per-queue reset state.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > >
> > > > > > > > > Yes
> > > > > > > > >
> > > > > > > > > virtio_queue_reset
> > > > > > > > >   k->queue_reset
> > > > > > > > >           virtio_net_queue_reset
> > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > >                                   .....
> > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > struct VirtQueue.
> > > > > > > > >
> > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > >
> > > > > > > >
> > > > > > > > > > What does the call stack look like?
> > > > > > > > > >
> > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > >
> > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > >
> > > > > > > > > > >          return num_packets;
> > > > > > > >
> > > > > > > > and then
> > > > > > > >
> > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > >
> > > > > > > >
> > > > > > > > will reschedule automatically won't it?
> > > > > > > >
> > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > >
> > > > > > > virtio_net_flush_tx may been called by timer.
>
> We stop timer/bh during device reset, do we need to do the same with vq reset?
>
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > >
> > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > structure is cleared.
> > > >
> > > >
> > > >
> > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > {
> > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > >
> > > >     if (k->queue_reset) {
> > > >         k->queue_reset(vdev, queue_index);
> > > >     }
> > > >
> > > >     __virtio_queue_reset(vdev, queue_index);
> > > > }
> > > >
> > > >
> > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > >
> > > >
> > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > This is a callback, it is not convenient to pass the parameters.
> > > > >
> > > > > Thanks
> > > >
> > > >
> > > > How about checking that length is 0?
> > >
> > >
> > > I think that check length is not a good way. This modifys the semantics of 0.
>
> 0 seems to mean "purge" and
>
> > It is
> > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > will pass 0, and this function is called by many places.
>
> That's exactly what we want actually, when do purge we don't need a flush?

Yes, but I'm not sure. If we stop flush, there will be any other effects.

On the other hand, if we use "0" as a judgment condition, do you mean only the
implementation of the purge in the flush_or_purge_queued_packets()?

>
> > >
> > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > >
> > > Thanks.
> > > > OK I guess. Jason?
>
> Not sure, anything different from adding a check in
> virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> deleted).

We replaced the sent_cb with a function without flush.

Thanks.


>
> Thanks
>
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > >      }
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
>
Jason Wang Jan. 30, 2023, 8:40 a.m. UTC | #19
On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > >
> > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > >
> > > > > > > > > > Yes
> > > > > > > > > >
> > > > > > > > > > virtio_queue_reset
> > > > > > > > > >   k->queue_reset
> > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > >                                   .....
> > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > struct VirtQueue.
> > > > > > > > > >
> > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > >
> > > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > >
> > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > >
> > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > >
> > > > > > > > > > > >          return num_packets;
> > > > > > > > >
> > > > > > > > > and then
> > > > > > > > >
> > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > will reschedule automatically won't it?
> > > > > > > > >
> > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > >
> > > > > > > > virtio_net_flush_tx may been called by timer.
> >
> > We stop timer/bh during device reset, do we need to do the same with vq reset?
> >
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > >
> > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > structure is cleared.
> > > > >
> > > > >
> > > > >
> > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > {
> > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > >
> > > > >     if (k->queue_reset) {
> > > > >         k->queue_reset(vdev, queue_index);
> > > > >     }
> > > > >
> > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > }
> > > > >
> > > > >
> > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > >
> > > > >
> > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > >
> > > > > How about checking that length is 0?
> > > >
> > > >
> > > > I think that check length is not a good way. This modifys the semantics of 0.
> >
> > 0 seems to mean "purge" and
> >
> > > It is
> > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > will pass 0, and this function is called by many places.
> >
> > That's exactly what we want actually, when do purge we don't need a flush?
>
> Yes, but I'm not sure. If we stop flush, there will be any other effects.

So we did:

virtio_net_queue_reset():
    nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
    flush_or_purge_queued_packets(nc);
        qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
            if (qemu_net_queue_flush(nc->incoming_queue)) {
            ....
            } else if (purge) {
                qemu_net_queue_purge(nc->incoming_queue, nc->peer);
                        packet->send_cb()
                            virtio_net_tx_complete()
                                virtio_net_flush_tx()
                                    qemu_sendv_packet_async() // [2]
            }

We try to flush the tap's incoming queue and if we fail we will purge
in [1]. But the sent_cb() tries to send more packets which could be
queued to the tap incoming queue [2]. This breaks the semantic of
qemu_flush_or_purge_queued_packets().

>
> On the other hand, if we use "0" as a judgment condition, do you mean only the
> implementation of the purge in the flush_or_purge_queued_packets()?

It should be all the users of qemu_net_queue_purge(). The rest users
seems all fine:

virtio_net_vhost_status(), if we do flush, it may end up with touching
vring during vhost is running.
filters: all do a flush before.

>
> >
> > > >
> > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > >
> > > > Thanks.
> > > > > OK I guess. Jason?
> >
> > Not sure, anything different from adding a check in
> > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > deleted).
>
> We replaced the sent_cb with a function without flush.

I meant it won't be different from adding a

if (virtio_queue_is_rest())

somewhere in virtio_net_tx_complete()?

Thanks

>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > >      }
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
>
Jason Wang Jan. 30, 2023, 8:41 a.m. UTC | #20
On Mon, Jan 30, 2023 at 1:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote:
> > On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >
> > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > >
> > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > >
> > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > per-queue reset state.
> > > > > > >
> > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > ---
> > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > >      VirtQueueElement *elem;
> > > > > > >      int32_t num_packets = 0;
> > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > >
> > > > > > We have other places that check DRIVER_OK do we need to check queue
> > > > > > reset as well?
> > > > >
> > > > > I checked it again. I still think that the location of other checking DRIVER_OK
> > > > > does not need to check the queue reset.
> > > >
> > > > For example, if we don't disable can_receive() when the queue is
> > > > reset, it means rx may go for virtio_net_receive_rcu(). It means the
> > > > Qemu is still trying to process the traffic from the network backend
> > > > like tap which may waste cpu cycles.
> > > >
> > > > I think the correct way is to return false when the queue is reset in
> > > > can_receive(), then the backend poll will be disabled (e.g TAP). When
> > > > the queue is enabled again, qemu_flush_queued_packets() will wake up
> > > > the backend polling.
> > > >
> > > > Having had time to check other places but it would be better to
> > > > mention why it doesn't need a check in the changelog.
> > >
> > >
> > > static bool virtio_net_can_receive(NetClientState *nc)
> > > {
> > >     VirtIONet *n = qemu_get_nic_opaque(nc);
> > >     VirtIODevice *vdev = VIRTIO_DEVICE(n);
> > >     VirtIONetQueue *q = virtio_net_get_subqueue(nc);
> > >
> > >     if (!vdev->vm_running) {
> > >         return false;
> > >     }
> > >
> > >     if (nc->queue_index >= n->curr_queue_pairs) {
> > >         return false;
> > >     }
> > >
> > >     if (!virtio_queue_ready(q->rx_vq) ||
> > >         !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > >         return false;
> > >     }
> > >
> > >     return true;
> > > }
> > >
> > > int virtio_queue_ready(VirtQueue *vq)
> > > {
> > >     return vq->vring.avail != 0;
> > > }
> > >
> > >
> > > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
> > > {
> > >     vdev->vq[i].vring.desc = 0;
> > >     vdev->vq[i].vring.avail = 0;
> > >     vdev->vq[i].vring.used = 0;
> > >     vdev->vq[i].last_avail_idx = 0;
> > >     vdev->vq[i].shadow_avail_idx = 0;
> > >     vdev->vq[i].used_idx = 0;
> > >     vdev->vq[i].last_avail_wrap_counter = true;
> > >     vdev->vq[i].shadow_avail_wrap_counter = true;
> > >     vdev->vq[i].used_wrap_counter = true;
> > >     virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
> > >     vdev->vq[i].signalled_used = 0;
> > >     vdev->vq[i].signalled_used_valid = false;
> > >     vdev->vq[i].notification = true;
> > >     vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
> > >     vdev->vq[i].inuse = 0;
> > >     virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> > > }
> > >
> > > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.
> >
> > Ok, but this is kind of fragile (especially when vIOMMU is enabled).
> > I'd add an explicit check for reset there.
>
> It's not great in that spec says avail 0 is actually legal.
> But I don't really want to see more and more checks.
> If we are doing cleanups, the right way is probably a new "live" flag
> that transports can set correctly from the combination of
> DRIVER_OK, desc, kick, queue_enable, queue_reset and so on.

I second this, but for kick, we can only do that unless it is mandated
by the spec (otherwise we may break drivers silently).

Thanks

>
> > (probably on top).
> >
> > Thanks
> >
> > > Then callback can_receive will return False.
> > >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > > >
> > > > > > E.g:
> > > > > > virtio_net_can_receive()
> > > > > > virtio_net_tx_{timer|bh}()
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >          return num_packets;
> > > > > > >      }
> > > > > > >
> > > > > > > --
> > > > > > > 2.32.0.3.g01195cf9f
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>
Xuan Zhuo Jan. 30, 2023, 10:24 a.m. UTC | #21
On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > > >
> > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > > >
> > > > > > > > > > > Yes
> > > > > > > > > > >
> > > > > > > > > > > virtio_queue_reset
> > > > > > > > > > >   k->queue_reset
> > > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > > >                                   .....
> > > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > > struct VirtQueue.
> > > > > > > > > > >
> > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > > >
> > > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > > >
> > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > > >
> > > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > > >
> > > > > > > > > > > > >          return num_packets;
> > > > > > > > > >
> > > > > > > > > > and then
> > > > > > > > > >
> > > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > will reschedule automatically won't it?
> > > > > > > > > >
> > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > > >
> > > > > > > > > virtio_net_flush_tx may been called by timer.
> > >
> > > We stop timer/bh during device reset, do we need to do the same with vq reset?
> > >
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > > >
> > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > > structure is cleared.
> > > > > >
> > > > > >
> > > > > >
> > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > > {
> > > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > >
> > > > > >     if (k->queue_reset) {
> > > > > >         k->queue_reset(vdev, queue_index);
> > > > > >     }
> > > > > >
> > > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > > }
> > > > > >
> > > > > >
> > > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > > >
> > > > > >
> > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > How about checking that length is 0?
> > > > >
> > > > >
> > > > > I think that check length is not a good way. This modifys the semantics of 0.
> > >
> > > 0 seems to mean "purge" and
> > >
> > > > It is
> > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > > will pass 0, and this function is called by many places.
> > >
> > > That's exactly what we want actually, when do purge we don't need a flush?
> >
> > Yes, but I'm not sure. If we stop flush, there will be any other effects.
>
> So we did:
>
> virtio_net_queue_reset():
>     nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
>     flush_or_purge_queued_packets(nc);
>         qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
>             if (qemu_net_queue_flush(nc->incoming_queue)) {
>             ....
>             } else if (purge) {
>                 qemu_net_queue_purge(nc->incoming_queue, nc->peer);
>                         packet->send_cb()
>                             virtio_net_tx_complete()
>                                 virtio_net_flush_tx()
>                                     qemu_sendv_packet_async() // [2]
>             }
>
> We try to flush the tap's incoming queue and if we fail we will purge
> in [1]. But the sent_cb() tries to send more packets which could be
> queued to the tap incoming queue [2]. This breaks the semantic of
> qemu_flush_or_purge_queued_packets().

Sounds like good news, and I think so too.

>
> >
> > On the other hand, if we use "0" as a judgment condition, do you mean only the
> > implementation of the purge in the flush_or_purge_queued_packets()?
>
> It should be all the users of qemu_net_queue_purge(). The rest users
> seems all fine:
>
> virtio_net_vhost_status(), if we do flush, it may end up with touching
> vring during vhost is running.
> filters: all do a flush before.
>
> >
> > >
> > > > >
> > > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > > >
> > > > > Thanks.
> > > > > > OK I guess. Jason?
> > >
> > > Not sure, anything different from adding a check in
> > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > > deleted).
> >
> > We replaced the sent_cb with a function without flush.
>
> I meant it won't be different from adding a
>
> if (virtio_queue_is_rest())
>
> somewhere in virtio_net_tx_complete()?


Only modify on the stack, without using a variable like disabled_by_reset.

Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > >      }
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> >
>
Jason Wang Jan. 31, 2023, 3:27 a.m. UTC | #22
On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > > > >
> > > > > > > > > > > > Yes
> > > > > > > > > > > >
> > > > > > > > > > > > virtio_queue_reset
> > > > > > > > > > > >   k->queue_reset
> > > > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > > > >                                   .....
> > > > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > > > struct VirtQueue.
> > > > > > > > > > > >
> > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks.
> > > > > > > > > > >
> > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > > > >
> > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > > > >
> > > > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > > > >
> > > > > > > > > > > > > >          return num_packets;
> > > > > > > > > > >
> > > > > > > > > > > and then
> > > > > > > > > > >
> > > > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > will reschedule automatically won't it?
> > > > > > > > > > >
> > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > > > >
> > > > > > > > > > virtio_net_flush_tx may been called by timer.
> > > >
> > > > We stop timer/bh during device reset, do we need to do the same with vq reset?
> > > >
> > > > > > > > > >
> > > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > > > >
> > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > > > structure is cleared.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > > > {
> > > > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > > >
> > > > > > >     if (k->queue_reset) {
> > > > > > >         k->queue_reset(vdev, queue_index);
> > > > > > >     }
> > > > > > >
> > > > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > > > >
> > > > > > >
> > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > How about checking that length is 0?
> > > > > >
> > > > > >
> > > > > > I think that check length is not a good way. This modifys the semantics of 0.
> > > >
> > > > 0 seems to mean "purge" and
> > > >
> > > > > It is
> > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > > > will pass 0, and this function is called by many places.
> > > >
> > > > That's exactly what we want actually, when do purge we don't need a flush?
> > >
> > > Yes, but I'm not sure. If we stop flush, there will be any other effects.
> >
> > So we did:
> >
> > virtio_net_queue_reset():
> >     nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
> >     flush_or_purge_queued_packets(nc);
> >         qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
> >             if (qemu_net_queue_flush(nc->incoming_queue)) {
> >             ....
> >             } else if (purge) {
> >                 qemu_net_queue_purge(nc->incoming_queue, nc->peer);
> >                         packet->send_cb()
> >                             virtio_net_tx_complete()
> >                                 virtio_net_flush_tx()
> >                                     qemu_sendv_packet_async() // [2]
> >             }
> >
> > We try to flush the tap's incoming queue and if we fail we will purge
> > in [1]. But the sent_cb() tries to send more packets which could be
> > queued to the tap incoming queue [2]. This breaks the semantic of
> > qemu_flush_or_purge_queued_packets().
>
> Sounds like good news, and I think so too.
>
> >
> > >
> > > On the other hand, if we use "0" as a judgment condition, do you mean only the
> > > implementation of the purge in the flush_or_purge_queued_packets()?
> >
> > It should be all the users of qemu_net_queue_purge(). The rest users
> > seems all fine:
> >
> > virtio_net_vhost_status(), if we do flush, it may end up with touching
> > vring during vhost is running.
> > filters: all do a flush before.
> >
> > >
> > > >
> > > > > >
> > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > > > >
> > > > > > Thanks.
> > > > > > > OK I guess. Jason?
> > > >
> > > > Not sure, anything different from adding a check in
> > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > > > deleted).
> > >
> > > We replaced the sent_cb with a function without flush.
> >
> > I meant it won't be different from adding a
> >
> > if (virtio_queue_is_rest())
> >
> > somewhere in virtio_net_tx_complete()?
>
>
> Only modify on the stack, without using a variable like disabled_by_reset.

Ok, but per discussion above, it looks to me we can simply check len
against zero which seems simpler. (And it may fix other possible bugs)

Thanks

>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > > >      }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>
Xuan Zhuo Jan. 31, 2023, 7:17 a.m. UTC | #23
On Tue, 31 Jan 2023 11:27:42 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes
> > > > > > > > > > > > >
> > > > > > > > > > > > > virtio_queue_reset
> > > > > > > > > > > > >   k->queue_reset
> > > > > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > > > > >                                   .....
> > > > > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > > > > struct VirtQueue.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks.
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > > > > >
> > > > > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > > > > >
> > > > > > > > > > > > > > >          return num_packets;
> > > > > > > > > > > >
> > > > > > > > > > > > and then
> > > > > > > > > > > >
> > > > > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > will reschedule automatically won't it?
> > > > > > > > > > > >
> > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > > > > >
> > > > > > > > > > > virtio_net_flush_tx may been called by timer.
> > > > >
> > > > > We stop timer/bh during device reset, do we need to do the same with vq reset?
> > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > > > > >
> > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > > > > structure is cleared.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > > > > {
> > > > > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > > > >
> > > > > > > >     if (k->queue_reset) {
> > > > > > > >         k->queue_reset(vdev, queue_index);
> > > > > > > >     }
> > > > > > > >
> > > > > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > > > > }
> > > > > > > >
> > > > > > > >
> > > > > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > > > > >
> > > > > > > >
> > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > How about checking that length is 0?
> > > > > > >
> > > > > > >
> > > > > > > I think that check length is not a good way. This modifys the semantics of 0.
> > > > >
> > > > > 0 seems to mean "purge" and
> > > > >
> > > > > > It is
> > > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > > > > will pass 0, and this function is called by many places.
> > > > >
> > > > > That's exactly what we want actually, when do purge we don't need a flush?
> > > >
> > > > Yes, but I'm not sure. If we stop flush, there will be any other effects.
> > >
> > > So we did:
> > >
> > > virtio_net_queue_reset():
> > >     nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
> > >     flush_or_purge_queued_packets(nc);
> > >         qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
> > >             if (qemu_net_queue_flush(nc->incoming_queue)) {
> > >             ....
> > >             } else if (purge) {
> > >                 qemu_net_queue_purge(nc->incoming_queue, nc->peer);
> > >                         packet->send_cb()
> > >                             virtio_net_tx_complete()
> > >                                 virtio_net_flush_tx()
> > >                                     qemu_sendv_packet_async() // [2]
> > >             }
> > >
> > > We try to flush the tap's incoming queue and if we fail we will purge
> > > in [1]. But the sent_cb() tries to send more packets which could be
> > > queued to the tap incoming queue [2]. This breaks the semantic of
> > > qemu_flush_or_purge_queued_packets().
> >
> > Sounds like good news, and I think so too.
> >
> > >
> > > >
> > > > On the other hand, if we use "0" as a judgment condition, do you mean only the
> > > > implementation of the purge in the flush_or_purge_queued_packets()?
> > >
> > > It should be all the users of qemu_net_queue_purge(). The rest users
> > > seems all fine:
> > >
> > > virtio_net_vhost_status(), if we do flush, it may end up with touching
> > > vring during vhost is running.
> > > filters: all do a flush before.
> > >
> > > >
> > > > >
> > > > > > >
> > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > > > > >
> > > > > > > Thanks.
> > > > > > > > OK I guess. Jason?
> > > > >
> > > > > Not sure, anything different from adding a check in
> > > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > > > > deleted).
> > > >
> > > > We replaced the sent_cb with a function without flush.
> > >
> > > I meant it won't be different from adding a
> > >
> > > if (virtio_queue_is_rest())
> > >
> > > somewhere in virtio_net_tx_complete()?
> >
> >
> > Only modify on the stack, without using a variable like disabled_by_reset.
>
> Ok, but per discussion above, it looks to me we can simply check len
> against zero which seems simpler. (And it may fix other possible bugs)


Yes, if '0' means purge and we should not flush new packet for purge, that is a
good way.

I will post patch soon.

Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > > >      }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Xuan Zhuo Jan. 31, 2023, 7:38 a.m. UTC | #24
On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > >
> > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > per-queue reset state.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > >
> > > > > > > > > Yes
> > > > > > > > >
> > > > > > > > > virtio_queue_reset
> > > > > > > > >   k->queue_reset
> > > > > > > > >           virtio_net_queue_reset
> > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > >                                   .....
> > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > struct VirtQueue.
> > > > > > > > >
> > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > >
> > > > > > > >
> > > > > > > > > > What does the call stack look like?
> > > > > > > > > >
> > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > >
> > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > >
> > > > > > > > > > >          return num_packets;
> > > > > > > >
> > > > > > > > and then
> > > > > > > >
> > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > >
> > > > > > > >
> > > > > > > > will reschedule automatically won't it?
> > > > > > > >
> > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > >
> > > > > > > virtio_net_flush_tx may been called by timer.
>
> We stop timer/bh during device reset, do we need to do the same with vq reset?


Should I stop timer/bh?

Thanks.


>
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > >
> > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > structure is cleared.
> > > >
> > > >
> > > >
> > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > {
> > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > >
> > > >     if (k->queue_reset) {
> > > >         k->queue_reset(vdev, queue_index);
> > > >     }
> > > >
> > > >     __virtio_queue_reset(vdev, queue_index);
> > > > }
> > > >
> > > >
> > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > >
> > > >
> > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > This is a callback, it is not convenient to pass the parameters.
> > > > >
> > > > > Thanks
> > > >
> > > >
> > > > How about checking that length is 0?
> > >
> > >
> > > I think that check length is not a good way. This modifys the semantics of 0.
>
> 0 seems to mean "purge" and
>
> > It is
> > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > will pass 0, and this function is called by many places.
>
> That's exactly what we want actually, when do purge we don't need a flush?
>
> > >
> > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > >
> > > Thanks.
> >
> > OK I guess. Jason?
>
> Not sure, anything different from adding a check in
> virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> deleted).
>
> Thanks
>
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > >      }
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
>
diff mbox series

Patch

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3ae909041a..fba6451a50 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2627,7 +2627,8 @@  static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
     VirtQueueElement *elem;
     int32_t num_packets = 0;
     int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
-    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
+        virtio_queue_reset_state(q->tx_vq)) {
         return num_packets;
     }