Message ID | 1435633611-14023-1-git-send-email-famz@redhat.com |
---|---|
State | New |
Headers | show |
On 06/30/2015 11:06 AM, Fam Zheng wrote: > virtio_net_receive still does the check by calling > virtio_net_can_receive, if the device or driver is not ready, the packet > is dropped. > > This is necessary because returning false from can_receive complicates > things: the peer would disable sending until we explicitly flush the > queue. > > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > hw/net/virtio-net.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index d728233..dbef0d0 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > static NetClientInfo net_virtio_info = { > .type = NET_CLIENT_OPTIONS_KIND_NIC, > .size = sizeof(NICState), > - .can_receive = virtio_net_can_receive, > .receive = virtio_net_receive, > .link_status_changed = virtio_net_set_link_status, > .query_rx_filter = virtio_net_query_rxfilter, A side effect of this patch is it will read and then drop packet is guest driver is no ok.
On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > On 06/30/2015 11:06 AM, Fam Zheng wrote: > > virtio_net_receive still does the check by calling > > virtio_net_can_receive, if the device or driver is not ready, the packet > > is dropped. > > > > This is necessary because returning false from can_receive complicates > > things: the peer would disable sending until we explicitly flush the > > queue. > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > --- > > hw/net/virtio-net.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > index d728233..dbef0d0 100644 > > --- a/hw/net/virtio-net.c > > +++ b/hw/net/virtio-net.c > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > static NetClientInfo net_virtio_info = { > > .type = NET_CLIENT_OPTIONS_KIND_NIC, > > .size = sizeof(NICState), > > - .can_receive = virtio_net_can_receive, > > .receive = virtio_net_receive, > > .link_status_changed = virtio_net_set_link_status, > > .query_rx_filter = virtio_net_query_rxfilter, > > A side effect of this patch is it will read and then drop packet is > guest driver is no ok. I think that the semantics of .can_receive() and .receive() return values are currently incorrect in many NICs. They have .can_receive() functions that return false for conditions where .receive() would discard the packet. So what happens is that packets get queued when they should actually be discarded. The purpose of the flow control (queuing) mechanism is to tell the sender to hold off until the receiver has more rx buffers available. It's a short-term thing that doesn't included link down, rx disable, or NIC reset states. Therefore, I think this patch will not introduce a regression. It is adjusting the code to stop queuing packets when they should actually be dropped. Thoughts? Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote: > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > On 06/30/2015 11:06 AM, Fam Zheng wrote: > > > virtio_net_receive still does the check by calling > > > virtio_net_can_receive, if the device or driver is not ready, the packet > > > is dropped. > > > > > > This is necessary because returning false from can_receive complicates > > > things: the peer would disable sending until we explicitly flush the > > > queue. > > > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > --- > > > hw/net/virtio-net.c | 1 - > > > 1 file changed, 1 deletion(-) > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > index d728233..dbef0d0 100644 > > > --- a/hw/net/virtio-net.c > > > +++ b/hw/net/virtio-net.c > > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > > static NetClientInfo net_virtio_info = { > > > .type = NET_CLIENT_OPTIONS_KIND_NIC, > > > .size = sizeof(NICState), > > > - .can_receive = virtio_net_can_receive, > > > .receive = virtio_net_receive, > > > .link_status_changed = virtio_net_set_link_status, > > > .query_rx_filter = virtio_net_query_rxfilter, > > > > A side effect of this patch is it will read and then drop packet is > > guest driver is no ok. > > I think that the semantics of .can_receive() and .receive() return > values are currently incorrect in many NICs. They have .can_receive() > functions that return false for conditions where .receive() would > discard the packet. So what happens is that packets get queued when > they should actually be discarded. > > The purpose of the flow control (queuing) mechanism is to tell the > sender to hold off until the receiver has more rx buffers available. > It's a short-term thing that doesn't included link down, rx disable, or > NIC reset states. > > Therefore, I think this patch will not introduce a regression. It is > adjusting the code to stop queuing packets when they should actually be > dropped. > > Thoughts? > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> OK, but we do have transient out of buffers states too, and virtio_net_can_receive checks these as well. To me, it looks like we should change virtio_net_can_receive to avoid checking link down state, etc. But I don't see why we should remove it completely.
On Thu, 07/02 18:46, Michael S. Tsirkin wrote: > On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote: > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > > On 06/30/2015 11:06 AM, Fam Zheng wrote: > > > > virtio_net_receive still does the check by calling > > > > virtio_net_can_receive, if the device or driver is not ready, the packet > > > > is dropped. > > > > > > > > This is necessary because returning false from can_receive complicates > > > > things: the peer would disable sending until we explicitly flush the > > > > queue. > > > > > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > > --- > > > > hw/net/virtio-net.c | 1 - > > > > 1 file changed, 1 deletion(-) > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > index d728233..dbef0d0 100644 > > > > --- a/hw/net/virtio-net.c > > > > +++ b/hw/net/virtio-net.c > > > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > > > static NetClientInfo net_virtio_info = { > > > > .type = NET_CLIENT_OPTIONS_KIND_NIC, > > > > .size = sizeof(NICState), > > > > - .can_receive = virtio_net_can_receive, > > > > .receive = virtio_net_receive, > > > > .link_status_changed = virtio_net_set_link_status, > > > > .query_rx_filter = virtio_net_query_rxfilter, > > > > > > A side effect of this patch is it will read and then drop packet is > > > guest driver is no ok. > > > > I think that the semantics of .can_receive() and .receive() return > > values are currently incorrect in many NICs. They have .can_receive() > > functions that return false for conditions where .receive() would > > discard the packet. So what happens is that packets get queued when > > they should actually be discarded. > > > > The purpose of the flow control (queuing) mechanism is to tell the > > sender to hold off until the receiver has more rx buffers available. > > It's a short-term thing that doesn't included link down, rx disable, or > > NIC reset states. > > > > Therefore, I think this patch will not introduce a regression. It is > > adjusting the code to stop queuing packets when they should actually be > > dropped. > > > > Thoughts? > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> > > OK, but we do have transient out of buffers states too, and > virtio_net_can_receive checks these as well. > > To me, it looks like we should change virtio_net_can_receive to > avoid checking link down state, etc. > But I don't see why we should remove it completely. OK, it makse sense to check buffer state. Fam
On Fri, 07/03 09:12, Fam Zheng wrote: > On Thu, 07/02 18:46, Michael S. Tsirkin wrote: > > On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote: > > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > > > On 06/30/2015 11:06 AM, Fam Zheng wrote: > > > > > virtio_net_receive still does the check by calling > > > > > virtio_net_can_receive, if the device or driver is not ready, the packet > > > > > is dropped. > > > > > > > > > > This is necessary because returning false from can_receive complicates > > > > > things: the peer would disable sending until we explicitly flush the > > > > > queue. > > > > > > > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > > > --- > > > > > hw/net/virtio-net.c | 1 - > > > > > 1 file changed, 1 deletion(-) > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > index d728233..dbef0d0 100644 > > > > > --- a/hw/net/virtio-net.c > > > > > +++ b/hw/net/virtio-net.c > > > > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > > > > static NetClientInfo net_virtio_info = { > > > > > .type = NET_CLIENT_OPTIONS_KIND_NIC, > > > > > .size = sizeof(NICState), > > > > > - .can_receive = virtio_net_can_receive, > > > > > .receive = virtio_net_receive, > > > > > .link_status_changed = virtio_net_set_link_status, > > > > > .query_rx_filter = virtio_net_query_rxfilter, > > > > > > > > A side effect of this patch is it will read and then drop packet is > > > > guest driver is no ok. > > > > > > I think that the semantics of .can_receive() and .receive() return > > > values are currently incorrect in many NICs. They have .can_receive() > > > functions that return false for conditions where .receive() would > > > discard the packet. So what happens is that packets get queued when > > > they should actually be discarded. > > > > > > The purpose of the flow control (queuing) mechanism is to tell the > > > sender to hold off until the receiver has more rx buffers available. > > > It's a short-term thing that doesn't included link down, rx disable, or > > > NIC reset states. > > > > > > Therefore, I think this patch will not introduce a regression. It is > > > adjusting the code to stop queuing packets when they should actually be > > > dropped. > > > > > > Thoughts? > > > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> > > > > OK, but we do have transient out of buffers states too, and > > virtio_net_can_receive checks these as well. I'm a bit confused, do you mean virtio_queue_ready()? That is buffer presence check, and the free buffer check virtio_net_has_buffers() is only in virtio_net_receive(). Am I missing things? Fam > > > > To me, it looks like we should change virtio_net_can_receive to > > avoid checking link down state, etc. > > But I don't see why we should remove it completely. > > OK, it makse sense to check buffer state. > > Fam >
On Fri, Jul 03, 2015 at 12:17:38PM +0800, Fam Zheng wrote: > On Fri, 07/03 09:12, Fam Zheng wrote: > > On Thu, 07/02 18:46, Michael S. Tsirkin wrote: > > > On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote: > > > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > > > > On 06/30/2015 11:06 AM, Fam Zheng wrote: > > > > > > virtio_net_receive still does the check by calling > > > > > > virtio_net_can_receive, if the device or driver is not ready, the packet > > > > > > is dropped. > > > > > > > > > > > > This is necessary because returning false from can_receive complicates > > > > > > things: the peer would disable sending until we explicitly flush the > > > > > > queue. > > > > > > > > > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > > > > --- > > > > > > hw/net/virtio-net.c | 1 - > > > > > > 1 file changed, 1 deletion(-) > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > index d728233..dbef0d0 100644 > > > > > > --- a/hw/net/virtio-net.c > > > > > > +++ b/hw/net/virtio-net.c > > > > > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > > > > > static NetClientInfo net_virtio_info = { > > > > > > .type = NET_CLIENT_OPTIONS_KIND_NIC, > > > > > > .size = sizeof(NICState), > > > > > > - .can_receive = virtio_net_can_receive, > > > > > > .receive = virtio_net_receive, > > > > > > .link_status_changed = virtio_net_set_link_status, > > > > > > .query_rx_filter = virtio_net_query_rxfilter, > > > > > > > > > > A side effect of this patch is it will read and then drop packet is > > > > > guest driver is no ok. > > > > > > > > I think that the semantics of .can_receive() and .receive() return > > > > values are currently incorrect in many NICs. They have .can_receive() > > > > functions that return false for conditions where .receive() would > > > > discard the packet. So what happens is that packets get queued when > > > > they should actually be discarded. > > > > > > > > The purpose of the flow control (queuing) mechanism is to tell the > > > > sender to hold off until the receiver has more rx buffers available. > > > > It's a short-term thing that doesn't included link down, rx disable, or > > > > NIC reset states. > > > > > > > > Therefore, I think this patch will not introduce a regression. It is > > > > adjusting the code to stop queuing packets when they should actually be > > > > dropped. > > > > > > > > Thoughts? > > > > > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> > > > > > > OK, but we do have transient out of buffers states too, and > > > virtio_net_can_receive checks these as well. > > I'm a bit confused, do you mean virtio_queue_ready()? That is buffer presence > check, and the free buffer check virtio_net_has_buffers() is only in > virtio_net_receive(). Am I missing things? > > Fam Interesting. You are right. See also commit cdd5cc12ba8cf0c068da319370bdd3ba45eaf7ac. > > > > > > To me, it looks like we should change virtio_net_can_receive to > > > avoid checking link down state, etc. > > > But I don't see why we should remove it completely. > > > > OK, it makse sense to check buffer state. > > > > Fam > >
On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: >> On 06/30/2015 11:06 AM, Fam Zheng wrote: >>> virtio_net_receive still does the check by calling >>> virtio_net_can_receive, if the device or driver is not ready, the packet >>> is dropped. >>> >>> This is necessary because returning false from can_receive complicates >>> things: the peer would disable sending until we explicitly flush the >>> queue. >>> >>> Signed-off-by: Fam Zheng <famz@redhat.com> >>> --- >>> hw/net/virtio-net.c | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >>> index d728233..dbef0d0 100644 >>> --- a/hw/net/virtio-net.c >>> +++ b/hw/net/virtio-net.c >>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, >>> static NetClientInfo net_virtio_info = { >>> .type = NET_CLIENT_OPTIONS_KIND_NIC, >>> .size = sizeof(NICState), >>> - .can_receive = virtio_net_can_receive, >>> .receive = virtio_net_receive, >>> .link_status_changed = virtio_net_set_link_status, >>> .query_rx_filter = virtio_net_query_rxfilter, >> A side effect of this patch is it will read and then drop packet is >> guest driver is no ok. > I think that the semantics of .can_receive() and .receive() return > values are currently incorrect in many NICs. They have .can_receive() > functions that return false for conditions where .receive() would > discard the packet. So what happens is that packets get queued when > they should actually be discarded. Yes, but they are bugs more or less. > > The purpose of the flow control (queuing) mechanism is to tell the > sender to hold off until the receiver has more rx buffers available. > It's a short-term thing that doesn't included link down, rx disable, or > NIC reset states. > > Therefore, I think this patch will not introduce a regression. It is > adjusting the code to stop queuing packets when they should actually be > dropped. > > Thoughts? I agree there's no functional issue. But it cause wasting of cpu cycles (consider guest is being flooded). Sometime it maybe even dangerous. For tap, we're probably ok since we have 756ae78b but for other backend, we don't. > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: > > > On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > >> On 06/30/2015 11:06 AM, Fam Zheng wrote: > >>> virtio_net_receive still does the check by calling > >>> virtio_net_can_receive, if the device or driver is not ready, the packet > >>> is dropped. > >>> > >>> This is necessary because returning false from can_receive complicates > >>> things: the peer would disable sending until we explicitly flush the > >>> queue. > >>> > >>> Signed-off-by: Fam Zheng <famz@redhat.com> > >>> --- > >>> hw/net/virtio-net.c | 1 - > >>> 1 file changed, 1 deletion(-) > >>> > >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > >>> index d728233..dbef0d0 100644 > >>> --- a/hw/net/virtio-net.c > >>> +++ b/hw/net/virtio-net.c > >>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > >>> static NetClientInfo net_virtio_info = { > >>> .type = NET_CLIENT_OPTIONS_KIND_NIC, > >>> .size = sizeof(NICState), > >>> - .can_receive = virtio_net_can_receive, > >>> .receive = virtio_net_receive, > >>> .link_status_changed = virtio_net_set_link_status, > >>> .query_rx_filter = virtio_net_query_rxfilter, > >> A side effect of this patch is it will read and then drop packet is > >> guest driver is no ok. > > I think that the semantics of .can_receive() and .receive() return > > values are currently incorrect in many NICs. They have .can_receive() > > functions that return false for conditions where .receive() would > > discard the packet. So what happens is that packets get queued when > > they should actually be discarded. > > Yes, but they are bugs more or less. > > > > > The purpose of the flow control (queuing) mechanism is to tell the > > sender to hold off until the receiver has more rx buffers available. > > It's a short-term thing that doesn't included link down, rx disable, or > > NIC reset states. > > > > Therefore, I think this patch will not introduce a regression. It is > > adjusting the code to stop queuing packets when they should actually be > > dropped. > > > > Thoughts? > > I agree there's no functional issue. But it cause wasting of cpu cycles > (consider guest is being flooded). Sometime it maybe even dangerous. For > tap, we're probably ok since we have 756ae78b but for other backend, we > don't. If the guest uses iptables rules or other mechanisms to drop bogus packets the cost is even higher than discarding them at the QEMU layer. What's more is that if you're using link down as a DoS mitigation strategy then you might as well hot unplug the NIC. Stefan
On Mon, Jul 06, 2015 at 04:21:16PM +0100, Stefan Hajnoczi wrote: > On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: > > > > > > On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: > > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > >> On 06/30/2015 11:06 AM, Fam Zheng wrote: > > >>> virtio_net_receive still does the check by calling > > >>> virtio_net_can_receive, if the device or driver is not ready, the packet > > >>> is dropped. > > >>> > > >>> This is necessary because returning false from can_receive complicates > > >>> things: the peer would disable sending until we explicitly flush the > > >>> queue. > > >>> > > >>> Signed-off-by: Fam Zheng <famz@redhat.com> > > >>> --- > > >>> hw/net/virtio-net.c | 1 - > > >>> 1 file changed, 1 deletion(-) > > >>> > > >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > >>> index d728233..dbef0d0 100644 > > >>> --- a/hw/net/virtio-net.c > > >>> +++ b/hw/net/virtio-net.c > > >>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > >>> static NetClientInfo net_virtio_info = { > > >>> .type = NET_CLIENT_OPTIONS_KIND_NIC, > > >>> .size = sizeof(NICState), > > >>> - .can_receive = virtio_net_can_receive, > > >>> .receive = virtio_net_receive, > > >>> .link_status_changed = virtio_net_set_link_status, > > >>> .query_rx_filter = virtio_net_query_rxfilter, > > >> A side effect of this patch is it will read and then drop packet is > > >> guest driver is no ok. > > > I think that the semantics of .can_receive() and .receive() return > > > values are currently incorrect in many NICs. They have .can_receive() > > > functions that return false for conditions where .receive() would > > > discard the packet. So what happens is that packets get queued when > > > they should actually be discarded. > > > > Yes, but they are bugs more or less. > > > > > > > > The purpose of the flow control (queuing) mechanism is to tell the > > > sender to hold off until the receiver has more rx buffers available. > > > It's a short-term thing that doesn't included link down, rx disable, or > > > NIC reset states. > > > > > > Therefore, I think this patch will not introduce a regression. It is > > > adjusting the code to stop queuing packets when they should actually be > > > dropped. > > > > > > Thoughts? > > > > I agree there's no functional issue. But it cause wasting of cpu cycles > > (consider guest is being flooded). Sometime it maybe even dangerous. For > > tap, we're probably ok since we have 756ae78b but for other backend, we > > don't. > > If the guest uses iptables rules or other mechanisms to drop bogus > packets the cost is even higher than discarding them at the QEMU layer. > > What's more is that if you're using link down as a DoS mitigation > strategy then you might as well hot unplug the NIC. > > Stefan Frankly, I don't see the point of the patch. Is this supposed to be a bugfix? If so, there's should be a description about how to trigger the bug. Is this an optimization? If so there should be some numbers showing a gain.
On Mon, 07/06 20:09, Michael S. Tsirkin wrote: > On Mon, Jul 06, 2015 at 04:21:16PM +0100, Stefan Hajnoczi wrote: > > On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: > > > > > > > > > On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: > > > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > > >> On 06/30/2015 11:06 AM, Fam Zheng wrote: > > > >>> virtio_net_receive still does the check by calling > > > >>> virtio_net_can_receive, if the device or driver is not ready, the packet > > > >>> is dropped. > > > >>> > > > >>> This is necessary because returning false from can_receive complicates > > > >>> things: the peer would disable sending until we explicitly flush the > > > >>> queue. > > > >>> > > > >>> Signed-off-by: Fam Zheng <famz@redhat.com> > > > >>> --- > > > >>> hw/net/virtio-net.c | 1 - > > > >>> 1 file changed, 1 deletion(-) > > > >>> > > > >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > >>> index d728233..dbef0d0 100644 > > > >>> --- a/hw/net/virtio-net.c > > > >>> +++ b/hw/net/virtio-net.c > > > >>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > > >>> static NetClientInfo net_virtio_info = { > > > >>> .type = NET_CLIENT_OPTIONS_KIND_NIC, > > > >>> .size = sizeof(NICState), > > > >>> - .can_receive = virtio_net_can_receive, > > > >>> .receive = virtio_net_receive, > > > >>> .link_status_changed = virtio_net_set_link_status, > > > >>> .query_rx_filter = virtio_net_query_rxfilter, > > > >> A side effect of this patch is it will read and then drop packet is > > > >> guest driver is no ok. > > > > I think that the semantics of .can_receive() and .receive() return > > > > values are currently incorrect in many NICs. They have .can_receive() > > > > functions that return false for conditions where .receive() would > > > > discard the packet. So what happens is that packets get queued when > > > > they should actually be discarded. > > > > > > Yes, but they are bugs more or less. > > > > > > > > > > > The purpose of the flow control (queuing) mechanism is to tell the > > > > sender to hold off until the receiver has more rx buffers available. > > > > It's a short-term thing that doesn't included link down, rx disable, or > > > > NIC reset states. > > > > > > > > Therefore, I think this patch will not introduce a regression. It is > > > > adjusting the code to stop queuing packets when they should actually be > > > > dropped. > > > > > > > > Thoughts? > > > > > > I agree there's no functional issue. But it cause wasting of cpu cycles > > > (consider guest is being flooded). Sometime it maybe even dangerous. For > > > tap, we're probably ok since we have 756ae78b but for other backend, we > > > don't. > > > > If the guest uses iptables rules or other mechanisms to drop bogus > > packets the cost is even higher than discarding them at the QEMU layer. > > > > What's more is that if you're using link down as a DoS mitigation > > strategy then you might as well hot unplug the NIC. > > > > Stefan > > > > Frankly, I don't see the point of the patch. Is this supposed to be a > bugfix? If so, there's should be a description about how to trigger the > bug. Is this an optimization? If so there should be some numbers > showing a gain. It's a bug fix, we are not flushing the queue when DIRVER_OK is being set or when buffer is becoming available (the virtio_net_can_receive conditions). Not an issue before a90a7425cf but since that the semantics is enforced. Fam
On Tue, Jul 07, 2015 at 08:53:59AM +0800, Fam Zheng wrote: > On Mon, 07/06 20:09, Michael S. Tsirkin wrote: > > On Mon, Jul 06, 2015 at 04:21:16PM +0100, Stefan Hajnoczi wrote: > > > On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: > > > > > > > > > > > > On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: > > > > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > > > > >> On 06/30/2015 11:06 AM, Fam Zheng wrote: > > > > >>> virtio_net_receive still does the check by calling > > > > >>> virtio_net_can_receive, if the device or driver is not ready, the packet > > > > >>> is dropped. > > > > >>> > > > > >>> This is necessary because returning false from can_receive complicates > > > > >>> things: the peer would disable sending until we explicitly flush the > > > > >>> queue. > > > > >>> > > > > >>> Signed-off-by: Fam Zheng <famz@redhat.com> > > > > >>> --- > > > > >>> hw/net/virtio-net.c | 1 - > > > > >>> 1 file changed, 1 deletion(-) > > > > >>> > > > > >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > >>> index d728233..dbef0d0 100644 > > > > >>> --- a/hw/net/virtio-net.c > > > > >>> +++ b/hw/net/virtio-net.c > > > > >>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > > > > >>> static NetClientInfo net_virtio_info = { > > > > >>> .type = NET_CLIENT_OPTIONS_KIND_NIC, > > > > >>> .size = sizeof(NICState), > > > > >>> - .can_receive = virtio_net_can_receive, > > > > >>> .receive = virtio_net_receive, > > > > >>> .link_status_changed = virtio_net_set_link_status, > > > > >>> .query_rx_filter = virtio_net_query_rxfilter, > > > > >> A side effect of this patch is it will read and then drop packet is > > > > >> guest driver is no ok. > > > > > I think that the semantics of .can_receive() and .receive() return > > > > > values are currently incorrect in many NICs. They have .can_receive() > > > > > functions that return false for conditions where .receive() would > > > > > discard the packet. So what happens is that packets get queued when > > > > > they should actually be discarded. > > > > > > > > Yes, but they are bugs more or less. > > > > > > > > > > > > > > The purpose of the flow control (queuing) mechanism is to tell the > > > > > sender to hold off until the receiver has more rx buffers available. > > > > > It's a short-term thing that doesn't included link down, rx disable, or > > > > > NIC reset states. > > > > > > > > > > Therefore, I think this patch will not introduce a regression. It is > > > > > adjusting the code to stop queuing packets when they should actually be > > > > > dropped. > > > > > > > > > > Thoughts? > > > > > > > > I agree there's no functional issue. But it cause wasting of cpu cycles > > > > (consider guest is being flooded). Sometime it maybe even dangerous. For > > > > tap, we're probably ok since we have 756ae78b but for other backend, we > > > > don't. > > > > > > If the guest uses iptables rules or other mechanisms to drop bogus > > > packets the cost is even higher than discarding them at the QEMU layer. > > > > > > What's more is that if you're using link down as a DoS mitigation > > > strategy then you might as well hot unplug the NIC. > > > > > > Stefan > > > > > > > > Frankly, I don't see the point of the patch. Is this supposed to be a > > bugfix? If so, there's should be a description about how to trigger the > > bug. Is this an optimization? If so there should be some numbers > > showing a gain. > > It's a bug fix, we are not flushing the queue when DIRVER_OK is being set or > when buffer is becoming available (the virtio_net_can_receive conditions). Not > an issue before a90a7425cf but since that the semantics is enforced. > > Fam I think the safest and obvious fix is to flush on DRIVER_OK then (unless vhost started). That might be 2.4 material.
On 07/06/2015 11:21 PM, Stefan Hajnoczi wrote: > On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: >> >> On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: >>> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: >>>> On 06/30/2015 11:06 AM, Fam Zheng wrote: >>>>> virtio_net_receive still does the check by calling >>>>> virtio_net_can_receive, if the device or driver is not ready, the packet >>>>> is dropped. >>>>> >>>>> This is necessary because returning false from can_receive complicates >>>>> things: the peer would disable sending until we explicitly flush the >>>>> queue. >>>>> >>>>> Signed-off-by: Fam Zheng <famz@redhat.com> >>>>> --- >>>>> hw/net/virtio-net.c | 1 - >>>>> 1 file changed, 1 deletion(-) >>>>> >>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >>>>> index d728233..dbef0d0 100644 >>>>> --- a/hw/net/virtio-net.c >>>>> +++ b/hw/net/virtio-net.c >>>>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, >>>>> static NetClientInfo net_virtio_info = { >>>>> .type = NET_CLIENT_OPTIONS_KIND_NIC, >>>>> .size = sizeof(NICState), >>>>> - .can_receive = virtio_net_can_receive, >>>>> .receive = virtio_net_receive, >>>>> .link_status_changed = virtio_net_set_link_status, >>>>> .query_rx_filter = virtio_net_query_rxfilter, >>>> A side effect of this patch is it will read and then drop packet is >>>> guest driver is no ok. >>> I think that the semantics of .can_receive() and .receive() return >>> values are currently incorrect in many NICs. They have .can_receive() >>> functions that return false for conditions where .receive() would >>> discard the packet. So what happens is that packets get queued when >>> they should actually be discarded. >> Yes, but they are bugs more or less. >> >>> The purpose of the flow control (queuing) mechanism is to tell the >>> sender to hold off until the receiver has more rx buffers available. >>> It's a short-term thing that doesn't included link down, rx disable, or >>> NIC reset states. >>> >>> Therefore, I think this patch will not introduce a regression. It is >>> adjusting the code to stop queuing packets when they should actually be >>> dropped. >>> >>> Thoughts? >> I agree there's no functional issue. But it cause wasting of cpu cycles >> (consider guest is being flooded). Sometime it maybe even dangerous. For >> tap, we're probably ok since we have 756ae78b but for other backend, we >> don't. > If the guest uses iptables rules or other mechanisms to drop bogus > packets the cost is even higher than discarding them at the QEMU layer. But it was the choice of guest. > > What's more is that if you're using link down as a DoS mitigation > strategy then you might as well hot unplug the NIC. > > Stefan I think there're two problems for virtio-net: 1) mitigation method when guest driver is ok. For tx, we have either timer or bh, for rx and only for tap, we have 756ae78b. We probably need fixes for other backends. 2) when driver is not ok, the point is we should not poll the backend at all (I believe this is one of the main objects of main loop). Something like tap_can_send() and the commit that drops tap_can_send() all follow this rule. But this patch does not, we end up with: - driver is not ok or no driver, qemu keep reading and dropping packets. - driver is ok but not enough rx buffer, qemu will disable tap read poll. Which looks conflicted. We need fix this either in 2.4 or later and also for other NICs.
On Tue, Jul 07, 2015 at 04:45:41PM +0800, Jason Wang wrote: > > > On 07/06/2015 11:21 PM, Stefan Hajnoczi wrote: > > On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: > >> > >> On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: > >>> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: > >>>> On 06/30/2015 11:06 AM, Fam Zheng wrote: > >>>>> virtio_net_receive still does the check by calling > >>>>> virtio_net_can_receive, if the device or driver is not ready, the packet > >>>>> is dropped. > >>>>> > >>>>> This is necessary because returning false from can_receive complicates > >>>>> things: the peer would disable sending until we explicitly flush the > >>>>> queue. > >>>>> > >>>>> Signed-off-by: Fam Zheng <famz@redhat.com> > >>>>> --- > >>>>> hw/net/virtio-net.c | 1 - > >>>>> 1 file changed, 1 deletion(-) > >>>>> > >>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > >>>>> index d728233..dbef0d0 100644 > >>>>> --- a/hw/net/virtio-net.c > >>>>> +++ b/hw/net/virtio-net.c > >>>>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, > >>>>> static NetClientInfo net_virtio_info = { > >>>>> .type = NET_CLIENT_OPTIONS_KIND_NIC, > >>>>> .size = sizeof(NICState), > >>>>> - .can_receive = virtio_net_can_receive, > >>>>> .receive = virtio_net_receive, > >>>>> .link_status_changed = virtio_net_set_link_status, > >>>>> .query_rx_filter = virtio_net_query_rxfilter, > >>>> A side effect of this patch is it will read and then drop packet is > >>>> guest driver is no ok. > >>> I think that the semantics of .can_receive() and .receive() return > >>> values are currently incorrect in many NICs. They have .can_receive() > >>> functions that return false for conditions where .receive() would > >>> discard the packet. So what happens is that packets get queued when > >>> they should actually be discarded. > >> Yes, but they are bugs more or less. > >> > >>> The purpose of the flow control (queuing) mechanism is to tell the > >>> sender to hold off until the receiver has more rx buffers available. > >>> It's a short-term thing that doesn't included link down, rx disable, or > >>> NIC reset states. > >>> > >>> Therefore, I think this patch will not introduce a regression. It is > >>> adjusting the code to stop queuing packets when they should actually be > >>> dropped. > >>> > >>> Thoughts? > >> I agree there's no functional issue. But it cause wasting of cpu cycles > >> (consider guest is being flooded). Sometime it maybe even dangerous. For > >> tap, we're probably ok since we have 756ae78b but for other backend, we > >> don't. > > If the guest uses iptables rules or other mechanisms to drop bogus > > packets the cost is even higher than discarding them at the QEMU layer. > > But it was the choice of guest. > > > > > What's more is that if you're using link down as a DoS mitigation > > strategy then you might as well hot unplug the NIC. > > > > Stefan > > I think there're two problems for virtio-net: > > 1) mitigation method when guest driver is ok. For tx, we have either > timer or bh, for rx and only for tap, we have 756ae78b. We probably need > fixes for other backends. I agree. It's not directly related to the change Fam is proposing though. > 2) when driver is not ok, the point is we should not poll the backend at > all (I believe this is one of the main objects of main loop). Something > like tap_can_send() and the commit that drops tap_can_send() all follow > this rule. But this patch does not, we end up with: > > - driver is not ok or no driver, qemu keep reading and dropping packets. It's correct from a functionality perspective. When rx is disabled or the link is down, packets should be dropped. From a performance perspective I agree it would be better to drop packets earlier in the stack. If you want tap to drop packets efficiently during link down/receive disabled/etc: What's the best way to disable tap rx so the host kernel drops packets instead of queuing them? > - driver is ok but not enough rx buffer, qemu will disable tap read poll. QEMU relies on the kernel's socket receive buffer or netif receive queue to hold incoming packets. When QEMU enables tap read poll again it receives the queued packets. Queuing is necessary so USB net emulation (the rx buffer is only 1 packet!) and maybe slirp work. I think real NICs drop packets when the rx ring is exhausted, but we need to keep the queuing behavior unless someone proposes a way to eliminate it.
On 07/08/2015 06:50 PM, Stefan Hajnoczi wrote: > On Tue, Jul 07, 2015 at 04:45:41PM +0800, Jason Wang wrote: >> >> On 07/06/2015 11:21 PM, Stefan Hajnoczi wrote: >>> On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote: >>>> On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote: >>>>> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote: >>>>>> On 06/30/2015 11:06 AM, Fam Zheng wrote: >>>>>>> virtio_net_receive still does the check by calling >>>>>>> virtio_net_can_receive, if the device or driver is not ready, the packet >>>>>>> is dropped. >>>>>>> >>>>>>> This is necessary because returning false from can_receive complicates >>>>>>> things: the peer would disable sending until we explicitly flush the >>>>>>> queue. >>>>>>> >>>>>>> Signed-off-by: Fam Zheng <famz@redhat.com> >>>>>>> --- >>>>>>> hw/net/virtio-net.c | 1 - >>>>>>> 1 file changed, 1 deletion(-) >>>>>>> >>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >>>>>>> index d728233..dbef0d0 100644 >>>>>>> --- a/hw/net/virtio-net.c >>>>>>> +++ b/hw/net/virtio-net.c >>>>>>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, >>>>>>> static NetClientInfo net_virtio_info = { >>>>>>> .type = NET_CLIENT_OPTIONS_KIND_NIC, >>>>>>> .size = sizeof(NICState), >>>>>>> - .can_receive = virtio_net_can_receive, >>>>>>> .receive = virtio_net_receive, >>>>>>> .link_status_changed = virtio_net_set_link_status, >>>>>>> .query_rx_filter = virtio_net_query_rxfilter, >>>>>> A side effect of this patch is it will read and then drop packet is >>>>>> guest driver is no ok. >>>>> I think that the semantics of .can_receive() and .receive() return >>>>> values are currently incorrect in many NICs. They have .can_receive() >>>>> functions that return false for conditions where .receive() would >>>>> discard the packet. So what happens is that packets get queued when >>>>> they should actually be discarded. >>>> Yes, but they are bugs more or less. >>>> >>>>> The purpose of the flow control (queuing) mechanism is to tell the >>>>> sender to hold off until the receiver has more rx buffers available. >>>>> It's a short-term thing that doesn't included link down, rx disable, or >>>>> NIC reset states. >>>>> >>>>> Therefore, I think this patch will not introduce a regression. It is >>>>> adjusting the code to stop queuing packets when they should actually be >>>>> dropped. >>>>> >>>>> Thoughts? >>>> I agree there's no functional issue. But it cause wasting of cpu cycles >>>> (consider guest is being flooded). Sometime it maybe even dangerous. For >>>> tap, we're probably ok since we have 756ae78b but for other backend, we >>>> don't. >>> If the guest uses iptables rules or other mechanisms to drop bogus >>> packets the cost is even higher than discarding them at the QEMU layer. >> But it was the choice of guest. >> >>> What's more is that if you're using link down as a DoS mitigation >>> strategy then you might as well hot unplug the NIC. >>> >>> Stefan >> I think there're two problems for virtio-net: >> >> 1) mitigation method when guest driver is ok. For tx, we have either >> timer or bh, for rx and only for tap, we have 756ae78b. We probably need >> fixes for other backends. > I agree. It's not directly related to the change Fam is proposing > though. > >> 2) when driver is not ok, the point is we should not poll the backend at >> all (I believe this is one of the main objects of main loop). Something >> like tap_can_send() and the commit that drops tap_can_send() all follow >> this rule. But this patch does not, we end up with: >> >> - driver is not ok or no driver, qemu keep reading and dropping packets. > It's correct from a functionality perspective. When rx is disabled or > the link is down, packets should be dropped. > > From a performance perspective I agree it would be better to drop > packets earlier in the stack. > > If you want tap to drop packets efficiently during link down/receive > disabled/etc: > > What's the best way to disable tap rx so the host kernel drops packets > instead of queuing them? There's no such API currently especially for a unprivileged qemu. But tuntap will drop the packet if it exceeds tx_queue_length. This looks still much better than current behaviour. > >> - driver is ok but not enough rx buffer, qemu will disable tap read poll. > QEMU relies on the kernel's socket receive buffer or netif receive > queue to hold incoming packets. When QEMU enables tap read poll again > it receives the queued packets. > > Queuing is necessary so USB net emulation (the rx buffer is only 1 > packet!) and maybe slirp work. Yes. > > I think real NICs drop packets when the rx ring is exhausted, but we > need to keep the queuing behavior unless someone proposes a way to > eliminate it. Yes, queueing seems useless if backend can do it. So I'm ok with the patch, and we can optimize in the future. Thanks
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index d728233..dbef0d0 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f, static NetClientInfo net_virtio_info = { .type = NET_CLIENT_OPTIONS_KIND_NIC, .size = sizeof(NICState), - .can_receive = virtio_net_can_receive, .receive = virtio_net_receive, .link_status_changed = virtio_net_set_link_status, .query_rx_filter = virtio_net_query_rxfilter,
virtio_net_receive still does the check by calling virtio_net_can_receive, if the device or driver is not ready, the packet is dropped. This is necessary because returning false from can_receive complicates things: the peer would disable sending until we explicitly flush the queue. Signed-off-by: Fam Zheng <famz@redhat.com> --- hw/net/virtio-net.c | 1 - 1 file changed, 1 deletion(-)