Message ID | 20100722064157.GA25913@gondor.apana.org.au |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 2010-07-22 at 14:41 +0800, Herbert Xu wrote: > { > struct macvtap_queue *q = macvtap_get_queue(dev, skb); > if (!q) > - return -ENOLINK; > + goto drop; > + > + if (skb_queue_len(&q->sk.sk_receive_queue) >= > dev->tx_queue_len) > + goto drop; > Do we need to orphan skb here, just like tun? > skb_queue_tail(&q->sk.sk_receive_queue, skb); > wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | > POLLRDNORM | POLLRDBAND); > - return 0; > + return NET_RX_SUCCESS; > + > +drop: Do we need to increase dropped++ counter here to let user know there are packets dropped? > + kfree_skb(skb); > + return NET_RX_DROP; > } > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 22, 2010 at 08:59:58AM -0700, Shirley Ma wrote: > On Thu, 2010-07-22 at 14:41 +0800, Herbert Xu wrote: > > { > > struct macvtap_queue *q = macvtap_get_queue(dev, skb); > > if (!q) > > - return -ENOLINK; > > + goto drop; > > + > > + if (skb_queue_len(&q->sk.sk_receive_queue) >= > > dev->tx_queue_len) > > + goto drop; > > > > Do we need to orphan skb here, just like tun? We could, but that is orthogonal to the problem at hand so feel free to do that in another patch. > > skb_queue_tail(&q->sk.sk_receive_queue, skb); > > wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | > > POLLRDNORM | POLLRDBAND); > > - return 0; > > + return NET_RX_SUCCESS; > > + > > +drop: > > Do we need to increase dropped++ counter here to let user know there are > packets dropped? The caller is supposed to handle this. Cheers,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 23 Jul 2010 00:07:31 +0800 > On Thu, Jul 22, 2010 at 08:59:58AM -0700, Shirley Ma wrote: >> On Thu, 2010-07-22 at 14:41 +0800, Herbert Xu wrote: >> > { >> > struct macvtap_queue *q = macvtap_get_queue(dev, skb); >> > if (!q) >> > - return -ENOLINK; >> > + goto drop; >> > + >> > + if (skb_queue_len(&q->sk.sk_receive_queue) >= >> > dev->tx_queue_len) >> > + goto drop; >> > >> >> Do we need to orphan skb here, just like tun? > > We could, but that is orthogonal to the problem at hand so feel > free to do that in another patch. These days, the stack pre-orphans all packets sent to ->ndo_start_xmit() in dev_hard_start_xmit() as long as socket based TX timestamping is not active for the packet. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 22 July 2010, David Miller wrote: > From: Herbert Xu <herbert@gondor.apana.org.au> > Date: Fri, 23 Jul 2010 00:07:31 +0800 > > > On Thu, Jul 22, 2010 at 08:59:58AM -0700, Shirley Ma wrote: > >> On Thu, 2010-07-22 at 14:41 +0800, Herbert Xu wrote: > >> > { > >> > struct macvtap_queue *q = macvtap_get_queue(dev, skb); > >> > if (!q) > >> > - return -ENOLINK; > >> > + goto drop; > >> > + > >> > + if (skb_queue_len(&q->sk.sk_receive_queue) >= > >> > dev->tx_queue_len) > >> > + goto drop; > >> > > >> > >> Do we need to orphan skb here, just like tun? > > > > We could, but that is orthogonal to the problem at hand so feel > > free to do that in another patch. > > These days, the stack pre-orphans all packets sent to ->ndo_start_xmit() > in dev_hard_start_xmit() as long as socket based TX timestamping is not > active for the packet. But this is the receive path, not transmit, so a packet coming from an external NIC never goes through dev_hard_start_xmit. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 23, 2010 at 09:28:08AM +0200, Arnd Bergmann wrote: > > But this is the receive path, not transmit, so a packet coming from an > external NIC never goes through dev_hard_start_xmit. In that case skb->sk is probably coming from tuntap, which means that orphaning the skb is not necessarily a good thing. Cheers,
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index a8a94e2..488d3b9 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -180,11 +180,18 @@ static int macvtap_forward(struct net_device *dev, struct sk_buff *skb) { struct macvtap_queue *q = macvtap_get_queue(dev, skb); if (!q) - return -ENOLINK; + goto drop; + + if (skb_queue_len(&q->sk.sk_receive_queue) >= dev->tx_queue_len) + goto drop; skb_queue_tail(&q->sk.sk_receive_queue, skb); wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | POLLRDNORM | POLLRDBAND); - return 0; + return NET_RX_SUCCESS; + +drop: + kfree_skb(skb); + return NET_RX_DROP; } /*
Hi: macvtap: Limit packet queue length Mark Wagner reported OOM symptoms when sending UDP traffic over a macvtap link to a kvm receiver. This appears to be caused by the fact that macvtap packet queues are unlimited in length. This means that if the receiver can't keep up with the rate of flow, then we will hit OOM. Of course it gets worse if the OOM killer then decides to kill the receiver. This patch imposes a cap on the packet queue length, in the same way as the tuntap driver, using the device TX queue length. Please note that macvtap currently has no way of giving congestion notification, that means the software device TX queue cannot be used and packets will always be dropped once the macvtap driver queue fills up. This shouldn't be a great problem for the scenario where macvtap is used to feed a kvm receiver, as the traffic is most likely external in origin so congestion notification can't be applied anyway. Of course, if anybody decides to complain about guest-to-guest UDP packet loss down the track, then we may have to revisit this. Incidentally, this patch also fixes a real memory leak when macvtap_get_queue fails. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers,