Message ID | 1332176549-30960-2-git-send-email-haiyangz@microsoft.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, 2012-03-19 at 10:02 -0700, Haiyang Zhang wrote: > Instead of dropping the packet, we keep the skb buffer, and return > NETDEV_TX_BUSY to let upper layer retry send. This will not cause > endless loop, because the host is taking data away from ring buffer. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> > --- > drivers/net/hyperv/netvsc_drv.c | 5 +---- > 1 files changed, 1 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > index 2517d20..dd29478 100644 > --- a/drivers/net/hyperv/netvsc_drv.c > +++ b/drivers/net/hyperv/netvsc_drv.c > @@ -223,13 +223,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) > net->stats.tx_bytes += skb->len; > net->stats.tx_packets++; > } else { > - /* we are shutting down or bus overloaded, just drop packet */ > - net->stats.tx_dropped++; > kfree(packet); > - dev_kfree_skb_any(skb); > } > > - return NETDEV_TX_OK; > + return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK; > } > > /* Thats simply not true at all. A start_xmit() cannot do that. TX_BUSY should never be returned at all, its a deprecated code, for pretty good reasons. (assuming queue is not stopped) Try this on a machine with one CPU, I am pretty sure this can trigger complete freezes. Once softirq loops in your start_xmit(), how do you think one process can help you now ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > Sent: Monday, March 19, 2012 1:12 PM > To: Haiyang Zhang > Cc: KY Srinivasan; davem@davemloft.net; netdev@vger.kernel.org; linux- > kernel@vger.kernel.org; devel@linuxdriverproject.org > Subject: Re: [PATCH 1/1] net/hyperv: Fix the code handling tx busy > > On Mon, 2012-03-19 at 10:02 -0700, Haiyang Zhang wrote: > > Instead of dropping the packet, we keep the skb buffer, and return > > NETDEV_TX_BUSY to let upper layer retry send. This will not cause > > endless loop, because the host is taking data away from ring buffer. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> > > --- > > drivers/net/hyperv/netvsc_drv.c | 5 +---- > > 1 files changed, 1 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > b/drivers/net/hyperv/netvsc_drv.c index 2517d20..dd29478 100644 > > --- a/drivers/net/hyperv/netvsc_drv.c > > +++ b/drivers/net/hyperv/netvsc_drv.c > > @@ -223,13 +223,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, > struct net_device *net) > > net->stats.tx_bytes += skb->len; > > net->stats.tx_packets++; > > } else { > > - /* we are shutting down or bus overloaded, just drop packet > */ > > - net->stats.tx_dropped++; > > kfree(packet); > > - dev_kfree_skb_any(skb); > > } > > > > - return NETDEV_TX_OK; > > + return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK; > > } > > > > /* > > Thats simply not true at all. > > A start_xmit() cannot do that. > > TX_BUSY should never be returned at all, its a deprecated code, for pretty > good reasons. (assuming queue is not stopped) We actually stop queue when the ring buffer is busy, see the code in netvsc.c > Try this on a machine with one CPU, I am pretty sure this can trigger > complete freezes. I have tested with one CPU. After NETDEV_TX_BUSY is returned, the Linux guest OS continues to respond without any problem. Thanks, - Haiyang
On Mon, 19 Mar 2012 10:11:58 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Mon, 2012-03-19 at 10:02 -0700, Haiyang Zhang wrote: > > Instead of dropping the packet, we keep the skb buffer, and return > > NETDEV_TX_BUSY to let upper layer retry send. This will not cause > > endless loop, because the host is taking data away from ring buffer. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> > > --- > > drivers/net/hyperv/netvsc_drv.c | 5 +---- > > 1 files changed, 1 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > > index 2517d20..dd29478 100644 > > --- a/drivers/net/hyperv/netvsc_drv.c > > +++ b/drivers/net/hyperv/netvsc_drv.c > > @@ -223,13 +223,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) > > net->stats.tx_bytes += skb->len; > > net->stats.tx_packets++; > > } else { > > - /* we are shutting down or bus overloaded, just drop packet */ > > - net->stats.tx_dropped++; > > kfree(packet); > > - dev_kfree_skb_any(skb); > > } > > > > - return NETDEV_TX_OK; > > + return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK; > > } > > > > /* > > Thats simply not true at all. > > A start_xmit() cannot do that. > > TX_BUSY should never be returned at all, its a deprecated code, for > pretty good reasons. (assuming queue is not stopped) > > Try this on a machine with one CPU, I am pretty sure this can trigger > complete freezes. > > Once softirq loops in your start_xmit(), how do you think one process > can help you now ? Eric is right, look how devices with real physical rings work. They test for space left at end of start xmit and stop the transmit queue with netif_stop_queue. The transmit done code then re-enables when enough space is netif_wake_queue. Think of it as classic high/low water mark on a FIFO. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Stephen Hemminger [mailto:shemminger@vyatta.com] > Sent: Monday, March 19, 2012 1:49 PM > To: Eric Dumazet > Cc: Haiyang Zhang; KY Srinivasan; davem@davemloft.net; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; > devel@linuxdriverproject.org > Subject: Re: [PATCH 1/1] net/hyperv: Fix the code handling tx busy > > On Mon, 19 Mar 2012 10:11:58 -0700 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > On Mon, 2012-03-19 at 10:02 -0700, Haiyang Zhang wrote: > > > Instead of dropping the packet, we keep the skb buffer, and return > > > NETDEV_TX_BUSY to let upper layer retry send. This will not cause > > > endless loop, because the host is taking data away from ring buffer. > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > > Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> > > > --- > > > drivers/net/hyperv/netvsc_drv.c | 5 +---- > > > 1 files changed, 1 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > > b/drivers/net/hyperv/netvsc_drv.c index 2517d20..dd29478 100644 > > > --- a/drivers/net/hyperv/netvsc_drv.c > > > +++ b/drivers/net/hyperv/netvsc_drv.c > > > @@ -223,13 +223,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, > struct net_device *net) > > > net->stats.tx_bytes += skb->len; > > > net->stats.tx_packets++; > > > } else { > > > - /* we are shutting down or bus overloaded, just drop packet > */ > > > - net->stats.tx_dropped++; > > > kfree(packet); > > > - dev_kfree_skb_any(skb); > > > } > > > > > > - return NETDEV_TX_OK; > > > + return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK; > > > } > > > > > > /* > > > > Thats simply not true at all. > > > > A start_xmit() cannot do that. > > > > TX_BUSY should never be returned at all, its a deprecated code, for > > pretty good reasons. (assuming queue is not stopped) > > > > Try this on a machine with one CPU, I am pretty sure this can trigger > > complete freezes. > > > > Once softirq loops in your start_xmit(), how do you think one process > > can help you now ? > > Eric is right, look how devices with real physical rings work. > They test for space left at end of start xmit and stop the transmit queue with > netif_stop_queue. The transmit done code then re-enables when enough > space is netif_wake_queue. Think of it as classic high/low water mark on a > FIFO. As in my previous reply to Eric -- We actually stop queue when the ring buffer is busy, see the code in netvsc.c I have tested with one CPU. After NETDEV_TX_BUSY is returned, the Linux guest OS continues to respond without any problem. Thanks, - Haiyang -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-03-19 at 17:46 +0000, Haiyang Zhang wrote: > We actually stop queue when the ring buffer is busy, see the code in netvsc.c Then you dont need NETDEV_TX_BUSY at all. When you used whole tx slots, you stop the queue, so start_xmit() wont be called (and you wont recover from this useless call with NETDEV_TX_BUSY) > > > Try this on a machine with one CPU, I am pretty sure this can trigger > > complete freezes. > > I have tested with one CPU. After NETDEV_TX_BUSY is returned, the Linux > guest OS continues to respond without any problem. Problem is you might have used several billions cycles/instructions without notice. Thats a busy loop and you assume consumer can empty som tx slots while you're busy looping. Thats pretty lazy. This path is actually hard to test. In fact most of the time its probably never hit at all. Some NETDEV_TX_BUSY bugs are in the code since ages and nobody complained. Thats not a reason to add new ones. See recents commits on this subject : Bug never triggered but it was here fir sure. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=b8fbaef586176f6abe0eb7887ddae66e99898b79 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-03-19 at 17:50 +0000, Haiyang Zhang wrote: > As in my previous reply to Eric -- > We actually stop queue when the ring buffer is busy, see the code in netvsc.c > > I have tested with one CPU. After NETDEV_TX_BUSY is returned, the Linux guest OS > continues to respond without any problem. Then something is wrong somewhere. Dont hide a bug adding a trick. If you ever return NETDEV_TX_BUSY from start_xmit(), then you MUST call netif_tx_stop_queue() as well right before. I believe I already told this before... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > Sent: Monday, March 19, 2012 2:31 PM > To: Haiyang Zhang > Cc: Stephen Hemminger; KY Srinivasan; davem@davemloft.net; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; > devel@linuxdriverproject.org > Subject: RE: [PATCH 1/1] net/hyperv: Fix the code handling tx busy > > On Mon, 2012-03-19 at 17:50 +0000, Haiyang Zhang wrote: > > > As in my previous reply to Eric -- > > We actually stop queue when the ring buffer is busy, see the code in > > netvsc.c > > > > I have tested with one CPU. After NETDEV_TX_BUSY is returned, the > > Linux guest OS continues to respond without any problem. > > Then something is wrong somewhere. > > Dont hide a bug adding a trick. > > If you ever return NETDEV_TX_BUSY from start_xmit(), then you MUST call > netif_tx_stop_queue() as well right before. Yes, we called the stop_queue before returning NETDEV_TX_BUSY. The stop_queue was called in the function netvsc_send() in file netvsc.c, then it returns to rndis_filter_send(), which returns to netvsc_start_xmit() in file netvsc_drv.c. So the NETDEV_TX_BUSY is indeed returned AFTER queue is stopped. Thanks, - Haiyang
On Mon, 2012-03-19 at 19:17 +0000, Haiyang Zhang wrote: > Yes, we called the stop_queue before returning NETDEV_TX_BUSY. > > The stop_queue was called in the function netvsc_send() in file > netvsc.c, then it returns to rndis_filter_send(), which returns to > netvsc_start_xmit() in file netvsc_drv.c. So the NETDEV_TX_BUSY is > indeed returned AFTER queue is stopped. > Thats should be in your changelog, so that next time, reviewers dont have to spend their time to check you did it right, especially when start_xmit() code is not self contained or at least in a single file. Each time we see a NETDEV_TX_BUSY in a patch, this is a sign of a possible problem. Your initial changelog was : Instead of dropping the packet, we keep the skb buffer, and return NETDEV_TX_BUSY to let upper layer retry send. This will not cause endless loop, because the host is taking data away from ring buffer. And this is the typical message that doesnt explain why its safe. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > Sent: Monday, March 19, 2012 4:47 PM > To: Haiyang Zhang > Cc: Stephen Hemminger; KY Srinivasan; davem@davemloft.net; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; > devel@linuxdriverproject.org > Subject: RE: [PATCH 1/1] net/hyperv: Fix the code handling tx busy > > On Mon, 2012-03-19 at 19:17 +0000, Haiyang Zhang wrote: > > > Yes, we called the stop_queue before returning NETDEV_TX_BUSY. > > > > The stop_queue was called in the function netvsc_send() in file > > netvsc.c, then it returns to rndis_filter_send(), which returns to > > netvsc_start_xmit() in file netvsc_drv.c. So the NETDEV_TX_BUSY is > > indeed returned AFTER queue is stopped. > > > > Thats should be in your changelog, so that next time, reviewers dont have to > spend their time to check you did it right, especially when > start_xmit() code is not self contained or at least in a single file. > > Each time we see a NETDEV_TX_BUSY in a patch, this is a sign of a possible > problem. > > Your initial changelog was : > > Instead of dropping the packet, we keep the skb buffer, and return > NETDEV_TX_BUSY to let upper layer retry send. This will not cause endless > loop, because the host is taking data away from ring buffer. > > And this is the typical message that doesnt explain why its safe. Thanks for your time, I will re-submit the patch with the explanation in the change log. Thanks, - Haiyang
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 2517d20..dd29478 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -223,13 +223,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) net->stats.tx_bytes += skb->len; net->stats.tx_packets++; } else { - /* we are shutting down or bus overloaded, just drop packet */ - net->stats.tx_dropped++; kfree(packet); - dev_kfree_skb_any(skb); } - return NETDEV_TX_OK; + return ret ? NETDEV_TX_BUSY : NETDEV_TX_OK; } /*