Message ID | 1267817699.30393.8.camel@c-dwalke-linux.qualcomm.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 05 Mar 2010 11:34:59 -0800 Daniel Walker <dwalker@codeaurora.org> wrote: > Fixes a race condition on the networking receive path that causes all > received packets to be dropped after 15-60 minutes of heavy network usage. > Function process_backlog() empties the receive queue, re-enables > interrupts, then "completes" the softIRQ. This provides a time window for > netif_rx() to execute (in IRQ context) and enqueue a received packet > without re-scheduling the softIRQ. After this, the receive queue is never > processed and the system eventually begins to drop all received packets. I wonder why this hasn't shown up before? Where exactly is the window between empty process_backlog and netif_rx? Maybe it is ARM specific behavior of softirq? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
It appears that this patch is no longer necessary. It was made against 2.6.29, but I see that more recent kernel versions don't have the problem code. For a more detailed explanation, see below. All code references are in routine process_backlog(), file net/core/dev.c. In kernel version 2.6.27.45, __napi_complete() is invoked BEFORE interrupts are re-enabled. Thus, the receive queue status is cleaned up before another interrupt (due to a receive packet) can occur. This is good design. In kernel version 2.6.29, git commit ID 303c6a025 inverts this ordering. Routine napi_complete() is invoked AFTER interrupts are re-enabled. We observed interrupts taken after interrupts were re-enabled, but before napi_complete cleaned up the receive queue. This would then shut down the processing of subsequent received packets. In kernel versions 2.6.30.10 and later, the sequence of operations is identical to 2.6.27.45, so there is no problem. Jim Harford Qualcomm Innovation Center -----Original Message----- From: Stephen Hemminger [mailto:shemminger@vyatta.com] Sent: Friday, March 05, 2010 4:21 PM To: Daniel Walker Cc: David S. Miller; Harford, Jim; netdev@vger.kernel.org Subject: Re: [PATCH] net: Fix race condition on receive path. On Fri, 05 Mar 2010 11:34:59 -0800 Daniel Walker <dwalker@codeaurora.org> wrote: > Fixes a race condition on the networking receive path that causes all > received packets to be dropped after 15-60 minutes of heavy network usage. > Function process_backlog() empties the receive queue, re-enables > interrupts, then "completes" the softIRQ. This provides a time window for > netif_rx() to execute (in IRQ context) and enqueue a received packet > without re-scheduling the softIRQ. After this, the receive queue is never > processed and the system eventually begins to drop all received packets. I wonder why this hasn't shown up before? Where exactly is the window between empty process_backlog and netif_rx? Maybe it is ARM specific behavior of softirq? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2010-03-05 at 15:41 -0800, Harford, Jim wrote: > It appears that this patch is no longer necessary. It was made against 2.6.29, but I see that more recent kernel versions don't have the problem code. For a more detailed explanation, see below. All code references are in routine process_backlog(), file net/core/dev.c. > > In kernel version 2.6.27.45, __napi_complete() is invoked BEFORE interrupts are re-enabled. Thus, the receive queue status is cleaned up before another interrupt (due to a receive packet) can occur. This is good design. > > In kernel version 2.6.29, git commit ID 303c6a025 inverts this ordering. Routine napi_complete() is invoked AFTER interrupts are re-enabled. We observed interrupts taken after interrupts were re-enabled, but before napi_complete cleaned up the receive queue. This would then shut down the processing of subsequent received packets. > > In kernel versions 2.6.30.10 and later, the sequence of operations is identical to 2.6.27.45, so there is no problem. > > Jim Harford > Qualcomm Innovation Center Ok, I guess we can ignore this one then. Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 05 Mar 2010 15:44:27 -0800 Daniel Walker <dwalker@codeaurora.org> wrote: > On Fri, 2010-03-05 at 15:41 -0800, Harford, Jim wrote: > > It appears that this patch is no longer necessary. It was made against 2.6.29, but I see that more recent kernel versions don't have the problem code. For a more detailed explanation, see below. All code references are in routine process_backlog(), file net/core/dev.c. > > > > In kernel version 2.6.27.45, __napi_complete() is invoked BEFORE interrupts are re-enabled. Thus, the receive queue status is cleaned up before another interrupt (due to a receive packet) can occur. This is good design. > > > > In kernel version 2.6.29, git commit ID 303c6a025 inverts this ordering. Routine napi_complete() is invoked AFTER interrupts are re-enabled. We observed interrupts taken after interrupts were re-enabled, but before napi_complete cleaned up the receive queue. This would then shut down the processing of subsequent received packets. > > > > In kernel versions 2.6.30.10 and later, the sequence of operations is identical to 2.6.27.45, so there is no problem. > > > > Jim Harford > > Qualcomm Innovation Center > > Ok, I guess we can ignore this one then. > > Daniel > It was fixed by: commit 8f1ead2d1a626ed0c85b3d2c2046a49081d5933f Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Mar 26 00:59:10 2009 -0700 GRO: Disable GRO on legacy netif_rx path When I fixed the GRO crash in the legacy receive path I used napi_complete to replace __napi_complete. Unfortunately they're not the same when NETPOLL is enabled, which may result in us not calling __napi_complete at all. What's more, we really do need to keep the __napi_complete call within the IRQ-off section since in theory an IRQ can occur in between and fill up the backlog to the maximum, causing us to lock up. Since we can't seem to find a fix that works properly right now, this patch reverts all the GRO support from the netif_rx path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/dev.c b/net/core/dev.c index be9924f..43161c6 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2150,15 +2150,10 @@ int netif_rx(struct sk_buff *skb) __get_cpu_var(netdev_rx_stat).total++; if (queue->input_pkt_queue.qlen <= netdev_max_backlog) { - if (queue->input_pkt_queue.qlen) { -enqueue: - __skb_queue_tail(&queue->input_pkt_queue, skb); - local_irq_restore(flags); - return NET_RX_SUCCESS; - } - + __skb_queue_tail(&queue->input_pkt_queue, skb); napi_schedule(&queue->backlog); - goto enqueue; + local_irq_restore(flags); + return NET_RX_SUCCESS; } __get_cpu_var(netdev_rx_stat).dropped++;