Message ID | 20090317102244.GB23396@gondor.apana.org.au |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 17 Mar 2009 18:22:44 +0800 > On Tue, Mar 17, 2009 at 09:35:21AM +0100, Frank Blaschka wrote: > > > > What is the intention process_backlog calls __napi_complete() instead of > > napi_complete(), this looks suspicious to me. Can anybody help? > > You're absolutely right. Dave, we need this fix for both net > and net-next. > > gro: Fix legacy path napi_complete crash > > On the legacy netif_rx path, I incorrectly tried to optimise > the napi_complete call by using __napi_complete before we reenable > IRQs. This simply doesn't work since we need to flush the held > GRO packets first. > > This patch fixes it by doing the obvious thing of reenabling > IRQs first and then calling napi_complete. > > Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On the legacy netif_rx path, I incorrectly tried to optimise > > the napi_complete call by using __napi_complete before we reenable > IRQs. This simply doesn't work since we need to flush the held > GRO packets first. > > This patch fixes it by doing the obvious thing of reenabling > IRQs first and then calling napi_complete. Does this fix generate a race condition for a non-NAPI device? If netif_rx runs immediately after local_irq_enable it would queue a packet on the backlog queue and try to schedule napi (the latter has no effect because napi has not completed). On return from interrupt, napi_complete is done leaving a packet in the input queue but napi is not scheduled to process it. Thanks, Tom > > Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > diff --git a/net/core/dev.c b/net/core/dev.c > index f112970..2565f6d 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2588,9 +2588,9 @@ static int process_backlog(struct napi_struct *napi, int quota) > local_irq_disable(); > skb = __skb_dequeue(&queue->input_pkt_queue); > if (!skb) { > - __napi_complete(napi); > local_irq_enable(); > - break; > + napi_complete(napi); > + goto out; > } > local_irq_enable(); > > @@ -2599,6 +2599,7 @@ static int process_backlog(struct napi_struct *napi, int quota) > > napi_gro_flush(napi); > > +out: > return work; > } > > Thanks, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Tom Herbert <therbert@google.com> Date: Fri, 27 Mar 2009 12:05:30 -0700 > > On the legacy netif_rx path, I incorrectly tried to optimise > > > > the napi_complete call by using __napi_complete before we reenable > > IRQs. This simply doesn't work since we need to flush the held > > GRO packets first. > > > > This patch fixes it by doing the obvious thing of reenabling > > IRQs first and then calling napi_complete. > > Does this fix generate a race condition for a non-NAPI device? If > netif_rx runs immediately after local_irq_enable it would queue a > packet on the backlog queue and try to schedule napi (the latter has > no effect because napi has not completed). On return from interrupt, > napi_complete is done leaving a packet in the input queue but napi is > not scheduled to process it. Yes we know this version of Herbert's patch has that problem, read the rest of the thread and subsequent versions of the fix. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/dev.c b/net/core/dev.c index f112970..2565f6d 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2588,9 +2588,9 @@ static int process_backlog(struct napi_struct *napi, int quota) local_irq_disable(); skb = __skb_dequeue(&queue->input_pkt_queue); if (!skb) { - __napi_complete(napi); local_irq_enable(); - break; + napi_complete(napi); + goto out; } local_irq_enable(); @@ -2599,6 +2599,7 @@ static int process_backlog(struct napi_struct *napi, int quota) napi_gro_flush(napi); +out: return work; }