diff mbox

net: Detect drivers that reschedule NAPI and exhaust budget

Message ID 20141220065529.GA2033@gondor.apana.org.au
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Herbert Xu Dec. 20, 2014, 6:55 a.m. UTC
On Fri, Dec 19, 2014 at 09:40:00PM -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 19 Dec 2014 17:34:48 -0800
> 
> >> @@ -4620,7 +4620,11 @@ static void net_rx_action(struct softirq_action *h)
> >>  					 */
> >>  					napi_gro_flush(n, HZ >= 1000);
> >>  				}
> >> -				list_add_tail(&n->poll_list, &repoll);
> >> +				/* Some drivers may have called napi_schedule
> >> +				 * prior to exhausting their budget.
> >> +				 */
> >> +				if (!WARN_ON_ONCE(!list_empty(&n->poll_list)))
> >> +					list_add_tail(&n->poll_list, &repoll);
> >>  			}
> >>  		}
> >>  
> > 
> > I do not think stack trace will point to the buggy driver ?
> > 
> > IMO it would be better to print a single line with the netdev name ?
> 
> Right, we are already back from the poll routine and will just end
> up seeing the call trace leading to the software interrupt.

Good point Eric.

-- >8 --
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) required drivers to leave poll_list
empty if the entire budget is consumed.

We have already had two broken drivers so let's add a check for
this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,

Comments

Eric Dumazet Dec. 20, 2014, 6 p.m. UTC | #1
On Sat, 2014-12-20 at 17:55 +1100, Herbert Xu wrote:

> -- >8 --
> The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
> interrupt masking in NAPI) required drivers to leave poll_list
> empty if the entire budget is consumed.
> 
> We have already had two broken drivers so let's add a check for
> this.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f411c28..47fdc5c 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4620,7 +4620,13 @@ static void net_rx_action(struct softirq_action *h)
>  					 */
>  					napi_gro_flush(n, HZ >= 1000);
>  				}
> -				list_add_tail(&n->poll_list, &repoll);
> +				/* Some drivers may have called napi_schedule
> +				 * prior to exhausting their budget.
> +				 */
> +				if (unlikely(!list_empty(&n->poll_list)))
> +					pr_warn("%s: Budget exhausted after napi rescheduled\n", n->dev ? n->dev->name : "backlog");
> +				else
> +					list_add_tail(&n->poll_list, &repoll);
>  			}
>  		}
>  
> Thanks,

OK, but could you :
1) use pr_warn_once()
2) split the too long line
	pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
		     n->dev ? n->dev->name : "backlog");

Thanks Herbert !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Dec. 20, 2014, 8:14 p.m. UTC | #2
On Sat, Dec 20, 2014 at 10:00:12AM -0800, Eric Dumazet wrote:
> 
> OK, but could you :
> 1) use pr_warn_once()
> 2) split the too long line
> 	pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
> 		     n->dev ? n->dev->name : "backlog");

Sure, I'll clean it up a bit too.

Cheers,
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index f411c28..47fdc5c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4620,7 +4620,13 @@  static void net_rx_action(struct softirq_action *h)
 					 */
 					napi_gro_flush(n, HZ >= 1000);
 				}
-				list_add_tail(&n->poll_list, &repoll);
+				/* Some drivers may have called napi_schedule
+				 * prior to exhausting their budget.
+				 */
+				if (unlikely(!list_empty(&n->poll_list)))
+					pr_warn("%s: Budget exhausted after napi rescheduled\n", n->dev ? n->dev->name : "backlog");
+				else
+					list_add_tail(&n->poll_list, &repoll);
 			}
 		}