diff mbox

[RFC] net: NETDEV WATCHDOG should print something every time

Message ID 20100122214333.14389.86017.stgit@jbrandeb-ich9b.jf.intel.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Jesse Brandeburg Jan. 22, 2010, 9:43 p.m. UTC
commit 5337407c changed NETDEV WATCHDOG messages into a message
that will only print once per driver load.  This removed a significant amount
of information from an admin who might be missing that his system was having
NETDEV WATCHDOGs, esp since there is no other global counter available to
count these events.

simply check the __warned flag and print a simple version of the message
without the full stack dump if the (kerneloops related) WARN_ON_ONCE has
already logged the hardware type and one hang.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Arjan <arjan@linux.intel.com>
---

 include/asm-generic/bug.h |    5 +++++
 net/sched/sch_generic.c   |    9 +++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ben Hutchings Jan. 23, 2010, 1:51 a.m. UTC | #1
On Fri, 2010-01-22 at 13:43 -0800, Jesse Brandeburg wrote:
> commit 5337407c changed NETDEV WATCHDOG messages into a message
> that will only print once per driver load.  This removed a significant amount
> of information from an admin who might be missing that his system was having
> NETDEV WATCHDOGs, esp since there is no other global counter available to
> count these events.
> 
> simply check the __warned flag and print a simple version of the message
> without the full stack dump if the (kerneloops related) WARN_ON_ONCE has
> already logged the hardware type and one hang.
[...]
> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> index 18c435d..ad810a0 100644
> --- a/include/asm-generic/bug.h
> +++ b/include/asm-generic/bug.h
> @@ -132,6 +132,11 @@ extern void warn_slowpath_null(const char *file, const int line);
>  	unlikely(__ret_warn_once);				\
>  })
>  
> +#define WARNED_ALREADY() ({					\
> +	static bool __warned;					\
> +	unlikely(__warned);					\
> +})

It is indeed unlikely that __warned will be true, given there is no
statement to set it...

I think this could be a generic macro:

#define first_time() ({						\
	static bool __been_here;				\
	__been_here++;						\
})

>  #define WARN_ON_RATELIMIT(condition, state)			\
>  		WARN_ON((condition) && __ratelimit(state))
>  
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 5173c1e..28fb14f 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -251,8 +251,13 @@ static void dev_watchdog(unsigned long arg)
>  
>  			if (some_queue_timedout) {
>  				char drivername[64];
> -				WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
> -				       dev->name, netdev_drivername(dev, drivername, 64), i);
> +				/* FIXME: is there a way to const char string[] = "NETDEV WATCHDOG..." */
[...]

Maybe you could, you know, just write that declaration... though a
'static' in front wouldn't hurt.

Ben.
David Miller Jan. 23, 2010, 2:45 a.m. UTC | #2
From: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date: Fri, 22 Jan 2010 13:43:33 -0800

> commit 5337407c changed NETDEV WATCHDOG messages into a message
> that will only print once per driver load.  This removed a significant amount
> of information from an admin who might be missing that his system was having
> NETDEV WATCHDOGs, esp since there is no other global counter available to
> count these events.

It's not once per driver load, it's once globally.

Once per driver load would be in fact what I would actually
consider more reasonable, so put the boolean state into
struct netdev, and test it to decide whether to do the
WARN_ON() print.

Doing a message every time is way overboard and is going to
spam some people's systems to the point where they can't
even diagnose the problem, so I'm not accepting a patch
which does that.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 18c435d..ad810a0 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -132,6 +132,11 @@  extern void warn_slowpath_null(const char *file, const int line);
 	unlikely(__ret_warn_once);				\
 })
 
+#define WARNED_ALREADY() ({					\
+	static bool __warned;					\
+	unlikely(__warned);					\
+})
+
 #define WARN_ON_RATELIMIT(condition, state)			\
 		WARN_ON((condition) && __ratelimit(state))
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 5173c1e..28fb14f 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -251,8 +251,13 @@  static void dev_watchdog(unsigned long arg)
 
 			if (some_queue_timedout) {
 				char drivername[64];
-				WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
-				       dev->name, netdev_drivername(dev, drivername, 64), i);
+				/* FIXME: is there a way to const char string[] = "NETDEV WATCHDOG..." */
+				if (!WARNED_ALREADY())
+					WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
+						  dev->name, netdev_drivername(dev, drivername, 64), i);
+				else
+					printk(KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
+					       dev->name, netdev_drivername(dev, drivername, 64), i);
 				dev->netdev_ops->ndo_tx_timeout(dev);
 			}
 			if (!mod_timer(&dev->watchdog_timer,