diff mbox

[RFC,net-next] bnx2x: avoid printing unnecessary messages during register dump

Message ID 1475001234-25933-1-git-send-email-gpiccoli@linux.vnet.ibm.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Guilherme G. Piccoli Sept. 27, 2016, 6:33 p.m. UTC
The bnx2x driver prints multiple error messages during register dump,
with "ethtool -d" for example. The driver even warn that many messages
might be seen during the register dump, but they are harmless. A typical
kernel log after register dump looks like this:

  [9.375] bnx2x: [bnx2x_get_regs:987(net0)]Generating register dump. Might trigger harmless GRC timeouts
  [9.439] bnx2x: [bnx2x_attn_int_deasserted3:4342(net0)]LATCHED attention 0x04000000 (masked)
  [9.439] bnx2x: [bnx2x_attn_int_deasserted3:4346(net0)]GRC time-out 0x010580cd
  [...]

The notation [...] means that some messages were supressed - in our
tests we saw 78 more "LATCHED attention" and "GRC time-out" messages,
supressed here.

This patch avoid these messages to be printed on register dump instead
of just warn they are harmless.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---

  * This was sent as RFC for two main reasons: firstly, I might be ignoring
    some importance in showing these error messages during register dump.
    Also, there are multiple ways to implement this idea - I just did the
    first one that came to my head. We might also add a new flag on struct
    bnx2x or even a new field. Any suggestions regarding the best
    implementation are welcome.

 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h        |  1 +
 .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c    | 17 ++++++++++++-----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 22 ++++++++++++----------
 3 files changed, 25 insertions(+), 15 deletions(-)

Comments

David Miller Sept. 28, 2016, 2:43 a.m. UTC | #1
From: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
Date: Tue, 27 Sep 2016 15:33:54 -0300

> The bnx2x driver prints multiple error messages during register dump,
> with "ethtool -d" for example. The driver even warn that many messages
> might be seen during the register dump, but they are harmless. A typical
> kernel log after register dump looks like this:
> 
>   [9.375] bnx2x: [bnx2x_get_regs:987(net0)]Generating register dump. Might trigger harmless GRC timeouts
>   [9.439] bnx2x: [bnx2x_attn_int_deasserted3:4342(net0)]LATCHED attention 0x04000000 (masked)
>   [9.439] bnx2x: [bnx2x_attn_int_deasserted3:4346(net0)]GRC time-out 0x010580cd
>   [...]
> 
> The notation [...] means that some messages were supressed - in our
> tests we saw 78 more "LATCHED attention" and "GRC time-out" messages,
> supressed here.
> 
> This patch avoid these messages to be printed on register dump instead
> of just warn they are harmless.
> 
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>

Although "ethtool -d" is really a debugging facility, I still think that
serious care should be placed into arranging what gets dumped in such
a way that such access timeouts and errors are minimized.
Guilherme G. Piccoli Sept. 29, 2016, 4:19 p.m. UTC | #2
On 09/27/2016 11:43 PM, David Miller wrote:
> From: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
> Date: Tue, 27 Sep 2016 15:33:54 -0300
> 
>> The bnx2x driver prints multiple error messages during register dump,
>> with "ethtool -d" for example. The driver even warn that many messages
>> might be seen during the register dump, but they are harmless. A typical
>> kernel log after register dump looks like this:
>>
>>   [9.375] bnx2x: [bnx2x_get_regs:987(net0)]Generating register dump. Might trigger harmless GRC timeouts
>>   [9.439] bnx2x: [bnx2x_attn_int_deasserted3:4342(net0)]LATCHED attention 0x04000000 (masked)
>>   [9.439] bnx2x: [bnx2x_attn_int_deasserted3:4346(net0)]GRC time-out 0x010580cd
>>   [...]
>>
>> The notation [...] means that some messages were supressed - in our
>> tests we saw 78 more "LATCHED attention" and "GRC time-out" messages,
>> supressed here.
>>
>> This patch avoid these messages to be printed on register dump instead
>> of just warn they are harmless.
>>
>> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> 
> Although "ethtool -d" is really a debugging facility, I still think that
> serious care should be placed into arranging what gets dumped in such
> a way that such access timeouts and errors are minimized.
> 

David, thanks for your comment. I confess I didn't understand your
statement quite well. You say we shouldn't dump registers that will
cause timeouts, that's it?

If yes, I guess this is a valid point. We will however loose some debug
information (as you mentioned, 'ethtool -d' is a debug facility). Now,
since I'm no expert in QLogic adapter hw/fw, I want to ask Yuval/Ariel
why those timeouts are hit anyway. Are they completely harmless?

In my understanding/opinion, hiding the messages entirely (as this patch
does) OR avoid the timeouts by disabling some registers' dump are both
better alternatives than the current behavior of the driver.

Thanks,



Guilherme
David Miller Sept. 30, 2016, 5:22 a.m. UTC | #3
From: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
Date: Thu, 29 Sep 2016 13:19:39 -0300

> David, thanks for your comment. I confess I didn't understand your
> statement quite well. You say we shouldn't dump registers that will
> cause timeouts, that's it?

Yes, basically.

If this happened infrequently for one or two registers maybe
that would be OK, but this seems to timeout on many registers
especially if the device is up and operational during the
"ethtool -d", and that's a bit much.
diff mbox

Patch

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 7dd7490..73f2713 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -2053,6 +2053,7 @@  void bnx2x_update_coalesce(struct bnx2x *bp);
 int bnx2x_get_cur_phy_idx(struct bnx2x *bp);
 
 bool bnx2x_port_after_undi(struct bnx2x *bp);
+bool bnx2x_is_reading_regs(void);
 
 static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
 			   int wait)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index 85a7800..d7dc867 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -29,6 +29,8 @@ 
 #include "bnx2x_dump.h"
 #include "bnx2x_init.h"
 
+static int bnx2x_reading_regs;
+
 /* Note: in the format strings below %s is replaced by the queue-name which is
  * either its index or 'fcoe' for the fcoe queue. Make sure the format string
  * length does not exceed ETH_GSTRING_LEN - MAX_QUEUE_NAME_LEN + 2
@@ -981,19 +983,24 @@  static void bnx2x_get_regs(struct net_device *dev,
 	memcpy(p, &dump_hdr, sizeof(struct dump_header));
 	p += dump_hdr.header_size + 1;
 
-	/* This isn't really an error, but since attention handling is going
-	 * to print the GRC timeouts using this macro, we use the same.
+	/* Actually read the registers - we use bnx2x_reading_regs to
+	 * avoid multiple unnecessary error messages to be printed on
+	 * kernel log when reading registers, like GRC timeouts.
 	 */
-	BNX2X_ERR("Generating register dump. Might trigger harmless GRC timeouts\n");
-
-	/* Actually read the registers */
+	bnx2x_reading_regs = 1;
 	__bnx2x_get_regs(bp, p);
+	bnx2x_reading_regs = 0;
 
 	/* Re-enable parity attentions */
 	bnx2x_clear_blocks_parity(bp);
 	bnx2x_enable_blocks_parity(bp);
 }
 
+inline bool bnx2x_is_reading_regs(void)
+{
+	return !!bnx2x_reading_regs;
+}
+
 static int bnx2x_get_preset_regs_len(struct net_device *dev, u32 preset)
 {
 	struct bnx2x *bp = netdev_priv(dev);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index fa3386b..392d14c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -4339,16 +4339,18 @@  static void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
 	}
 
 	if (attn & EVEREST_LATCHED_ATTN_IN_USE_MASK) {
-		BNX2X_ERR("LATCHED attention 0x%08x (masked)\n", attn);
-		if (attn & BNX2X_GRC_TIMEOUT) {
-			val = CHIP_IS_E1(bp) ? 0 :
-					REG_RD(bp, MISC_REG_GRC_TIMEOUT_ATTN);
-			BNX2X_ERR("GRC time-out 0x%08x\n", val);
-		}
-		if (attn & BNX2X_GRC_RSV) {
-			val = CHIP_IS_E1(bp) ? 0 :
-					REG_RD(bp, MISC_REG_GRC_RSV_ATTN);
-			BNX2X_ERR("GRC reserved 0x%08x\n", val);
+		if (!bnx2x_is_reading_regs()) {
+			BNX2X_ERR("LATCHED attention 0x%08x (masked)\n", attn);
+			if (attn & BNX2X_GRC_TIMEOUT) {
+				val = CHIP_IS_E1(bp) ? 0 :
+				      REG_RD(bp, MISC_REG_GRC_TIMEOUT_ATTN);
+				BNX2X_ERR("GRC time-out 0x%08x\n", val);
+			}
+			if (attn & BNX2X_GRC_RSV) {
+				val = CHIP_IS_E1(bp) ? 0 :
+				      REG_RD(bp, MISC_REG_GRC_RSV_ATTN);
+				BNX2X_ERR("GRC reserved 0x%08x\n", val);
+			}
 		}
 		REG_WR(bp, MISC_REG_AEU_CLR_LATCH_SIGNAL, 0x7ff);
 	}