Message ID | 1363384577-21287-1-git-send-email-zenczykowski@gmail.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
> -----Original Message----- > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Maciej ?enczykowski > Sent: Friday, March 15, 2013 11:56 PM > To: Maciej Żenczykowski; David S. Miller > Cc: netdev@vger.kernel.org; Yuval Mintz > Subject: [PATCH] bnx2x: fix occasional statistics off-by-4GB error > > From: Maciej Żenczykowski <maze@google.com> > > The UPDATE_QSTAT function introduced on February 15, 2012 > in commit 1355b704b9ba "bnx2x: consistent statistics after > internal driver reload" incorrectly fails to handle overflow > during addition of the lower 32-bit field of a stat. > > This bug is present since 3.4-rc1 and should thus be considered > a candidate for stable 3.4+ releases. > > Google-Bug-Id: 8374428 > Signed-off-by: Maciej Żenczykowski <maze@google.com> > Cc: Mintz Yuval <yuvalmin@broadcom.com> > --- > drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h > index 364e37ecbc5c..198f6f1c9ad5 100644 > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h > @@ -459,8 +459,9 @@ struct bnx2x_fw_port_stats_old { > > #define UPDATE_QSTAT(s, t) \ > do { \ > - qstats->t##_hi = qstats_old->t##_hi + le32_to_cpu(s.hi); \ > qstats->t##_lo = qstats_old->t##_lo + le32_to_cpu(s.lo); \ > + qstats->t##_hi = qstats_old->t##_hi + le32_to_cpu(s.hi) \ > + + ((qstats->t##_lo < qstats_old->t##_lo) ? 1 : 0); \ > } while (0) > > #define UPDATE_QSTAT_OLD(f) \ > -- > 1.8.1.3 > Probably this commit resolved the issue: commit bef05406ac0ea6f468e1e25e9934f3011ea9259b Author: Dmitry Kravkov <dmitry@broadcom.com> Date: Tue Sep 11 04:34:08 2012 +0000 bnx2x: Avoid sending multiple statistics queries Can you try it pls? Thanks
From: "Dmitry Kravkov" <dmitry@broadcom.com> Date: Sun, 17 Mar 2013 13:10:37 +0000 > Probably this commit resolved the issue: > > commit bef05406ac0ea6f468e1e25e9934f3011ea9259b > Author: Dmitry Kravkov <dmitry@broadcom.com> > Date: Tue Sep 11 04:34:08 2012 +0000 > > bnx2x: Avoid sending multiple statistics queries > > Can you try it pls? These are completely seperate bugs. The macro in question does not handle rollover of the lower 32-bits of the statistic properly at all, and therefore Maciej's patch should be applied and queud up for -stable. Please give it your ACK unless you can find a bug in his change. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2013-03-17 at 14:24 -0400, David Miller wrote: > From: "Dmitry Kravkov" <dmitry@broadcom.com> > Date: Sun, 17 Mar 2013 13:10:37 +0000 > > > Probably this commit resolved the issue: > > > > commit bef05406ac0ea6f468e1e25e9934f3011ea9259b > > Author: Dmitry Kravkov <dmitry@broadcom.com> > > Date: Tue Sep 11 04:34:08 2012 +0000 > > > > bnx2x: Avoid sending multiple statistics queries > > > > Can you try it pls? > > These are completely seperate bugs. > > The macro in question does not handle rollover of the lower 32-bits of > the statistic properly at all, and therefore Maciej's patch should be > applied and queud up for -stable. > > Please give it your ACK unless you can find a bug in his change. Both the high value and the low value are read from the chip - so this patch will increment the higher 32 bits twice (well, more than once). Not taking the HW/FW high counter at all is also not acceptable since the reading frequency is not high enough without those. So the original patch is nacked, but we are trying to figure out what is causing the statistics to misbehave and it might be related to sending statistics query twice. Thanks, Eilon -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2013-03-17 at 21:37 +0200, Eilon Greenstein wrote: > On Sun, 2013-03-17 at 14:24 -0400, David Miller wrote: > > From: "Dmitry Kravkov" <dmitry@broadcom.com> > > Date: Sun, 17 Mar 2013 13:10:37 +0000 > > > > > Probably this commit resolved the issue: > > > > > > commit bef05406ac0ea6f468e1e25e9934f3011ea9259b > > > Author: Dmitry Kravkov <dmitry@broadcom.com> > > > Date: Tue Sep 11 04:34:08 2012 +0000 > > > > > > bnx2x: Avoid sending multiple statistics queries > > > > > > Can you try it pls? > > > > These are completely seperate bugs. > > > > The macro in question does not handle rollover of the lower 32-bits of > > the statistic properly at all, and therefore Maciej's patch should be > > applied and queud up for -stable. > > > > Please give it your ACK unless you can find a bug in his change. > > Both the high value and the low value are read from the chip - so this > patch will increment the higher 32 bits twice (well, more than once). > Not taking the HW/FW high counter at all is also not acceptable since > the reading frequency is not high enough without those. > > So the original patch is nacked, but we are trying to figure out what is > causing the statistics to misbehave and it might be related to sending > statistics query twice. This looks like the typical problem of updating a 64bit value in non atomic way. Its guaranteed to happen on 32bit hosts. We had to introduce include/linux/u64_stats_sync.h to help to solve this without adding extra cost on 64bit arches. In bnx2x case, we perform 32bit operations even on 64bit host, so we probably need to add a seqcount_t, so that a stat consumer can detect it read a stale value. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2013-03-17 at 13:25 -0700, Eric Dumazet wrote: > On Sun, 2013-03-17 at 21:37 +0200, Eilon Greenstein wrote: > > On Sun, 2013-03-17 at 14:24 -0400, David Miller wrote: > > > From: "Dmitry Kravkov" <dmitry@broadcom.com> > > > Date: Sun, 17 Mar 2013 13:10:37 +0000 > > > > > > > Probably this commit resolved the issue: > > > > > > > > commit bef05406ac0ea6f468e1e25e9934f3011ea9259b > > > > Author: Dmitry Kravkov <dmitry@broadcom.com> > > > > Date: Tue Sep 11 04:34:08 2012 +0000 > > > > > > > > bnx2x: Avoid sending multiple statistics queries > > > > > > > > Can you try it pls? > > > > > > These are completely seperate bugs. > > > > > > The macro in question does not handle rollover of the lower 32-bits of > > > the statistic properly at all, and therefore Maciej's patch should be > > > applied and queud up for -stable. > > > > > > Please give it your ACK unless you can find a bug in his change. > > > > Both the high value and the low value are read from the chip - so this > > patch will increment the higher 32 bits twice (well, more than once). > > Not taking the HW/FW high counter at all is also not acceptable since > > the reading frequency is not high enough without those. > > > > So the original patch is nacked, but we are trying to figure out what is > > causing the statistics to misbehave and it might be related to sending > > statistics query twice. > > This looks like the typical problem of updating a 64bit value in non > atomic way. > > Its guaranteed to happen on 32bit hosts. > We had to introduce include/linux/u64_stats_sync.h to help to solve this > without adding extra cost on 64bit arches. > > In bnx2x case, we perform 32bit operations even on 64bit host, so we > probably need to add a seqcount_t, so that a stat consumer can detect it > read a stale value. This is not such a trivial issue - the HW/FW is guaranteeing the atomic read and this is why we can always use 32b variables. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2013-03-17 at 22:53 +0200, Eilon Greenstein wrote: > This is not such a trivial issue - the HW/FW is guaranteeing the atomic > read and this is why we can always use 32b variables. > I am glad you qualify the following code as 'trivial' /* difference = minuend - subtrahend */ #define DIFF_64(d_hi, m_hi, s_hi, d_lo, m_lo, s_lo) \ do { \ if (m_lo < s_lo) { \ /* underflow */ \ d_hi = m_hi - s_hi; \ if (d_hi > 0) { \ /* we can 'loan' 1 */ \ d_hi--; \ d_lo = m_lo + (UINT_MAX - s_lo) + 1; \ } else { \ /* m_hi <= s_hi */ \ d_hi = 0; \ d_lo = 0; \ } \ } else { \ /* m_lo >= s_lo */ \ if (m_hi < s_hi) { \ d_hi = 0; \ d_lo = 0; \ } else { \ /* m_hi >= s_hi */ \ d_hi = m_hi - s_hi; \ d_lo = m_lo - s_lo; \ } \ } \ } while (0) All the macros and _hi/_ho fields found in drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h are really badly designed. It hurts so much the common sense its not even funny. All you really needed is to have normal host structures containing u64 fields, and you only needed one single helper to convert from hw u32:u32 to host u64. Because gcc handles u64 operations very well, even on 32bit hosts, and it knows about the carry flag. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This has nothing to do with atomicity or races or anything. Look at the ADD_64 macro (which implements a += b) and compare it to this one (which implents a = b + c). It simply doesn't know how to correctly add in any situation in which "qstats_old->t..._lo != 0". By this point all the stats are already in software and both firmware and hardware is not relevant. Normally qstats_old starts off at 0 and the bug isn't visible (since with a value of 0, there can't be overflow and carry propagation is not needed). The reason qstats_old was introduced was to allow for a partial nic reset [or whatever the proper term is] which resets fw/hw provided stats to not reset sw exported stats. During the reset old stats are copied into qstats_old, and in the future stats are returned as "qstats_old + current stats from fw/hw" except this addition is buggy and only works if qstats_old happens to have lower 32-bits zero (ie. basically only before first partial nic reset). Imagine: (a) driver initializes, nic gets brought up, qstats_old = 0, stats from device = 0 (b) 2GB of xfer happens (c) qstats_old still = 0, stats from device = 2GB (d) addition happens qstats = qstats_old + stats_from_dev = 0 + 2GB = 2GB --- we report 2GB xfer (e) something partially resets nic (say mtu change), qstats_old = 2GB, stats from device = 0 (f) 2GB-1 of xfer happens (g) qstats_old = 2GB, stats from device = 2GB-1 (h) addition happens qstats = qstats_old + stats_from_dev = 2GB + 2GB-1 = 4GB-1 --- we report 4GB-1 xfer (i) 1 byte of xfer happens (j) qstats_old = 2GB, stats from device = 2GB (k) addition happens qstats = qstats_old + stats_from_dev = 2GB + 2GB = 0 --- we report 0 xfer (l) 2GB of xfer happens (m) qstats_old = 2GB, stats from device = 4GB (n) addition happens qstats = qstats_old + stats_from_dev = 2GB + 4GB = 6GB --- we report 6GB xfer And yes the macro in question when asked to do 2GB+2GB will return 0. (btw. at least on the driver I tested there's still some weirdness with mtu change not-resetting per-queue stats but resetting coalesced-from-all-queues stats) Now, to be fair, I don't yet have confirmation that my patch fixes the problem our SREs saw in deployment, but I expect to have that soon. (and even if it doesn't fix the problem, this code is still buggy and should be fixed) Maciej On Sun, Mar 17, 2013 at 1:53 PM, Eilon Greenstein <eilong@broadcom.com> wrote: > On Sun, 2013-03-17 at 13:25 -0700, Eric Dumazet wrote: >> On Sun, 2013-03-17 at 21:37 +0200, Eilon Greenstein wrote: >> > On Sun, 2013-03-17 at 14:24 -0400, David Miller wrote: >> > > From: "Dmitry Kravkov" <dmitry@broadcom.com> >> > > Date: Sun, 17 Mar 2013 13:10:37 +0000 >> > > >> > > > Probably this commit resolved the issue: >> > > > >> > > > commit bef05406ac0ea6f468e1e25e9934f3011ea9259b >> > > > Author: Dmitry Kravkov <dmitry@broadcom.com> >> > > > Date: Tue Sep 11 04:34:08 2012 +0000 >> > > > >> > > > bnx2x: Avoid sending multiple statistics queries >> > > > >> > > > Can you try it pls? >> > > >> > > These are completely seperate bugs. >> > > >> > > The macro in question does not handle rollover of the lower 32-bits of >> > > the statistic properly at all, and therefore Maciej's patch should be >> > > applied and queud up for -stable. >> > > >> > > Please give it your ACK unless you can find a bug in his change. >> > >> > Both the high value and the low value are read from the chip - so this >> > patch will increment the higher 32 bits twice (well, more than once). >> > Not taking the HW/FW high counter at all is also not acceptable since >> > the reading frequency is not high enough without those. >> > >> > So the original patch is nacked, but we are trying to figure out what is >> > causing the statistics to misbehave and it might be related to sending >> > statistics query twice. >> >> This looks like the typical problem of updating a 64bit value in non >> atomic way. >> >> Its guaranteed to happen on 32bit hosts. >> We had to introduce include/linux/u64_stats_sync.h to help to solve this >> without adding extra cost on 64bit arches. >> >> In bnx2x case, we perform 32bit operations even on 64bit host, so we >> probably need to add a seqcount_t, so that a stat consumer can detect it >> read a stale value. > > This is not such a trivial issue - the HW/FW is guaranteeing the atomic > read and this is why we can always use 32b variables. > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2013-03-17 at 16:17 -0700, Eric Dumazet wrote: > On Sun, 2013-03-17 at 22:53 +0200, Eilon Greenstein wrote: > > > This is not such a trivial issue - the HW/FW is guaranteeing the atomic > > read and this is why we can always use 32b variables. > > > > I am glad you qualify the following code as 'trivial' > Eric - I did not mean that the solution is trivial, I meant that the problem is trivial. Like the trivial statement that the code should not contain any bugs :) > All you really needed is to have normal host structures containing u64 > fields, and you only needed one single helper to convert from hw u32:u32 > to host u64. > > Because gcc handles u64 operations very well, even on 32bit hosts, and > it knows about the carry flag. I was not clear when I said size-extension: most of the HW counters are 46bits and we are trying to keep 64bits counters in the SW, so simply copying the data from the HW is not enough. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2013-03-17 at 23:18 -0700, Maciej Żenczykowski wrote: > This has nothing to do with atomicity or races or anything. > > Look at the ADD_64 macro (which implements a += b) and compare it to > this one (which implents a = b + c). > > It simply doesn't know how to correctly add in any situation in which > "qstats_old->t..._lo != 0". > By this point all the stats are already in software and both firmware > and hardware is not relevant. > > Normally qstats_old starts off at 0 and the bug isn't visible (since > with a value of 0, there can't be overflow and carry propagation is > not needed). > > The reason qstats_old was introduced was to allow for a partial nic > reset [or whatever the proper term is] which resets fw/hw provided > stats to not reset sw exported stats. > During the reset old stats are copied into qstats_old, and in the > future stats are returned as "qstats_old + current stats from fw/hw" > except this addition is buggy and only works if qstats_old happens to > have lower 32-bits zero (ie. basically only before first partial nic > reset). > > Imagine: > (a) driver initializes, nic gets brought up, qstats_old = 0, stats > from device = 0 > (b) 2GB of xfer happens > (c) qstats_old still = 0, stats from device = 2GB > (d) addition happens qstats = qstats_old + stats_from_dev = 0 + 2GB = > 2GB --- we report 2GB xfer > (e) something partially resets nic (say mtu change), qstats_old = 2GB, > stats from device = 0 > (f) 2GB-1 of xfer happens > (g) qstats_old = 2GB, stats from device = 2GB-1 > (h) addition happens qstats = qstats_old + stats_from_dev = 2GB + > 2GB-1 = 4GB-1 --- we report 4GB-1 xfer > (i) 1 byte of xfer happens > (j) qstats_old = 2GB, stats from device = 2GB > (k) addition happens qstats = qstats_old + stats_from_dev = 2GB + 2GB > = 0 --- we report 0 xfer > (l) 2GB of xfer happens > (m) qstats_old = 2GB, stats from device = 4GB > (n) addition happens qstats = qstats_old + stats_from_dev = 2GB + 4GB > = 6GB --- we report 6GB xfer > > And yes the macro in question when asked to do 2GB+2GB will return 0. > > (btw. at least on the driver I tested there's still some weirdness > with mtu change not-resetting per-queue stats but resetting > coalesced-from-all-queues stats) > > Now, to be fair, I don't yet have confirmation that my patch fixes the > problem our SREs saw in deployment, but I expect to have that soon. > (and even if it doesn't fix the problem, this code is still buggy and > should be fixed) > > Maciej Maciej - thanks for the detailed information. You are right - it has nothing to do with the HW/FW and it is simply a bug that needs to be fixed. I withdraw my objections and add my ACK. Acked-by: Eilon Greenstein <eilong@broadcom.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: "Eilon Greenstein" <eilong@broadcom.com> Date: Mon, 18 Mar 2013 12:06:22 +0200 > Maciej - thanks for the detailed information. You are right - it has > nothing to do with the HW/FW and it is simply a bug that needs to be > fixed. I withdraw my objections and add my ACK. > > Acked-by: Eilon Greenstein <eilong@broadcom.com> Applied and queued up for -stable. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I've reproduced this in the lab. I can confirm this patch fixes per-queue statistics (ie. ethtool -S "[0]: tx_bytes") - which previously were not monotonically increasing. However there is still a bug wrt. global device stats (ie. ethtool -S "tx_bytes" and /proc/net/dev transmit bytes). I'm guessing there's another similar bug later on during aggregation. I expect to find the offending code and send out a patch soon. On Mon, Mar 18, 2013 at 10:13 AM, David Miller <davem@davemloft.net> wrote: > From: "Eilon Greenstein" <eilong@broadcom.com> > Date: Mon, 18 Mar 2013 12:06:22 +0200 > >> Maciej - thanks for the detailed information. You are right - it has >> nothing to do with the HW/FW and it is simply a bug that needs to be >> fixed. I withdraw my objections and add my ACK. >> >> Acked-by: Eilon Greenstein <eilong@broadcom.com> > > Applied and queued up for -stable. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Maciej Żenczykowski <zenczykowski@gmail.com> Date: Mon, 18 Mar 2013 12:54:49 -0700 > I've reproduced this in the lab. > > I can confirm this patch fixes per-queue statistics (ie. ethtool -S > "[0]: tx_bytes") - which previously were not monotonically increasing. > However there is still a bug wrt. global device stats (ie. ethtool -S > "tx_bytes" and /proc/net/dev transmit bytes). > I'm guessing there's another similar bug later on during aggregation. > > I expect to find the offending code and send out a patch soon. Thanks Maciej. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Current best guess is UPDATE_ESTAT_QSTAT_64 which uses SUB_64 implemented via DIFF_64 which is crazy. On Mon, Mar 18, 2013 at 12:54 PM, Maciej Żenczykowski <zenczykowski@gmail.com> wrote: > I've reproduced this in the lab. > > I can confirm this patch fixes per-queue statistics (ie. ethtool -S > "[0]: tx_bytes") - which previously were not monotonically increasing. > However there is still a bug wrt. global device stats (ie. ethtool -S > "tx_bytes" and /proc/net/dev transmit bytes). > I'm guessing there's another similar bug later on during aggregation. > > I expect to find the offending code and send out a patch soon. > > On Mon, Mar 18, 2013 at 10:13 AM, David Miller <davem@davemloft.net> wrote: >> From: "Eilon Greenstein" <eilong@broadcom.com> >> Date: Mon, 18 Mar 2013 12:06:22 +0200 >> >>> Maciej - thanks for the detailed information. You are right - it has >>> nothing to do with the HW/FW and it is simply a bug that needs to be >>> fixed. I withdraw my objections and add my ACK. >>> >>> Acked-by: Eilon Greenstein <eilong@broadcom.com> >> >> Applied and queued up for -stable. >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-03-18 at 13:06 -0700, Maciej Żenczykowski wrote: > Current best guess is UPDATE_ESTAT_QSTAT_64 which uses SUB_64 > implemented via DIFF_64 which is crazy. All these macros are really ugly, I wish someone could cleanup this code. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Not quite the right call sequence above. UPDATE_FSTAT_QSTAT(total_bytes_received); --> SUB_64 --> DIFF_64 is probably more relevant. Regardless: /* difference = minuend - subtrahend */ #define DIFF_64(d_hi, m_hi, s_hi, d_lo, m_lo, s_lo) \ <------>do { \ <------><------>if (m_lo < s_lo) { \ <------><------><------>/* underflow */ \ <------><------><------>d_hi = m_hi - s_hi; \ <------><------><------>if (d_hi > 0) { \ <------><------><------><------>/* we can 'loan' 1 */ \ <------><------><------><------>d_hi--; \ <------><------><------><------>d_lo = m_lo + (UINT_MAX - s_lo) + 1; \ <------><------><------>} else { \ <------><------><------><------>/* m_hi <= s_hi */ \ <------><------><------><------>d_hi = 0; \ <------><------><------><------>d_lo = 0; \ <------><------><------>} \ <------><------>} else { \ ... I believe this fails. All parameters are most likely going to be u32, since that's used for stats pretty much everywhere. As such after d_hi = m_hi - s_hi; d_hi will be >= 0 since it's u32. As such if "m_hi == s_hi" && "m_lo < s_lo" we will return (0,0) however if "m_hi < s_hi" && "m_lo < s_lo" we will not return (0,0) I'm not sure which behaviour is desired, but either way this is obviously wrong. 0 - 1 returns 0 0 - (4GB+1) returns -4GB-1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Just as an update on the state here. Extensive testing both in lab and in deployment confirms the already applied patch fixes the problem. My concerns about aggregate counters being wrong (after taking another long look at the data from the 'bad' run) was simply a matter of me getting mixed up in what was the 'correct' value I should be seeing. ie. I did not actually see any bad behaviour with this patch. That said. I spent a fair bit of time debugging this before I realized I was stupid, and found other problems while doing that. I cannot actually trigger the potential problems, but I'll follow up to this with a couple patches to fix potential problems (and export more stats). (a) The DIFF_64 macro is buggy in the case of underflow due to unsigned-ness (b) Any code which does SUB/ADD should instead do ADD/SUB, because SUB(A,B) is implemented via DIFF(A,B) and thus doesn't actually do subtraction, but instead does A := max(A-B, 0) (c) there's a bunch of extra statistics it is trivial to export via ethtool which make debugging stuff like this much easier -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Maciej Żenczykowski <zenczykowski@gmail.com> Date: Thu, 21 Mar 2013 14:23:47 -0700 > (a) The DIFF_64 macro is buggy in the case of underflow due to unsigned-ness > (b) Any code which does SUB/ADD should instead do ADD/SUB, because > SUB(A,B) is implemented via DIFF(A,B) and thus doesn't actually do > subtraction, but instead does A := max(A-B, 0) > (c) there's a bunch of extra statistics it is trivial to export via > ethtool which make debugging stuff like this much easier Thanks for looking into this and the status update. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h index 364e37ecbc5c..198f6f1c9ad5 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h @@ -459,8 +459,9 @@ struct bnx2x_fw_port_stats_old { #define UPDATE_QSTAT(s, t) \ do { \ - qstats->t##_hi = qstats_old->t##_hi + le32_to_cpu(s.hi); \ qstats->t##_lo = qstats_old->t##_lo + le32_to_cpu(s.lo); \ + qstats->t##_hi = qstats_old->t##_hi + le32_to_cpu(s.hi) \ + + ((qstats->t##_lo < qstats_old->t##_lo) ? 1 : 0); \ } while (0) #define UPDATE_QSTAT_OLD(f) \