diff mbox

[net-next,3/5] be2net: fix erx->rx_drops_no_frags wrap around

Message ID 1314011613-4519-4-git-send-email-sathya.perla@emulex.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Sathya Perla Aug. 22, 2011, 11:13 a.m. UTC
The rx_drops_no_frags HW counter for RSS rings is 16bits in HW and can
wraparound often. Maintain a 32-bit accumulator in the driver to prevent
frequent wraparound.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |   21 ++++++++++++++++++---
 1 files changed, 18 insertions(+), 3 deletions(-)

Comments

Eric Dumazet Aug. 22, 2011, 11:43 a.m. UTC | #1
Le lundi 22 août 2011 à 16:43 +0530, Sathya Perla a écrit :
> The rx_drops_no_frags HW counter for RSS rings is 16bits in HW and can
> wraparound often. Maintain a 32-bit accumulator in the driver to prevent
> frequent wraparound.
> 
> Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
> ---
>  drivers/net/ethernet/emulex/benet/be_main.c |   21 ++++++++++++++++++---
>  1 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
> index 2375c0c..4e588d8 100644
> --- a/drivers/net/ethernet/emulex/benet/be_main.c
> +++ b/f
> @@ -378,6 +378,17 @@ static void populate_lancer_stats(struct be_adapter *adapter)
>  				pport_stats->rx_drops_too_many_frags_lo;
>  }
>  
> +static void accumulate_16bit_val(u32 *acc, u16 val)
> +{
> +#define lo(x)			(x & 0xFFFF)
> +#define hi(x)			(x & 0xFFFF0000)
> +	bool wrapped = val < lo(*acc);
> +
> +	*acc = hi(*acc) + val;
> +	if (wrapped)
> +		*acc += 65536;
> +}
> +
>  

This adds a race for lockless SNMP readers (in be_get_stats64())

They can see the 32bit value going backward.

You have to be very careful to read the *acc once, and write it once.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 22, 2011, 11:47 a.m. UTC | #2
Le lundi 22 août 2011 à 13:43 +0200, Eric Dumazet a écrit :

> This adds a race for lockless SNMP readers (in be_get_stats64())
> 
> They can see the 32bit value going backward.
> 
> You have to be very careful to read the *acc once, and write it once.

The "write once" is the only requirement.

Something like :

u32 newval = hi(*acc) + val;

if (wrapped)
	newval += 65536;
ACCESS_ONCE(*acc) = newval;



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Perla Aug. 22, 2011, 12:15 p.m. UTC | #3
>-----Original Message-----

>From: Eric Dumazet [mailto:eric.dumazet@gmail.com]

>Sent: Monday, August 22, 2011 5:18 PM

>

>Le lundi 22 août 2011 à 13:43 +0200, Eric Dumazet a écrit :

>

>> This adds a race for lockless SNMP readers (in be_get_stats64())

>>

>> They can see the 32bit value going backward.

>>

>> You have to be very careful to read the *acc once, and write it once.

>

>The "write once" is the only requirement.

>

>Something like :

>

>u32 newval = hi(*acc) + val;

>

>if (wrapped)

>	newval += 65536;

>ACCESS_ONCE(*acc) = newval;

>


Eric,

I'm wondering why you'd need ACCESS_ONCE() here? Wouldn't using the temp variable (newval) as you suggested,
ensure that the reader doesn't see a half-baked value?
Eric Dumazet Aug. 22, 2011, 12:44 p.m. UTC | #4
Le lundi 22 août 2011 à 05:15 -0700, Sathya.Perla@Emulex.Com a écrit :
> >-----Original Message-----
> >From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> >Sent: Monday, August 22, 2011 5:18 PM
> >
> >Le lundi 22 août 2011 à 13:43 +0200, Eric Dumazet a écrit :
> >
> >> This adds a race for lockless SNMP readers (in be_get_stats64())
> >>
> >> They can see the 32bit value going backward.
> >>
> >> You have to be very careful to read the *acc once, and write it once.
> >
> >The "write once" is the only requirement.
> >
> >Something like :
> >
> >u32 newval = hi(*acc) + val;
> >
> >if (wrapped)
> >	newval += 65536;
> >ACCESS_ONCE(*acc) = newval;
> >
> 
> Eric,
> 
> I'm wondering why you'd need ACCESS_ONCE() here? Wouldn't using the temp variable (newval) as you suggested,
> ensure that the reader doesn't see a half-baked value?
> 

If you write :

u32 newval = hi(*acc) + val;
if (wrapped)
	newval += 65536;
*acc = newval;


C compiler (gcc) is free to eliminate the temp variable and to generate
same code than :

*acc = hi(*acc) + val;
if (wrapped)
	*acc += 65536;

ACCESS_ONCE() here is the clean way (and self explanatory/documented) to
keep compiler to write to *acc twice.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Perla Aug. 22, 2011, 5:22 p.m. UTC | #5
OK, thanks Eric. I'll re-spin this patch series ...

-Sathya
diff mbox

Patch

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 2375c0c..4e588d8 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -378,6 +378,17 @@  static void populate_lancer_stats(struct be_adapter *adapter)
 				pport_stats->rx_drops_too_many_frags_lo;
 }
 
+static void accumulate_16bit_val(u32 *acc, u16 val)
+{
+#define lo(x)			(x & 0xFFFF)
+#define hi(x)			(x & 0xFFFF0000)
+	bool wrapped = val < lo(*acc);
+
+	*acc = hi(*acc) + val;
+	if (wrapped)
+		*acc += 65536;
+}
+
 void be_parse_stats(struct be_adapter *adapter)
 {
 	struct be_erx_stats_v1 *erx = be_erx_stats_from_cmd(adapter);
@@ -394,9 +405,13 @@  void be_parse_stats(struct be_adapter *adapter)
 	}
 
 	/* as erx_v1 is longer than v0, ok to use v1 defn for v0 access */
-	for_all_rx_queues(adapter, rxo, i)
-		rx_stats(rxo)->rx_drops_no_frags =
-			erx->rx_drops_no_fragments[rxo->q.id];
+	for_all_rx_queues(adapter, rxo, i) {
+		/* below erx HW counter can actually wrap around after
+		 * 65535. Driver accumulates a 32-bit value
+		 */
+		accumulate_16bit_val(&rx_stats(rxo)->rx_drops_no_frags,
+				(u16)erx->rx_drops_no_fragments[rxo->q.id]);
+	}
 }
 
 static struct rtnl_link_stats64 *be_get_stats64(struct net_device *netdev,