Message ID | 1314011613-4519-4-git-send-email-sathya.perla@emulex.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Le lundi 22 août 2011 à 16:43 +0530, Sathya Perla a écrit : > The rx_drops_no_frags HW counter for RSS rings is 16bits in HW and can > wraparound often. Maintain a 32-bit accumulator in the driver to prevent > frequent wraparound. > > Signed-off-by: Sathya Perla <sathya.perla@emulex.com> > --- > drivers/net/ethernet/emulex/benet/be_main.c | 21 ++++++++++++++++++--- > 1 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c > index 2375c0c..4e588d8 100644 > --- a/drivers/net/ethernet/emulex/benet/be_main.c > +++ b/f > @@ -378,6 +378,17 @@ static void populate_lancer_stats(struct be_adapter *adapter) > pport_stats->rx_drops_too_many_frags_lo; > } > > +static void accumulate_16bit_val(u32 *acc, u16 val) > +{ > +#define lo(x) (x & 0xFFFF) > +#define hi(x) (x & 0xFFFF0000) > + bool wrapped = val < lo(*acc); > + > + *acc = hi(*acc) + val; > + if (wrapped) > + *acc += 65536; > +} > + > This adds a race for lockless SNMP readers (in be_get_stats64()) They can see the 32bit value going backward. You have to be very careful to read the *acc once, and write it once. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 22 août 2011 à 13:43 +0200, Eric Dumazet a écrit : > This adds a race for lockless SNMP readers (in be_get_stats64()) > > They can see the 32bit value going backward. > > You have to be very careful to read the *acc once, and write it once. The "write once" is the only requirement. Something like : u32 newval = hi(*acc) + val; if (wrapped) newval += 65536; ACCESS_ONCE(*acc) = newval; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>-----Original Message----- >From: Eric Dumazet [mailto:eric.dumazet@gmail.com] >Sent: Monday, August 22, 2011 5:18 PM > >Le lundi 22 août 2011 à 13:43 +0200, Eric Dumazet a écrit : > >> This adds a race for lockless SNMP readers (in be_get_stats64()) >> >> They can see the 32bit value going backward. >> >> You have to be very careful to read the *acc once, and write it once. > >The "write once" is the only requirement. > >Something like : > >u32 newval = hi(*acc) + val; > >if (wrapped) > newval += 65536; >ACCESS_ONCE(*acc) = newval; > Eric, I'm wondering why you'd need ACCESS_ONCE() here? Wouldn't using the temp variable (newval) as you suggested, ensure that the reader doesn't see a half-baked value?
Le lundi 22 août 2011 à 05:15 -0700, Sathya.Perla@Emulex.Com a écrit : > >-----Original Message----- > >From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > >Sent: Monday, August 22, 2011 5:18 PM > > > >Le lundi 22 août 2011 à 13:43 +0200, Eric Dumazet a écrit : > > > >> This adds a race for lockless SNMP readers (in be_get_stats64()) > >> > >> They can see the 32bit value going backward. > >> > >> You have to be very careful to read the *acc once, and write it once. > > > >The "write once" is the only requirement. > > > >Something like : > > > >u32 newval = hi(*acc) + val; > > > >if (wrapped) > > newval += 65536; > >ACCESS_ONCE(*acc) = newval; > > > > Eric, > > I'm wondering why you'd need ACCESS_ONCE() here? Wouldn't using the temp variable (newval) as you suggested, > ensure that the reader doesn't see a half-baked value? > If you write : u32 newval = hi(*acc) + val; if (wrapped) newval += 65536; *acc = newval; C compiler (gcc) is free to eliminate the temp variable and to generate same code than : *acc = hi(*acc) + val; if (wrapped) *acc += 65536; ACCESS_ONCE() here is the clean way (and self explanatory/documented) to keep compiler to write to *acc twice. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
OK, thanks Eric. I'll re-spin this patch series ... -Sathya
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 2375c0c..4e588d8 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -378,6 +378,17 @@ static void populate_lancer_stats(struct be_adapter *adapter) pport_stats->rx_drops_too_many_frags_lo; } +static void accumulate_16bit_val(u32 *acc, u16 val) +{ +#define lo(x) (x & 0xFFFF) +#define hi(x) (x & 0xFFFF0000) + bool wrapped = val < lo(*acc); + + *acc = hi(*acc) + val; + if (wrapped) + *acc += 65536; +} + void be_parse_stats(struct be_adapter *adapter) { struct be_erx_stats_v1 *erx = be_erx_stats_from_cmd(adapter); @@ -394,9 +405,13 @@ void be_parse_stats(struct be_adapter *adapter) } /* as erx_v1 is longer than v0, ok to use v1 defn for v0 access */ - for_all_rx_queues(adapter, rxo, i) - rx_stats(rxo)->rx_drops_no_frags = - erx->rx_drops_no_fragments[rxo->q.id]; + for_all_rx_queues(adapter, rxo, i) { + /* below erx HW counter can actually wrap around after + * 65535. Driver accumulates a 32-bit value + */ + accumulate_16bit_val(&rx_stats(rxo)->rx_drops_no_frags, + (u16)erx->rx_drops_no_fragments[rxo->q.id]); + } } static struct rtnl_link_stats64 *be_get_stats64(struct net_device *netdev,
The rx_drops_no_frags HW counter for RSS rings is 16bits in HW and can wraparound often. Maintain a 32-bit accumulator in the driver to prevent frequent wraparound. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> --- drivers/net/ethernet/emulex/benet/be_main.c | 21 ++++++++++++++++++--- 1 files changed, 18 insertions(+), 3 deletions(-)