diff mbox

[net-next] igb: update adapter stats when reading /proc/net/dev.

Message ID 20101005141833.20929.10943.stgit@localhost
State Awaiting Upstream, archived
Delegated to: David Miller
Headers show

Commit Message

Jesper Dangaard Brouer Oct. 5, 2010, 2:18 p.m. UTC
Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
updating the adapter stats when reading /proc/net/dev.  Currently the
stats are updated by the watchdog timer every 2 sec, or when getting
stats via ethtool -S.

A number of userspace apps read these /proc/net/dev stats every second,
e.g. ifstat, which then gives a perceived very bursty traffic pattern,
which is actually false.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
---
 drivers/net/igb/igb_main.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Oct. 5, 2010, 2:41 p.m. UTC | #1
Le mardi 05 octobre 2010 à 16:18 +0200, Jesper Dangaard Brouer a écrit :
> Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> updating the adapter stats when reading /proc/net/dev.  Currently the
> stats are updated by the watchdog timer every 2 sec, or when getting
> stats via ethtool -S.
> 
> A number of userspace apps read these /proc/net/dev stats every second,
> e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> which is actually false.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>

Unfortunately this is racy with igb_watchdog_task()

igb_update_stats() must be called under protection of a lock.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Oct. 5, 2010, 2:53 p.m. UTC | #2
On Tue, 2010-10-05 at 16:41 +0200, Eric Dumazet wrote:
> Le mardi 05 octobre 2010 à 16:18 +0200, Jesper Dangaard Brouer a écrit :
> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > stats are updated by the watchdog timer every 2 sec, or when getting
> > stats via ethtool -S.
> > 
> > A number of userspace apps read these /proc/net/dev stats every second,
> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > which is actually false.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> 
> Unfortunately this is racy with igb_watchdog_task()
> 
> igb_update_stats() must be called under protection of a lock.

Its already racy, because "ethtool -S" reads out the stats immediately,
and thus races with the timer.

See: igb_ethtool.c
 igb_get_ethtool_stats() invoke igb_update_stats(adapter);
Eric Dumazet Oct. 5, 2010, 3:19 p.m. UTC | #3
Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :

> Its already racy, because "ethtool -S" reads out the stats immediately,
> and thus races with the timer.
> 
> See: igb_ethtool.c
>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> 

You would be surprised how many bugs are waiting to be found and
fixed ;)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 5, 2010, 9:01 p.m. UTC | #4
Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
> 
> > Its already racy, because "ethtool -S" reads out the stats immediately,
> > and thus races with the timer.
> > 
> > See: igb_ethtool.c
> >  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> > 
> 
> You would be surprised how many bugs are waiting to be found and
> fixed ;)
> 
> 

I took a look at igb stats, and it appears they also are racy on 32bit
arches. igb uses u64 counters, with no synchronization between
producers(writers) and consumers(readers).

Some fixes are needed ;)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Oct. 5, 2010, 9:16 p.m. UTC | #5
On Tue, 5 Oct 2010, Eric Dumazet wrote:

> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>>
>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>>> and thus races with the timer.
>>>
>>> See: igb_ethtool.c
>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>>>
>>
>> You would be surprised how many bugs are waiting to be found and
>> fixed ;)
>
> I took a look at igb stats, and it appears they also are racy on 32bit
> arches. igb uses u64 counters, with no synchronization between
> producers(writers) and consumers(readers).

Are you saying, that we need more than a simple mutex in 
igb_update_stats()?


> Some fixes are needed ;)

The question is then if Intel wants to fix it, or let it be up to you or 
me?

Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------
Kirsher, Jeffrey T Oct. 5, 2010, 10:34 p.m. UTC | #6
On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
> On Tue, 5 Oct 2010, Eric Dumazet wrote:
>
>> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>>>
>>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>>>
>>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>>>> and thus races with the timer.
>>>>
>>>> See: igb_ethtool.c
>>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>>>>
>>>
>>> You would be surprised how many bugs are waiting to be found and
>>> fixed ;)
>>
>> I took a look at igb stats, and it appears they also are racy on 32bit
>> arches. igb uses u64 counters, with no synchronization between
>> producers(writers) and consumers(readers).
>
> Are you saying, that we need more than a simple mutex in igb_update_stats()?
>
>
>> Some fixes are needed ;)
>
> The question is then if Intel wants to fix it, or let it be up to you or me?
>

I will make sure that Carolyn and Alex know about the issue.  But,
feel free to submit a patch if you guys have the time.
Eric Dumazet Oct. 6, 2010, 3:28 a.m. UTC | #7
Le mardi 05 octobre 2010 à 15:34 -0700, Jeff Kirsher a écrit :
> On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
> > On Tue, 5 Oct 2010, Eric Dumazet wrote:
> >
> >> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
> >>>
> >>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
> >>>
> >>>> Its already racy, because "ethtool -S" reads out the stats immediately,
> >>>> and thus races with the timer.
> >>>>
> >>>> See: igb_ethtool.c
> >>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> >>>>
> >>>
> >>> You would be surprised how many bugs are waiting to be found and
> >>> fixed ;)
> >>
> >> I took a look at igb stats, and it appears they also are racy on 32bit
> >> arches. igb uses u64 counters, with no synchronization between
> >> producers(writers) and consumers(readers).
> >
> > Are you saying, that we need more than a simple mutex in igb_update_stats()?
> >
> >
> >> Some fixes are needed ;)
> >
> > The question is then if Intel wants to fix it, or let it be up to you or me?
> >
> 
> I will make sure that Carolyn and Alex know about the issue.  But,
> feel free to submit a patch if you guys have the time.
> 

I woke up early this morning, I'll provide patches to fix issues for
net-next-2.6

I'll let Intel guys doing the backporting work, but for old kernels,
you'll probably need to use "unsigned long" instead of "u64"

My plan is :

- Provide 64bit counters even on 32bit arch
- with proper synchro (include/linux/u64_stats_sync.h)
- Add a spinlock so we can apply Jesper patch.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kirsher, Jeffrey T Oct. 6, 2010, 5:47 a.m. UTC | #8
On Tue, Oct 5, 2010 at 20:28, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 05 octobre 2010 à 15:34 -0700, Jeff Kirsher a écrit :
>> On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
>> > On Tue, 5 Oct 2010, Eric Dumazet wrote:
>> >
>> >> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>> >>>
>> >>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>> >>>
>> >>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>> >>>> and thus races with the timer.
>> >>>>
>> >>>> See: igb_ethtool.c
>> >>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>> >>>>
>> >>>
>> >>> You would be surprised how many bugs are waiting to be found and
>> >>> fixed ;)
>> >>
>> >> I took a look at igb stats, and it appears they also are racy on 32bit
>> >> arches. igb uses u64 counters, with no synchronization between
>> >> producers(writers) and consumers(readers).
>> >
>> > Are you saying, that we need more than a simple mutex in igb_update_stats()?
>> >
>> >
>> >> Some fixes are needed ;)
>> >
>> > The question is then if Intel wants to fix it, or let it be up to you or me?
>> >
>>
>> I will make sure that Carolyn and Alex know about the issue.  But,
>> feel free to submit a patch if you guys have the time.
>>
>
> I woke up early this morning, I'll provide patches to fix issues for
> net-next-2.6
>
> I'll let Intel guys doing the backporting work, but for old kernels,
> you'll probably need to use "unsigned long" instead of "u64"
>
> My plan is :
>
> - Provide 64bit counters even on 32bit arch
> - with proper synchro (include/linux/u64_stats_sync.h)
> - Add a spinlock so we can apply Jesper patch.
>
>

Thanks Eric, I really appreciate it.
David Miller Oct. 7, 2010, 6:36 a.m. UTC | #9
From: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Tue, 05 Oct 2010 16:18:33 +0200

> Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> updating the adapter stats when reading /proc/net/dev.  Currently the
> stats are updated by the watchdog timer every 2 sec, or when getting
> stats via ethtool -S.
> 
> A number of userspace apps read these /proc/net/dev stats every second,
> e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> which is actually false.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>

I assume that the Intel folks will take care of integrating this
now that the locking is fixed.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 7, 2010, 6:44 a.m. UTC | #10
Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> From: Jesper Dangaard Brouer <hawk@comx.dk>
> Date: Tue, 05 Oct 2010 16:18:33 +0200
> 
> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > stats are updated by the watchdog timer every 2 sec, or when getting
> > stats via ethtool -S.
> > 
> > A number of userspace apps read these /proc/net/dev stats every second,
> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > which is actually false.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> 
> I assume that the Intel folks will take care of integrating this
> now that the locking is fixed.
> 

I integrated Jesper patch into my cumulative patch.

BTW, ixgbe has similar locking problem.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 7, 2010, 6:46 a.m. UTC | #11
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 07 Oct 2010 08:44:08 +0200

> Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
>> From: Jesper Dangaard Brouer <hawk@comx.dk>
>> Date: Tue, 05 Oct 2010 16:18:33 +0200
>> 
>> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
>> > updating the adapter stats when reading /proc/net/dev.  Currently the
>> > stats are updated by the watchdog timer every 2 sec, or when getting
>> > stats via ethtool -S.
>> > 
>> > A number of userspace apps read these /proc/net/dev stats every second,
>> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
>> > which is actually false.
>> > 
>> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
>> 
>> I assume that the Intel folks will take care of integrating this
>> now that the locking is fixed.
>> 
> 
> I integrated Jesper patch into my cumulative patch.

Ok.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Oct. 7, 2010, 10:06 a.m. UTC | #12
On Wed, 2010-10-06 at 23:46 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 07 Oct 2010 08:44:08 +0200
> 
> > Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> >> From: Jesper Dangaard Brouer <hawk@comx.dk>
> >> Date: Tue, 05 Oct 2010 16:18:33 +0200
> >> 
> >> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> >> > updating the adapter stats when reading /proc/net/dev.  Currently the
> >> > stats are updated by the watchdog timer every 2 sec, or when getting
> >> > stats via ethtool -S.
> >> > 
> >> > A number of userspace apps read these /proc/net/dev stats every second,
> >> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> >> > which is actually false.
> >> > 
> >> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> >> 
> >> I assume that the Intel folks will take care of integrating this
> >> now that the locking is fixed.
> > 
> > I integrated Jesper patch into my cumulative patch.
> 
> Ok.

I'm fine with this, as long as the commit message describe this change
of accuracy of stats in /proc/net/dev.

Something like:

 This patch also increase the accuracy of stats in /proc/net/dev, by
 updating the adapter stats when reading /proc/net/dev.  Previously
 the stats were only updated by the watchdog timer every 2 sec, which
 resulted in false observations from userspace.
Eric Dumazet Oct. 7, 2010, 10:24 a.m. UTC | #13
Le jeudi 07 octobre 2010 à 12:06 +0200, Jesper Dangaard Brouer a écrit :
> On Wed, 2010-10-06 at 23:46 -0700, David Miller wrote:
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Thu, 07 Oct 2010 08:44:08 +0200
> > 
> > > Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> > >> From: Jesper Dangaard Brouer <hawk@comx.dk>
> > >> Date: Tue, 05 Oct 2010 16:18:33 +0200
> > >> 
> > >> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > >> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > >> > stats are updated by the watchdog timer every 2 sec, or when getting
> > >> > stats via ethtool -S.
> > >> > 
> > >> > A number of userspace apps read these /proc/net/dev stats every second,
> > >> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > >> > which is actually false.
> > >> > 
> > >> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> > >> 
> > >> I assume that the Intel folks will take care of integrating this
> > >> now that the locking is fixed.
> > > 
> > > I integrated Jesper patch into my cumulative patch.
> > 
> > Ok.
> 
> I'm fine with this, as long as the commit message describe this change
> of accuracy of stats in /proc/net/dev.
> 
> Something like:
> 
>  This patch also increase the accuracy of stats in /proc/net/dev, by
>  updating the adapter stats when reading /proc/net/dev.  Previously
>  the stats were only updated by the watchdog timer every 2 sec, which
>  resulted in false observations from userspace.
> 
> 

Well, its not exactly true :)

Previously, igb stats were updated :
- By watchdog timer, every 2 secs
- Every time an "ethtool -S ethX" was done

There is no general guarantee netdev stats are immediately available to
user.

ndo_get_stats() is not allowed to sleep, (because of bonding...), so
drivers can not always provide accurate stats, if they need to make a
long transaction with hardware.

Other drivers do the same (provide hardware statistics), with about one
second resolution.

So the "resulted in false observations from userspace." is something
that might upset admins, but is not a hard requirement of netdev stats.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Oct. 7, 2010, 11:36 a.m. UTC | #14
On Thu, 2010-10-07 at 12:24 +0200, Eric Dumazet wrote:
> Le jeudi 07 octobre 2010 à 12:06 +0200, Jesper Dangaard Brouer a écrit :
> > On Wed, 2010-10-06 at 23:46 -0700, David Miller wrote:
> > > From: Eric Dumazet <eric.dumazet@gmail.com>
> > > Date: Thu, 07 Oct 2010 08:44:08 +0200
> > > 
> > > > Le mercredi 06 octobre 2010 à 23:36 -0700, David Miller a écrit :
> > > >> From: Jesper Dangaard Brouer <hawk@comx.dk>
> > > >> Date: Tue, 05 Oct 2010 16:18:33 +0200
> > > >> 
> > > >> > Network driver igb: Improve the accuracy of stats in /proc/net/dev, by
> > > >> > updating the adapter stats when reading /proc/net/dev.  Currently the
> > > >> > stats are updated by the watchdog timer every 2 sec, or when getting
> > > >> > stats via ethtool -S.
> > > >> > 
> > > >> > A number of userspace apps read these /proc/net/dev stats every second,
> > > >> > e.g. ifstat, which then gives a perceived very bursty traffic pattern,
> > > >> > which is actually false.
> > > >> > 
> > > >> > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
> > > >> 
> > > >> I assume that the Intel folks will take care of integrating this
> > > >> now that the locking is fixed.
> > > > 
> > > > I integrated Jesper patch into my cumulative patch.
> > > 
> > > Ok.
> > 
> > I'm fine with this, as long as the commit message describe this change
> > of accuracy of stats in /proc/net/dev.
> > 
> > Something like:
> > 
> >  This patch also increase the accuracy of stats in /proc/net/dev, by
> >  updating the adapter stats when reading /proc/net/dev.  Previously
> >  the stats were only updated by the watchdog timer every 2 sec, which
> >  resulted in false observations from userspace.
> > 
> > 
> 
> Well, its not exactly true :)
> 
> Previously, igb stats were updated :
> - By watchdog timer, every 2 secs
> - Every time an "ethtool -S ethX" was done
> 
> There is no general guarantee netdev stats are immediately available to
> user.

I'm not talking general, just for this driver.
And I'm not giving any guarantees, that is why I choose the wording
"increase the accuracy" and not "provide accurate stats".

> ndo_get_stats() is not allowed to sleep, (because of bonding...), so
> drivers can not always provide accurate stats, if they need to make a
> long transaction with hardware.
> 
> Other drivers do the same (provide hardware statistics), with about one
> second resolution.

Yes, a lot of drivers don't provide accurate states. And one second
resolution, as most drivers use, will usually be good enough for most
admins.

Our usage pattern is a bit different, as we are very interested in
measuring bursty traffic.  I'm explain you why during the Netfilter
Workshop, if I get my talk about IPTV accepted ;-)

I also use the increased accuracy, when running my pktgen testing
scripts.


> So the "resulted in false observations from userspace." is something
> that might upset admins, but is not a hard requirement of netdev stats.

Its definitly not a hard requirement, but lets fix it where we can.

Then change the text:
 "resulted in false observations from userspace"
to:
 "resulted in inaccurate observations from userspace"
diff mbox

Patch

diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 55edcb7..6cec297 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -4218,11 +4218,17 @@  static void igb_reset_task(struct work_struct *work)
  * @netdev: network interface device structure
  *
  * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
+ * The statistics are also updated from the timer callback
+ * igb_watchdog_task().
  **/
 static struct net_device_stats *igb_get_stats(struct net_device *netdev)
 {
-	/* only return the current stats */
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	/* update stats */
+	igb_update_stats(adapter);
+
+	/* return the current stats */
 	return &netdev->stats;
 }
 
@@ -4307,7 +4313,7 @@  static int igb_change_mtu(struct net_device *netdev, int new_mtu)
 
 void igb_update_stats(struct igb_adapter *adapter)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
+	struct net_device_stats *net_stats = &adapter->netdev->stats;
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 reg, mpc;