Patchwork ixgbe patch to provide NIC's tx/rx counters via ethtool

login
register
mail settings
Submitter Ben Greear
Date Sept. 23, 2009, 10:36 p.m.
Message ID <4ABAA2D0.4030608@candelatech.com>
Download mbox | patch
Permalink /patch/34198/
State Awaiting Upstream
Delegated to: David Miller
Headers show

Comments

Ben Greear - Sept. 23, 2009, 10:36 p.m.
When LRO is enabled, the received packet and byte counters represent the
LRO'd packets, not the packets/bytes on the wire.  The Intel 82599 NIC has
registers that keep count of the physical packets.  Add these counters to
the ethtool stats.  The byte counters are 36-bit, but the high 4 bits were
being ignored in the 2.6.31 ixgbe driver:  Read those as well to allow
longer time between polling the stats to detect wraps.

Signed-off-by: Ben Greear <greearb@candelatech.com>


Please do not apply this until the ixgbe authors ACK it.  There may
have been reasons for not reading the high 4 bits, or they may dislike
this approach entirely.


Here is ethtool stats output with LRO enabled, with patch applied:

#ethtool -S eth20
NIC statistics:
      rx_packets: 15944000
      tx_packets: 12339293
      rx_bytes: 272306022656
      tx_bytes: 940244184
      rx_pkts_nic: 187747191
      tx_pkts_nic: 12340822
      rx_bytes_nic: 284695533402
      tx_bytes_nic: 989725050
      lsc_int: 3
...

Thanks,
Ben
Rick Jones - Sept. 24, 2009, 12:02 a.m.
Ben Greear wrote:
> When LRO is enabled, the received packet and byte counters represent the
> LRO'd packets, not the packets/bytes on the wire. 

When LRO is enabled, are all the bytes on the wire actually transferred into the 
host?

rick jones
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear - Sept. 24, 2009, 2:08 a.m.
Rick Jones wrote:
> Ben Greear wrote:
>> When LRO is enabled, the received packet and byte counters represent the
>> LRO'd packets, not the packets/bytes on the wire. 
>
> When LRO is enabled, are all the bytes on the wire actually 
> transferred into the host?
No...the ethernet, IP and TCP headers and such are not, for packets that 
are combined into a single
large SKB.

That is why the driver counts them wrong.  The bytes are off by a few 
percentage points, but the
packet count is off by an order of magnitude.

Thanks,
Ben
Rick Jones - Sept. 24, 2009, 4:30 p.m.
Ben Greear wrote:
> Rick Jones wrote:
>> Ben Greear wrote:
>>
>>> When LRO is enabled, the received packet and byte counters represent the
>>> LRO'd packets, not the packets/bytes on the wire. 
>>
>>
>> When LRO is enabled, are all the bytes on the wire actually 
>> transferred into the host?
> 
> No...the ethernet, IP and TCP headers and such are not, for packets that 
> are combined into a single large SKB.
> 
> That is why the driver counts them wrong.  The bytes are off by a few 
> percentage points, but the packet count is off by an order of magnitude.

An overly philosphical question perhaps, but are ethtool stats supposed to 
represent what was on the wire, or what entered the host?

rick
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear - Sept. 24, 2009, 5:07 p.m.
On 09/24/2009 09:30 AM, Rick Jones wrote:
> Ben Greear wrote:
>> Rick Jones wrote:
>>> Ben Greear wrote:
>>>
>>>> When LRO is enabled, the received packet and byte counters represent
>>>> the
>>>> LRO'd packets, not the packets/bytes on the wire.
>>>
>>>
>>> When LRO is enabled, are all the bytes on the wire actually
>>> transferred into the host?
>>
>> No...the ethernet, IP and TCP headers and such are not, for packets
>> that are combined into a single large SKB.
>>
>> That is why the driver counts them wrong. The bytes are off by a few
>> percentage points, but the packet count is off by an order of magnitude.
>
> An overly philosphical question perhaps, but are ethtool stats supposed
> to represent what was on the wire, or what entered the host?

They report whatever they report, you get to set custom labels for the values,
and every NIC/driver may be different, so only humans and crazy code like mine that does
specific things based on the driver reported by ethtool should use it.

A more interesting question to me is what netdev-stats tx/rx byte counters should report?

My opinions:
ethernet header (yes)
ethernet CRC  (yes)
ethernet preamble (no)
ethernet frame gap (no)

I think many don't count the CRC, but I haven't looked recently.

Some didn't even report the ethernet header properly a few years ago, but
I think most do now.

When LRO is enabled, it's hard to say if we should report the LRO pkt
stats or the stats on the wire for the netdev-stats.  At least in my case,
I want to report the stats on the wire, but it's also good to see the
LRO stats because you can easily tell that LRO is actually working if you
see low pkts-per-second counters v/s high-bits-per-sec.


Thanks,
Ben
Waskiewicz Jr, Peter P - Sept. 24, 2009, 6:28 p.m.
On Wed, 2009-09-23 at 15:36 -0700, Ben Greear wrote:
> When LRO is enabled, the received packet and byte counters represent the
> LRO'd packets, not the packets/bytes on the wire.  The Intel 82599 NIC has
> registers that keep count of the physical packets.  Add these counters to
> the ethtool stats.  The byte counters are 36-bit, but the high 4 bits were
> being ignored in the 2.6.31 ixgbe driver:  Read those as well to allow
> longer time between polling the stats to detect wraps.
> 
> Signed-off-by: Ben Greear <greearb@candelatech.com>
> 
> 
> Please do not apply this until the ixgbe authors ACK it.  There may
> have been reasons for not reading the high 4 bits, or they may dislike
> this approach entirely.

Aside from the trivial line-wrap on the comments, I'm fine with this
patch.  There is no issue I could find with the hardware that would
limit you from reading the high 4 bits.  And since we're reading it
already to clear the register, we might as well use the value we get
from it.

Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Kirsher - Sept. 24, 2009, 7:10 p.m.
On Thu, Sep 24, 2009 at 11:28, Peter P Waskiewicz Jr
<peter.p.waskiewicz.jr@intel.com> wrote:
> On Wed, 2009-09-23 at 15:36 -0700, Ben Greear wrote:
>> When LRO is enabled, the received packet and byte counters represent the
>> LRO'd packets, not the packets/bytes on the wire.  The Intel 82599 NIC has
>> registers that keep count of the physical packets.  Add these counters to
>> the ethtool stats.  The byte counters are 36-bit, but the high 4 bits were
>> being ignored in the 2.6.31 ixgbe driver:  Read those as well to allow
>> longer time between polling the stats to detect wraps.
>>
>> Signed-off-by: Ben Greear <greearb@candelatech.com>
>>
>>
>> Please do not apply this until the ixgbe authors ACK it.  There may
>> have been reasons for not reading the high 4 bits, or they may dislike
>> this approach entirely.
>
> Aside from the trivial line-wrap on the comments, I'm fine with this
> patch.  There is no issue I could find with the hardware that would
> limit you from reading the high 4 bits.  And since we're reading it
> already to clear the register, we might as well use the value we get
> from it.
>
> Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
>

I have added this patch to my tree and will push along with my other
ixgbe patches to dave/netdev.  Thanks.

Patch

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index dff8dfa..da3cba3 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -53,6 +53,10 @@  static struct ixgbe_stats ixgbe_gstrings_stats[] = {
 	{"tx_packets", IXGBE_STAT(net_stats.tx_packets)},
 	{"rx_bytes", IXGBE_STAT(net_stats.rx_bytes)},
 	{"tx_bytes", IXGBE_STAT(net_stats.tx_bytes)},
+	{"rx_pkts_nic", IXGBE_STAT(stats.gprc)},
+	{"tx_pkts_nic", IXGBE_STAT(stats.gptc)},
+	{"rx_bytes_nic", IXGBE_STAT(stats.gorc)},
+	{"tx_bytes_nic", IXGBE_STAT(stats.gotc)},
 	{"lsc_int", IXGBE_STAT(lsc_int)},
 	{"tx_busy", IXGBE_STAT(tx_busy)},
 	{"non_eop_descs", IXGBE_STAT(non_eop_descs)},
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 77b0381..929a847 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -4377,10 +4377,13 @@  void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 
 	/* 82598 hardware only has a 32 bit counter in the high register */
 	if (hw->mac.type == ixgbe_mac_82599EB) {
+		u64 tmp;
 		adapter->stats.gorc += IXGBE_READ_REG(hw, IXGBE_GORCL);
-		IXGBE_READ_REG(hw, IXGBE_GORCH); /* to clear */
+		tmp = IXGBE_READ_REG(hw, IXGBE_GORCH) & 0xF; /* 4 high bits of GORC */
+		adapter->stats.gorc += (tmp << 32);
 		adapter->stats.gotc += IXGBE_READ_REG(hw, IXGBE_GOTCL);
-		IXGBE_READ_REG(hw, IXGBE_GOTCH); /* to clear */
+		tmp = IXGBE_READ_REG(hw, IXGBE_GOTCH) & 0xF; /* 4 high bits of GOTC */
+		adapter->stats.gotc += (tmp << 32);
 		adapter->stats.tor += IXGBE_READ_REG(hw, IXGBE_TORL);
 		IXGBE_READ_REG(hw, IXGBE_TORH); /* to clear */
 		adapter->stats.lxonrxc += IXGBE_READ_REG(hw, IXGBE_LXONRXCNT);