Message ID | 20170419142903.GJ4730@C02RW35GFVH8.dhcp.broadcom.net |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Apr 19, 2017 at 10:29:03AM -0400, Andy Gospodarek wrote: > > I ran this on top of a card that uses the bnxt_en driver on a desktop > class system with an i7-6700 CPU @ 3.40GHz, sending a single stream of > UDP traffic with flow control disabled and saw the following (all stats > in Million PPS). > > xdp1 xdp2 xdp_tx_tunnel > Generic XDP 7.8 5.5 (1.3 actual) 4.6 (1.1 actual) > Optimized XDP 11.7 9.7 4.6 Nice! Thanks for testing. > One thing to note is that the Generic XDP case shows some different > results for reported by the application vs actual (seen on the wire). I > did not debug where the drops are happening and what counter needs to be > incremented to note this -- I'll add that to my TODO list. The > Optimized XDP case does not have a difference in reported vs actual > frames on the wire. The missed packets are probably due to xmit queue being full. We need 'xdp_tx_full' counter in: + if (free_skb) { + trace_xdp_exception(dev, xdp_prog, XDP_TX); + kfree_skb(skb); + } like in-driver xdp does. It's surprising that tx becomes full so often. May be bnxt specific behavior? > I agree with all those who have asserted that this is great tool for > those that want to get started with XDP but do not have hardware, so I'd > say it's ready to have the 'RFC' tag dropped. Thanks for pushing this > forward, Dave! :-) +1
On 17-04-19 10:17 AM, Alexei Starovoitov wrote: > On Wed, Apr 19, 2017 at 10:29:03AM -0400, Andy Gospodarek wrote: >> >> I ran this on top of a card that uses the bnxt_en driver on a desktop >> class system with an i7-6700 CPU @ 3.40GHz, sending a single stream of >> UDP traffic with flow control disabled and saw the following (all stats >> in Million PPS). >> >> xdp1 xdp2 xdp_tx_tunnel >> Generic XDP 7.8 5.5 (1.3 actual) 4.6 (1.1 actual) >> Optimized XDP 11.7 9.7 4.6 > > Nice! Thanks for testing. > >> One thing to note is that the Generic XDP case shows some different >> results for reported by the application vs actual (seen on the wire). I >> did not debug where the drops are happening and what counter needs to be >> incremented to note this -- I'll add that to my TODO list. The >> Optimized XDP case does not have a difference in reported vs actual >> frames on the wire. > > The missed packets are probably due to xmit queue being full. > We need 'xdp_tx_full' counter in: > + if (free_skb) { > + trace_xdp_exception(dev, xdp_prog, XDP_TX); > + kfree_skb(skb); > + } > like in-driver xdp does. > It's surprising that tx becomes full so often. May be bnxt specific behavior? hmm as a data point I get better numbers than 1.3Mpps running through the qdisc layer with pktgen so seems like something is wrong with the driver perhaps? If I get a chance I'll take a look with my setup here, although it likely wont be until the weekend. I don't think it needs to slow down dropping the RFC tag and getting the patch applied though. > >> I agree with all those who have asserted that this is great tool for >> those that want to get started with XDP but do not have hardware, so I'd >> say it's ready to have the 'RFC' tag dropped. Thanks for pushing this >> forward, Dave! :-) > > +1 > >
On Wed, Apr 19, 2017 at 10:44:59AM -0700, John Fastabend wrote: > On 17-04-19 10:17 AM, Alexei Starovoitov wrote: > > On Wed, Apr 19, 2017 at 10:29:03AM -0400, Andy Gospodarek wrote: > >> > >> I ran this on top of a card that uses the bnxt_en driver on a desktop > >> class system with an i7-6700 CPU @ 3.40GHz, sending a single stream of > >> UDP traffic with flow control disabled and saw the following (all stats > >> in Million PPS). > >> > >> xdp1 xdp2 xdp_tx_tunnel > >> Generic XDP 7.8 5.5 (1.3 actual) 4.6 (1.1 actual) > >> Optimized XDP 11.7 9.7 4.6 > > > > Nice! Thanks for testing. > > > >> One thing to note is that the Generic XDP case shows some different > >> results for reported by the application vs actual (seen on the wire). I > >> did not debug where the drops are happening and what counter needs to be > >> incremented to note this -- I'll add that to my TODO list. The > >> Optimized XDP case does not have a difference in reported vs actual > >> frames on the wire. > > > > The missed packets are probably due to xmit queue being full. > > We need 'xdp_tx_full' counter in: > > + if (free_skb) { > > + trace_xdp_exception(dev, xdp_prog, XDP_TX); > > + kfree_skb(skb); > > + } > > like in-driver xdp does. > > It's surprising that tx becomes full so often. May be bnxt specific behavior? > > hmm as a data point I get better numbers than 1.3Mpps running through the qdisc > layer with pktgen so seems like something is wrong with the driver perhaps? If I get ~6.5Mpps on a single core with pktgen, so inconclusive for now.... > I get a chance I'll take a look with my setup here, although it likely wont be > until the weekend. I don't think it needs to slow down dropping the RFC tag > and getting the patch applied though. > > > > >> I agree with all those who have asserted that this is great tool for > >> those that want to get started with XDP but do not have hardware, so I'd > >> say it's ready to have the 'RFC' tag dropped. Thanks for pushing this > >> forward, Dave! :-) > > > > +1 > > > > >
On Wed, Apr 19, 2017 at 04:25:43PM -0400, Andy Gospodarek wrote: > On Wed, Apr 19, 2017 at 10:44:59AM -0700, John Fastabend wrote: > > On 17-04-19 10:17 AM, Alexei Starovoitov wrote: > > > On Wed, Apr 19, 2017 at 10:29:03AM -0400, Andy Gospodarek wrote: > > >> > > >> I ran this on top of a card that uses the bnxt_en driver on a desktop > > >> class system with an i7-6700 CPU @ 3.40GHz, sending a single stream of > > >> UDP traffic with flow control disabled and saw the following (all stats > > >> in Million PPS). > > >> > > >> xdp1 xdp2 xdp_tx_tunnel > > >> Generic XDP 7.8 5.5 (1.3 actual) 4.6 (1.1 actual) > > >> Optimized XDP 11.7 9.7 4.6 > > > > > > Nice! Thanks for testing. > > > > > >> One thing to note is that the Generic XDP case shows some different > > >> results for reported by the application vs actual (seen on the wire). I > > >> did not debug where the drops are happening and what counter needs to be > > >> incremented to note this -- I'll add that to my TODO list. The > > >> Optimized XDP case does not have a difference in reported vs actual > > >> frames on the wire. > > > > > > The missed packets are probably due to xmit queue being full. > > > We need 'xdp_tx_full' counter in: > > > + if (free_skb) { > > > + trace_xdp_exception(dev, xdp_prog, XDP_TX); > > > + kfree_skb(skb); > > > + } > > > like in-driver xdp does. > > > It's surprising that tx becomes full so often. May be bnxt specific behavior? > > > > hmm as a data point I get better numbers than 1.3Mpps running through the qdisc > > layer with pktgen so seems like something is wrong with the driver perhaps? If > > I get ~6.5Mpps on a single core with pktgen, so inconclusive for now.... may be your tx queue is simply smaller than rx queue?
From: Andy Gospodarek <andy@greyhouse.net> Date: Wed, 19 Apr 2017 10:29:03 -0400 > So I tried a variety of things and the simplest change on top of yours that > works well for xdp1, xdp2, and xdp_tx_iptunnel. > > diff --git a/net/core/dev.c b/net/core/dev.c > index b3d3a6e..1bab3dc 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4316,11 +4316,11 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, > > off = xdp.data - orig_data; > if (off) > - __skb_push(skb, off); > + __skb_push(skb, -off); We have to handle both pushing and popping headers, so could you please test the snippet I asked you to try? > if (off > 0) > __skb_pull(skb, off); > else if (off < 0) > __skb_push(skb, -off); Thanks.
On Wed, 19 Apr 2017 10:29:03 -0400 Andy Gospodarek <andy@greyhouse.net> wrote: > I ran this on top of a card that uses the bnxt_en driver on a desktop > class system with an i7-6700 CPU @ 3.40GHz, sending a single stream of > UDP traffic with flow control disabled and saw the following (all stats > in Million PPS). > > xdp1 xdp2 xdp_tx_tunnel > Generic XDP 7.8 5.5 (1.3 actual) 4.6 (1.1 actual) > Optimized XDP 11.7 9.7 4.6 > > One thing to note is that the Generic XDP case shows some different > results for reported by the application vs actual (seen on the wire). I > did not debug where the drops are happening and what counter needs to be > incremented to note this -- I'll add that to my TODO list. The > Optimized XDP case does not have a difference in reported vs actual > frames on the wire. The reported application vs actual (seen on the wire) number sound scary. How do you evaluate/measure "seen on the wire"? Perhaps you could use ethtool -S stats to see if anything is fishy? I recommend using my tool[1] like: ~/git/network-testing/bin/ethtool_stats.pl --dev mlx5p2 --sec 2 [1] https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl I'm evaluating this patch on a mlx5 NIC, and something is not right... I'm seeing: Ethtool(mlx5p2) stat: 349599 ( 349,599) <= tx_multicast_phy /sec Ethtool(mlx5p2) stat: 4940185 ( 4,940,185) <= tx_packets /sec Ethtool(mlx5p2) stat: 349596 ( 349,596) <= tx_packets_phy /sec [...] Ethtool(mlx5p2) stat: 36898 ( 36,898) <= rx_cache_busy /sec Ethtool(mlx5p2) stat: 36898 ( 36,898) <= rx_cache_full /sec Ethtool(mlx5p2) stat: 4903287 ( 4,903,287) <= rx_cache_reuse /sec Ethtool(mlx5p2) stat: 4940185 ( 4,940,185) <= rx_csum_complete /sec Ethtool(mlx5p2) stat: 4940185 ( 4,940,185) <= rx_packets /sec Something is wrong... when I tcpdump on the generator machine, I see garbled packets with IPv6 multicast addresses. And it looks like I'm only sending 349,596 tx_packets_phy/sec on the "wire".
On Wed, Apr 19, 2017 at 09:40:49PM -0400, David Miller wrote: > From: Andy Gospodarek <andy@greyhouse.net> > Date: Wed, 19 Apr 2017 10:29:03 -0400 > > > So I tried a variety of things and the simplest change on top of yours that > > works well for xdp1, xdp2, and xdp_tx_iptunnel. > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index b3d3a6e..1bab3dc 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -4316,11 +4316,11 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, > > > > off = xdp.data - orig_data; > > if (off) > > - __skb_push(skb, off); > > + __skb_push(skb, -off); > > We have to handle both pushing and popping headers, so could you > please test the snippet I asked you to try? > I will tomorrow or by Monday of next week. I'm also going to hack^W write a quick test app to exercise it as well. > > if (off > 0) > > __skb_pull(skb, off); > > else if (off < 0) > > __skb_push(skb, -off); > > Thanks.
On Thu, 20 Apr 2017 16:30:34 +0200 Jesper Dangaard Brouer <brouer@redhat.com> wrote: > On Wed, 19 Apr 2017 10:29:03 -0400 > Andy Gospodarek <andy@greyhouse.net> wrote: > > > I ran this on top of a card that uses the bnxt_en driver on a desktop > > class system with an i7-6700 CPU @ 3.40GHz, sending a single stream of > > UDP traffic with flow control disabled and saw the following (all stats > > in Million PPS). > > > > xdp1 xdp2 xdp_tx_tunnel > > Generic XDP 7.8 5.5 (1.3 actual) 4.6 (1.1 actual) > > Optimized XDP 11.7 9.7 4.6 > > > > One thing to note is that the Generic XDP case shows some different > > results for reported by the application vs actual (seen on the wire). I > > did not debug where the drops are happening and what counter needs to be > > incremented to note this -- I'll add that to my TODO list. The > > Optimized XDP case does not have a difference in reported vs actual > > frames on the wire. > > The reported application vs actual (seen on the wire) number sound scary. > How do you evaluate/measure "seen on the wire"? > > Perhaps you could use ethtool -S stats to see if anything is fishy? > I recommend using my tool[1] like: > > ~/git/network-testing/bin/ethtool_stats.pl --dev mlx5p2 --sec 2 > > [1] https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl > > I'm evaluating this patch on a mlx5 NIC, and something is not right... > I'm seeing: > > Ethtool(mlx5p2) stat: 349599 ( 349,599) <= tx_multicast_phy /sec > Ethtool(mlx5p2) stat: 4940185 ( 4,940,185) <= tx_packets /sec > Ethtool(mlx5p2) stat: 349596 ( 349,596) <= tx_packets_phy /sec > [...] > Ethtool(mlx5p2) stat: 36898 ( 36,898) <= rx_cache_busy /sec > Ethtool(mlx5p2) stat: 36898 ( 36,898) <= rx_cache_full /sec > Ethtool(mlx5p2) stat: 4903287 ( 4,903,287) <= rx_cache_reuse /sec > Ethtool(mlx5p2) stat: 4940185 ( 4,940,185) <= rx_csum_complete /sec > Ethtool(mlx5p2) stat: 4940185 ( 4,940,185) <= rx_packets /sec > > Something is wrong... when I tcpdump on the generator machine, I see > garbled packets with IPv6 multicast addresses. > > And it looks like I'm only sending 349,596 tx_packets_phy/sec on the "wire". > Not seeing packets on the TX wire was caused by the NIC HW dropping the packets, because the ethernet MAC-addr were not changed/swapped. Fixed this XDP_TX bug in my test program xdp_bench01_mem_access_cost. https://github.com/netoptimizer/prototype-kernel/commit/85f7ba2f0ea2 Even added a new option --swapmac for creating another test option for modifying the packet. https://github.com/netoptimizer/prototype-kernel/commit/fe080e6f3ccf I will shortly publish a full report of testing this patch.
diff --git a/net/core/dev.c b/net/core/dev.c index b3d3a6e..1bab3dc 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4316,11 +4316,11 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, off = xdp.data - orig_data; if (off) - __skb_push(skb, off); + __skb_push(skb, -off); switch (act) { case XDP_TX: - __skb_push(skb, skb->mac_len); + __skb_push(skb, mac_len); /* fall through */ case XDP_PASS: break;