Message ID | 1542072871-21208-1-git-send-email-lirongqing@baidu.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net-next] net: slightly optimize eth_type_trans | expand |
From: Li RongQing <lirongqing@baidu.com> Date: Tue, 13 Nov 2018 09:34:31 +0800 > netperf udp stream shows that eth_type_trans takes certain cpu, > so adjust the mac address check order, and firstly check if it > is device address, and only check if it is multicast address > only if not the device address. > > After this change: > To unicast, and skb dst mac is device mac, this is most of time > reduce a comparision > To unicast, and skb dst mac is not device mac, nothing change > To multicast, increase a comparision > > Before: > 1.03% [kernel] [k] eth_type_trans > > After: > 0.78% [kernel] [k] eth_type_trans > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com> > Signed-off-by: Li RongQing <lirongqing@baidu.com> Applied.
On 11/15/2018 03:11 PM, David Miller wrote: > > Applied. > While reviewing this stuff, I found we have a bug. If napi_reuse_skb() is called, we might inherit from prior skb->pkt_type value. It seems that GRO could aggregate packets with pkt_type != PACKET_HOST, right ? David, any objection if I submit the following fix ? diff --git a/net/core/dev.c b/net/core/dev.c index 5927f6a7c301ed90af21a4b82b443f30f00bb483..f2bfd2eda7b2734d29d30f0e82c1a48c1b5b166a 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5664,6 +5664,10 @@ static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb) __vlan_hwaccel_clear_tag(skb); skb->dev = napi->dev; skb->skb_iif = 0; + + /* eth_type_trans() assumes initial pkt_type is PACKET_HOST */ + skb->pkt_type = PACKET_HOST; + skb->encapsulation = 0; skb_shinfo(skb)->gso_type = 0; skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat, 17 Nov 2018 14:37:12 -0800 > > > On 11/15/2018 03:11 PM, David Miller wrote: > >> >> Applied. >> > > While reviewing this stuff, I found we have a bug. > > If napi_reuse_skb() is called, we might inherit from prior skb->pkt_type value. > > It seems that GRO could aggregate packets with pkt_type != PACKET_HOST, right ? > > David, any objection if I submit the following fix ? Oh weird, so we do GRO frags accumulation using SKB which never goes through eth_type_trans()? I don't understand how we can, in this circumstance, assume PACKET_HOST? Because that is what your suggested patch does. Frame could be UDP multicast, and we could legitimately GRO accumulate it. In that situations setting PACKET_HOST doesn't seem correct.
On 11/17/2018 04:51 PM, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Sat, 17 Nov 2018 14:37:12 -0800 > >> >> >> On 11/15/2018 03:11 PM, David Miller wrote: >> >>> >>> Applied. >>> >> >> While reviewing this stuff, I found we have a bug. >> >> If napi_reuse_skb() is called, we might inherit from prior skb->pkt_type value. >> >> It seems that GRO could aggregate packets with pkt_type != PACKET_HOST, right ? >> >> David, any objection if I submit the following fix ? > > Oh weird, so we do GRO frags accumulation using SKB which never goes through > eth_type_trans()? > > I don't understand how we can, in this circumstance, assume PACKET_HOST? > > Because that is what your suggested patch does. > > Frame could be UDP multicast, and we could legitimately GRO accumulate it. > In that situations setting PACKET_HOST doesn't seem correct. > I might have been not very clear. Issue is that with macvlan and GRO-friend traffic we can receive the following packets : P1-P2 with PACKET_OTHERHOST because eth_type_trans() detected the dst MAC is not the eth0 device mac address. P2 has been aggregated to P1, so the sk_buff has been put to napi->skb via napi_reuse_skb() Then we receive on same NAPI packet P3, for this host, reusing napi->skb that was saved (old P2 sk_buff). skb->pkt_type is PACKET_OTHERHOST. eth_type_trans() does not change skb->pkt_type because ethernet dst mac address is our ethernet mac address. -> We feed the upper stack with P3, with incorrect pkt_type. -> packet is dropped because pkt_type != PACKET_HOST, for example in tcp_v4_rcv()
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat, 17 Nov 2018 17:19:34 -0800 > I might have been not very clear. > > Issue is that with macvlan and GRO-friend traffic we can receive the following packets : > > P1-P2 with PACKET_OTHERHOST because eth_type_trans() detected the dst MAC is not the eth0 device mac address. > > P2 has been aggregated to P1, so the sk_buff has been put to napi->skb via napi_reuse_skb() > > Then we receive on same NAPI packet P3, for this host, reusing napi->skb that was saved (old P2 sk_buff). > skb->pkt_type is PACKET_OTHERHOST. > > eth_type_trans() does not change skb->pkt_type because ethernet dst mac address is our ethernet mac address. > > -> We feed the upper stack with P3, with incorrect pkt_type. > > -> packet is dropped because pkt_type != PACKET_HOST, for example in tcp_v4_rcv() Oh I see, it is about defaults when using skb freshly from cache for another new packet. Yes, I completely agree with your suggested patch. Thanks for explaining.
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index fd8faa0dfa61..1c88f5c5d5b1 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -165,15 +165,17 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev) eth = (struct ethhdr *)skb->data; skb_pull_inline(skb, ETH_HLEN); - if (unlikely(is_multicast_ether_addr_64bits(eth->h_dest))) { - if (ether_addr_equal_64bits(eth->h_dest, dev->broadcast)) - skb->pkt_type = PACKET_BROADCAST; - else - skb->pkt_type = PACKET_MULTICAST; + if (unlikely(!ether_addr_equal_64bits(eth->h_dest, + dev->dev_addr))) { + if (unlikely(is_multicast_ether_addr_64bits(eth->h_dest))) { + if (ether_addr_equal_64bits(eth->h_dest, dev->broadcast)) + skb->pkt_type = PACKET_BROADCAST; + else + skb->pkt_type = PACKET_MULTICAST; + } else { + skb->pkt_type = PACKET_OTHERHOST; + } } - else if (unlikely(!ether_addr_equal_64bits(eth->h_dest, - dev->dev_addr))) - skb->pkt_type = PACKET_OTHERHOST; /* * Some variants of DSA tagging don't have an ethertype field