Message ID | 1319799986.23112.101.camel@edumazet-laptop |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On 10/28/2011 4:06 AM, Eric Dumazet wrote: > Le vendredi 28 octobre 2011 à 12:00 +0200, Eric Dumazet a écrit : > >> Oh well, this broke my setup, a very basic one. >> >> eth1 and eth2 on a bonding device, bond0, active-backup >> >> some vlans on top of bond0, say vlan.103 >> >> $ ip link show dev vlan.103 >> 8: vlan.103@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> pfifo_fast state UP qlen 100 >> link/ether 00:1e:0b:ec:d3:d2 brd ff:ff:ff:ff:ff:ff >> >> >> arp_rcv() now gets packets with skb->type PACKET_OTHERHOST and drops >> such packets. >> >> [000] 52870.115435: skb_gro_reset_offset <-napi_gro_receive >> [000] 52870.115435: dev_gro_receive <-napi_gro_receive >> [000] 52870.115435: napi_skb_finish <-napi_gro_receive >> [000] 52870.115435: netif_receive_skb <-napi_skb_finish >> [000] 52870.115435: get_rps_cpu <-netif_receive_skb >> [000] 52870.115435: __netif_receive_skb <-netif_receive_skb >> [000] 52870.115436: vlan_do_receive <-__netif_receive_skb >> [000] 52870.115436: bond_handle_frame <-__netif_receive_skb >> [000] 52870.115436: vlan_do_receive <-__netif_receive_skb >> [000] 52870.115436: arp_rcv <-__netif_receive_skb >> [000] 52870.115436: kfree_skb <-arp_rcv >> [000] 52870.115437: __kfree_skb <-kfree_skb >> [000] 52870.115437: skb_release_head_state <-__kfree_skb >> [000] 52870.115437: skb_release_data <-__kfree_skb >> [000] 52870.115437: kfree <-skb_release_data >> [000] 52870.115437: kmem_cache_free <-__kfree_skb >> >> >> By the way, we have no SNMP counter here so I spent some time to track >> this. I'll send a patch for this. >> >> If this host initiates the trafic, all is well. >> >> Please guys, can we get back ARP or revert this patch ? > > Following patch cures the problem, I am not sure its the right fix. > > Problem is we dont know how many times vlan_do_receive() can be called > for a packet. > > Only last call should set/mess pkt_type to PACKET_OTHERHOST. > > So the caller should be responsible for this, not vlan_do_receive() > > > Alternative would be to check skb->dev->rx_handler being NULL, > but its not clean. > > Following patch is a hack because it handles multicast/broadcast trafic > only. Unicast is already handled in lines 26-33, this is why we didnt > catch the problem. > > diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c > index f1f2f7b..6861899 100644 > --- a/net/8021q/vlan_core.c > +++ b/net/8021q/vlan_core.c > @@ -13,7 +13,7 @@ bool vlan_do_receive(struct sk_buff **skbp) > > vlan_dev = vlan_find_dev(skb->dev, vlan_id); > if (!vlan_dev) { > - if (vlan_id) > + if (vlan_id && skb->pkt_type == PACKET_HOST) > skb->pkt_type = PACKET_OTHERHOST; > return false; > } > Thanks Eric! Thought about this some and I haven't come up with anything better yet. Even though this might be a slight hack I would prefer this to reverting the patch. I'll think about this more tomorrow. Would you be against submitting this patch? .John -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c index f1f2f7b..6861899 100644 --- a/net/8021q/vlan_core.c +++ b/net/8021q/vlan_core.c @@ -13,7 +13,7 @@ bool vlan_do_receive(struct sk_buff **skbp) vlan_dev = vlan_find_dev(skb->dev, vlan_id); if (!vlan_dev) { - if (vlan_id) + if (vlan_id && skb->pkt_type == PACKET_HOST) skb->pkt_type = PACKET_OTHERHOST; return false; }