Message ID | 1937058599.2214531.1365659704193.JavaMail.root@redhat.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On 4/10/2013 10:55 PM, Cong Wang wrote: > > ----- Original Message ----- >> On 4/10/2013 7:10 PM, Cong Wang wrote: >>>> - when source and destination endpoints belonging to different vni's >>>> are on 2 different bridges on the same host. encap bypass is done >>>> in this scenario by checking if rt_flags has RTCF_LOCAL set. I think >>>> you must be hitting this path and the following patch should fix >>>> it by only doing bypass if the source and dest devices belong to >>>> the same net. Can you try it and see if it fixes your tests? >>> I just tested it, unfortunately it doesn't work, the bug still exists. >>> >>> If you need any other info, please let me know. >> So does it mean that you are hitting the if condition that does encap >> bypass >> even afterthe net_eq() check? Do the tests pass If you comment out the >> 'if' block? > Yes, after adding a printk inside the 'if' block, I got: > > [ 71.456329] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth0 > [ 71.596551] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth1 > [ 72.028574] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth0 > [ 72.436384] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth1 > [ 73.028576] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth0 > [ 73.185134] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth0 > [ 73.436582] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth1 > [ 74.184251] vxlan: dev: vxlan0, dst: 224.8.8.8, dst dev: veth0 > > It seems the dst dev is the dev which vxlan0 setup on, so > there is no way to know if the packet is targeted for a different netns > on the same host, at least I don't find such RTCF_* flag. > > I'd propose to revert that commit partially: I think we should spend some more time to address this issue correctly. Bypassing encap makes a significant improvement in performance when the dest. endpoint is on the same host. So is vxlan_encap_bypass() getting called or are you hitting goto tx_error? > > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index 9a64715..0847564 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -1012,18 +1012,6 @@ static netdev_tx_t vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, > goto tx_error; > } > > - /* Bypass encapsulation if the destination is local */ > - if (rt->rt_flags & RTCF_LOCAL) { > - struct vxlan_dev *dst_vxlan; > - > - ip_rt_put(rt); > - dst_vxlan = vxlan_find_vni(dev_net(dev), vni); > - if (!dst_vxlan) > - goto tx_error; > - vxlan_encap_bypass(skb, vxlan, dst_vxlan); > - return NETDEV_TX_OK; > - } > - > memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); > IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED | > IPSKB_REROUTED); > > >> Can you share your test config/scripts so that i can try out your setup if >> it is not toocomplicated? >> > > Sure, here is what I did: > > 1) create a veth pair: veth0 and veth1 > 2) create a new netns > 3) move veth1 to the new netns > 4) setup vxlan0 on veth0 > 5) setup vxlan0 on veth1 in the new netns > 6) ping remote, that is the IP of the vxlan0 in new netns > I am not all that familiar with creating netns and veth interfaces. I guess we can do all this via 'ip' command. Can you give me a script with the exact commands to do this setup? Thanks Sridhar -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 9a64715..0847564 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1012,18 +1012,6 @@ static netdev_tx_t vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, goto tx_error; } - /* Bypass encapsulation if the destination is local */ - if (rt->rt_flags & RTCF_LOCAL) { - struct vxlan_dev *dst_vxlan; - - ip_rt_put(rt); - dst_vxlan = vxlan_find_vni(dev_net(dev), vni); - if (!dst_vxlan) - goto tx_error; - vxlan_encap_bypass(skb, vxlan, dst_vxlan); - return NETDEV_TX_OK; - } - memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED | IPSKB_REROUTED);