Message ID | 20130713101818.0d4fa8e0@nehalam.linuxnetplumber.net |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Sat, Jul 13, 2013 at 10:18 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > The socket management is now done in workqueue (outside of RTNL) > and protected by vn->sock_lock. There were two possible bugs, first > the vxlan device was removed from the VNI hash table per socket without > holding lock. And there was a race when device is created and the workqueue > could run after deletion. > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > > --- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700 > +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700 > @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net > > static void vxlan_dellink(struct net_device *dev, struct list_head *head) > { > + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id); > struct vxlan_dev *vxlan = netdev_priv(dev); > > + flush_workqueue(vxlan_wq); > + Doesn't this create dependency on sock_work thread while holding RTNL. If so it can result in deadlock. > + spin_lock(&vn->sock_lock); > hlist_del_rcu(&vxlan->hlist); > + spin_unlock(&vn->sock_lock); > + > list_del(&vxlan->next); > unregister_netdevice_queue(dev, head); > } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Pravin Shelar <pshelar@nicira.com> Date: Sat, 13 Jul 2013 12:21:52 -0700 > On Sat, Jul 13, 2013 at 10:18 AM, Stephen Hemminger > <stephen@networkplumber.org> wrote: >> The socket management is now done in workqueue (outside of RTNL) >> and protected by vn->sock_lock. There were two possible bugs, first >> the vxlan device was removed from the VNI hash table per socket without >> holding lock. And there was a race when device is created and the workqueue >> could run after deletion. >> >> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> >> >> --- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700 >> +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700 >> @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net >> >> static void vxlan_dellink(struct net_device *dev, struct list_head *head) >> { >> + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id); >> struct vxlan_dev *vxlan = netdev_priv(dev); >> >> + flush_workqueue(vxlan_wq); >> + > Doesn't this create dependency on sock_work thread while holding RTNL. > If so it can result in deadlock. What exact deadlock do you perceive? I don't see any code path in the sock_work handler (vxlan_sock_work) which takes the RTNL mutex. So we should be able to safely flush any pending sock_work jobs from vxlan_dellink(). The fact that vxlan_dellink() runs with the RTNL mutex shouldn't cause any issues. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 16, 2013 at 11:28 AM, David Miller <davem@davemloft.net> wrote: > From: Pravin Shelar <pshelar@nicira.com> > Date: Sat, 13 Jul 2013 12:21:52 -0700 > >> On Sat, Jul 13, 2013 at 10:18 AM, Stephen Hemminger >> <stephen@networkplumber.org> wrote: >>> The socket management is now done in workqueue (outside of RTNL) >>> and protected by vn->sock_lock. There were two possible bugs, first >>> the vxlan device was removed from the VNI hash table per socket without >>> holding lock. And there was a race when device is created and the workqueue >>> could run after deletion. >>> >>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> >>> >>> --- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700 >>> +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700 >>> @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net >>> >>> static void vxlan_dellink(struct net_device *dev, struct list_head *head) >>> { >>> + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id); >>> struct vxlan_dev *vxlan = netdev_priv(dev); >>> >>> + flush_workqueue(vxlan_wq); >>> + >> Doesn't this create dependency on sock_work thread while holding RTNL. >> If so it can result in deadlock. > > What exact deadlock do you perceive? I don't see any code path in the > sock_work handler (vxlan_sock_work) which takes the RTNL mutex. > > So we should be able to safely flush any pending sock_work jobs from > vxlan_dellink(). The fact that vxlan_dellink() runs with the RTNL > mutex shouldn't cause any issues. commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought sock_work takes RTNL. but if it is not taking RTNL in any case, they why vxlan sock create is deferred to sock_work? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Pravin Shelar <pshelar@nicira.com> Date: Tue, 16 Jul 2013 13:29:08 -0700 > commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping > rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought > sock_work takes RTNL. but if it is not taking RTNL in any case, they > why vxlan sock create is deferred to sock_work? That commit is handling the fact that the RTNL mutex is NOT held during the create operation in question. It defers to a workqueue so that the new ->sock_lock can be taken in the appropriate context. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 16, 2013 at 11:06 PM, David Miller <davem@davemloft.net> wrote: > From: Pravin Shelar <pshelar@nicira.com> > Date: Tue, 16 Jul 2013 13:29:08 -0700 > >> commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping >> rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought >> sock_work takes RTNL. but if it is not taking RTNL in any case, they >> why vxlan sock create is deferred to sock_work? > > That commit is handling the fact that the RTNL mutex is NOT held > during the create operation in question. > > It defers to a workqueue so that the new ->sock_lock can be taken > in the appropriate context. ok, Thanks for explanation. I do not see any problem with the patch. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Pravin Shelar <pshelar@nicira.com> Date: Wed, 17 Jul 2013 08:41:33 -0700 > On Tue, Jul 16, 2013 at 11:06 PM, David Miller <davem@davemloft.net> wrote: >> From: Pravin Shelar <pshelar@nicira.com> >> Date: Tue, 16 Jul 2013 13:29:08 -0700 >> >>> commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping >>> rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought >>> sock_work takes RTNL. but if it is not taking RTNL in any case, they >>> why vxlan sock create is deferred to sock_work? >> >> That commit is handling the fact that the RTNL mutex is NOT held >> during the create operation in question. >> >> It defers to a workqueue so that the new ->sock_lock can be taken >> in the appropriate context. > > ok, Thanks for explanation. I do not see any problem with the patch. Great, I'm going to apply Stephen's patch then. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Stephen Hemminger <stephen@networkplumber.org> Date: Sat, 13 Jul 2013 10:18:18 -0700 > The socket management is now done in workqueue (outside of RTNL) > and protected by vn->sock_lock. There were two possible bugs, first > the vxlan device was removed from the VNI hash table per socket without > holding lock. And there was a race when device is created and the workqueue > could run after deletion. > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700 +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700 @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net static void vxlan_dellink(struct net_device *dev, struct list_head *head) { + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id); struct vxlan_dev *vxlan = netdev_priv(dev); + flush_workqueue(vxlan_wq); + + spin_lock(&vn->sock_lock); hlist_del_rcu(&vxlan->hlist); + spin_unlock(&vn->sock_lock); + list_del(&vxlan->next); unregister_netdevice_queue(dev, head); }
The socket management is now done in workqueue (outside of RTNL) and protected by vn->sock_lock. There were two possible bugs, first the vxlan device was removed from the VNI hash table per socket without holding lock. And there was a race when device is created and the workqueue could run after deletion. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html