Message ID | kqul4h$ib8$1@ger.gmane.org |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > > On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > >> I've managed to hit a deadlock at boot a couple times while testing > >> the 3.10 rc kernels. It seems to always happen when my network > >> devices are initializing. This morning I updated to v3.10 and made a > >> few config tweaks and so far I've hit it 4 out of 5 reboots. It looks > >> like most processes are getting stuck on rtnl_lock. Below is a boot > >> log with the soft lockup prints. Please let know if there is any > >> other information I can provide: > > > > Could you try a build with CONFIG_LOCKDEP enabled? > > > > The problem is clear: ib_register_device() is called with rtnl_lock, > but itself needs device_mutex, however, ib_register_client() first > acquires device_mutex, then indirectly calls register_netdev() which > takes rtnl_lock. Deadlock! > > One possible fix is always taking rtnl_lock before taking > device_mutex, something like below: > > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c > index 18c1ece..890870b 100644 > --- a/drivers/infiniband/core/device.c > +++ b/drivers/infiniband/core/device.c > @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) > { > struct ib_device *device; > > + rtnl_lock(); > mutex_lock(&device_mutex); > > list_add_tail(&client->list, &client_list); > @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) > client->add(device); > > mutex_unlock(&device_mutex); > + rtnl_unlock(); > > return 0; > } > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > index b6e049a..5a7a048 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, > goto event_failed; > } > > - result = register_netdev(priv->dev); > + result = register_netdevice(priv->dev); > if (result) { > printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", > hca->name, port, result); Looks good to me. Shawn, could you test this patch? Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: > On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > > On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > > > On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > > >> I've managed to hit a deadlock at boot a couple times while testing > > >> the 3.10 rc kernels. It seems to always happen when my network > > >> devices are initializing. This morning I updated to v3.10 and made a > > >> few config tweaks and so far I've hit it 4 out of 5 reboots. It looks > > >> like most processes are getting stuck on rtnl_lock. Below is a boot > > >> log with the soft lockup prints. Please let know if there is any > > >> other information I can provide: > > > > > > Could you try a build with CONFIG_LOCKDEP enabled? > > > > > > > The problem is clear: ib_register_device() is called with rtnl_lock, > > but itself needs device_mutex, however, ib_register_client() first > > acquires device_mutex, then indirectly calls register_netdev() which > > takes rtnl_lock. Deadlock! > > > > One possible fix is always taking rtnl_lock before taking > > device_mutex, something like below: > > > > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c > > index 18c1ece..890870b 100644 > > --- a/drivers/infiniband/core/device.c > > +++ b/drivers/infiniband/core/device.c > > @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) > > { > > struct ib_device *device; > > > > + rtnl_lock(); > > mutex_lock(&device_mutex); > > > > list_add_tail(&client->list, &client_list); > > @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) > > client->add(device); > > > > mutex_unlock(&device_mutex); > > + rtnl_unlock(); > > > > return 0; > > } > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > index b6e049a..5a7a048 100644 > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, > > goto event_failed; > > } > > > > - result = register_netdev(priv->dev); > > + result = register_netdevice(priv->dev); > > if (result) { > > printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", > > hca->name, port, result); > > Looks good to me. Shawn, could you test this patch? ib_unregister_device/ib_unregister_client would need the same change, too. I have not checked the other ->add() and ->remove() functions. Also cc'ed linux-rdma@vger.kernel.org, Roland Dreier. Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: > On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: > > On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > > > On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > > > > On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > > > >> I've managed to hit a deadlock at boot a couple times while testing > > > >> the 3.10 rc kernels. It seems to always happen when my network > > > >> devices are initializing. This morning I updated to v3.10 and made a > > > >> few config tweaks and so far I've hit it 4 out of 5 reboots. It looks > > > >> like most processes are getting stuck on rtnl_lock. Below is a boot > > > >> log with the soft lockup prints. Please let know if there is any > > > >> other information I can provide: > > > > > > > > Could you try a build with CONFIG_LOCKDEP enabled? > > > > > > > > > > The problem is clear: ib_register_device() is called with rtnl_lock, > > > but itself needs device_mutex, however, ib_register_client() first > > > acquires device_mutex, then indirectly calls register_netdev() which > > > takes rtnl_lock. Deadlock! > > > > > > One possible fix is always taking rtnl_lock before taking > > > device_mutex, something like below: > > > > > > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c > > > index 18c1ece..890870b 100644 > > > --- a/drivers/infiniband/core/device.c > > > +++ b/drivers/infiniband/core/device.c > > > @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) > > > { > > > struct ib_device *device; > > > > > > + rtnl_lock(); > > > mutex_lock(&device_mutex); > > > > > > list_add_tail(&client->list, &client_list); > > > @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) > > > client->add(device); > > > > > > mutex_unlock(&device_mutex); > > > + rtnl_unlock(); > > > > > > return 0; > > > } > > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > > index b6e049a..5a7a048 100644 > > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > > @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, > > > goto event_failed; > > > } > > > > > > - result = register_netdev(priv->dev); > > > + result = register_netdevice(priv->dev); > > > if (result) { > > > printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", > > > hca->name, port, result); > > > > Looks good to me. Shawn, could you test this patch? > > ib_unregister_device/ib_unregister_client would need the same change, > too. I have not checked the other ->add() and ->remove() functions. Also > cc'ed linux-rdma@vger.kernel.org, Roland Dreier. Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise I've had 34 successful reboots with no deadlocks which is a good sign. It sounds like there are more paths that need to be audited and a proper patch submitted. I can do more testing later if needed. Thanks, Shawn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/07/2013 20:22, Shawn Bohrer wrote: > On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: >> On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: >>> On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: >>>> On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: >>>>> On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: >>>>>> I've managed to hit a deadlock at boot a couple times while testing >>>>>> the 3.10 rc kernels. It seems to always happen when my network >>>>>> devices are initializing. This morning I updated to v3.10 and made a >>>>>> few config tweaks and so far I've hit it 4 out of 5 reboots. It looks >>>>>> like most processes are getting stuck on rtnl_lock. Below is a boot >>>>>> log with the soft lockup prints. Please let know if there is any >>>>>> other information I can provide: >>>>> Could you try a build with CONFIG_LOCKDEP enabled? >>>>> >>>> The problem is clear: ib_register_device() is called with rtnl_lock, >>>> but itself needs device_mutex, however, ib_register_client() first >>>> acquires device_mutex, then indirectly calls register_netdev() which >>>> takes rtnl_lock. Deadlock! >>>> >>>> One possible fix is always taking rtnl_lock before taking >>>> device_mutex, something like below: >>>> >>>> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c >>>> index 18c1ece..890870b 100644 >>>> --- a/drivers/infiniband/core/device.c >>>> +++ b/drivers/infiniband/core/device.c >>>> @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) >>>> { >>>> struct ib_device *device; >>>> >>>> + rtnl_lock(); >>>> mutex_lock(&device_mutex); >>>> >>>> list_add_tail(&client->list, &client_list); >>>> @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) >>>> client->add(device); >>>> >>>> mutex_unlock(&device_mutex); >>>> + rtnl_unlock(); >>>> >>>> return 0; >>>> } >>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>> index b6e049a..5a7a048 100644 >>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>> @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, >>>> goto event_failed; >>>> } >>>> >>>> - result = register_netdev(priv->dev); >>>> + result = register_netdevice(priv->dev); >>>> if (result) { >>>> printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", >>>> hca->name, port, result); >>> Looks good to me. Shawn, could you test this patch? >> ib_unregister_device/ib_unregister_client would need the same change, >> too. I have not checked the other ->add() and ->remove() functions. Also >> cc'ed linux-rdma@vger.kernel.org, Roland Dreier. > Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise > I've had 34 successful reboots with no deadlocks which is a good sign. > It sounds like there are more paths that need to be audited and a > proper patch submitted. I can do more testing later if needed. > > Thanks, > Shawn > Guys, I was a bit busy today looking into that, but I don't think we want the IB core layer (core/device.c) to use rtnl locking which is something that belongs to the network stack. Or. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: > On 03/07/2013 20:22, Shawn Bohrer wrote: > >On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: > >>On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: > >>>On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > >>>>On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > >>>>>On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > >>>>>>I've managed to hit a deadlock at boot a couple times while testing > >>>>>>the 3.10 rc kernels. It seems to always happen when my network > >>>>>>devices are initializing. This morning I updated to v3.10 and made a > >>>>>>few config tweaks and so far I've hit it 4 out of 5 reboots. It looks > >>>>>>like most processes are getting stuck on rtnl_lock. Below is a boot > >>>>>>log with the soft lockup prints. Please let know if there is any > >>>>>>other information I can provide: > >>>>>Could you try a build with CONFIG_LOCKDEP enabled? > >>>>> > >>>>The problem is clear: ib_register_device() is called with rtnl_lock, > >>>>but itself needs device_mutex, however, ib_register_client() first > >>>>acquires device_mutex, then indirectly calls register_netdev() which > >>>>takes rtnl_lock. Deadlock! > >>>> > >>>>One possible fix is always taking rtnl_lock before taking > >>>>device_mutex, something like below: > >>>> > >>>>diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c > >>>>index 18c1ece..890870b 100644 > >>>>--- a/drivers/infiniband/core/device.c > >>>>+++ b/drivers/infiniband/core/device.c > >>>>@@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) > >>>> { > >>>> struct ib_device *device; > >>>>+ rtnl_lock(); > >>>> mutex_lock(&device_mutex); > >>>> list_add_tail(&client->list, &client_list); > >>>>@@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) > >>>> client->add(device); > >>>> mutex_unlock(&device_mutex); > >>>>+ rtnl_unlock(); > >>>> return 0; > >>>> } > >>>>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>index b6e049a..5a7a048 100644 > >>>>--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>@@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, > >>>> goto event_failed; > >>>> } > >>>>- result = register_netdev(priv->dev); > >>>>+ result = register_netdevice(priv->dev); > >>>> if (result) { > >>>> printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", > >>>> hca->name, port, result); > >>>Looks good to me. Shawn, could you test this patch? > >>ib_unregister_device/ib_unregister_client would need the same change, > >>too. I have not checked the other ->add() and ->remove() functions. Also > >>cc'ed linux-rdma@vger.kernel.org, Roland Dreier. > >Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise > >I've had 34 successful reboots with no deadlocks which is a good sign. > >It sounds like there are more paths that need to be audited and a > >proper patch submitted. I can do more testing later if needed. > > > >Thanks, > >Shawn > > > > Guys, I was a bit busy today looking into that, but I don't think we > want the IB core layer (core/device.c) to > use rtnl locking which is something that belongs to the network stack. Has anymore thought been put into a proper fix for this issue? Thanks, Shawn
On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: > On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: > > On 03/07/2013 20:22, Shawn Bohrer wrote: > > >On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: > > >>On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: > > >>>On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > > >>>>On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > > >>>>>On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > > >>>>>>I've managed to hit a deadlock at boot a couple times while testing > > >>>>>>the 3.10 rc kernels. It seems to always happen when my network > > >>>>>>devices are initializing. This morning I updated to v3.10 and made a > > >>>>>>few config tweaks and so far I've hit it 4 out of 5 reboots. It looks > > >>>>>>like most processes are getting stuck on rtnl_lock. Below is a boot > > >>>>>>log with the soft lockup prints. Please let know if there is any > > >>>>>>other information I can provide: > > >>>>>Could you try a build with CONFIG_LOCKDEP enabled? > > >>>>> > > >>>>The problem is clear: ib_register_device() is called with rtnl_lock, > > >>>>but itself needs device_mutex, however, ib_register_client() first > > >>>>acquires device_mutex, then indirectly calls register_netdev() which > > >>>>takes rtnl_lock. Deadlock! > > >>>> > > >>>>One possible fix is always taking rtnl_lock before taking > > >>>>device_mutex, something like below: > > >>>> > > >>>>diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c > > >>>>index 18c1ece..890870b 100644 > > >>>>--- a/drivers/infiniband/core/device.c > > >>>>+++ b/drivers/infiniband/core/device.c > > >>>>@@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) > > >>>> { > > >>>> struct ib_device *device; > > >>>>+ rtnl_lock(); > > >>>> mutex_lock(&device_mutex); > > >>>> list_add_tail(&client->list, &client_list); > > >>>>@@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) > > >>>> client->add(device); > > >>>> mutex_unlock(&device_mutex); > > >>>>+ rtnl_unlock(); > > >>>> return 0; > > >>>> } > > >>>>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > >>>>index b6e049a..5a7a048 100644 > > >>>>--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > >>>>+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > >>>>@@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, > > >>>> goto event_failed; > > >>>> } > > >>>>- result = register_netdev(priv->dev); > > >>>>+ result = register_netdevice(priv->dev); > > >>>> if (result) { > > >>>> printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", > > >>>> hca->name, port, result); > > >>>Looks good to me. Shawn, could you test this patch? > > >>ib_unregister_device/ib_unregister_client would need the same change, > > >>too. I have not checked the other ->add() and ->remove() functions. Also > > >>cc'ed linux-rdma@vger.kernel.org, Roland Dreier. > > >Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise > > >I've had 34 successful reboots with no deadlocks which is a good sign. > > >It sounds like there are more paths that need to be audited and a > > >proper patch submitted. I can do more testing later if needed. > > > > > >Thanks, > > >Shawn > > > > > > > Guys, I was a bit busy today looking into that, but I don't think we > > want the IB core layer (core/device.c) to > > use rtnl locking which is something that belongs to the network stack. > > Has anymore thought been put into a proper fix for this issue? I'm no expert in this area but I'm having a hard time seeing a different solution than the one Cong suggested. Just to be clear the deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd Steve Wise in case he has a better solution from the Chelsio side. Here are those two stacks again as a reminder: [ 243.301380] INFO: task modprobe:2635 blocked for more than 120 seconds. [ 243.321495] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.343008] modprobe D ffffffff81607960 0 2635 1917 0x00000000 [ 243.363956] ffff880612c11b98 0000000000000002 ffffffff81a10440 ffff880612c11b90 [ 243.385305] 00003ffffffff000 ffff880612a8adc0 ffff880612c11fd8 ffff880612c11fd8 [ 243.406670] ffff880612c11fd8 ffff880612a8adc0 ffff880612c11fd8 ffffffff81a9c9e0 [ 243.428043] Call Trace: [ 243.444258] [<ffffffff814c4969>] schedule+0x29/0x70 [ 243.463136] [<ffffffff814c4c6e>] schedule_preempt_disabled+0xe/0x10 [ 243.483464] [<ffffffff814c2f62>] __mutex_lock_slowpath+0x112/0x1b0 [ 243.503753] [<ffffffff814c2dda>] mutex_lock+0x2a/0x50 [ 243.530031] [<ffffffff814261e5>] rtnl_lock+0x15/0x20 [ 243.549063] [<ffffffff8141b6f6>] register_netdev+0x16/0x30 [ 243.568635] [<ffffffffa02a71db>] ipoib_add_one+0x32b/0x4c0 [ib_ipoib] [ 243.589253] [<ffffffff8113361d>] ? kmem_cache_alloc_trace+0x12d/0x160 [ 243.609940] [<ffffffffa00f5be7>] ib_register_client+0x87/0xc0 [ib_core] [ 243.630862] [<ffffffffa02b9000>] ? 0xffffffffa02b8fff [ 243.650169] [<ffffffffa02b90e6>] ipoib_init_module+0xe6/0x137 [ib_ipoib] [ 243.671213] [<ffffffff810002ca>] do_one_initcall+0xea/0x190 [ 243.690975] [<ffffffff81090377>] load_module+0x1407/0x1af0 [ 243.710445] [<ffffffff812868c0>] ? ddebug_add_module+0xf0/0xf0 [ 243.730098] [<ffffffff81090b32>] SyS_init_module+0xd2/0x120 [ 243.749357] [<ffffffff814ce2b9>] tracesys+0xd0/0xd5 [ 243.767809] INFO: task ip:2662 blocked for more than 120 seconds. [ 243.787583] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.809193] ip D ffffffff81607960 0 2662 2637 0x00000000 [ 243.830165] ffff880c04b5f558 0000000000000002 ffff880c13d844a0 ffffffff8127d463 [ 243.851709] 000080d00030c0d0 ffff880c12e8db80 ffff880c04b5ffd8 ffff880c04b5ffd8 [ 243.873311] ffff880c04b5ffd8 ffff880c12e8db80 0000000000008000 ffffffffa00fd400 [ 243.894929] Call Trace: [ 243.911445] [<ffffffff8127d463>] ? gen_pool_add_virt+0x53/0xb0 [ 243.931762] [<ffffffff814c4969>] schedule+0x29/0x70 [ 243.951054] [<ffffffff814c4c6e>] schedule_preempt_disabled+0xe/0x10 [ 243.971786] [<ffffffff814c2f62>] __mutex_lock_slowpath+0x112/0x1b0 [ 243.992424] [<ffffffff8127d463>] ? gen_pool_add_virt+0x53/0xb0 [ 244.012708] [<ffffffff814c2dda>] mutex_lock+0x2a/0x50 [ 244.032203] [<ffffffffa00f5c5f>] ib_register_device+0x3f/0x4d0 [ib_core] [ 244.053500] [<ffffffff8113361d>] ? kmem_cache_alloc_trace+0x12d/0x160 [ 244.074665] [<ffffffffa01fba3e>] ? iwch_register_device+0x2ae/0x3d0 [iw_cxgb3] [ 244.096749] [<ffffffffa01fbac8>] iwch_register_device+0x338/0x3d0 [iw_cxgb3] [ 244.118752] [<ffffffffa01fc1e7>] open_rnic_dev+0x247/0x340 [iw_cxgb3] [ 244.140170] [<ffffffffa0a192be>] cxgb3_add_clients+0x3e/0x60 [cxgb3] [ 244.161569] [<ffffffffa0a069c0>] cxgb_open+0x320/0x360 [cxgb3] [ 244.182341] [<ffffffff8141aa3f>] __dev_open+0xcf/0x150 [ 244.202210] [<ffffffff8141ad11>] __dev_change_flags+0xa1/0x180 [ 244.222602] [<ffffffff8141aea8>] dev_change_flags+0x28/0x70 [ 244.242568] [<ffffffff81426652>] do_setlink+0x252/0x960 [ 244.262128] [<ffffffffa005500a>] ? inet6_fill_link_af+0x1a/0x30 [ipv6] [ 244.283144] [<ffffffff81286fc0>] ? nla_parse+0x30/0xe0 [ 244.302572] [<ffffffff81429a19>] rtnl_newlink+0x359/0x560 [ 244.322229] [<ffffffff814295dd>] rtnetlink_rcv_msg+0x14d/0x230 [ 244.342348] [<ffffffff8140b22b>] ? __alloc_skb+0x8b/0x2b0 [ 244.361988] [<ffffffff81429490>] ? __rtnl_unlock+0x20/0x20 [ 244.381664] [<ffffffff81444079>] netlink_rcv_skb+0xa9/0xd0 [ 244.401320] [<ffffffff81426215>] rtnetlink_rcv+0x25/0x40 [ 244.420848] [<ffffffff81443a31>] netlink_unicast+0x161/0x1e0 [ 244.440693] [<ffffffff81443d4b>] netlink_sendmsg+0x29b/0x340 [ 244.460443] [<ffffffff81400776>] sock_sendmsg+0x76/0x90 [ 244.479751] [<ffffffff81403564>] ? move_addr_to_kernel+0x44/0x60 [ 244.499905] [<ffffffff8140f426>] ? verify_iovec+0x56/0xd0 [ 244.519493] [<ffffffff8140292c>] ___sys_sendmsg+0x38c/0x3a0 [ 244.539314] [<ffffffff8110fee0>] ? handle_mm_fault+0x210/0x310 [ 244.559424] [<ffffffff814c9714>] ? __do_page_fault+0x274/0x4c0 [ 244.579490] [<ffffffff811135d5>] ? __vma_link_rb+0x105/0x120 [ 244.599250] [<ffffffff811136bf>] ? vma_link+0xcf/0xe0 [ 244.618153] [<ffffffff810ac809>] ? rcu_eqs_exit+0x59/0xb0 [ 244.637209] [<ffffffff81403f19>] __sys_sendmsg+0x49/0x90 [ 244.656010] [<ffffffff81403f79>] SyS_sendmsg+0x19/0x20 [ 244.674518] [<ffffffff814ce2b9>] tracesys+0xd0/0xd5 Looking back at the history these code paths it looks like they've been there since day one, so I suppose I've just been lucky to not his this before. Even so I'd like to see some kind of official fix for this so I don't have to carry a custom patch in my tree forever. -- Shawn
On 7/29/2013 6:02 PM, Shawn Bohrer wrote: > On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: >> On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: >>> On 03/07/2013 20:22, Shawn Bohrer wrote: >>>> On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: >>>>> On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: >>>>>> On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: >>>>>>> On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: >>>>>>>> On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: >>>>>>>>> I've managed to hit a deadlock at boot a couple times while testing >>>>>>>>> the 3.10 rc kernels. It seems to always happen when my network >>>>>>>>> devices are initializing. This morning I updated to v3.10 and made a >>>>>>>>> few config tweaks and so far I've hit it 4 out of 5 reboots. It looks >>>>>>>>> like most processes are getting stuck on rtnl_lock. Below is a boot >>>>>>>>> log with the soft lockup prints. Please let know if there is any >>>>>>>>> other information I can provide: >>>>>>>> Could you try a build with CONFIG_LOCKDEP enabled? >>>>>>>> >>>>>>> The problem is clear: ib_register_device() is called with rtnl_lock, >>>>>>> but itself needs device_mutex, however, ib_register_client() first >>>>>>> acquires device_mutex, then indirectly calls register_netdev() which >>>>>>> takes rtnl_lock. Deadlock! >>>>>>> >>>>>>> One possible fix is always taking rtnl_lock before taking >>>>>>> device_mutex, something like below: >>>>>>> >>>>>>> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c >>>>>>> index 18c1ece..890870b 100644 >>>>>>> --- a/drivers/infiniband/core/device.c >>>>>>> +++ b/drivers/infiniband/core/device.c >>>>>>> @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) >>>>>>> { >>>>>>> struct ib_device *device; >>>>>>> + rtnl_lock(); >>>>>>> mutex_lock(&device_mutex); >>>>>>> list_add_tail(&client->list, &client_list); >>>>>>> @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) >>>>>>> client->add(device); >>>>>>> mutex_unlock(&device_mutex); >>>>>>> + rtnl_unlock(); >>>>>>> return 0; >>>>>>> } >>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>> index b6e049a..5a7a048 100644 >>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>> @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, >>>>>>> goto event_failed; >>>>>>> } >>>>>>> - result = register_netdev(priv->dev); >>>>>>> + result = register_netdevice(priv->dev); >>>>>>> if (result) { >>>>>>> printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", >>>>>>> hca->name, port, result); >>>>>> Looks good to me. Shawn, could you test this patch? >>>>> ib_unregister_device/ib_unregister_client would need the same change, >>>>> too. I have not checked the other ->add() and ->remove() functions. Also >>>>> cc'ed linux-rdma@vger.kernel.org, Roland Dreier. >>>> Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise >>>> I've had 34 successful reboots with no deadlocks which is a good sign. >>>> It sounds like there are more paths that need to be audited and a >>>> proper patch submitted. I can do more testing later if needed. >>>> >>>> Thanks, >>>> Shawn >>>> >>> Guys, I was a bit busy today looking into that, but I don't think we >>> want the IB core layer (core/device.c) to >>> use rtnl locking which is something that belongs to the network stack. >> Has anymore thought been put into a proper fix for this issue? > I'm no expert in this area but I'm having a hard time seeing a > different solution than the one Cong suggested. Just to be clear the > deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd > Steve Wise in case he has a better solution from the Chelsio side. I don't know of another way to resolve this. The rtnl lock is used in ipoib and mlx4 already. I think we should go forward with the proposed patch. Steve. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/30/13 14:54, Steve Wise wrote: > On 7/29/2013 6:02 PM, Shawn Bohrer wrote: >> On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: >>> On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: >>>> On 03/07/2013 20:22, Shawn Bohrer wrote: >>>>> On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: >>>>>> On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: >>>>>>> On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: >>>>>>>> On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa >>>>>>>> <hannes@stressinduktion.org> wrote: >>>>>>>>> On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: >>>>>>>>>> I've managed to hit a deadlock at boot a couple times while >>>>>>>>>> testing >>>>>>>>>> the 3.10 rc kernels. It seems to always happen when my network >>>>>>>>>> devices are initializing. This morning I updated to v3.10 and >>>>>>>>>> made a >>>>>>>>>> few config tweaks and so far I've hit it 4 out of 5 reboots. >>>>>>>>>> It looks >>>>>>>>>> like most processes are getting stuck on rtnl_lock. Below is >>>>>>>>>> a boot >>>>>>>>>> log with the soft lockup prints. Please let know if there is any >>>>>>>>>> other information I can provide: >>>>>>>>> Could you try a build with CONFIG_LOCKDEP enabled? >>>>>>>>> >>>>>>>> The problem is clear: ib_register_device() is called with >>>>>>>> rtnl_lock, >>>>>>>> but itself needs device_mutex, however, ib_register_client() first >>>>>>>> acquires device_mutex, then indirectly calls register_netdev() >>>>>>>> which >>>>>>>> takes rtnl_lock. Deadlock! >>>>>>>> >>>>>>>> One possible fix is always taking rtnl_lock before taking >>>>>>>> device_mutex, something like below: >>>>>>>> >>>>>>>> diff --git a/drivers/infiniband/core/device.c >>>>>>>> b/drivers/infiniband/core/device.c >>>>>>>> index 18c1ece..890870b 100644 >>>>>>>> --- a/drivers/infiniband/core/device.c >>>>>>>> +++ b/drivers/infiniband/core/device.c >>>>>>>> @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client >>>>>>>> *client) >>>>>>>> { >>>>>>>> struct ib_device *device; >>>>>>>> + rtnl_lock(); >>>>>>>> mutex_lock(&device_mutex); >>>>>>>> list_add_tail(&client->list, &client_list); >>>>>>>> @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client >>>>>>>> *client) >>>>>>>> client->add(device); >>>>>>>> mutex_unlock(&device_mutex); >>>>>>>> + rtnl_unlock(); >>>>>>>> return 0; >>>>>>>> } >>>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>> b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>> index b6e049a..5a7a048 100644 >>>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>> @@ -1609,7 +1609,7 @@ static struct net_device >>>>>>>> *ipoib_add_port(const char *format, >>>>>>>> goto event_failed; >>>>>>>> } >>>>>>>> - result = register_netdev(priv->dev); >>>>>>>> + result = register_netdevice(priv->dev); >>>>>>>> if (result) { >>>>>>>> printk(KERN_WARNING "%s: couldn't register ipoib port >>>>>>>> %d; error %d\n", >>>>>>>> hca->name, port, result); >>>>>>> Looks good to me. Shawn, could you test this patch? >>>>>> ib_unregister_device/ib_unregister_client would need the same change, >>>>>> too. I have not checked the other ->add() and ->remove() >>>>>> functions. Also >>>>>> cc'ed linux-rdma@vger.kernel.org, Roland Dreier. >>>>> Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise >>>>> I've had 34 successful reboots with no deadlocks which is a good sign. >>>>> It sounds like there are more paths that need to be audited and a >>>>> proper patch submitted. I can do more testing later if needed. >>>>> >>>>> Thanks, >>>>> Shawn >>>>> >>>> Guys, I was a bit busy today looking into that, but I don't think we >>>> want the IB core layer (core/device.c) to >>>> use rtnl locking which is something that belongs to the network stack. >>> Has anymore thought been put into a proper fix for this issue? >> I'm no expert in this area but I'm having a hard time seeing a >> different solution than the one Cong suggested. Just to be clear the >> deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd >> Steve Wise in case he has a better solution from the Chelsio side. > > I don't know of another way to resolve this. The rtnl lock is used in > ipoib and mlx4 already. I think we should go forward with the proposed > patch. (replying to an e-mail of one month ago) Hello, It would be appreciated if anyone could report what the current status of this issue is. I think a deadlock I ran into with kernels 3.10 and 3.11 and PCI pass-through is related to this issue. See also http://bugzilla.kernel.org/show_bug.cgi?id=60856 for the lockdep report. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/5/2013 5:02 AM, Bart Van Assche wrote: > On 07/30/13 14:54, Steve Wise wrote: >> On 7/29/2013 6:02 PM, Shawn Bohrer wrote: >>> On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: >>>> On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: >>>>> On 03/07/2013 20:22, Shawn Bohrer wrote: >>>>>> On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa >>>>>> wrote: >>>>>>> On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa >>>>>>> wrote: >>>>>>>> On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: >>>>>>>>> On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa >>>>>>>>> <hannes@stressinduktion.org> wrote: >>>>>>>>>> On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: >>>>>>>>>>> I've managed to hit a deadlock at boot a couple times while >>>>>>>>>>> testing >>>>>>>>>>> the 3.10 rc kernels. It seems to always happen when my network >>>>>>>>>>> devices are initializing. This morning I updated to v3.10 and >>>>>>>>>>> made a >>>>>>>>>>> few config tweaks and so far I've hit it 4 out of 5 reboots. >>>>>>>>>>> It looks >>>>>>>>>>> like most processes are getting stuck on rtnl_lock. Below is >>>>>>>>>>> a boot >>>>>>>>>>> log with the soft lockup prints. Please let know if there >>>>>>>>>>> is any >>>>>>>>>>> other information I can provide: >>>>>>>>>> Could you try a build with CONFIG_LOCKDEP enabled? >>>>>>>>>> >>>>>>>>> The problem is clear: ib_register_device() is called with >>>>>>>>> rtnl_lock, >>>>>>>>> but itself needs device_mutex, however, ib_register_client() >>>>>>>>> first >>>>>>>>> acquires device_mutex, then indirectly calls register_netdev() >>>>>>>>> which >>>>>>>>> takes rtnl_lock. Deadlock! >>>>>>>>> >>>>>>>>> One possible fix is always taking rtnl_lock before taking >>>>>>>>> device_mutex, something like below: >>>>>>>>> >>>>>>>>> diff --git a/drivers/infiniband/core/device.c >>>>>>>>> b/drivers/infiniband/core/device.c >>>>>>>>> index 18c1ece..890870b 100644 >>>>>>>>> --- a/drivers/infiniband/core/device.c >>>>>>>>> +++ b/drivers/infiniband/core/device.c >>>>>>>>> @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client >>>>>>>>> *client) >>>>>>>>> { >>>>>>>>> struct ib_device *device; >>>>>>>>> + rtnl_lock(); >>>>>>>>> mutex_lock(&device_mutex); >>>>>>>>> list_add_tail(&client->list, &client_list); >>>>>>>>> @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client >>>>>>>>> *client) >>>>>>>>> client->add(device); >>>>>>>>> mutex_unlock(&device_mutex); >>>>>>>>> + rtnl_unlock(); >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>>> b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>>> index b6e049a..5a7a048 100644 >>>>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>>>>>>>> @@ -1609,7 +1609,7 @@ static struct net_device >>>>>>>>> *ipoib_add_port(const char *format, >>>>>>>>> goto event_failed; >>>>>>>>> } >>>>>>>>> - result = register_netdev(priv->dev); >>>>>>>>> + result = register_netdevice(priv->dev); >>>>>>>>> if (result) { >>>>>>>>> printk(KERN_WARNING "%s: couldn't register ipoib port >>>>>>>>> %d; error %d\n", >>>>>>>>> hca->name, port, result); >>>>>>>> Looks good to me. Shawn, could you test this patch? >>>>>>> ib_unregister_device/ib_unregister_client would need the same >>>>>>> change, >>>>>>> too. I have not checked the other ->add() and ->remove() >>>>>>> functions. Also >>>>>>> cc'ed linux-rdma@vger.kernel.org, Roland Dreier. >>>>>> Cong's patch is missing the #include <linux/rtnetlink.h> but >>>>>> otherwise >>>>>> I've had 34 successful reboots with no deadlocks which is a good >>>>>> sign. >>>>>> It sounds like there are more paths that need to be audited and a >>>>>> proper patch submitted. I can do more testing later if needed. >>>>>> >>>>>> Thanks, >>>>>> Shawn >>>>>> >>>>> Guys, I was a bit busy today looking into that, but I don't think we >>>>> want the IB core layer (core/device.c) to >>>>> use rtnl locking which is something that belongs to the network >>>>> stack. >>>> Has anymore thought been put into a proper fix for this issue? >>> I'm no expert in this area but I'm having a hard time seeing a >>> different solution than the one Cong suggested. Just to be clear the >>> deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd >>> Steve Wise in case he has a better solution from the Chelsio side. >> >> I don't know of another way to resolve this. The rtnl lock is used in >> ipoib and mlx4 already. I think we should go forward with the proposed >> patch. > > (replying to an e-mail of one month ago) > > Hello, > > It would be appreciated if anyone could report what the current status > of this issue is. I think a deadlock I ran into with kernels 3.10 and > 3.11 and PCI pass-through is related to this issue. See also > http://bugzilla.kernel.org/show_bug.cgi?id=60856 for the lockdep report. > > Thanks, > > Bart. Roland, what do you think? As I've said, I think we should go ahead with using the rtnl lock in the core. Is there a complete patch available for review? looks like the original was a partial fix. Steve. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote: > On 9/5/2013 5:02 AM, Bart Van Assche wrote: > >On 07/30/13 14:54, Steve Wise wrote: > >>On 7/29/2013 6:02 PM, Shawn Bohrer wrote: > >>>On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: > >>>>On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: > >>>>>On 03/07/2013 20:22, Shawn Bohrer wrote: > >>>>>>On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes > >>>>>>Frederic Sowa wrote: > >>>>>>>On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes > >>>>>>>Frederic Sowa wrote: > >>>>>>>>On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > >>>>>>>>>On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa > >>>>>>>>><hannes@stressinduktion.org> wrote: > >>>>>>>>>>On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > >>>>>>>>>>>I've managed to hit a deadlock at boot a couple times while > >>>>>>>>>>>testing > >>>>>>>>>>>the 3.10 rc kernels. It seems to always happen when my network > >>>>>>>>>>>devices are initializing. This morning I updated to v3.10 and > >>>>>>>>>>>made a > >>>>>>>>>>>few config tweaks and so far I've hit it 4 out of 5 reboots. > >>>>>>>>>>>It looks > >>>>>>>>>>>like most processes are getting stuck on rtnl_lock. Below is > >>>>>>>>>>>a boot > >>>>>>>>>>>log with the soft lockup prints. Please let > >>>>>>>>>>>know if there is any > >>>>>>>>>>>other information I can provide: > >>>>>>>>>>Could you try a build with CONFIG_LOCKDEP enabled? > >>>>>>>>>> > >>>>>>>>>The problem is clear: ib_register_device() is called with > >>>>>>>>>rtnl_lock, > >>>>>>>>>but itself needs device_mutex, however, > >>>>>>>>>ib_register_client() first > >>>>>>>>>acquires device_mutex, then indirectly calls register_netdev() > >>>>>>>>>which > >>>>>>>>>takes rtnl_lock. Deadlock! > >>>>>>>>> > >>>>>>>>>One possible fix is always taking rtnl_lock before taking > >>>>>>>>>device_mutex, something like below: > >>>>>>>>> > >>>>>>>>>diff --git a/drivers/infiniband/core/device.c > >>>>>>>>>b/drivers/infiniband/core/device.c > >>>>>>>>>index 18c1ece..890870b 100644 > >>>>>>>>>--- a/drivers/infiniband/core/device.c > >>>>>>>>>+++ b/drivers/infiniband/core/device.c > >>>>>>>>>@@ -381,6 +381,7 @@ int ib_register_client(struct ib_client > >>>>>>>>>*client) > >>>>>>>>> { > >>>>>>>>> struct ib_device *device; > >>>>>>>>>+ rtnl_lock(); > >>>>>>>>> mutex_lock(&device_mutex); > >>>>>>>>> list_add_tail(&client->list, &client_list); > >>>>>>>>>@@ -389,6 +390,7 @@ int ib_register_client(struct ib_client > >>>>>>>>>*client) > >>>>>>>>> client->add(device); > >>>>>>>>> mutex_unlock(&device_mutex); > >>>>>>>>>+ rtnl_unlock(); > >>>>>>>>> return 0; > >>>>>>>>> } > >>>>>>>>>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>>b/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>>index b6e049a..5a7a048 100644 > >>>>>>>>>--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>>+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>>@@ -1609,7 +1609,7 @@ static struct net_device > >>>>>>>>>*ipoib_add_port(const char *format, > >>>>>>>>> goto event_failed; > >>>>>>>>> } > >>>>>>>>>- result = register_netdev(priv->dev); > >>>>>>>>>+ result = register_netdevice(priv->dev); > >>>>>>>>> if (result) { > >>>>>>>>> printk(KERN_WARNING "%s: couldn't register ipoib port > >>>>>>>>>%d; error %d\n", > >>>>>>>>> hca->name, port, result); > >>>>>>>>Looks good to me. Shawn, could you test this patch? > >>>>>>>ib_unregister_device/ib_unregister_client would need > >>>>>>>the same change, > >>>>>>>too. I have not checked the other ->add() and ->remove() > >>>>>>>functions. Also > >>>>>>>cc'ed linux-rdma@vger.kernel.org, Roland Dreier. > >>>>>>Cong's patch is missing the #include <linux/rtnetlink.h> > >>>>>>but otherwise > >>>>>>I've had 34 successful reboots with no deadlocks which > >>>>>>is a good sign. > >>>>>>It sounds like there are more paths that need to be audited and a > >>>>>>proper patch submitted. I can do more testing later if needed. > >>>>>> > >>>>>>Thanks, > >>>>>>Shawn > >>>>>> > >>>>>Guys, I was a bit busy today looking into that, but I don't think we > >>>>>want the IB core layer (core/device.c) to > >>>>>use rtnl locking which is something that belongs to the > >>>>>network stack. > >>>>Has anymore thought been put into a proper fix for this issue? > >>>I'm no expert in this area but I'm having a hard time seeing a > >>>different solution than the one Cong suggested. Just to be clear the > >>>deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd > >>>Steve Wise in case he has a better solution from the Chelsio side. > >> > >>I don't know of another way to resolve this. The rtnl lock is used in > >>ipoib and mlx4 already. I think we should go forward with the proposed > >>patch. > > > >(replying to an e-mail of one month ago) > > > >Hello, > > > >It would be appreciated if anyone could report what the current > >status of this issue is. I think a deadlock I ran into with > >kernels 3.10 and 3.11 and PCI pass-through is related to this > >issue. See also http://bugzilla.kernel.org/show_bug.cgi?id=60856 > >for the lockdep report. > > > >Thanks, > > > >Bart. > > > Roland, what do you think? > > As I've said, I think we should go ahead with using the rtnl lock in > the core. Is there a complete patch available for review? looks > like the original was a partial fix. I've been running with Cong's partial fix for the past couple of months, and I'm pretty sure no complete patch has been posted. I may be able to look at the missing pieces tomorrow and see if I can put together a patch but if someone else wants to run with this feel free. -- Shawn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 05, 2013 at 12:02:12PM +0200, Bart Van Assche wrote: > On 07/30/13 14:54, Steve Wise wrote: > >On 7/29/2013 6:02 PM, Shawn Bohrer wrote: > >>On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: > >>>On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: > >>>>On 03/07/2013 20:22, Shawn Bohrer wrote: > >>>>>On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: > >>>>>>On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: > >>>>>>>On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote: > >>>>>>>>On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa > >>>>>>>><hannes@stressinduktion.org> wrote: > >>>>>>>>>On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: > >>>>>>>>>>I've managed to hit a deadlock at boot a couple times while > >>>>>>>>>>testing > >>>>>>>>>>the 3.10 rc kernels. It seems to always happen when my network > >>>>>>>>>>devices are initializing. This morning I updated to v3.10 and > >>>>>>>>>>made a > >>>>>>>>>>few config tweaks and so far I've hit it 4 out of 5 reboots. > >>>>>>>>>>It looks > >>>>>>>>>>like most processes are getting stuck on rtnl_lock. Below is > >>>>>>>>>>a boot > >>>>>>>>>>log with the soft lockup prints. Please let know if there is any > >>>>>>>>>>other information I can provide: > >>>>>>>>>Could you try a build with CONFIG_LOCKDEP enabled? > >>>>>>>>> > >>>>>>>>The problem is clear: ib_register_device() is called with > >>>>>>>>rtnl_lock, > >>>>>>>>but itself needs device_mutex, however, ib_register_client() first > >>>>>>>>acquires device_mutex, then indirectly calls register_netdev() > >>>>>>>>which > >>>>>>>>takes rtnl_lock. Deadlock! > >>>>>>>> > >>>>>>>>One possible fix is always taking rtnl_lock before taking > >>>>>>>>device_mutex, something like below: > >>>>>>>> > >>>>>>>>diff --git a/drivers/infiniband/core/device.c > >>>>>>>>b/drivers/infiniband/core/device.c > >>>>>>>>index 18c1ece..890870b 100644 > >>>>>>>>--- a/drivers/infiniband/core/device.c > >>>>>>>>+++ b/drivers/infiniband/core/device.c > >>>>>>>>@@ -381,6 +381,7 @@ int ib_register_client(struct ib_client > >>>>>>>>*client) > >>>>>>>> { > >>>>>>>> struct ib_device *device; > >>>>>>>>+ rtnl_lock(); > >>>>>>>> mutex_lock(&device_mutex); > >>>>>>>> list_add_tail(&client->list, &client_list); > >>>>>>>>@@ -389,6 +390,7 @@ int ib_register_client(struct ib_client > >>>>>>>>*client) > >>>>>>>> client->add(device); > >>>>>>>> mutex_unlock(&device_mutex); > >>>>>>>>+ rtnl_unlock(); > >>>>>>>> return 0; > >>>>>>>> } > >>>>>>>>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>b/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>index b6e049a..5a7a048 100644 > >>>>>>>>--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > >>>>>>>>@@ -1609,7 +1609,7 @@ static struct net_device > >>>>>>>>*ipoib_add_port(const char *format, > >>>>>>>> goto event_failed; > >>>>>>>> } > >>>>>>>>- result = register_netdev(priv->dev); > >>>>>>>>+ result = register_netdevice(priv->dev); > >>>>>>>> if (result) { > >>>>>>>> printk(KERN_WARNING "%s: couldn't register ipoib port > >>>>>>>>%d; error %d\n", > >>>>>>>> hca->name, port, result); > >>>>>>>Looks good to me. Shawn, could you test this patch? > >>>>>>ib_unregister_device/ib_unregister_client would need the same change, > >>>>>>too. I have not checked the other ->add() and ->remove() > >>>>>>functions. Also > >>>>>>cc'ed linux-rdma@vger.kernel.org, Roland Dreier. > >>>>>Cong's patch is missing the #include <linux/rtnetlink.h> but otherwise > >>>>>I've had 34 successful reboots with no deadlocks which is a good sign. > >>>>>It sounds like there are more paths that need to be audited and a > >>>>>proper patch submitted. I can do more testing later if needed. > >>>>> > >>>>>Thanks, > >>>>>Shawn > >>>>> > >>>>Guys, I was a bit busy today looking into that, but I don't think we > >>>>want the IB core layer (core/device.c) to > >>>>use rtnl locking which is something that belongs to the network stack. > >>>Has anymore thought been put into a proper fix for this issue? > >>I'm no expert in this area but I'm having a hard time seeing a > >>different solution than the one Cong suggested. Just to be clear the > >>deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd > >>Steve Wise in case he has a better solution from the Chelsio side. > > > >I don't know of another way to resolve this. The rtnl lock is used in > >ipoib and mlx4 already. I think we should go forward with the proposed > >patch. > > (replying to an e-mail of one month ago) > > Hello, > > It would be appreciated if anyone could report what the current > status of this issue is. I think a deadlock I ran into with kernels > 3.10 and 3.11 and PCI pass-through is related to this issue. See > also http://bugzilla.kernel.org/show_bug.cgi?id=60856 for the > lockdep report. Hey Bart, It looks like you hit a different issue. Yours is a deadlock between the s_active refcount on the sysfs dir and the rtnl_lock. My issue is a deadlock between the infiniband device_mutex and the rtnl_lock. Sadly I don't have a solution for your issue either but I haven't looked too hard. -- Shawn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote: > Roland, what do you think? > > As I've said, I think we should go ahead with using the rtnl lock in > the core. Is there a complete patch available for review? looks > like the original was a partial fix. I guess I should realize that when no one jumps at fixing my issues for me that they probably aren't simple to fix. The solution that Cong proposed was to acquire rtnl_lock() before acquiring the infiniband device_mutex, and his partial patch did that in ib_register_client(). The problem is that you would also need to do that in ib_unregister_client(), ib_register_device(), and ib_unregister_device(), and that brings us back to the original problem which was that cxgb3 was holding the rtnl_lock() when it called ib_register_device(). Thus with the proposed fix I believe cxgb3 would already be holding the rtnl_lock() and then call ib_register_device() which would try to acquire the rtnl_lock() again and deadlock for a different reason. Actually how does this currently work? ib_register_device() calls client->add() for each client in the list which should call ipoib_add_one() which calls register_netdev(). Shouldn't that also deadlock in the cxgb3 case? Also while digging through this I think I see another bug which is that ipoib_dev_cleanup() can be called from ipoib_add_port() but in the current code ipoib_add_port() is not holding the rtnl_lock() which appears to be a requirement of ipoib_dev_cleanup(). Sigh... I'm going to stop looking at this for now and hopefully someone can propose a better solution to this issue. Thanks, Shawn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/6/2013 6:19 PM, Shawn Bohrer wrote: > On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote: >> Roland, what do you think? >> >> As I've said, I think we should go ahead with using the rtnl lock in >> the core. Is there a complete patch available for review? looks >> like the original was a partial fix. > I guess I should realize that when no one jumps at fixing my issues > for me that they probably aren't simple to fix. The solution that > Cong proposed was to acquire rtnl_lock() before acquiring the > infiniband device_mutex, and his partial patch did that in > ib_register_client(). The problem is that you would also need to do > that in ib_unregister_client(), ib_register_device(), and > ib_unregister_device(), and that brings us back to the original > problem which was that cxgb3 was holding the rtnl_lock() when it > called ib_register_device(). Thus with the proposed fix I believe > cxgb3 would already be holding the rtnl_lock() and then call > ib_register_device() which would try to acquire the rtnl_lock() again > and deadlock for a different reason. > > Actually how does this currently work? ib_register_device() calls > client->add() for each client in the list which should call > ipoib_add_one() which calls register_netdev(). Shouldn't that also > deadlock in the cxgb3 case? cxgb3 is an iWARP device and doesn't support IPoIB. > > Also while digging through this I think I see another bug which is > that ipoib_dev_cleanup() can be called from ipoib_add_port() but in > the current code ipoib_add_port() is not holding the rtnl_lock() which > appears to be a requirement of ipoib_dev_cleanup(). > > Sigh... I'm going to stop looking at this for now and hopefully > someone can propose a better solution to this issue. I can help with this, but I'm waiting for Roland to chime in. Steve. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 18c1ece..890870b 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) { struct ib_device *device; + rtnl_lock(); mutex_lock(&device_mutex); list_add_tail(&client->list, &client_list); @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) client->add(device); mutex_unlock(&device_mutex); + rtnl_unlock(); return 0; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index b6e049a..5a7a048 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } - result = register_netdev(priv->dev); + result = register_netdevice(priv->dev); if (result) { printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", hca->name, port, result);