Message ID | 1309206092-23064-1-git-send-email-shyam_iyer@dell.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Shyam Iyer <shyam.iyer.t@gmail.com> Date: Mon, 27 Jun 2011 16:21:32 -0400 > The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) > causes oops like the one. What if ->cfg_flags changes while you have dropped the lock? If the lock doesn't protect those flags, what was it being taken for in the first place? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>From: Shyam Iyer [mailto:shyam.iyer.t@gmail.com] >Sent: Monday, June 27, 2011 1:22 PM >Subject: [PATCH] [net][bna] Fix call trace when interrupts are disabled >while sleeping function kzalloc is called > >The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) causes >oops like the one. Hi Shyam, We are not calling any sleeping function while holding the lock in bnad_mbox_irq_alloc(). How would your patch fix the call trace? Also can you tell which conditions led to this trace. Thanks, --Rasesh -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(My replies bounced of netdev for some reason..resending) > -----Original Message----- > From: David Miller [mailto:davem@davemloft.net] > Sent: Monday, June 27, 2011 7:07 PM > To: shyam.iyer.t@gmail.com > Cc: netdev@vger.kernel.org; rmody@brocade.com; ddutt@brocadel.com; > Iyer, Shyam > Subject: Re: [PATCH] [net][bna] Fix call trace when interrupts are > disabled while sleeping function kzalloc is called > > From: Shyam Iyer <shyam.iyer.t@gmail.com> > Date: Mon, 27 Jun 2011 16:21:32 -0400 > > > The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) > > causes oops like the one. > > What if ->cfg_flags changes while you have dropped the lock? > > If the lock doesn't protect those flags, what was it being > taken for in the first place? Oops yes! Here is try 2. I found that I had to initialize the flags variable outside the spin_lock_irqsave in the following way to fix the problem. The sleeping kzalloc function is in fact in the request_threaded_irq call which allocates the irq handler function and its flags arguments. Attached patch fixes the issue.
(Resending the reply for this as well.) > -----Original Message----- > From: Rasesh Mody [mailto:rmody@brocade.com] > Sent: Monday, June 27, 2011 8:51 PM > To: Shyam Iyer; netdev@vger.kernel.org > Cc: ddutt@brocadel.com; Iyer, Shyam; Jing Huang > Subject: RE: [PATCH] [net][bna] Fix call trace when interrupts are > disabled while sleeping function kzalloc is called > > > >From: Shyam Iyer [mailto:shyam.iyer.t@gmail.com] > >Sent: Monday, June 27, 2011 1:22 PM > >Subject: [PATCH] [net][bna] Fix call trace when interrupts are > disabled > >while sleeping function kzalloc is called > > > >The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) causes > >oops like the one. > > Hi Shyam, > > We are not calling any sleeping function while holding the lock in > bnad_mbox_irq_alloc(). How would your patch fix the call trace? Also > can you tell which conditions led to this trace. > > Thanks, > --Rasesh Hi Rasesh, Please review my reply/patch2 to Dave. I guess the condition to reproduce is to simply insmod/modprobe the driver on the Brocade 1020 CNA . I don't have cables attached to the card if that helps. Thanks, Shyam -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>From: Shyam_Iyer@Dell.com [mailto:Shyam_Iyer@Dell.com] >Sent: Tuesday, June 28, 2011 9:30 AM > >Here is try 2. > >I found that I had to initialize the flags variable outside the >spin_lock_irqsave in the following way to fix the problem. The sleeping >kzalloc function is in fact in the request_threaded_irq call which >allocates the irq handler function and its flags arguments. > >Attached patch fixes the issue. The attached patch may fix the issue in MSIX mode, however we'll continue to see the issue in INTX mode. Using a separate irq_flags may be a good idea. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Rasesh Mody [mailto:rmody@brocade.com] > Sent: Tuesday, June 28, 2011 1:28 PM > To: Iyer, Shyam; davem@davemloft.net; shyam.iyer.t@gmail.com > Cc: netdev@vger.kernel.org; Debashis Dutt; Jing Huang > Subject: RE: [PATCH] [net][bna] Fix call trace when interrupts are > disabled while sleeping function kzalloc is called > > >From: Shyam_Iyer@Dell.com [mailto:Shyam_Iyer@Dell.com] > >Sent: Tuesday, June 28, 2011 9:30 AM > > > >Here is try 2. > > > >I found that I had to initialize the flags variable outside the > >spin_lock_irqsave in the following way to fix the problem. The > sleeping > >kzalloc function is in fact in the request_threaded_irq call which > >allocates the irq handler function and its flags arguments. > > > >Attached patch fixes the issue. > > The attached patch may fix the issue in MSIX mode, however we'll > continue to see the issue in INTX mode. Using a separate irq_flags may > be a good idea. You are right.. Resending another version. -Shyam -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/bna/bnad.c b/drivers/net/bna/bnad.c index 7d25a97..04da3c8 100644 --- a/drivers/net/bna/bnad.c +++ b/drivers/net/bna/bnad.c @@ -1114,6 +1114,11 @@ bnad_mbox_irq_alloc(struct bnad *bnad, unsigned long flags; u32 irq; irq_handler_t irq_handler; + u32 cfg_flags; + + spin_lock_irqsave(&bnad->bna_lock, flags); + cfg_flags = bnad->cfg_flags; + spin_unlock_irqrestore(&bnad->bna_lock, flags); /* Mbox should use only 1 vector */ @@ -1121,8 +1126,7 @@ bnad_mbox_irq_alloc(struct bnad *bnad, if (!intr_info->idl) return -ENOMEM; - spin_lock_irqsave(&bnad->bna_lock, flags); - if (bnad->cfg_flags & BNAD_CF_MSIX) { + if (cfg_flags & BNAD_CF_MSIX) { irq_handler = (irq_handler_t)bnad_msix_mbox_handler; irq = bnad->msix_table[bnad->msix_num - 1].vector; flags = 0; @@ -1135,7 +1139,6 @@ bnad_mbox_irq_alloc(struct bnad *bnad, intr_info->intr_type = BNA_INTR_T_INTX; /* intr_info->idl.vector = 0 ? */ } - spin_unlock_irqrestore(&bnad->bna_lock, flags); sprintf(bnad->mbox_irq_name, "%s", BNAD_NAME);
The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) causes oops like the one. Jun 27 08:15:24 home-t710 kernel: [11735.634550] Brocade 10G Ethernet driver Jun 27 08:15:24 home-t710 kernel: [11735.634590] bnad_pci_probe : (0xffff880427f3d000, 0xffffffffa020f3e0) PCI Func : (2) Jun 27 08:15:24 home-t710 kernel: [11735.637677] bna 0000:82:00.2: PCI INT A -> GSI 66 (level, low) -> IRQ 66 Jun 27 08:15:24 home-t710 kernel: [11735.638290] bar0 mapped to ffffc90014980000, len 262144 Jun 27 08:15:24 home-t710 kernel: [11735.638732] BUG: sleeping function called from invalid context at mm/slub.c:847 Jun 27 08:15:24 home-t710 kernel: [11735.638736] in_atomic(): 0, irqs_disabled(): 1, pid: 11243, name: insmod Jun 27 08:15:24 home-t710 kernel: [11735.638740] Pid: 11243, comm: insmod Not tainted 3.0.0-rc4+ #6 Jun 27 08:15:24 home-t710 kernel: [11735.638743] Call Trace: Jun 27 08:15:24 home-t710 kernel: [11735.638755] [<ffffffff81046427>] __might_sleep+0xeb/0xf0 Jun 27 08:15:24 home-t710 kernel: [11735.638766] [<ffffffffa01fe469>] ? netif_wake_queue+0x3d/0x3d [bna] Jun 27 08:15:24 home-t710 kernel: [11735.638773] [<ffffffff8111201c>] kmem_cache_alloc_trace+0x43/0xd8 Jun 27 08:15:24 home-t710 kernel: [11735.638782] [<ffffffffa01fe469>] ? netif_wake_queue+0x3d/0x3d [bna] Jun 27 08:15:24 home-t710 kernel: [11735.638787] [<ffffffff810ab791>] request_threaded_irq+0xa1/0x113 Jun 27 08:15:24 home-t710 kernel: [11735.638798] [<ffffffffa020f0c0>] bnad_pci_probe+0x612/0x8e5 [bna] Jun 27 08:15:24 home-t710 kernel: [11735.638807] [<ffffffffa01fe469>] ? netif_wake_queue+0x3d/0x3d [bna] Jun 27 08:15:24 home-t710 kernel: [11735.638816] [<ffffffff81482ef4>] ? _raw_spin_unlock_irqrestore+0x17/0x19 Jun 27 08:15:24 home-t710 kernel: [11735.638822] [<ffffffff8124d17a>] local_pci_probe+0x44/0x75 Jun 27 08:15:24 home-t710 kernel: [11735.638826] [<ffffffff8124dc06>] pci_device_probe+0xd0/0xff Jun 27 08:15:24 home-t710 kernel: [11735.638832] [<ffffffff812ef8ab>] driver_probe_device+0x131/0x213 Jun 27 08:15:24 home-t710 kernel: [11735.638836] [<ffffffff812ef9e7>] __driver_attach+0x5a/0x7e Jun 27 08:15:24 home-t710 kernel: [11735.638840] [<ffffffff812ef98d>] ? driver_probe_device+0x213/0x213 Jun 27 08:15:24 home-t710 kernel: [11735.638844] [<ffffffff812ee933>] bus_for_each_dev+0x53/0x89 Jun 27 08:15:24 home-t710 kernel: [11735.638848] [<ffffffff812ef48a>] driver_attach+0x1e/0x20 Jun 27 08:15:24 home-t710 kernel: [11735.638852] [<ffffffff812ef0ae>] bus_add_driver+0xd1/0x224 Jun 27 08:15:24 home-t710 kernel: [11735.638858] [<ffffffffa01b8000>] ? 0xffffffffa01b7fff Jun 27 08:15:24 home-t710 kernel: [11735.638862] [<ffffffff812efe57>] driver_register+0x98/0x105 Jun 27 08:15:24 home-t710 kernel: [11735.638866] [<ffffffffa01b8000>] ? 0xffffffffa01b7fff Jun 27 08:15:24 home-t710 kernel: [11735.638871] [<ffffffff8124e4c9>] __pci_register_driver+0x56/0xc1 Jun 27 08:15:24 home-t710 kernel: [11735.638875] [<ffffffffa01b8000>] ? 0xffffffffa01b7fff Jun 27 08:15:24 home-t710 kernel: [11735.638884] [<ffffffffa01b8040>] bnad_module_init+0x40/0x60 [bna] Jun 27 08:15:24 home-t710 kernel: [11735.638892] [<ffffffff81002099>] do_one_initcall+0x7f/0x136 Jun 27 08:15:24 home-t710 kernel: [11735.638899] [<ffffffff8108608b>] sys_init_module+0x88/0x1d0 Jun 27 08:15:24 home-t710 kernel: [11735.638906] [<ffffffff81489682>] system_call_fastpath+0x16/0x1b Jun 27 08:15:24 home-t710 kernel: [11735.639642] bnad_pci_probe : (0xffff880427f3e000, 0xffffffffa020f3e0) PCI Func : (3) Jun 27 08:15:24 home-t710 kernel: [11735.639665] bna 0000:82:00.3: PCI INT A -> GSI 66 (level, low) -> IRQ 66 Jun 27 08:15:24 home-t710 kernel: [11735.639735] bar0 mapped to ffffc90014400000, len 262144 Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> --- drivers/net/bna/bnad.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-)