diff mbox

[e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110

Message ID 20160726093224.GA10339@wfg-t540p.sh.intel.com
State Not Applicable
Headers show

Commit Message

kbuild test robot July 26, 2016, 9:32 a.m. UTC
Hi Eric,

On Tue, Jul 26, 2016 at 11:14:52AM +0200, Eric Dumazet wrote:
>On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote:
>> Greetings,
>>
>> This BUG message can be found in recent kernels as well as v4.4 and
>> linux-stable. It happens when running
>>
>>         modprobe netconsole netconsole=@/,$port@$server/
>>
>> [   39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec
>> [   39.943285] netpoll: netconsole: local port 6665
>> [   39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0
>> [   39.943609] netpoll: netconsole: interface 'eth0'
>> [   39.943756] netpoll: netconsole: remote port 6672
>> [   39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1
>> [   39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff
>> [   39.944311] netpoll: netconsole: local IP 192.168.1.193
>> [   39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110
>> [   39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe
>> [   39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102
>> [   39.944518] Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
>> [   39.944522]  0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0
>> [   39.944524]  000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948
>> [   39.944525]  ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68
>> [   39.944526] Call Trace:
>> [   39.944533]  [<ffffffff813417d9>] dump_stack+0x63/0x8a
>> [   39.944536]  [<ffffffff810aec03>] ___might_sleep+0xd3/0x120
>> [   39.944537]  [<ffffffff810aec9a>] __might_sleep+0x4a/0x80
>> [   39.944541]  [<ffffffff810e4638>] synchronize_irq+0x38/0xa0
>> [   39.944543]  [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40
>> [   39.944545]  [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60
>> [   39.944547]  [<ffffffff810e492c>] disable_irq+0x1c/0x20
>> [   39.944559]  [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e]
>> [   39.944563]  [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0
>> [   39.944567]  [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90
>> [   39.944569]  [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250
>> [   39.944572]  [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450
>> [   39.944576]  [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole]
>> [   39.944578]  [<ffffffff810e04e5>] call_console_drivers+0x115/0x120
>> [   39.944580]  [<ffffffff810e1f13>] console_unlock+0x333/0x5c0
>> [   39.944583]  [<ffffffff810e2c74>] register_console+0x1c4/0x380
>> [   39.944586]  [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole]
>> [   39.944588]  [<ffffffffa004f000>] ? 0xffffffffa004f000
>> [   39.944591]  [<ffffffff8100216d>] do_one_initcall+0x3d/0x150
>> [   39.944592]  [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80
>> [   39.944596]  [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0
>> [   39.944598]  [<ffffffff8118f871>] do_init_module+0x5f/0x1d8
>> [   39.944602]  [<ffffffff81114009>] load_module+0x1429/0x1b40
>> [   39.944604]  [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40
>> [   39.944607]  [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0
>> [   39.944608]  [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80
>> [   39.944611]  [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0
>> [   39.944614]  [<ffffffff811149be>] SyS_finit_module+0xe/0x10
>> [   39.944617]  [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>> [   39.946384] console [netcon0] enabled
>> [   39.946514] netconsole: network logging started
>>
>> Can this be possibly fixed?
>
>Could you try this ?

Sure, thank you!

>diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
>index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644
>--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
>+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
>@@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev)
> {
> 	struct e1000_adapter *adapter = netdev_priv(netdev);
>
>-	disable_irq(adapter->pdev->irq);
>-	e1000_intr(adapter->pdev->irq, netdev);
>-	enable_irq(adapter->pdev->irq);
>+	if (napi_schedule_prep(&adapter->napi)) {
>+		adapter->total_tx_bytes = 0;
>+		adapter->total_tx_packets = 0;
>+		adapter->total_rx_bytes = 0;
>+		adapter->total_rx_packets = 0;
>+		__napi_schedule(&adapter->napi);
>+	}

btw, Nicholas also send us a patch for evaluation:

See if the below patch fixes the issue your reporting. Seems
that we are calling a schedule function incorrectly in atomic
interrupt context.
Regards,
Nicholas

Subject: [PATCH] schedule function called for e1000 driver interrupt
 incorrectly

Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

kernel test robot July 26, 2016, 9:45 a.m. UTC | #1
Hi,

[auto build test ERROR on jkirsher-next-queue/dev-queue]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Fengguang-Wu/schedule-function-called-for-e1000-driver-interrupt/20160726-173521
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git dev-queue
config: x86_64-randconfig-x011-201630 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/net/ethernet/intel/e1000/e1000_main.c: In function 'e1000_intr':
>> drivers/net/ethernet/intel/e1000/e1000_main.c:3800:22: error: 'struct e1000_adapter' has no member named 'watchdog_timer'; did you mean 'watchdog_task'?
       mod_timer(&adapter->watchdog_timer, jiffies + 1);
                         ^~

vim +3800 drivers/net/ethernet/intel/e1000/e1000_main.c

  3794			return IRQ_HANDLED;
  3795	
  3796		if (unlikely(icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))) {
  3797			hw->get_link_status = 1;
  3798			/* guard against interrupt when we're going down */
  3799			if (!test_bit(__E1000_DOWN, &adapter->flags))
> 3800				mod_timer(&adapter->watchdog_timer, jiffies + 1);
  3801		}
  3802	
  3803		/* disable interrupts, without the synchronize_irq bit */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Thomas Gleixner July 26, 2016, 9:50 a.m. UTC | #2
On Tue, 26 Jul 2016, Fengguang Wu wrote:
> --- a/drivers/net/ethernet/intel/e1000/e1000_main.c
> +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
> @@ -3797,7 +3797,7 @@ static irqreturn_t e1000_intr(int irq, void *data)
> 		hw->get_link_status = 1;
> 		/* guard against interrupt when we're going down */
> 		if (!test_bit(__E1000_DOWN, &adapter->flags))
> -			schedule_delayed_work(&adapter->watchdog_task, 1);
> +			mod_timer(&adapter->watchdog_timer, jiffies + 1);

ROTFL ....
kernel test robot July 26, 2016, 9:50 a.m. UTC | #3
Hi,

[auto build test ERROR on jkirsher-next-queue/dev-queue]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Fengguang-Wu/schedule-function-called-for-e1000-driver-interrupt/20160726-173521
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git dev-queue
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   drivers/net/ethernet/intel/e1000/e1000_main.c: In function 'e1000_intr':
>> drivers/net/ethernet/intel/e1000/e1000_main.c:3800:22: error: 'struct e1000_adapter' has no member named 'watchdog_timer'
       mod_timer(&adapter->watchdog_timer, jiffies + 1);
                         ^

vim +3800 drivers/net/ethernet/intel/e1000/e1000_main.c

  3794			return IRQ_HANDLED;
  3795	
  3796		if (unlikely(icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))) {
  3797			hw->get_link_status = 1;
  3798			/* guard against interrupt when we're going down */
  3799			if (!test_bit(__E1000_DOWN, &adapter->flags))
> 3800				mod_timer(&adapter->watchdog_timer, jiffies + 1);
  3801		}
  3802	
  3803		/* disable interrupts, without the synchronize_irq bit */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
diff mbox

Patch

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index f42129d..ae38513 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -3797,7 +3797,7 @@  static irqreturn_t e1000_intr(int irq, void *data)
 		hw->get_link_status = 1;
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->flags))
-			schedule_delayed_work(&adapter->watchdog_task, 1);
+			mod_timer(&adapter->watchdog_timer, jiffies + 1);
 	}
 
 	/* disable interrupts, without the synchronize_irq bit */