diff mbox

Kernel oops on setting sky2 interfaces down

Message ID 20090727153548.7c0d9f85@nehalam
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger July 27, 2009, 10:35 p.m. UTC
Does this help?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rene Mayrhofer July 28, 2009, 7:25 a.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen Hemminger wrote:
> Does this help?
Trying right now, will report results as soon as I have them.

Rene
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpup84ACgkQq7SPDcPCS96z8wCfZmWQMt1f5DHdOtsI1oCouqGU
dXwAoMXHAXKJNmZaWLiM6WjoIxEQWNlg
=+qLh
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer July 28, 2009, 9:48 a.m. UTC | #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen Hemminger wrote:
> Does this help?
> 
> --- a/drivers/net/sky2.c	2009-07-27 15:28:27.653757064 -0700
> +++ b/drivers/net/sky2.c	2009-07-27 15:34:24.358730966 -0700
> @@ -2763,6 +2763,11 @@ static int sky2_poll(struct napi_struct 
>  	int work_done = 0;
>  	u16 idx;
>  
> +	if (unlikely(status == ~0)) {
> +		dev_info(&hw->pdev->dev, "device status error\n");
> +		goto clear_napi;
> +	}
> +
>  	if (unlikely(status & Y2_IS_ERROR))
>  		sky2_err_intr(hw, status);
>  
> @@ -2779,6 +2784,7 @@ static int sky2_poll(struct napi_struct 
>  			goto done;
>  	}
>  
> +clear_napi:
>  	napi_complete(napi);
>  	sky2_read32(hw, B0_Y2_SP_LISR);
>  done:

With this applied, the behaviour is certainly different: On first
networking restart, the kernel still continues to run although there are
some phy errors. On the second restart, there is no Oops, but a BUG.


[~]# /etc/init.d/networking restart
Reconfiguring network interfaces...Removed VLAN -:quara.6:-
[  269.681295] sky2 0000:03:00.0: dmz: phy I/O error
[  269.686079] sky2 0000:03:00.0: dmz: phy I/O error
[  269.691000] sky2 0000:03:00.0: dmz: phy I/O error
[  269.695880] sky2 0000:03:00.0: dmz: phy I/O error
[  269.700751] sky2 0000:03:00.0: dmz: phy I/O error
[  269.705613] sky2 0000:03:00.0: dmz: phy I/O error
[  269.710519] sky2 0000:03:00.0: dmz: phy I/O error
[  269.715420] sky2 0000:03:00.0: dmz: phy I/O error
[  269.720290] sky2 0000:03:00.0: dmz: phy I/O error
[  269.725203] sky2 0000:03:00.0: dmz: phy I/O error
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 6 to IF -:testnet:-


[~]# /etc/init.d/networking restart
Reconfiguring network interfaces...[  298.296616] ICMPv6 NA: someone
advertises our address on lan!

[  299.360719] lan: hw csum failure.

[  299.364169] Pid: 11563, comm: sh Not tainted 2.6.28.10 #3

[  299.369763] Call Trace:

[  299.372317]  [<c09eb603>] __skb_checksum_complete_head+0x3e/0x4f

[  299.378536]  [<f827f1f7>] udp_error+0x124/0x198 [nf_conntrack]

[  299.384571]  [<f827f0d3>] udp_error+0x0/0x198 [nf_conntrack]

[  299.390429]  [<f827b7bd>] nf_conntrack_in+0x117/0x72a [nf_conntrack]

[  299.397138]  [<c08718e1>] handle_mm_fault+0x54a/0xbfa

[  299.402464]  [<c0a04da4>] nf_iterate+0x30/0x61

[  299.407148]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af

[  299.412108]  [<c0a04f17>] nf_hook_slow+0x49/0xbd

[  299.416893]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af

[  299.421760]  [<c0a0a5a9>] ip_rcv+0x1d6/0x20e

[  299.426185]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af

[  299.431067]  [<c09ef01d>] netif_receive_skb+0x3f7/0x435

[  299.436479]  [<f8085143>] sky2_poll+0x844/0xc21 [sky2]

[  299.441780]  [<f817f52e>] au_reval_and_lock_fdi+0x7e/0x540 [aufs]

[  299.448088]  [<f817f3aa>] au_reopen_nondir+0x0/0x106 [aufs]

[  299.453825]  [<c09eda45>] net_rx_action+0xb8/0x1f6

[  299.458886]  [<c082f954>] __do_softirq+0x95/0x142

[  299.463850]  [<c082fa49>] do_softirq+0x48/0x57

[  299.468547]  [<c082fbc9>] irq_exit+0x3b/0x78

[  299.473053]  [<c0806642>] do_IRQ+0x7a/0x8c

[  299.477306]  [<c0804e23>] common_interrupt+0x23/0x30

[  299.482543]  [<c089f0f1>] fsstack_copy_inode_size+0x1d/0x3f

[  299.488314]  [<f817b3a4>] au_cpup_attr_timesizes+0x4e/0x58 [aufs]

[  299.494615]  [<f8180b06>] aufs_flush+0x93/0xc9 [aufs]

[  299.499837]  [<c0884a24>] filp_close+0x2e/0x53

[  299.504559]  [<c0884ab4>] sys_close+0x6b/0xa4

[  299.509178]  [<c0803cb2>] syscall_call+0x7/0xb

[  299.513893]  [<c0a60000>] rwsem_down_failed_common+0xa4/0x175

[  299.621662] ------------[ cut here ]------------

[  299.625591] kernel BUG at drivers/net/sky2.c:1781!

[  299.625591] invalid opcode: 0000 [#1] PREEMPT SMP

[  299.625591] last sysfs file:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

[  299.625591] Modules linked in: xt_multiport cpufreq_userspace xt_DSCP
xt_length xt_mark xt_dscp xt_MARK xt_CONNMARK xt_comment xt_policy
ipt_REDIRECT ip6t_LOG xt_tcpudp ip6table_mangle iptable_mangle
ip6table_filter ip6_tables sit tunnel4 8021q garp stp llc ipt_LOG
xt_limit xt_state iptable_nat iptable_filter ip_tables x_tables dm_mod
p4_clockmod speedstep_lib freq_table tun imq nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 ipv6 evdev parport_pc parport pcspkr serio_raw i2c_i801
i2c_core iTCO_wdt rng_core intel_agp agpgart squashfs sqlzma unlzma loop
aufs exportfs nls_utf8 nls_cp437 ide_generic sd_mod ide_gd_mod
ata_generic pata_acpi ata_piix piix ide_pci_generic ide_core skge sky2
thermal_sys
[  299.625591]

[  299.625591] Pid: 11626, comm: ip Not tainted (2.6.28.10 #3)

[  299.625591] EIP: 0060:[<f80836a5>] EFLAGS: 00010256 CPU: 0

[  299.625591] EIP is at sky2_down+0x84/0x5c3 [sky2]

[  299.625591] EAX: f8010000 EBX: 00000280 ECX: 000006b4 EDX: f7060000

[  299.625591] ESI: 00000000 EDI: f684e980 EBP: 00000001 ESP: f5e35b78

[  299.625591]  DS: 0068 ES: 0068 FS: 00d8 GS: 0033 SS: 0068

[  299.625591] Process ip (pid: 11626, ti=f5e34000 task=f4a7e140
task.ti=f5e34000)

[  299.625591] Stack:

[  299.625591]  f7060000 00000303 00000004 f7060500 00000004 c09fc8c5
00000000 f7060290

[  299.625591]  f7060000 00001002 00001003 00000001 c09f00bd f7060000
c09efe15 00000000

[  299.625591]  ffffffef 00000000 00000000 f5e35ce4 c09f66b8 f5e35c28
f5e8f410 f7060000

[  299.625591] Call Trace:

[  299.625591]  [<c09fc8c5>] dev_deactivate+0x116/0x13b

[  299.625591]  [<c09f00bd>] dev_close+0x5f/0x7b

[  299.625591]  [<c09efe15>] dev_change_flags+0x9e/0x14f

[  299.625591]  [<c09f66b8>] do_setlink+0x28b/0x349

[  299.625591]  [<f80438b2>] skge_get_stats+0x2f/0x7b [skge]

[  299.625591]  [<c09f77a2>] rtnl_newlink+0x292/0x3f7

[  299.625591]  [<c09f756a>] rtnl_newlink+0x5a/0x3f7

[  299.625591]  [<c09f75aa>] rtnl_newlink+0x9a/0x3f7

[  299.625591]  [<c0862c3d>] find_get_page+0x87/0xaa

[  299.625591]  [<c09f7510>] rtnl_newlink+0x0/0x3f7

[  299.625591]  [<c09f74f6>] rtnetlink_rcv_msg+0x188/0x1a2

[  299.625591]  [<c09f736e>] rtnetlink_rcv_msg+0x0/0x1a2

[  299.625591]  [<c0a0390e>] netlink_rcv_skb+0x2d/0x73

[  299.625591]  [<c09f7368>] rtnetlink_rcv+0x19/0x1f

[  299.625591]  [<c0a03477>] netlink_unicast+0x1c7/0x229

[  299.625591]  [<c0a03729>] netlink_sendmsg+0x250/0x25d

[  299.625591]  [<c09e497a>] sock_sendmsg+0xc7/0xe1

[  299.625591]  [<c083b5fc>] autoremove_wake_function+0x0/0x2d

[  299.625591]  [<c083b5fc>] autoremove_wake_function+0x0/0x2d

[  299.625591]  [<c083b5fc>] autoremove_wake_function+0x0/0x2d

[  299.625591]  [<c09eb21b>] verify_iovec+0x3e/0x6b

[  299.625591]  [<c09e4b18>] sys_sendmsg+0x184/0x1e3

[  299.625591]  [<c09e574f>] sys_recvmsg+0x147/0x1e5

[  299.625591]  [<c09e5413>] sys_sendto+0xf9/0x124

[  299.625591]  [<c0a0298c>] netlink_insert+0xd2/0xef

[  299.625591]  [<c08757bb>] vma_merge+0x1d7/0x3ac

[  299.625591]  [<c09e5dd8>] sys_socketcall+0x177/0x1a9

[  299.625591]  [<c0875f3b>] sys_brk+0xd1/0xd9

[  299.625591]  [<c0803cb2>] syscall_call+0x7/0xb

[  299.625591]  [<c0a60000>] rwsem_down_failed_common+0xa4/0x175

[  299.625591] Code: b5 00 66 08 f8 b8 00 02 00 00 8d 8b 34 04 00 00 89
ca 03 17 89 02 8b 14 24 8b 82 00 05 00 00 8b 00 83 c0 04 66 8b 00 66 40
75 04 <0f> 0b eb fe 89 c8 03 07 8b 00 ba 05 00 00 00 8d 83 28 08 00 00

[  299.625591] EIP: [<f80836a5>] sky2_down+0x84/0x5c3 [sky2] SS:ESP
0068:f5e35b78

[  299.969972] sky2 0000:03:00.0: device status error


Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel[  299.982399] ---[ end trace c27f023d76d6060f ]---
:[  299.621662] ------------[ cut here ]------------

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591] invalid opcode: 0000 [#1] PREEMPT SMP

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591] last sysfs file:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed


Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591] Process ip (pid: 11626, ti=f5e34000 task=f4a7e140
task.ti=f5e34000)


Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591] Stack:

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  f7060000 00001002 00001003 00000001 c09f00bd
f7060000 c09efe15 00000000


Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  ffffffef 00000000 00000000 f5e35ce4 c09f66b8
f5e35c28 f5e8f410 f7060000


Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591] Call Trace:

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09fc8c5>] dev_deactivate+0x116/0x13b

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09f00bd>] dev_close+0x5f/0x7b

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09efe15>] dev_change_flags+0x9e/0x14f

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.6[  300.125238] lan: hw csum failure.
25591]  [<c09f66[  300.128944] Pid: 0, comm: swapper Tainted: G      D
  2.6.28.10 #3

b8>] do_setlink+[  300.136882] Call Trace:

0x28b/0x349

[  300.140827]  [<c09eb603>] __skb_checksum_complete_head+0x3e/0x4f


Message from s[  300.148422]  [<f827f1f7>] udp_error+0x124/0x198
[nf_conntrack]
yslogd@gibraltar[  300.155814]  [<f827f0d3>] udp_error+0x0/0x198
[nf_conntrack]
3-esys-master at[  300.163067]  [<f827b7bd>] nf_conntrack_in+0x117/0x72a
[nf_conntrack]

 Jul 28 11:46:45[  300.170988]  [<c08620a9>] cpupri_set+0xcf/0xeb

 ...

 kernel:[[  300.177061]  [<c08232cf>] enqueue_task_rt+0xfd/0x1ba

  299.625591]  [[  300.183686]  [<c081f9e3>] enqueue_task+0x52/0x5d

<f80438b2>] skge[  300.189847]  [<c0a604e8>]
_spin_unlock_irqrestore+0x22/0x39
_get_stats+0x2f/[  300.196991]  [<c0827013>] try_to_wake_up+0x158/0x162

0x7b [skge]

[  300.203501]  [<c0a04da4>] nf_iterate+0x30/0x61


Message from s[  300.209498]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
yslogd@gibraltar[  300.215661]  [<c0a04f17>] nf_hook_slow+0x49/0xbd
3-esys-master at[  300.221842]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
 Jul 28 11:46:45[  300.228078]  [<c0a0a5a9>] ip_rcv+0x1d6/0x20e
 ...
 kernel:[[  300.233875]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
  299.625591]  [[  300.240132]  [<c09ef01d>] netif_receive_skb+0x3f7/0x435
<c09f77a2>] rtnl[  300.246911]  [<f8085143>] sky2_poll+0x844/0xc21 [sky2]
_newlink+0x292/0[  300.253594]  [<c09eda45>] net_rx_action+0xb8/0x1f6
x3f7

Messa[  300.259939]  [<c082f954>] __do_softirq+0x95/0x142
ge from syslogd@[  300.266166]  [<c082fa49>] do_softirq+0x48/0x57
gibraltar3-esys-[  300.272137]  [<c082fbc9>] irq_exit+0x3b/0x78
master at Jul 28[  300.277945]  [<c0806642>] do_IRQ+0x7a/0x8c
 11:46:45 ...
[  300.283559]  [<c0804e23>] common_interrupt+0x23/0x30
 kernel:[  299.6[  300.290055]  [<c080a2c6>] mwait_idle+0x2f/0x3b
25591]  [<c09f75[  300.296043]  [<c0802ac9>] cpu_idle+0x7a/0xad
6a>] rtnl_newlink+0x5a/0x3f7

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09f75aa>] rtnl_newlink+0x9a/0x3f7

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c0862c3d>] find_get_page+0x87/0xaa

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09f7510>] rtnl_newlink+0x0/0x3f7

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09f74f6>] rtnetlink_rcv_msg+0x188/0x1a2

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09f736e>] rtnetlink_rcv_msg+0x0/0x1a2

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c0a0390e>] netlink_rcv_skb+0x2d/0x73

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09f7368>] rtnetlink_rcv+0x19/0x1f

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c0a03477>] netlink_unicast+0x1c7/0x229

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c0a03729>] netlink_sendmsg+0x250/0x25d

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09e497a>] sock_sendmsg+0xc7/0xe1

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c083b5fc>] autoremove_wake_function+0x0/0x2d

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:last message repeated 2 times

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09eb21b>] verify_iovec+0x3e/0x6b

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09e4b18>] sys_sendmsg+0x184/0x1e3

Message from syslogd@gibraltar3-esys-master at Jul 28 11:46:45 ...
 kernel:[  299.625591]  [<c09e574f>] sys_recvmsg+0x
Message from/etc/network/if-down.d/60address: line 25: 11626
Segmentation fault      ip link set dev $DEV down

run-parts: /etc/network/if-down.d/60address exited with return code 139

[  300.968019] sky2 0000:03:00.0: device status error

^C

[~]# [  302.000024] sky2 0000:03:00.0: device status error
[  302.746708] lan: hw csum failure.
[  302.750282] Pid: 0, comm: swapper Tainted: G      D    2.6.28.10 #3
[  302.757023] Call Trace:
[  302.759602]  [<c09eb603>] __skb_checksum_complete_head+0x3e/0x4f
[  302.765863]  [<f827f1f7>] udp_error+0x124/0x198 [nf_conntrack]
[  302.771934]  [<f827f0d3>] udp_error+0x0/0x198 [nf_conntrack]
[  302.777799]  [<f827b7bd>] nf_conntrack_in+0x117/0x72a [nf_conntrack]
[  302.784380]  [<c08210dd>] __wake_up_sync+0x2a/0x3e
[  302.789356]  [<c0a604e8>] _spin_unlock_irqrestore+0x22/0x39
[  302.795129]  [<c09e6ca5>] sock_def_readable+0x32/0x5b
[  302.800379]  [<c0a6050d>] _read_unlock+0xe/0x21
[  302.805080]  [<c09e8064>] sock_queue_rcv_skb+0xb5/0xbd
[  302.810421]  [<c0a25e9f>] __udp_queue_rcv_skb+0x12/0x86
[  302.815849]  [<c0a04da4>] nf_iterate+0x30/0x61
[  302.820442]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  302.825356]  [<c0a04f17>] nf_hook_slow+0x49/0xbd
[  302.830416]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  302.835510]  [<c0a0a5a9>] ip_rcv+0x1d6/0x20e
[  302.840117]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  302.845167]  [<c09ef01d>] netif_receive_skb+0x3f7/0x435
[  302.850791]  [<f8085143>] sky2_poll+0x844/0xc21 [sky2]
[  302.856337]  [<c0811a18>] lapic_next_event+0x10/0x13
[  302.861672]  [<c09eda45>] net_rx_action+0xb8/0x1f6
[  302.866654]  [<c082f954>] __do_softirq+0x95/0x142
[  302.871590]  [<c082fa49>] do_softirq+0x48/0x57
[  302.876335]  [<c082fbc9>] irq_exit+0x3b/0x78
[  302.880861]  [<c0806642>] do_IRQ+0x7a/0x8c
[  302.885184]  [<c0804e23>] common_interrupt+0x23/0x30
[  302.890422]  [<c080a2c6>] mwait_idle+0x2f/0x3b
[  302.896051]  [<c0802ac9>] cpu_idle+0x7a/0xad
[  303.000027] sky2 0000:03:00.0: device status error
[  303.058758] lan: hw csum failure.
[  303.062367] Pid: 0, comm: swapper Tainted: G      D    2.6.28.10 #3
[  303.068861] Call Trace:
[  303.071406]  [<c09eb603>] __skb_checksum_complete_head+0x3e/0x4f
[  303.077789]  [<f827f1f7>] udp_error+0x124/0x198 [nf_conntrack]
[  303.083871]  [<c0811a18>] lapic_next_event+0x10/0x13
[  303.089068]  [<f827f0d3>] udp_error+0x0/0x198 [nf_conntrack]
[  303.094929]  [<f827b7bd>] nf_conntrack_in+0x117/0x72a [nf_conntrack]
[  303.101514]  [<c0844533>] tick_program_event+0x2c/0x32
[  303.106851]  [<c083e621>] hrtimer_interrupt+0x146/0x16e
[  303.112313]  [<c082fb99>] irq_exit+0xb/0x78
[  303.116669]  [<c081218f>] smp_apic_timer_interrupt+0x75/0x7f
[  303.122585]  [<c0a04da4>] nf_iterate+0x30/0x61
[  303.127184]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  303.132067]  [<c0a04f17>] nf_hook_slow+0x49/0xbd
[  303.136911]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  303.141785]  [<c0a0a5a9>] ip_rcv+0x1d6/0x20e
[  303.146239]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  303.151112]  [<c09ef01d>] netif_receive_skb+0x3f7/0x435
[  303.156552]  [<f8085143>] sky2_poll+0x844/0xc21 [sky2]
[  303.161923]  [<c08417f0>] getnstimeofday+0x4f/0xd5
[  303.166879]  [<c09eda45>] net_rx_action+0xb8/0x1f6
[  303.171889]  [<c082f954>] __do_softirq+0x95/0x142
[  303.176758]  [<c082fa49>] do_softirq+0x48/0x57
[  303.181403]  [<c082fbc9>] irq_exit+0x3b/0x78
[  303.185850]  [<c0806642>] do_IRQ+0x7a/0x8c
[  303.190128]  [<c0804e23>] common_interrupt+0x23/0x30
[  303.195241]  [<c080a2c6>] mwait_idle+0x2f/0x3b
[  303.199879]  [<c0802ac9>] cpu_idle+0x7a/0xad
[  304.000039] sky2 0000:03:00.0: device status error
[  304.150934] lan: hw csum failure.
[  304.154546] Pid: 0, comm: swapper Tainted: G      D    2.6.28.10 #3
[  304.161288] Call Trace:
[  304.163876]  [<c09eb603>] __skb_checksum_complete_head+0x3e/0x4f
[  304.170102]  [<f827f1f7>] udp_error+0x124/0x198 [nf_conntrack]
[  304.176113]  [<c0811a18>] lapic_next_event+0x10/0x13
[  304.181260]  [<f827f0d3>] udp_error+0x0/0x198 [nf_conntrack]
[  304.187188]  [<f827b7bd>] nf_conntrack_in+0x117/0x72a [nf_conntrack]
[  304.193794]  [<c0844533>] tick_program_event+0x2c/0x32
[  304.199092]  [<c083e621>] hrtimer_interrupt+0x146/0x16e
[  304.204545]  [<c082fb99>] irq_exit+0xb/0x78
[  304.208887]  [<c081218f>] smp_apic_timer_interrupt+0x75/0x7f
[  304.214774]  [<c0a04da4>] nf_iterate+0x30/0x61
[  304.219390]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  304.224251]  [<c0a04f17>] nf_hook_slow+0x49/0xbd
[  304.229045]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  304.233922]  [<c0a0a5a9>] ip_rcv+0x1d6/0x20e
[  304.238348]  [<c0a0a124>] ip_rcv_finish+0x0/0x2af
[  304.243199]  [<c09ef01d>] netif_receive_skb+0x3f7/0x435
[  304.248634]  [<f8085143>] sky2_poll+0x844/0xc21 [sky2]
[  304.253961]  [<c0811a18>] lapic_next_event+0x10/0x13
[  304.259076]  [<c09eda45>] net_rx_action+0xb8/0x1f6
[  304.264063]  [<c082f954>] __do_softirq+0x95/0x142
[  304.268988]  [<c082fa49>] do_softirq+0x48/0x57
[  304.273595]  [<c082fbc9>] irq_exit+0x3b/0x78
[  304.278053]  [<c0806642>] do_IRQ+0x7a/0x8c
[  304.282326]  [<c0804e23>] common_interrupt+0x23/0x30
[  304.287467]  [<c080a2c6>] mwait_idle+0x2f/0x3b
[  304.292095]  [<c0802ac9>] cpu_idle+0x7a/0xad
[  305.000039] sky2 0000:03:00.0: device status error
[  306.000049] sky2 0000:03:00.0: device status error
[  307.000058] sky2 0000:03:00.0: device status error
[  308.000090] sky2 0000:03:00.0: device status error
[  309.000145] sky2 0000:03:00.0: device status error

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpuyU0ACgkQq7SPDcPCS95M+gCg/Sc8kO3FmHrQTdIqeKIzq1XI
gEwAoJez4jZCId+81exvRH6jF4Lzj922
=ryPP
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 3, 2009, 11:55 a.m. UTC | #3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have now tried again with the newest stable kernel (2.6.30.4), without
PaX and squashfs-lzma support. Still the same problem:

[~]# uname -a
Linux gibraltar3-esys-master 2.6.30.4 #9 SMP PREEMPT Fri Jul 31 15:32:55
UTC 2009 i686 GNU/Linux
[~]# /etc/init.d/networking restart
Reconfiguring network interfaces...[  277.816049] sky2 0000:01:00.0:
error interrupt status=0xffffffff
[  277.822124] sky2 0000:01:00.0: PCI hardware error (0xffff)
[  277.827656] sky2 0000:01:00.0: PCI Express error (0xffffffff)
[  277.833449] sky2 wan: ram data read parity error
[  277.838107] sky2 wan: ram data write parity error
[  277.842852] sky2 wan: MAC parity error
[  277.846643] sky2 wan: RX parity error
[  277.850345] sky2 wan: TCP segmentation error
[  277.854688] BUG: unable to handle kernel NULL pointer dereference at
0000038d
[  277.858653] IP: [<f8050ca5>] sky2_mac_intr+0x30/0xc1 [sky2]
[  277.858653] *pde = 00000000
[  277.858653] Oops: 0000 [#1] PREEMPT SMP
[  277.858653] last sysfs file:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
[  277.858653] Modules linked in: xt_multiport cpufreq_userspace xt_DSCP
xt_length xt_mark xt_dscp xt_MARK xt_CONNMARK xt_comment xt_policy
ipt_REDIRECT ip6t_LOG xt_tcpudp ip6table_mangle iptable_mangle
ip6table_filter ip6_tables sit tunnel4 8021q garp stp llc ipt_LOG
xt_limit xt_state iptable_nat iptable_filter ip_tables x_tables dm_mod
p4_clockmod speedstep_lib freq_table tun imq nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 ipv6 evdev parport_pc parport serio_raw i2c_i801 pcspkr
i2c_core iTCO_wdt rng_core intel_agp loop aufs exportfs nls_utf8
nls_cp437 ide_generic sd_mod ide_gd_mod ata_generic pata_acpi skge
ata_piix piix ide_pci_generic ide_core sky2 thermal_sys
[  277.858653]
[  277.858653] Pid: 9423, comm: tlsmgr Not tainted (2.6.30.4 #9)
[  277.858653] EIP: 0060:[<f8050ca5>] EFLAGS: 00010286 CPU: 0
[  277.858653] EIP is at sky2_mac_intr+0x30/0xc1 [sky2]
[  277.858653] EAX: f8068f88 EBX: 00000001 ECX: 00000008 EDX: 000000ff
[  277.858653] ESI: 00000000 EDI: f6901b80 EBP: f6acfce4 ESP: f6acfccc
[  277.858653]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  277.858653] Process tlsmgr (pid: 9423, ti=f6ace000 task=f7176e70
task.ti=f6ace000)
[  277.858653] Stack:
[  277.858653]  00000080 ff901b80 968c5f08 f71ed840 ffffffff ffffffff
f6acfd6c f80542d8
[  277.858653]  00000000 c181d260 00000040 f6901b88 f6acfd08 c04ee2b5
f6901b80 ffffffff
[  277.858653]  c022ded2 f71ef000 00000000 00000000 0000000f c181d260
00000000 00000246
[  277.858653] Call Trace:
[  277.858653]  [<f80542d8>] ? sky2_poll+0x1d2/0xb1e [sky2]
[  277.858653]  [<c04ee2b5>] ? _spin_unlock_irqrestore+0x31/0x44
[  277.858653]  [<c022ded2>] ? try_to_wake_up+0x291/0x2ac
[  277.858653]  [<c022df62>] ? wake_up_process+0x1b/0x2e
[  277.858653]  [<c04772f4>] ? __qdisc_run+0x73/0x1ca
[  277.858653]  [<c0463cc2>] ? net_rx_action+0x9e/0x1a2
[  277.858653]  [<c0237b5e>] ? __do_softirq+0xb2/0x188
[  277.858653]  [<c0237c73>] ? do_softirq+0x3f/0x5c
[  277.858653]  [<c0237dfd>] ? irq_exit+0x37/0x80
[  277.858653]  [<c0213cfd>] ? smp_apic_timer_interrupt+0x7c/0x9b
[  277.858653]  [<c02037dd>] ? apic_timer_interrupt+0x31/0x38
[  277.858653]  [<c0371524>] ? radix_tree_lookup_slot+0x34/0x79
[  277.858653]  [<c0284852>] ? find_get_page+0x34/0xc6
[  277.858653]  [<c0284c9e>] ? find_lock_page+0x21/0x67
[  277.858653]  [<c0285214>] ? filemap_fault+0x97/0x366
[  277.858653]  [<c0297054>] ? __do_fault+0x56/0x3b0
[  277.858653]  [<c02503a2>] ? getnstimeofday+0x5f/0xf3
[  277.858653]  [<c0252d85>] ? clockevents_program_event+0xe8/0x108
[  277.858653]  [<c0298f33>] ? handle_mm_fault+0x2b9/0x668
[  277.858653]  [<c024b121>] ? hrtimer_interrupt+0x13e/0x15f
[  277.858653]  [<c021d3f6>] ? do_page_fault+0x1fb/0x21b
[  277.858653]  [<c021d1fb>] ? do_page_fault+0x0/0x21b
[  277.858653]  [<c04ee72a>] ? error_code+0x7a/0x80
[  277.858653] Code: c7 56 53 89 d3 83 ec 0c 65 a1 14 00 00 00 89 45 f0
31 c0 8b 74 97 3c c1 e2 07 89 d0 05 08 0f 00 00 89 55 e8 03 07 8a 10 88
55 ef <f6> 86 8d 03 00 00 02 74 12 0f b6 c2 50 56 68 30 64 05 f8 e8 74
[  277.858653] EIP: [<f8050ca5>] sky2_mac_intr+0x30/0xc1 [sky2] SS:ESP
0068:f6acfccc
[  277.858653] CR2: 000000000000038d
[  278.173200] ---[ end trace bec12ce036036cbf ]---
[  278.177861] Kernel panic - not syncing: Fatal exception in interrupt
[  278.184259] Pid: 9423, comm: tlsmgr Tainted: G      D    2.6.30.4 #9
[  278.190654] Call Trace:
[  278.193140]  [<c04eb04e>] ? printk+0x1d/0x30
[  278.197452]  [<c04eaf8c>] panic+0x53/0xf8
[  278.201506]  [<c0206368>] oops_end+0x9f/0xbf
[  278.205817]  [<c021ceb4>] no_context+0x11a/0x135
[  278.210480]  [<c021d005>] __bad_area_nosemaphore+0x136/0x14f
[  278.216177]  [<c0374e68>] ? vsnprintf+0x91/0x332
[  278.220840]  [<c04ee2b5>] ? _spin_unlock_irqrestore+0x31/0x44
[  278.226622]  [<c04ee2b5>] ? _spin_unlock_irqrestore+0x31/0x44
[  278.232404]  [<c0232f3f>] ? release_console_sem+0x18b/0x1c9
[  278.238015]  [<c021d03b>] bad_area_nosemaphore+0x1d/0x34
[  278.243370]  [<c021d30b>] do_page_fault+0x110/0x21b
[  278.248287]  [<c021d1fb>] ? do_page_fault+0x0/0x21b
[  278.253209]  [<c04ee72a>] error_code+0x7a/0x80
[  278.257693]  [<c037007b>] ? kobject_uevent_env+0x42/0x387
[  278.263141]  [<f8050ca5>] ? sky2_mac_intr+0x30/0xc1 [sky2]
[  278.268673]  [<f80542d8>] sky2_poll+0x1d2/0xb1e [sky2]
[  278.273850]  [<c04ee2b5>] ? _spin_unlock_irqrestore+0x31/0x44
[  278.279632]  [<c022ded2>] ? try_to_wake_up+0x291/0x2ac
[  278.284818]  [<c022df62>] ? wake_up_process+0x1b/0x2e
[  278.289914]  [<c04772f4>] ? __qdisc_run+0x73/0x1ca
[  278.294750]  [<c0463cc2>] net_rx_action+0x9e/0x1a2
[  278.299578]  [<c0237b5e>] __do_softirq+0xb2/0x188
[  278.304321]  [<c0237c73>] do_softirq+0x3f/0x5c
[  278.308801]  [<c0237dfd>] irq_exit+0x37/0x80
[  278.313111]  [<c0213cfd>] smp_apic_timer_interrupt+0x7c/0x9b
[  278.318807]  [<c02037dd>] apic_timer_interrupt+0x31/0x38
[  278.324165]  [<c0371524>] ? radix_tree_lookup_slot+0x34/0x79
[  278.329869]  [<c0284852>] find_get_page+0x34/0xc6
[  278.334619]  [<c0284c9e>] find_lock_page+0x21/0x67
[  278.339447]  [<c0285214>] filemap_fault+0x97/0x366
[  278.344276]  [<c0297054>] __do_fault+0x56/0x3b0
[  278.348842]  [<c02503a2>] ? getnstimeofday+0x5f/0xf3
[  278.353847]  [<c0252d85>] ? clockevents_program_event+0xe8/0x108
[  278.359899]  [<c0298f33>] handle_mm_fault+0x2b9/0x668
[  278.364997]  [<c024b121>] ? hrtimer_interrupt+0x13e/0x15f
[  278.370445]  [<c021d3f6>] do_page_fault+0x1fb/0x21b
[  278.375364]  [<c021d1fb>] ? do_page_fault+0x0/0x21b
[  278.380287]  [<c04ee72a>] error_code+0x7a/0x80
[  278.384779] Rebooting in 30 seconds..

To allow easier debugging, I have now put our whole kernel tree up in a
public (read-only) git repository at
https://www.gibraltar.at/git/linux-2.6-gibraltar.git. The branch for
this kernel is origin/gibraltar-3.0, although the above dump was
produced by a version slightly "older" then HEAD, which did not yet have
the latest PaX patch applied (no PaX and no lzma-squashfs in this kernel).

Any hints/pointers/patches/etc. would be highly appreciated.

best regards,
Rene

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp20DYACgkQq7SPDcPCS96R3QCdGTJsPiJGLfiWUZk67f6wms9Y
rVgAoPMO2hnT3jwRtY0Qz40NRp0DpKxT
=8NsP
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 3, 2009, 6:19 p.m. UTC | #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rene Mayrhofer wrote:
> I have now tried again with the newest stable kernel (2.6.30.4), without
> PaX and squashfs-lzma support. Still the same problem:
Sorry for replying to myself, but I tried a few more things to do with MSI:

Neither
  find /sys -name "msi_bus" | while read f; do echo 0 > $f; done
nor booting with
  pci=nomsi
changed anything. The oops still happens when setting the last sky2
interface down.

> To allow easier debugging, I have now put our whole kernel tree up in a
> public (read-only) git repository at
> https://www.gibraltar.at/git/linux-2.6-gibraltar.git. The branch for
> this kernel is origin/gibraltar-3.0, although the above dump was
> produced by a version slightly "older" then HEAD, which did not yet have
> the latest PaX patch applied (no PaX and no lzma-squashfs in this kernel).
I have now updated the branch with both patches (the one from Stephen
and the other one Mike). Still trying if it changes anything with
2.6.30.4 (they didn't help with 2.6.28.10, though...).

best regards,
Rene

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkp3Kh0ACgkQq7SPDcPCS97QHgCgwdpi7RBPZNV1Of85/8qg5DsE
DWoAnjlT8U5wqN9ywxUyUpLyivH/Ex1h
=DCdB
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 4, 2009, 7:38 a.m. UTC | #5
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rene Mayrhofer wrote:
>> To allow easier debugging, I have now put our whole kernel tree up in a
>> public (read-only) git repository at
>> https://www.gibraltar.at/git/linux-2.6-gibraltar.git. The branch for
>> this kernel is origin/gibraltar-3.0, although the above dump was
>> produced by a version slightly "older" then HEAD, which did not yet have
>> the latest PaX patch applied (no PaX and no lzma-squashfs in this kernel).

> I have now updated the branch with both patches (the one from Stephen
> and the other one Mike). Still trying if it changes anything with
> 2.6.30.4 (they didn't help with 2.6.28.10, though...).

Result with both patches: there is no immediate crash when setting all
sky2 interfaces down, but I get the following messages repeated roughly
every second:

2009-08-04T09:35:31.030812+02:00 gibraltar3-esys-master kernel: [
592.000071] sky2 0000:01:00.0: device status error
2009-08-04T09:35:32.030908+02:00 gibraltar3-esys-master kernel: [
593.000058] sky2 0000:01:00.0: device status error
2009-08-04T09:35:33.030839+02:00 gibraltar3-esys-master kernel: [
594.000082] sky2 0000:01:00.0: device status error
2009-08-04T09:35:34.030864+02:00 gibraltar3-esys-master kernel: [
595.000118] sky2 0000:01:00.0: device status error
2009-08-04T09:35:35.030975+02:00 gibraltar3-esys-master kernel: [
596.000259] sky2 0000:01:00.0: device status error
2009-08-04T09:35:36.030974+02:00 gibraltar3-esys-master kernel: [
597.000198] sky2 0000:01:00.0: device status error
2009-08-04T09:35:37.030980+02:00 gibraltar3-esys-master kernel: [
598.000203] sky2 0000:01:00.0: device status error

and the network interface fails to work (no ping, nothing with tcpdump,
etc.).

Does anybody have an idea on what might be wrong in sky2_down?

best regards,
Rene
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp35WsACgkQq7SPDcPCS97wHQCcCYWO2qgg+LdW+BFUmeOXjGVT
B68AniD3Ur2NugPGhuvz3Fxy68Zl+3f4
=5MhE
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 4, 2009, 10:55 p.m. UTC | #6
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike McCormack wrote:
> 2009/8/4 Rene Mayrhofer <rene.mayrhofer@gibraltar.at>:
> 
>> Does anybody have an idea on what might be wrong in sky2_down?
> 
> btw. for 2.6.30, I found I could copy sky2.c from the netdev git into
> my 2.6.30 tree if I added the following line at the end of
> sky2_xmit_frame() :
> 
>         dev->trans_start = jiffies;     /* prevent tx timeout */

This seems to be already included in the current netdev git.

Nonetheless, the current unmodified version from netdev git solves the
oops in sky2. I have not diffed my old vs. this version, but whoever is
interested in which change fixed the oops, it should be somewhere in
commit 0a1449c in our Gibraltar kernel git repository.

Thanks a lot for that hint!
Rene
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp4vEQACgkQq7SPDcPCS97BtgCfZy1QTeQOL340hD0HIgTC1c3O
Gy0An1u8zdh4wyU4DchLfxNWzqlJExV+
=0+E4
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 4, 2009, 10:59 p.m. UTC | #7
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rene Mayrhofer wrote:
> Nonetheless, the current unmodified version from netdev git solves the
> oops in sky2. 
Actually, it doesn't. I managed to run networking restart twice without
an oops (with the netdev git version of sky2.c), but after generating
some minor traffic and trying to restart again, I still get this oops:

[~]# /etc/init.d/networking restart
Reconfiguring network interfaces...[  844.000236] sky2 0000:01:00.0:
error interrupt status=0xffffffff

[  844.007309] sky2 0000:01:00.0: PCI hardware error (0xffff)

[  844.013657] sky2 0000:01:00.0: PCI Express error (0xffffffff)

[  844.020290] sky2 wan: ram data read parity error

[  844.025697] sky2 wan: ram data write parity error

[  844.031148] sky2 wan: MAC parity error

[  844.035522] sky2 wan: RX parity error

[  844.039812] sky2 wan: TCP segmentation error

[  844.044966] BUG: unable to handle kernel NULL pointer dereference at
0000038d
[  844.048782] IP: [<f8050d2d>] sky2_mac_intr+0x30/0xc1 [sky2]

[  844.048782] *pde = 00000000

[  844.048782] Oops: 0000 [#1] PREEMPT SMP

[  844.048782] last sysfs file:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

[  844.048782] Modules linked in: xt_multiport cpufreq_userspace xt_DSCP
xt_length xt_mark xt_dscp xt_MARK xt_CONNMARK xt_comment xt_policy
ipt_REDIRECT ip6t_LOG xt_tcpudp ip6table_mangle iptable_mangle
ip6table_filter ip6_tables sit tunnel4 8021q garp stp llc ipt_LOG
xt_limit xt_state iptable_nat iptable_filter ip_tables x_tables dm_mod
p4_clockmod speedstep_lib freq_table tun imq nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 ipv6 evdev parport_pc parport serio_raw i2c_i801 i2c_core
iTCO_wdt rng_core pcspkr intel_agp loop aufs exportfs nls_utf8 nls_cp437
ide_generic sd_mod ide_gd_mod ata_generic pata_acpi ata_piix skge piix
ide_pci_generic ide_core sky2 thermal_sys

[  844.048782]

[  844.048782] Pid: 13285, comm: postfix Not tainted (2.6.30.4 #2)

[  844.048782] EIP: 0060:[<f8050d2d>] EFLAGS: 00010286 CPU: 0

[  844.048782] EIP is at sky2_mac_intr+0x30/0xc1 [sky2]

[  844.048782] EAX: f8068f88 EBX: 00000001 ECX: 00000008 EDX: 000000ff

[  844.048782] ESI: 00000000 EDI: f6901b80 EBP: e1c83e9c ESP: e1c83e84

[  844.048782]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068

[  844.048782] Process postfix (pid: 13285, ti=e1c82000 task=e1d105b0
task.ti=e1c82000)

[  844.048782] Stack:

[  844.048782]  00000080 ff901b80 eda21a93 f71ed840 ffffffff ffffffff
e1c83f28 f8054181

[  844.048782]  c022594e 00000000 00000040 f6901b88 00000003 eda21a93
f6901b80 ffffffff

[  844.048782]  c181d7a4 f71ef000 c0243594 00000000 c181d7a0 f702e130
eda21a93 e1c83eec

[  844.048782] Call Trace:

[  844.048782]  [<f8054181>] ? sky2_poll+0x1d2/0xb65 [sky2]

[  844.048782]  [<c022594e>] ? __wake_up+0x41/0x5c

[  844.048782]  [<c0243594>] ? insert_work+0xa5/0xbf

[  844.048782]  [<c04ee2a5>] ? _spin_unlock_irqrestore+0x31/0x44

[  844.048782]  [<c0243e4b>] ? __queue_work+0x36/0x4d

[  844.048782]  [<c047731c>] ? __qdisc_run+0x73/0x1ca

[  844.048782]  [<c0463ce6>] ? net_rx_action+0x9e/0x1a2

[  844.048782]  [<c0237b6e>] ? __do_softirq+0xb2/0x188

[  844.048782]  [<c0237c83>] ? do_softirq+0x3f/0x5c

[  844.048782]  [<c0237e0d>] ? irq_exit+0x37/0x80

[  844.048782]  [<c0213cfd>] ? smp_apic_timer_interrupt+0x7c/0x9b

[  844.048782]  [<c02037dd>] ? apic_timer_interrupt+0x31/0x38

[  844.048782] Code: c7 56 53 89 d3 83 ec 0c 65 a1 14 00 00 00 89 45 f0
31 c0 8b 74 97 3c c1 e2 07 89 d0 05 08 0f 00 00 89 55 e8 03 07 8a 10 88
55 ef <f6> 86 8d 03 00 00 02 74 12 0f b6 c2 50 56 68 d0 64 05 f8 e8 df

[  844.048782] EIP: [<f8050d2d>] sky2_mac_intr+0x30/0xc1 [sky2] SS:ESP
0068:e1c83e84

[  844.048782] CR2: 000000000000038d

[  844.345647] ---[ end trace d7398807329498ac ]---

[  844.351055] Kernel panic - not syncing: Fatal exception in interrupt

[  844.358606] Pid: 13285, comm: postfix Tainted: G      D    2.6.30.4
#2
[  844.366298] Call Trace:

[  844.369278]  [<c04eb041>] ? printk+0x1d/0x30

[  844.374388]  [<c04eaf7f>] panic+0x53/0xf8

[  844.379197]  [<c0206368>] oops_end+0x9f/0xbf

[  844.384303]  [<c021ceb4>] no_context+0x11a/0x135

[  844.389791]  [<c021d005>] __bad_area_nosemaphore+0x136/0x14f

[  844.396489]  [<c0374f60>] ? vsnprintf+0x91/0x332

[  844.401994]  [<c04ee2a5>] ? _spin_unlock_irqrestore+0x31/0x44

[  844.408787]  [<c04ee2a5>] ? _spin_unlock_irqrestore+0x31/0x44

[  844.415546]  [<c0232f4f>] ? release_console_sem+0x18b/0x1c9

[  844.422152]  [<c021d03b>] bad_area_nosemaphore+0x1d/0x34

[  844.428464]  [<c021d30b>] do_page_fault+0x110/0x21b

[  844.434271]  [<c021d1fb>] ? do_page_fault+0x0/0x21b

[  844.440026]  [<c04ee71a>] error_code+0x7a/0x80

[  844.445442]  [<c037007b>] ? add_uevent_var+0x17/0xb9

[  844.451413]  [<f8050d2d>] ? sky2_mac_intr+0x30/0xc1 [sky2]

[  844.457981]  [<f8054181>] sky2_poll+0x1d2/0xb65 [sky2]

[  844.464050]  [<c022594e>] ? __wake_up+0x41/0x5c

[  844.469437]  [<c0243594>] ? insert_work+0xa5/0xbf

[  844.475055]  [<c04ee2a5>] ? _spin_unlock_irqrestore+0x31/0x44

[  844.481817]  [<c0243e4b>] ? __queue_work+0x36/0x4d

[  844.487516]  [<c047731c>] ? __qdisc_run+0x73/0x1ca

[  844.493201]  [<c0463ce6>] net_rx_action+0x9e/0x1a2

[  844.498883]  [<c0237b6e>] __do_softirq+0xb2/0x188

[  844.504446]  [<c0237c83>] do_softirq+0x3f/0x5c

[  844.509720]  [<c0237e0d>] irq_exit+0x37/0x80

[  844.514791]  [<c0213cfd>] smp_apic_timer_interrupt+0x7c/0x9b

[  844.521488]  [<c02037dd>] apic_timer_interrupt+0x31/0x38

[  844.527811] Rebooting in 30 seconds..

This is with the newest version of sky2 as of today. Is this any
indication that traffic is needed to reproduce it? E.g. that a certain
number of interrupts must have already been handled to trigger the bug?

Again, any hints would be greatly appreciated (and sorry for being
persistent about this annoying little bug...).

best regards,
Rene
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp4vV8ACgkQq7SPDcPCS95UvgCfTNzwXKGxXi1SUfrMyLglF5Hf
mCkAnRZqfuA5KYkKCz53leWgxHBOLWMo
=Shq7
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Aug. 4, 2009, 11:08 p.m. UTC | #8
On Wed, 05 Aug 2009 00:59:43 +0200
Rene Mayrhofer <rene@mayrhofer.eu.org> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Rene Mayrhofer wrote:
> > Nonetheless, the current unmodified version from netdev git solves the
> > oops in sky2. 
> Actually, it doesn't. I managed to run networking restart twice without
> an oops (with the netdev git version of sky2.c), but after generating
> some minor traffic and trying to restart again, I still get this oops:
> 
> [~]# /etc/init.d/networking restart
> Reconfiguring network interfaces...[  844.000236] sky2 0000:01:00.0:
> error interrupt status=0xffffffff
> 
> [  844.007309] sky2 0000:01:00.0: PCI hardware error (0xffff)
> 
> [  844.013657] sky2 0000:01:00.0: PCI Express error (0xffffffff)

There is something about the hardware on your system that causes
the Marvell chip to not be present on the bus after the steps taken
in sky2_down.  Is there something unique about how it is wired to
the PCI express bus?

The sky2 driver has to handle the rare case of dual port board, so
in sky2_down in only shuts off part of the chip. Driver turns off the PHY
and stops receiver/transmitter.  It could be the power control bits
on your hardware turn off more than just the PHY. Or perhaps,
most systems have a low power input to keep chip alive for Wake On
Lan and that isn't present on your system. 

Maybe an option to not power down phy would be the simplest fix.
Mike McCormack Aug. 4, 2009, 11:53 p.m. UTC | #9
Rene Mayrhofer wrote:

> Again, any hints would be greatly appreciated (and sorry for being
> persistent about this annoying little bug...).

Hi Rene,

Thanks for being persistent in testing :-)  Looks like you've got a 
fairly unusual piece of hardware, as Stephen indicated.

Would you mind adding the phy_lock fix on top of the latest net-2.6
git version of sky2 and testing that?

thanks,

Mike
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 5, 2009, 12:14 p.m. UTC | #10
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike McCormack wrote:
>> Again, any hints would be greatly appreciated (and sorry for being
>> persistent about this annoying little bug...).
>
> Thanks for being persistent in testing :-)  Looks like you've got a 
> fairly unusual piece of hardware, as Stephen indicated.
Indeed, although I didn't think it _that_ unusual. It's just a 19" rack
appliance with 2 expansion slots for 4x LAN ports each. And those are
based around sky2.

But we have had problems before with kernel 2.4.34/.36 as well with that
hardware. They just weren't as easily reproducible but manifested
themselves in occasional malfunctions of the network devices that could
be solved by an ifdown/ifup cycle.
We still have one spare box and will try that one in case the hardware
is really flaky (which would be strange, given how reproducible it is
right now).

> Would you mind adding the phy_lock fix on top of the latest net-2.6
> git version of sky2 and testing that?
Tried it, doesn't fix the issue.

What would be the simplest change to stop disabling phy when the last
device goes down?

best regards,
Rene
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp5d6AACgkQq7SPDcPCS95OuACggTuTHsZd7m6IqHt0mrqUZbju
G4wAoPfPGr5G05E6HdO9kcKflGaSx7f5
=78yk
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike McCormack Aug. 5, 2009, 10:50 p.m. UTC | #11
Rene Mayrhofer wrote:

> What would be the simplest change to stop disabling phy when the last
> device goes down?

Commenting out the following line should stop all the phys from powering off:

sky2_phy_power_down(hw, port);

If you have a chance, please test "sky2: Add a mutex around ethtools operations" also.
it probably won't fix the problem you're seeing, but you never know...

thanks,

Mike

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 10, 2009, 10:28 a.m. UTC | #12
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike McCormack wrote:
> Rene Mayrhofer wrote:
> 
>> What would be the simplest change to stop disabling phy when the last
>> device goes down?
> 
> Commenting out the following line should stop all the phys from powering off:
> 
> sky2_phy_power_down(hw, port);
> 
> If you have a chance, please test "sky2: Add a mutex around ethtools operations" also.
> it probably won't fix the problem you're seeing, but you never know...

It seems that hardware is faulty, although in a very "interesting" way.
We tried changing the "slot" modules with 4 NICs each, which did not
change matters. However, another similar hardware appliance works. I am
thus not sure which component is at fault here, as (parts of) the NICs
were changed. Maybe the interrupt controller is weird on the "faulty"
box? ACPI issues? If anybody wants to track this any further, I am still
willing to test patches.

best regards,
Rene

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp/9moACgkQq7SPDcPCS979XACfRD6e5ixtX3oPiQCpC78nowO4
TH4Anivuo53VZsRO9LAIDIg7zYurW8UI
=MwmU
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 11, 2009, 8:54 a.m. UTC | #13
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rene Mayrhofer wrote:
> Mike McCormack wrote:
>> Rene Mayrhofer wrote:
> 
>>> What would be the simplest change to stop disabling phy when the last
>>> device goes down?
>> Commenting out the following line should stop all the phys from powering off:
> 
>> sky2_phy_power_down(hw, port);
> 
>> If you have a chance, please test "sky2: Add a mutex around ethtools operations" also.
>> it probably won't fix the problem you're seeing, but you never know...
> 
> It seems that hardware is faulty, although in a very "interesting" way.
> We tried changing the "slot" modules with 4 NICs each, which did not
> change matters. However, another similar hardware appliance works. 

Actually, it's not. After producing a bit of traffic, we still see the
same issue with the other hardware. It is therefore not likely to be a
real hardware fault in the sense that a specific appliances is broken.

Even after disabling the sky2_phy_power_down call in sky2_down, I get
the oops on restarting the interfaces:



[~]# /etc/init.d/networking restart
Reconfiguring network interfaces...Removed VLAN -:quara.6:-
RTNETLINK answers: Cannot assign requested address
run-parts: /etc/network/if-up.d/40address exited with return code 2
SIOCSIFFLAGS: Cannot assign requested address
Failed to bring up dmz.
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 6 to IF -:testnet:-
Starting radvd: radvd.
done.
[~]#
[~]#
[~]#
[~]# /etc/init.d/networking restart
Reconfiguring network interfaces...[  707.000123] sky2 0000:01:00.0:
error interrupt status=0xffffffff

[  707.006858] sky2 0000:01:00.0: PCI hardware error (0xffff)

[  707.012977] sky2 0000:01:00.0: PCI Express error (0xffffffff)

[  707.019381] sky2 wan: ram data read parity error

[  707.024531] sky2 wan: ram data write parity error

[  707.029775] sky2 wan: MAC parity error

[  707.033969] sky2 wan: RX parity error

[  707.038060] sky2 wan: TCP segmentation error

[  707.042904] BUG: unable to handle kernel NULL pointer dereference at
0000038d
[  707.046812] IP: [<f8068d2d>] sky2_mac_intr+0x30/0xc1 [sky2]

[  707.046812] *pde = 00000000

[  707.046812] Oops: 0000 [#1] PREEMPT SMP

[  707.046812] last sysfs file:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

[  707.046812] Modules linked in: xt_multiport cpufreq_userspace
ip6t_REJECT xt_DSCP xt_length xt_mark xt_dscp xt_MARK xt_IMQ xt_CONNMARK
xt_comment xt_policy ip6t_LOG xt_tcpudp ip6table_mangle iptable_mangle
ip6table_filter ip6_tables sit tunnel4 8021q garp stp llc ipt_LOG
xt_limit xt_state iptable_nat iptable_filter ip_tables x_tables dm_mod
p4_clockmod speedstep_lib freq_table tun imq nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 ipv6 evdev parport_pc parport i2c_i801 button i2c_core
iTCO_wdt processor serio_raw rng_core intel_agp pcspkr loop aufs
exportfs nls_utf8 nls_cp437 ide_generic sd_mod ata_generic pata_acpi
ata_piix ide_pci_generic skge ide_core sky2 thermal fan thermal_sys

[  707.145223]

[  707.145223] Pid: 11650, comm: 60address Not tainted (2.6.30.4 #3)

[  707.145223] EIP: 0060:[<f8068d2d>] EFLAGS: 00010286 CPU: 0

[  707.145223] EIP is at sky2_mac_intr+0x30/0xc1 [sky2]

[  707.145223] EAX: f8080f88 EBX: 00000001 ECX: 00000008 EDX: 000000ff

[  707.169707] ESI: 00000000 EDI: f68c8e80 EBP: e1983c08 ESP: e1983bf0

[  707.169707]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068

[  707.169707] Process 60address (pid: 11650, ti=e1982000 task=dc0ce030
task.ti=e1982000)

[  707.195323] Stack:

[  707.195323]  00000080 ff8c8e80 6f11c339 f71cef60 ffffffff ffffffff
e1983c94 f806c064

[  707.195323]  c04ee377 6f11c339 00000040 f68c8e88 f70c4bcc 00000000
f68c8e80 ffffffff

[  707.212226]  e1983ca4 f71d5800 c0243594 00000000 c06b7134 f707c230
00000001 00000000

[  707.212226] Call Trace:

[  707.212226]  [<f806c064>] ? sky2_poll+0x1d2/0xb66 [sky2]

[  707.232409]  [<c04ee377>] ? _spin_unlock+0x29/0x3c

[  707.232409]  [<c0243594>] ? insert_work+0xa5/0xbf

[  707.232409]  [<c047732c>] ? __qdisc_run+0x73/0x1ca

[  707.245403]  [<c0463cf6>] ? net_rx_action+0x9e/0x1a2

[  707.245403]  [<c0237b6e>] ? __do_softirq+0xb2/0x188

[  707.245403]  [<c0237c83>] ? do_softirq+0x3f/0x5c

[  707.245403]  [<c0237e0d>] ? irq_exit+0x37/0x80

[  707.245403]  [<c0213cfd>] ? smp_apic_timer_interrupt+0x7c/0x9b

[  707.245403]  [<c02037dd>] ? apic_timer_interrupt+0x31/0x38

[  707.245403]  [<c029804c>] ? unmap_vmas+0x1df/0x655

[  707.245403]  [<c028d170>] ? ____pagevec_lru_add+0x10b/0x12a

[  707.245403]  [<c029c293>] ? exit_mmap+0xb8/0x158

[  707.295480]  [<c02305e1>] ? mmput+0x2f/0xa5

[  707.295480]  [<c02b43b1>] ? flush_old_exec+0x3a0/0x630

[  707.295480]  [<c02b46da>] ? kernel_read+0x40/0x63

[  707.295480]  [<c02e25e9>] ? load_elf_binary+0x355/0x11e4

[  707.295480]  [<c0299591>] ? __get_user_pages+0x28f/0x310

[  707.295480]  [<c029964a>] ? get_user_pages+0x38/0x50

[  707.295480]  [<c02b3825>] ? get_arg_page+0x38/0x9c

[  707.295480]  [<c02b3b80>] ? search_binary_handler+0xed/0x273

[  707.295480]  [<c02e2294>] ? load_elf_binary+0x0/0x11e4

[  707.345549]  [<c02b4ed8>] ? do_execve+0x24d/0x35c

[  707.345549]  [<c02016f0>] ? sys_execve+0x34/0x6d

[  707.345549]  [<c0202df3>] ? sysenter_do_call+0x12/0x28

[  707.345549] Code: c7 56 53 89 d3 83 ec 0c 65 a1 14 00 00 00 89 45 f0
31 c0 8b 74 97 3c c1 e2 07 89 d0 05 08 0f 00 00 89 55 e8 03 07 8a 10 88
55 ef <f6> 86 8d 03 00 00 02 74 12 0f b6 c2 50 56 68 b4 e3 06 f8 e8 f3

[  707.345549] EIP: [<f8068d2d>] sky2_mac_intr+0x30/0xc1 [sky2] SS:ESP
0068:e1983bf0

[  707.395629] CR2: 000000000000038d

[  707.401711] ---[ end trace 78f2d616187daf45 ]---

[  707.406932] Kernel panic - not syncing: Fatal exception in interrupt


Message from[  707.414147] Pid: 11650, comm: 60address Tainted: G      D
   2.6.30.4 #3

 syslogd@gibralt[  707.423018] Call Trace:

ar3-esys-master [  707.427230]  [<c04eb055>] ? printk+0x1d/0x30

at Aug 11 10:47:[  707.433435]  [<c04eaf93>] panic+0x53/0xf8

03 ...

 kernel[  707.439358]  [<c0206368>] oops_end+0x9f/0xbf

:[  707.046812] [  707.445562]  [<c021ceb4>] no_context+0x11a/0x135

Oops: 0000 [#1] [  707.452146]  [<c021d005>]
__bad_area_nosemaphore+0x136/0x14f
PREEMPT SMP

[  707.459910]  [<c0374f70>] ? vsnprintf+0x91/0x332


Message from [  707.466510]  [<c04ee2bd>] ?
_spin_unlock_irqrestore+0x31/0x44
syslogd@gibralta[  707.474345]  [<c04ee2bd>] ?
_spin_unlock_irqrestore+0x31/0x44
r3-esys-master a[  707.482190]  [<c0232f4f>] ?
release_console_sem+0x18b/0x1c9
t Aug 11 10:47:0[  707.489813]  [<c021d03b>]
bad_area_nosemaphore+0x1d/0x34
3 ...

 kernel:[  707.497163]  [<c021d30b>] do_page_fault+0x110/0x21b

[  707.046812] l[  707.504052]  [<c021d1fb>] ? do_page_fault+0x0/0x21b

ast sysfs file: [  707.510906]  [<c04ee732>] error_code+0x7a/0x80

/sys/devices/sys[  707.517321]  [<c037007b>] ? add_uevent_var+0x7/0xb9

tem/cpu/cpu0/cpu[  707.524189]  [<f8068d2d>] ? sky2_mac_intr+0x30/0xc1
[sky2]
freq/scaling_set[  707.531735]  [<f806c064>] sky2_poll+0x1d2/0xb66
[sky2]
speed


Mess[  707.538873]  [<c04ee377>] ? _spin_unlock+0x29/0x3c
age from syslogd[  707.545648]  [<c0243594>] ? insert_work+0xa5/0xbf
@gibraltar3-esys[  707.552333]  [<c047732c>] ? __qdisc_run+0x73/0x1ca
- -master at Aug 1[  707.559115]  [<c0463cf6>] net_rx_action+0x9e/0x1a2
[  707.565893]  [<c0237b6e>] __do_softirq+0xb2/0x188

 kernel:[  707.[  707.572571]  [<c0237c83>] do_softirq+0x3f/0x5c
169707] Process [  707.578968]  [<c0237e0d>] irq_exit+0x37/0x80
60address (pid: [  707.585194]  [<c0213cfd>]
smp_apic_timer_interrupt+0x7c/0x9b
11650, ti=e19820[  707.592938]  [<c02037dd>]
apic_timer_interrupt+0x31/0x38
00 task=dc0ce030[  707.600296]  [<c029804c>] ? unmap_vmas+0x1df/0x655

 task.ti=e198200[  707.607074]  [<c028d170>] ?
____pagevec_lru_add+0x10b/0x12a
0)


Message[  707.614707]  [<c029c293>] exit_mmap+0xb8/0x158
 from syslogd@gi[  707.621097]  [<c02305e1>] mmput+0x2f/0xa5
braltar3-esys-ma[  707.627024]  [<c02b43b1>] flush_old_exec+0x3a0/0x630
ster at Aug 11 1[  707.633988]  [<c02b46da>] ? kernel_read+0x40/0x63
0:47:03 ...
 k[  707.640669]  [<c02e25e9>] load_elf_binary+0x355/0x11e4
ernel:[  707.195[  707.647821]  [<c0299591>] ? __get_user_pages+0x28f/0x310
323] Stack:
[  707.655179]  [<c029964a>] ? get_user_pages+0x38/0x50

Message from s[  707.662148]  [<c02b3825>] ? get_arg_page+0x38/0x9c
yslogd@gibraltar[  707.668929]  [<c02b3b80>]
search_binary_handler+0xed/0x273
3-esys-master at[  707.676471]  [<c02e2294>] ?
load_elf_binary+0x0/0x11e4
 Aug 11 10:47:03[  707.683677]  [<c02b4ed8>] do_execve+0x24d/0x35c

 ...

 kernel:[[  707.690143]  [<c02016f0>] sys_execve+0x34/0x6d

  707.195323]  c[  707.696519]  [<c0202df3>] sysenter_do_call+0x12/0x28

04ee377 6f11c339[  707.703480] Rebooting in 30 seconds..


Thus, there really seems to be an uncaught case in sky2.c. When
sky2_phy_power_down is not called, chip should not go down, right? But
still sky2_poll seems to be called (maybe by an interrupt belonging to
another network interface but the same chip)?

Any other hints?
Rene
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkqBMdoACgkQq7SPDcPCS94SugCguCfe45JB+nNi+jE28JynRWtX
2M4Ani/SHmCaslHWy9gf0UT2Egp6Ql1+
=K4Qh
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rene Mayrhofer Aug. 19, 2009, 7:01 a.m. UTC | #14
Hi everybody,

On Tuesday 11 August 2009 10:54:53 am Rene Mayrhofer wrote:
> Thus, there really seems to be an uncaught case in sky2.c. When
> sky2_phy_power_down is not called, chip should not go down, right? But
> still sky2_poll seems to be called (maybe by an interrupt belonging to
> another network interface but the same chip)?

Is there anything else I could try? We still have this issue, making one range 
of hardware appliances unusable with 2.6 kernels...

best regards,
Rene
diff mbox

Patch

--- a/drivers/net/sky2.c	2009-07-27 15:28:27.653757064 -0700
+++ b/drivers/net/sky2.c	2009-07-27 15:34:24.358730966 -0700
@@ -2763,6 +2763,11 @@  static int sky2_poll(struct napi_struct 
 	int work_done = 0;
 	u16 idx;
 
+	if (unlikely(status == ~0)) {
+		dev_info(&hw->pdev->dev, "device status error\n");
+		goto clear_napi;
+	}
+
 	if (unlikely(status & Y2_IS_ERROR))
 		sky2_err_intr(hw, status);
 
@@ -2779,6 +2784,7 @@  static int sky2_poll(struct napi_struct 
 			goto done;
 	}
 
+clear_napi:
 	napi_complete(napi);
 	sky2_read32(hw, B0_Y2_SP_LISR);
 done: