diff mbox

[BUG] latest net-next-2.6 doesnt fly

Message ID 1270202304.1989.14.camel@edumazet-laptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet April 2, 2010, 9:58 a.m. UTC
Le vendredi 02 avril 2010 à 11:40 +0200, Eric Dumazet a écrit :

> 
> [  206.020316] BUG: unable to handle kernel NULL pointer dereference at 000000b4
> [  206.020451] IP: [<c12d76b4>] illegal_highdma+0x44/0x170
> [  206.020543] *pde = 00000000 
> [  206.020627] Oops: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC
> [  206.020808] last sysfs file: /sys/devices/system/cpu/cpu3/cpufreq/stats/time_in_state
> [  206.020882] Modules linked in: xt_hashlimit ipmi_si ipmi_msghandler hpilo bonding
> [  206.021148] 
> [  206.021198] Pid: 4632, comm: emonitor Tainted: G      D W  2.6.34-rc1-01558-gba0ad27-dirty #599 /ProLiant BL460c G1
> [  206.021276] EIP: 0060:[<c12d76b4>] EFLAGS: 00010202 CPU: 4
> [  206.021332] EIP is at illegal_highdma+0x44/0x170
> [  206.021386] EAX: c23a7e80 EBX: 00000000 ECX: f1f75cb0 EDX: f292af30
> [  206.021443] ESI: 00000001 EDI: 00000001 EBP: ee83ab68 ESP: ee83ab58
> [  206.021500]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [  206.021556] Process emonitor (pid: 4632, ti=ee83a000 task=ee9726e0 task.ti=ee83a000)
> [  206.021629] Stack:
> [  206.021678]  00000000 f292af30 00010000 f2bdc800 ee83aba8 c12dcfb9 c1046b27 f2976f24
> [  206.021958] <0> ee83ab88 c1073810 c12e4275 f2976f00 ee83ab90 c107398b ee83ab9c c1046b27
> [  206.022316] <0> f2976f24 f292af30 f2976f00 f2976f24 ee83abec c12e428b f2976f48 00000000
> [  206.022717] Call Trace:
> [  206.022770]  [<c12dcfb9>] ? dev_queue_xmit+0x229/0x550
> [  206.022828]  [<c1046b27>] ? local_bh_enable_ip+0x67/0xd0
> [  206.022885]  [<c1073810>] ? trace_hardirqs_on_caller+0x20/0x190
> [  206.022943]  [<c12e4275>] ? neigh_resolve_output+0xd5/0x350
> [  206.023000]  [<c107398b>] ? trace_hardirqs_on+0xb/0x10
> [  206.023055]  [<c1046b27>] ? local_bh_enable_ip+0x67/0xd0
> [  206.023111]  [<c12e428b>] ? neigh_resolve_output+0xeb/0x350
> [  206.023169]  [<c12f0008>] ? qdisc_create+0x1f8/0x340
> [  206.023225]  [<c12ed8f0>] ? eth_header+0x0/0xb0
> [  206.023282]  [<c130dc64>] ? ip_finish_output2+0xc4/0x280
> [  206.023339]  [<c12fe4b8>] ? nf_hook_slow+0x108/0x140
> [  206.023394]  [<c130de20>] ? ip_finish_output+0x0/0x70
> [  206.023450]  [<c130de6c>] ? ip_finish_output+0x4c/0x70
> [  206.023506]  [<c130df42>] ? ip_output+0xb2/0xd0
> [  206.023560]  [<c130de20>] ? ip_finish_output+0x0/0x70
> [  206.023616]  [<c130d31d>] ? ip_local_out+0x1d/0x30
> [  206.023671]  [<c130d7cd>] ? ip_queue_xmit+0x13d/0x380
> [  206.023728]  [<c10b5434>] ? get_page_from_freelist+0x254/0x510
> [  206.023785]  [<c12d0517>] ? __skb_clone+0x27/0xe0
> [  206.023841]  [<c132120d>] ? tcp_transmit_skb+0x35d/0x7a0
> [  206.023898]  [<c13231e1>] ? tcp_write_xmit+0x1e1/0x980
> [  206.023955]  [<c10c6de2>] ? might_fault+0x62/0xb0
> [  206.024010]  [<c13239b5>] ? tcp_push_one+0x35/0x40
> [  206.024066]  [<c1317cc8>] ? tcp_sendmsg+0x898/0x910
> [  206.024123]  [<c12ca08b>] ? sock_aio_write+0xfb/0x110
> [  206.024180]  [<c10e370d>] ? do_sync_readv_writev+0x9d/0xe0
> [  206.024237]  [<c10e35b0>] ? rw_copy_check_uvector+0x80/0xf0
> [  206.024257]  [<c10e4431>] ? do_readv_writev+0xa1/0x1b0
> [  206.024257]  [<c12c9f90>] ? sock_aio_write+0x0/0x110
> [  206.024257]  [<c10e4950>] ? rcu_read_unlock+0x0/0x50
> [  206.024257]  [<c10e4976>] ? rcu_read_unlock+0x26/0x50
> [  206.024257]  [<c10e4a6b>] ? fget_light+0xcb/0xe0
> [  206.024257]  [<c10e4585>] ? vfs_writev+0x45/0x60
> [  206.024257]  [<c10e4676>] ? sys_writev+0x46/0x70
> [  206.024257]  [<c1002e50>] ? sysenter_do_call+0x12/0x36
> [  206.024257] Code: 0d 80 34 53 c1 8b 49 3c 85 c9 0f 84 37 01 00 00 8b 8a a0 00 00 00 8b 98 34 03 00 00 0f b7 71 04 85 f6 0f 84 1f 01 00 00 8b 41 2c <8b> 9b b4 00 00 00 8b 10 c1 ea 1a 85 db 8b 14 d5 c0 04 cb c1 74 
> [  206.024257] EIP: [<c12d76b4>] illegal_highdma+0x44/0x170 SS:ESP 0068:ee83ab58
> [  206.024257] CR2: 00000000000000b4
> [  206.027098] ---[ end trace 2b194fa03b7756a0 ]---

Here is the patch I did to solve this problem

[PATCH net-next-2.6] net: illegal_highdma() fix

Followup to commit 5acbbd428db47b12f137a8a2aa96b3c0a96b744e
(net: change illegal_highdma to use dma_mask)

If dev->dev.parent is NULL, we should not try to dereference it.

Dont force inline illegal_highdma() as its pretty big now.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/core/dev.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller April 2, 2010, 8:35 p.m. UTC | #1
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 02 Apr 2010 11:58:24 +0200

> [PATCH net-next-2.6] net: illegal_highdma() fix
> 
> Followup to commit 5acbbd428db47b12f137a8a2aa96b3c0a96b744e
> (net: change illegal_highdma to use dma_mask)
> 
> If dev->dev.parent is NULL, we should not try to dereference it.
> 
> Dont force inline illegal_highdma() as its pretty big now.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks for tracking this down.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
FUJITA Tomonori April 4, 2010, 9:16 a.m. UTC | #2
On Fri, 02 Apr 2010 11:58:24 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> diff --git a/net/core/dev.c b/net/core/dev.c
> index e19cdae..c6b5206 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1801,7 +1801,7 @@ EXPORT_SYMBOL(netdev_rx_csum_fault);
>   * 2. No high memory really exists on this machine.
>   */
>  
> -static inline int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
> +static int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
>  {
>  #ifdef CONFIG_HIGHMEM
>  	int i;
> @@ -1814,6 +1814,8 @@ static inline int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
>  	if (PCI_DMA_BUS_IS_PHYS) {
>  		struct device *pdev = dev->dev.parent;
>  
> +		if (!pdev)
> +			return 0;

Sorry about that and thanks for the fix.

I think, if pdev is null, returning 1 here is safer since the device
doesn't set up dma info properly.

Do you know what device hits this bug? You said that you use bnx2 and
tg3. Both call SET_NETDEV_DEV with pdev->dev. I tested bnx2 and seems
that netdev->dev.parent is set up correctly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 4, 2010, 9:29 a.m. UTC | #3
Le dimanche 04 avril 2010 à 18:16 +0900, FUJITA Tomonori a écrit :
> > +			return 0;
> 
> Sorry about that and thanks for the fix.
> 
> I think, if pdev is null, returning 1 here is safer since the device
> doesn't set up dma info properly.
> 
> Do you know what device hits this bug? You said that you use bnx2 and
> tg3. Both call SET_NETDEV_DEV with pdev->dev. I tested bnx2 and seems
> that netdev->dev.parent is set up correctly.
> --

Might be because of my setup, I suspect I had two reasons to hit the
bug :

A bonding of eth2 (bnx2) and eth3 (tg3)

Then vlans on top of this bond0

When first dev_queue_xmit() was called, it was for a virtual device :)

# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
qlen 1000
    link/ether 00:1e:0b:ec:d3:dc brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
master bond0 state UP qlen 1000
    link/ether 00:1e:0b:ec:d3:d2 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc hfsc
master bond0 state UP qlen 1000
    link/ether 00:1e:0b:ec:d3:d2 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1e:0b:92:78:51 brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP 
    link/ether 00:1e:0b:ec:d3:d2 brd ff:ff:ff:ff:ff:ff
7: vlan.103@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500
qdisc pfifo_fast state UP qlen 100
    link/ether 00:1e:0b:ec:d3:d2 brd ff:ff:ff:ff:ff:ff
8: vlan.825@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500
qdisc pfifo_fast state UP qlen 1000
    link/ether 00:1e:0b:ec:d3:d2 brd ff:ff:ff:ff:ff:ff


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
FUJITA Tomonori April 4, 2010, 10:19 a.m. UTC | #4
On Sun, 04 Apr 2010 11:29:55 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> > I think, if pdev is null, returning 1 here is safer since the device
> > doesn't set up dma info properly.
> > 
> > Do you know what device hits this bug? You said that you use bnx2 and
> > tg3. Both call SET_NETDEV_DEV with pdev->dev. I tested bnx2 and seems
> > that netdev->dev.parent is set up correctly.
> > --
> 
> Might be because of my setup, I suspect I had two reasons to hit the
> bug :
> 
> A bonding of eth2 (bnx2) and eth3 (tg3)
> 
> Then vlans on top of this bond0
> 
> When first dev_queue_xmit() was called, it was for a virtual device :)

Thanks! So it's due to bond or vlan (or both).

I guess that returning zero here with a null pdev is fine. If we
return 1, probably some people would complain about performance
regression. Like the block layer does, coping the dma restriction info
from the lower devices can solve this problem but I guess that it's
over engineering. My original patch doesn't loosen the DMA restriction
checking so returning zero shouldn't break anything. If when we fix
the usage of NETIF_F_HIGHDMA in each driver and also check the usage
of netdev->dev.parent, everything should be fine.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index e19cdae..c6b5206 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1801,7 +1801,7 @@  EXPORT_SYMBOL(netdev_rx_csum_fault);
  * 2. No high memory really exists on this machine.
  */
 
-static inline int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
+static int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
 {
 #ifdef CONFIG_HIGHMEM
 	int i;
@@ -1814,6 +1814,8 @@  static inline int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
 	if (PCI_DMA_BUS_IS_PHYS) {
 		struct device *pdev = dev->dev.parent;
 
+		if (!pdev)
+			return 0;
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			dma_addr_t addr = page_to_phys(skb_shinfo(skb)->frags[i].page);
 			if (!pdev->dma_mask || addr + PAGE_SIZE - 1 > *pdev->dma_mask)