diff mbox

NULL pointer dereference when loading the gre module (3.10.0-rc4)

Message ID 1370618100.9844.73.camel@gandalf.local.home
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Steven Rostedt June 7, 2013, 3:15 p.m. UTC
On Fri, 2013-06-07 at 06:40 -0700, Eric Dumazet wrote:
> On Fri, 2013-06-07 at 10:54 +0200, Steinar H. Gunderson wrote:
> > On Thu, Jun 06, 2013 at 11:06:48PM -0400, Steven Rostedt wrote:
> > > Note the faulting address is 0xffffffffa0e52001, which is around the
> > > above address, be interesting to know what was at that location.
> > 
> > Doh, I looked at the wrong place in kallsyms:
> > 
> > ffffffffa0e52000 u ip_tunnel_init_net   [ip_gre]
> > ffffffffa0e55000 t gre_err      [gre]
> > ffffffffa0e5503d t gre_gso_send_check   [gre]
> > ffffffffa0e55053 t gre_rcv      [gre]
> > 
> > So it's really ip_tunnel_init_net+1.
> > 
> > /* Steinar */
> 
> " u " for ip_tunnel_init_net ?
> 
> Looks like someone forgot taking refcounts on a module ...
> 
> CC Pravin B Shelar, as this probably comes from commit
> c54419321455631079c7d6e60bc732dd0c5914c5
> ("GRE: Refactor GRE tunneling code.")

int __net_init ip_tunnel_init_net(struct net *net, int ip_tnl_net_id,
				  struct rtnl_link_ops *ops, char *devname)
{

[...]

}
EXPORT_SYMBOL_GPL(ip_tunnel_init_net);

Really, you exported a symbol that can go away if CONFIG_NET_NS is not
set?

----
net: Remove __net_init/exit from exported functions

If CONFIG_NET_NS is not set then __net_init is the same as __init and
__net_exit is the same as __exit. These functions will be removed from
memory after the module loads or is removed. Functions that are exported
for use by other functions should never be labeled for removal.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Steinar H. Gunderson June 7, 2013, 3:46 p.m. UTC | #1
On Fri, Jun 07, 2013 at 11:15:00AM -0400, Steven Rostedt wrote:
> net: Remove __net_init/exit from exported functions
> 
> If CONFIG_NET_NS is not set then __net_init is the same as __init and
> __net_exit is the same as __exit. These functions will be removed from
> memory after the module loads or is removed. Functions that are exported
> for use by other functions should never be labeled for removal.

That didn't help much, I'm afraid:

[   18.005451] BUG: unable to handle kernel NULL pointer dereference at 0000000000000003
[   18.013853] IP: [<ffffffffa0e76002>] 0xffffffffa0e76001
[   18.019380] PGD 0 
[   18.021695] Oops: 0000 [#1] SMP 
[   18.025285] Modules linked in: ip_gre(+) gre ip_tunnel psmouse ide_generic ide_gd_mod ide_cd_mod cdrom acpi_cpufreq mperf coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support lpc_ich microcode mfd_core i2c_i801 pcspkr i2c_core ehci_pci evbug evdev ext4 crc16 jbd2 mbcache dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod usbhid ide_pci_generic ide_core crc32c_intel e1000e ata_piix ptp pps_core uhci_hcd ehci_hcd mpt2sas raid_class unix
[   18.073543] CPU: 0 PID: 3263 Comm: modprobe Not tainted 3.10.0-rc4 #2
[   18.080237] Hardware name: Supermicro X8DTL/X8DTL, BIOS 2.1a       12/30/2011
[   18.087634] task: ffff88061ecfad60 ti: ffff8806212f0000 task.ti: ffff8806212f0000
[   18.095571] RIP: 0010:[<ffffffffa0e76002>]  [<ffffffffa0e76002>] 0xffffffffa0e76001
[   18.103745] RSP: 0018:ffff8806212f1ca8  EFLAGS: 00010246
[   18.109301] RAX: ffffffffa0e81000 RBX: ffff880623ebe280 RCX: 0000000000000000
[   18.116682] RDX: ffffffffa0e7ea40 RSI: 0000000000000003 RDI: ffffffffa0e81018
[   18.124063] RBP: ffff8806212f1ca8 R08: 0000000000000cf8 R09: ffffffff812bae96
[   18.131441] R10: ffffea0018852c00 R11: 0000000000000000 R12: ffff880621678290
[   18.138829] R13: ffffffffa0e7e9c0 R14: ffff8806212f1ef8 R15: 0000000000000002
[   18.146210] FS:  00007f2e37fd1700(0000) GS:ffff880627200000(0000) knlGS:0000000000000000
[   18.154747] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   18.160742] CR2: 0000000000000003 CR3: 0000000622a5e000 CR4: 00000000000007f0
[   18.168131] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   18.175510] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   18.182890] Stack:
[   18.185143]  ffff8806212f1cf8 ffffffff812baf26 2222222222222222 2222222222222222
[   18.193235]  2222222222222222 ffffffffa0e7e9c0 0000000000000000 0000000000000000
[   18.201313]  ffff8806212f1ef8 ffffffffa0e7eb60 ffff8806212f1d28 ffffffff812bafb6
[   18.209389] Call Trace:
[   18.212084]  [<ffffffff812baf26>] ops_init.constprop.7+0xc6/0xf5
[   18.218339]  [<ffffffff812bafb6>] register_pernet_operations.isra.4+0x61/0x91
[   18.225720]  [<ffffffff8138486f>] ? mutex_lock+0xf/0x20
[   18.231189]  [<ffffffff812bb006>] register_pernet_device+0x20/0x51
[   18.237621]  [<ffffffffa0e81034>] ? ipgre_tap_init_net+0x1a/0x1a [ip_gre]
[   18.244661]  [<ffffffffa0e81055>] ipgre_init+0x21/0xc9 [ip_gre]
[   18.250831]  [<ffffffffa0e81034>] ? ipgre_tap_init_net+0x1a/0x1a [ip_gre]
[   18.257866]  [<ffffffff81000263>] do_one_initcall+0x7b/0x10c
[   18.263780]  [<ffffffff8107e5db>] load_module+0x1b1f/0x1e19
[   18.269594]  [<ffffffff8107a4f8>] ? sys_getegid16+0x44/0x44
[   18.275416]  [<ffffffff81386cf2>] ? page_fault+0x22/0x30
[   18.280972]  [<ffffffff8107e969>] SyS_init_module+0x94/0xa1
[   18.286795]  [<ffffffff8138cf12>] system_call_fastpath+0x16/0x1b
[   18.293051] Code: <6e> 65 77 6c 69 6e 6b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[   18.302807] RIP  [<ffffffffa0e76002>] 0xffffffffa0e76001
[   18.308429]  RSP <ffff8806212f1ca8>
[   18.312163] CR2: 0000000000000003
[   18.316021] ---[ end trace 839c6b43b00f02f5 ]---

and still:

Ffffffffa0e76000 u ip_tunnel_init_net   [ip_gre]

I've checked that ip_tunnel.ko and ip_gre.ko was indeed rebuilt (new timestamps),
and that my patching (I had to resolve manually due to fuzz) really removed __net_init.

/* Steinar */
Steven Rostedt June 7, 2013, 4:12 p.m. UTC | #2
On Fri, 2013-06-07 at 17:46 +0200, Steinar H. Gunderson wrote:
> On Fri, Jun 07, 2013 at 11:15:00AM -0400, Steven Rostedt wrote:
> > net: Remove __net_init/exit from exported functions
> > 
> > If CONFIG_NET_NS is not set then __net_init is the same as __init and
> > __net_exit is the same as __exit. These functions will be removed from
> > memory after the module loads or is removed. Functions that are exported
> > for use by other functions should never be labeled for removal.
> 
> That didn't help much, I'm afraid:

Ouch :-/

> 
> [   18.005451] BUG: unable to handle kernel NULL pointer dereference at 0000000000000003
> [   18.013853] IP: [<ffffffffa0e76002>] 0xffffffffa0e76001
> [   18.019380] PGD 0 
> [   18.021695] Oops: 0000 [#1] SMP 
> [   18.025285] Modules linked in: ip_gre(+) gre ip_tunnel psmouse ide_generic ide_gd_mod ide_cd_mod cdrom acpi_cpufreq mperf coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support lpc_ich microcode mfd_core i2c_i801 pcspkr i2c_core ehci_pci evbug evdev ext4 crc16 jbd2 mbcache dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 md_mod sg sd_mod usbhid ide_pci_generic ide_core crc32c_intel e1000e ata_piix ptp pps_core uhci_hcd ehci_hcd mpt2sas raid_class unix
> [   18.073543] CPU: 0 PID: 3263 Comm: modprobe Not tainted 3.10.0-rc4 #2
> [   18.080237] Hardware name: Supermicro X8DTL/X8DTL, BIOS 2.1a       12/30/2011
> [   18.087634] task: ffff88061ecfad60 ti: ffff8806212f0000 task.ti: ffff8806212f0000
> [   18.095571] RIP: 0010:[<ffffffffa0e76002>]  [<ffffffffa0e76002>] 0xffffffffa0e76001
> [   18.103745] RSP: 0018:ffff8806212f1ca8  EFLAGS: 00010246
> [   18.109301] RAX: ffffffffa0e81000 RBX: ffff880623ebe280 RCX: 0000000000000000
> [   18.116682] RDX: ffffffffa0e7ea40 RSI: 0000000000000003 RDI: ffffffffa0e81018
> [   18.124063] RBP: ffff8806212f1ca8 R08: 0000000000000cf8 R09: ffffffff812bae96
> [   18.131441] R10: ffffea0018852c00 R11: 0000000000000000 R12: ffff880621678290
> [   18.138829] R13: ffffffffa0e7e9c0 R14: ffff8806212f1ef8 R15: 0000000000000002
> [   18.146210] FS:  00007f2e37fd1700(0000) GS:ffff880627200000(0000) knlGS:0000000000000000
> [   18.154747] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   18.160742] CR2: 0000000000000003 CR3: 0000000622a5e000 CR4: 00000000000007f0
> [   18.168131] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   18.175510] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   18.182890] Stack:
> [   18.185143]  ffff8806212f1cf8 ffffffff812baf26 2222222222222222 2222222222222222
> [   18.193235]  2222222222222222 ffffffffa0e7e9c0 0000000000000000 0000000000000000
> [   18.201313]  ffff8806212f1ef8 ffffffffa0e7eb60 ffff8806212f1d28 ffffffff812bafb6
> [   18.209389] Call Trace:
> [   18.212084]  [<ffffffff812baf26>] ops_init.constprop.7+0xc6/0xf5
> [   18.218339]  [<ffffffff812bafb6>] register_pernet_operations.isra.4+0x61/0x91
> [   18.225720]  [<ffffffff8138486f>] ? mutex_lock+0xf/0x20
> [   18.231189]  [<ffffffff812bb006>] register_pernet_device+0x20/0x51
> [   18.237621]  [<ffffffffa0e81034>] ? ipgre_tap_init_net+0x1a/0x1a [ip_gre]
> [   18.244661]  [<ffffffffa0e81055>] ipgre_init+0x21/0xc9 [ip_gre]
> [   18.250831]  [<ffffffffa0e81034>] ? ipgre_tap_init_net+0x1a/0x1a [ip_gre]
> [   18.257866]  [<ffffffff81000263>] do_one_initcall+0x7b/0x10c
> [   18.263780]  [<ffffffff8107e5db>] load_module+0x1b1f/0x1e19
> [   18.269594]  [<ffffffff8107a4f8>] ? sys_getegid16+0x44/0x44
> [   18.275416]  [<ffffffff81386cf2>] ? page_fault+0x22/0x30
> [   18.280972]  [<ffffffff8107e969>] SyS_init_module+0x94/0xa1
> [   18.286795]  [<ffffffff8138cf12>] system_call_fastpath+0x16/0x1b
> [   18.293051] Code: <6e> 65 77 6c 69 6e 6b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> [   18.302807] RIP  [<ffffffffa0e76002>] 0xffffffffa0e76001
> [   18.308429]  RSP <ffff8806212f1ca8>
> [   18.312163] CR2: 0000000000000003
> [   18.316021] ---[ end trace 839c6b43b00f02f5 ]---
> 
> and still:
> 
> Ffffffffa0e76000 u ip_tunnel_init_net   [ip_gre]

What do you get if you do an objdump -Dr ip_gre.ko

And then look for ipgre_init, and then subtract 0xb053 (45139) from its
address. As that is: ffffffffa0e81055 - ffffffffa0e76002, then see if
that object file has anything in that location.


> 
> I've checked that ip_tunnel.ko and ip_gre.ko was indeed rebuilt (new timestamps),
> and that my patching (I had to resolve manually due to fuzz) really removed __net_init.
> 
> /* Steinar */

There's also reverting c54419321455631079c7d6e60bc732dd0c5914c5 and see
if that fixes things. Just to confirm if that is the culprit.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steinar H. Gunderson June 7, 2013, 5:52 p.m. UTC | #3
On Fri, Jun 07, 2013 at 12:12:23PM -0400, Steven Rostedt wrote:
>> Ffffffffa0e76000 u ip_tunnel_init_net   [ip_gre]
> What do you get if you do an objdump -Dr ip_gre.ko
> 
> And then look for ipgre_init, and then subtract 0xb053 (45139) from its
> address. As that is: ffffffffa0e81055 - ffffffffa0e76002, then see if
> that object file has anything in that location.

pannekake:~> objdump -Dr /lib/modules/3.10.0-rc4/kernel/net/ipv4/ip_gre.ko | grep ipgre_init        
0000000000000000 <ipgre_init_net>:                                                          
   0:	8b 35 00 00 00 00    	mov    0x0(%rip),%esi        # 6 <ipgre_init_net+0x6>
  13:	e8 00 00 00 00       	callq  18 <ipgre_init_net+0x18>

Ie., the symbol doesn't show up in the disassembly (for whatever reason).

/* Steinar */
Steven Rostedt June 7, 2013, 6:26 p.m. UTC | #4
On Fri, 2013-06-07 at 19:52 +0200, Steinar H. Gunderson wrote:
> On Fri, Jun 07, 2013 at 12:12:23PM -0400, Steven Rostedt wrote:
> >> Ffffffffa0e76000 u ip_tunnel_init_net   [ip_gre]
> > What do you get if you do an objdump -Dr ip_gre.ko
> > 
> > And then look for ipgre_init, and then subtract 0xb053 (45139) from its
> > address. As that is: ffffffffa0e81055 - ffffffffa0e76002, then see if
> > that object file has anything in that location.
> 
> pannekake:~> objdump -Dr /lib/modules/3.10.0-rc4/kernel/net/ipv4/ip_gre.ko | grep ipgre_init        
> 0000000000000000 <ipgre_init_net>:                                                          
>    0:	8b 35 00 00 00 00    	mov    0x0(%rip),%esi        # 6 <ipgre_init_net+0x6>
>   13:	e8 00 00 00 00       	callq  18 <ipgre_init_net+0x18>
> 
> Ie., the symbol doesn't show up in the disassembly (for whatever reason).

Ah, that's because of this: module_init(ipgre_init);  Where it makes it
into:

00000000 <init_module>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   53                      push   %ebx
   4:   83 ec 08                sub    $0x8,%esp
   7:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        a: R_386_32     .rodata.str1.4

We can use ipgre_tap_init_net, and the offset of 0xb032 (45106) as that
was 0xffffffffa0e5d034 - 0xffffffffa0e52002. Do you have CONFIG_NET_NS
set?


You can also cat /proc/modules. It gives you where the modules are
located.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steinar H. Gunderson June 7, 2013, 6:34 p.m. UTC | #5
On Fri, Jun 07, 2013 at 02:26:08PM -0400, Steven Rostedt wrote:
> On Fri, 2013-06-07 at 19:52 +0200, Steinar H. Gunderson wrote:
> Ah, that's because of this: module_init(ipgre_init);  Where it makes it
> into:
> 
> 00000000 <init_module>:
>    0:   55                      push   %ebp
>    1:   89 e5                   mov    %esp,%ebp
>    3:   53                      push   %ebx
>    4:   83 ec 08                sub    $0x8,%esp
>    7:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
>                         a: R_386_32     .rodata.str1.4
> 
> We can use ipgre_tap_init_net, and the offset of 0xb032 (45106) as that
> was 0xffffffffa0e5d034 - 0xffffffffa0e52002. Do you have CONFIG_NET_NS
> set?

ipgre_tap_init_net is 000000000000001a, but there's no way I can subtract
0xb053 from that? Sorry, I'm confused. :-)

> You can also cat /proc/modules. It gives you where the modules are
> located.

I've booted back to 3.9.x already; I couldn't live with a crashing kernel like
that. Unfortunately it's not that easy for me to reboot this machine all the
time either. :-/

/* Steinar */
Steven Rostedt June 7, 2013, 6:44 p.m. UTC | #6
On Fri, 2013-06-07 at 20:34 +0200, Steinar H. Gunderson wrote:
> On Fri, Jun 07, 2013 at 02:26:08PM -0400, Steven Rostedt wrote:
> > On Fri, 2013-06-07 at 19:52 +0200, Steinar H. Gunderson wrote:
> > Ah, that's because of this: module_init(ipgre_init);  Where it makes it
> > into:
> > 
> > 00000000 <init_module>:
> >    0:   55                      push   %ebp
> >    1:   89 e5                   mov    %esp,%ebp
> >    3:   53                      push   %ebx
> >    4:   83 ec 08                sub    $0x8,%esp
> >    7:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
> >                         a: R_386_32     .rodata.str1.4
> > 
> > We can use ipgre_tap_init_net, and the offset of 0xb032 (45106) as that
> > was 0xffffffffa0e5d034 - 0xffffffffa0e52002. Do you have CONFIG_NET_NS
> > set?
> 
> ipgre_tap_init_net is 000000000000001a, but there's no way I can subtract
> 0xb053 from that? Sorry, I'm confused. :-)

OK, then its most likely in another module (likely the ip_tunnel.ko).

Do know if you have CONFIG_NET_NS set in your .config?

> 
> > You can also cat /proc/modules. It gives you where the modules are
> > located.
> 
> I've booted back to 3.9.x already; I couldn't live with a crashing kernel like
> that. Unfortunately it's not that easy for me to reboot this machine all the
> time either. :-/

OK, if you get time, this may be a candidate to do a git bisect with.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steinar H. Gunderson June 7, 2013, 6:46 p.m. UTC | #7
On Fri, Jun 07, 2013 at 02:44:19PM -0400, Steven Rostedt wrote:
> Do know if you have CONFIG_NET_NS set in your .config?

Sorry, I forgot to answer this: No, it is not set.

/* Steinar */
diff mbox

Patch

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index e4147ec..850b5b5 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -853,8 +853,8 @@  void ip_tunnel_dellink(struct net_device *dev, struct list_head *head)
 }
 EXPORT_SYMBOL_GPL(ip_tunnel_dellink);
 
-int __net_init ip_tunnel_init_net(struct net *net, int ip_tnl_net_id,
-				  struct rtnl_link_ops *ops, char *devname)
+int ip_tunnel_init_net(struct net *net, int ip_tnl_net_id,
+		       struct rtnl_link_ops *ops, char *devname)
 {
 	struct ip_tunnel_net *itn = net_generic(net, ip_tnl_net_id);
 	struct ip_tunnel_parm parms;
@@ -899,7 +899,7 @@  static void ip_tunnel_destroy(struct ip_tunnel_net *itn, struct list_head *head)
 		unregister_netdevice_queue(itn->fb_tunnel_dev, head);
 }
 
-void __net_exit ip_tunnel_delete_net(struct ip_tunnel_net *itn)
+void ip_tunnel_delete_net(struct ip_tunnel_net *itn)
 {
 	LIST_HEAD(list);