Message ID | 1d301fb86becf863283ee1be107f17e837ef4255.1521122595.git.dcaratti@redhat.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] net/sched: fix NULL dereference in the error path of tcf_vlan_init() | expand |
Thu, Mar 15, 2018 at 03:06:30PM CET, dcaratti@redhat.com wrote: >when the following command > > # tc actions replace action vlan pop index 100 > >is run for the first time, and tcf_vlan_init() fails allocating struct >tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes >the following error: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 > IP: __call_rcu+0x23/0x2b0 > PGD 80000000760a2067 P4D 80000000760a2067 PUD 742c1067 PMD 0 > Oops: 0002 [#1] SMP PTI > Modules linked in: act_vlan(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel mbcache snd_hda_codec jbd2 snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev soundcore virtio_balloon pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk virtio_net ata_piix crc32c_intel libata virtio_pci i2c_core virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan] > CPU: 3 PID: 3119 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 > Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 > RIP: 0010:__call_rcu+0x23/0x2b0 > RSP: 0018:ffffaac3005fb798 EFLAGS: 00010246 > RAX: ffffffffc0704080 RBX: ffff97f2b4bbe900 RCX: 00000000ffffffff > RDX: ffffffffabca5f00 RSI: 0000000000000010 RDI: 0000000000000010 > RBP: 0000000000000010 R08: 0000000000000001 R09: 0000000000000044 > R10: 00000000fd003000 R11: ffff97f2faab5b91 R12: 0000000000000000 > R13: ffffffffabca5f00 R14: ffff97f2fb80202c R15: 00000000fffffff4 > FS: 00007f68f75b4740(0000) GS:ffff97f2ffd80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000018 CR3: 0000000072b52001 CR4: 00000000001606e0 > Call Trace: > __tcf_idr_release+0x79/0xf0 > tcf_vlan_init+0x168/0x270 [act_vlan] > tcf_action_init_1+0x2cc/0x430 > tcf_action_init+0xd3/0x1b0 > tc_ctl_action+0x18b/0x240 > rtnetlink_rcv_msg+0x29c/0x310 > ? _cond_resched+0x15/0x30 > ? __kmalloc_node_track_caller+0x1b9/0x270 > ? rtnl_calcit.isra.28+0x100/0x100 > netlink_rcv_skb+0xd2/0x110 > netlink_unicast+0x17c/0x230 > netlink_sendmsg+0x2cd/0x3c0 > sock_sendmsg+0x30/0x40 > ___sys_sendmsg+0x27a/0x290 > ? filemap_map_pages+0x34a/0x3a0 > ? __handle_mm_fault+0xbfd/0xe20 > __sys_sendmsg+0x51/0x90 > do_syscall_64+0x6e/0x1a0 > entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > RIP: 0033:0x7f68f69c5ba0 > RSP: 002b:00007fffd79c1118 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00007fffd79c1240 RCX: 00007f68f69c5ba0 > RDX: 0000000000000000 RSI: 00007fffd79c1190 RDI: 0000000000000003 > RBP: 000000005aaa708e R08: 0000000000000002 R09: 0000000000000000 > R10: 00007fffd79c0ba0 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007fffd79c1254 R14: 0000000000000001 R15: 0000000000669f60 > Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89 > RIP: __call_rcu+0x23/0x2b0 RSP: ffffaac3005fb798 > CR2: 0000000000000018 > >fix this in tcf_vlan_cleanup(), ensuring that kfree_rcu(p, ...) is called >only when p is not NULL. > >Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update") >Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com>
Davide Caratti <dcaratti@redhat.com> writes: > when the following command > > # tc actions replace action vlan pop index 100 > > is run for the first time, and tcf_vlan_init() fails allocating struct > tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes > the following error: > [...] > fix this in tcf_vlan_cleanup(), ensuring that kfree_rcu(p, ...) is called > only when p is not NULL. > > Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update") > Signed-off-by: Davide Caratti <dcaratti@redhat.com> > --- > net/sched/act_vlan.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c > index e1a1b3f3983a..c2914e9a4a6f 100644 > --- a/net/sched/act_vlan.c > +++ b/net/sched/act_vlan.c > @@ -225,7 +225,8 @@ static void tcf_vlan_cleanup(struct tc_action *a) > struct tcf_vlan_params *p; > > p = rcu_dereference_protected(v->vlan_p, 1); > - kfree_rcu(p, rcu); > + if (p) > + kfree_rcu(p, rcu); > } > > static int tcf_vlan_dump(struct sk_buff *skb, struct tc_action *a, Good catch. I think you can propagate the fix on the other actions ->cleanup(), where private parameters structure may not be present at cleanup time, e.g. csum, ife.
On Thu, 2018-03-15 at 15:21 +0100, Jiri Pirko wrote:
...
> Acked-by: Jiri Pirko <jiri@mellanox.com>
thank you for reviewing!
apparently, also act_tunnel_key seem and act_csum have a similar problem.
I will check and eventually do a followup series this afternoon.
thank you,
regards
On Thu, 2018-03-15 at 15:29 +0100, Davide Caratti wrote: > On Thu, 2018-03-15 at 15:21 +0100, Jiri Pirko wrote: > ... > > > Acked-by: Jiri Pirko <jiri@mellanox.com> > > thank you for reviewing! > > apparently, also act_tunnel_key seem and act_csum have a similar problem. > I will check and eventually do a followup series this afternoon. > > thank you, > regards hello David, please drop this patch: after some tests, the following TC actions are affected by the same problem: act_vlan act_csum act_tunnel_key act_skbmod act_sample so, I'm posting right now a series that fixes all of them. In act_ife and act_bpf, the problem is potentially there, but we don't see it crashing yet because we don't call tcf_idr_release() on the error path. This is causing the leak of 'index', and will be fixed in another series tomorrow. thank you in advance, regards
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c index e1a1b3f3983a..c2914e9a4a6f 100644 --- a/net/sched/act_vlan.c +++ b/net/sched/act_vlan.c @@ -225,7 +225,8 @@ static void tcf_vlan_cleanup(struct tc_action *a) struct tcf_vlan_params *p; p = rcu_dereference_protected(v->vlan_p, 1); - kfree_rcu(p, rcu); + if (p) + kfree_rcu(p, rcu); } static int tcf_vlan_dump(struct sk_buff *skb, struct tc_action *a,
when the following command # tc actions replace action vlan pop index 100 is run for the first time, and tcf_vlan_init() fails allocating struct tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: __call_rcu+0x23/0x2b0 PGD 80000000760a2067 P4D 80000000760a2067 PUD 742c1067 PMD 0 Oops: 0002 [#1] SMP PTI Modules linked in: act_vlan(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel mbcache snd_hda_codec jbd2 snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev soundcore virtio_balloon pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk virtio_net ata_piix crc32c_intel libata virtio_pci i2c_core virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan] CPU: 3 PID: 3119 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__call_rcu+0x23/0x2b0 RSP: 0018:ffffaac3005fb798 EFLAGS: 00010246 RAX: ffffffffc0704080 RBX: ffff97f2b4bbe900 RCX: 00000000ffffffff RDX: ffffffffabca5f00 RSI: 0000000000000010 RDI: 0000000000000010 RBP: 0000000000000010 R08: 0000000000000001 R09: 0000000000000044 R10: 00000000fd003000 R11: ffff97f2faab5b91 R12: 0000000000000000 R13: ffffffffabca5f00 R14: ffff97f2fb80202c R15: 00000000fffffff4 FS: 00007f68f75b4740(0000) GS:ffff97f2ffd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000072b52001 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tcf_vlan_init+0x168/0x270 [act_vlan] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? filemap_map_pages+0x34a/0x3a0 ? __handle_mm_fault+0xbfd/0xe20 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7f68f69c5ba0 RSP: 002b:00007fffd79c1118 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fffd79c1240 RCX: 00007f68f69c5ba0 RDX: 0000000000000000 RSI: 00007fffd79c1190 RDI: 0000000000000003 RBP: 000000005aaa708e R08: 0000000000000002 R09: 0000000000000000 R10: 00007fffd79c0ba0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffd79c1254 R14: 0000000000000001 R15: 0000000000669f60 Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89 RIP: __call_rcu+0x23/0x2b0 RSP: ffffaac3005fb798 CR2: 0000000000000018 fix this in tcf_vlan_cleanup(), ensuring that kfree_rcu(p, ...) is called only when p is not NULL. Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update") Signed-off-by: Davide Caratti <dcaratti@redhat.com> --- net/sched/act_vlan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)