diff mbox series

fq_codel: fix NULL pointer deref in fq_codel_reset

Message ID 20180608000636.4338-1-jacob.e.keller@intel.com
State Changes Requested
Delegated to: Jeff Kirsher
Headers show
Series fq_codel: fix NULL pointer deref in fq_codel_reset | expand

Commit Message

Jacob Keller June 8, 2018, 12:06 a.m. UTC
The function qdisc_create_dftl attempts to create a default qdisc. If
this fails, it calls qdisc_destroy when cleaning up. The qdisc_destroy
function calls the ->reset op on the qdisc.

In the case of sch_fq_codel.c, this function will panic when the qdisc
wasn't properly initialized:

   kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
   kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]
   kernel: PGD 0 P4D 0
   kernel: Oops: 0000 [#1] SMP PTI
   kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei joydev i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200 ixgbe drm_kms_helper isci ttm firewire_ohci
   kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf ipmi_msghandler [last unloaded: i40e]
   kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE    4.16.13custom-fq-codel-test+ #3
   kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.05.0004.051120151007 05/11/2015
   kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]
   kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246
   kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX: 00000000000005b9
   kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00
   kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9
   kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00
   kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15: 0000000000000001
   kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000) knlGS:0000000000000000
   kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4: 00000000000606e0
   kernel: Call Trace:
   kernel:  qdisc_destroy+0x56/0x140
   kernel:  qdisc_create_dflt+0x8b/0xb0
   kernel:  mq_init+0xc1/0xf0
   kernel:  qdisc_create_dflt+0x5a/0xb0
   kernel:  dev_activate+0x205/0x230
   kernel:  __dev_open+0xf5/0x160
   kernel:  __dev_change_flags+0x1a3/0x210
   kernel:  dev_change_flags+0x21/0x60
   kernel:  do_setlink+0x660/0xdf0
   kernel:  ? down_trylock+0x25/0x30
   kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]
   kernel:  ? rtnl_newlink+0x816/0x990
   kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]
   kernel:  ? _cond_resched+0x15/0x30
   kernel:  ? kmem_cache_alloc+0x20/0x1b0
   kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0
   kernel:  ? rtnl_calcit.isra.30+0x100/0x100
   kernel:  ? netlink_rcv_skb+0x4c/0x120
   kernel:  ? netlink_unicast+0x19e/0x260
   kernel:  ? netlink_sendmsg+0x1ff/0x3c0
   kernel:  ? sock_sendmsg+0x36/0x40
   kernel:  ? ___sys_sendmsg+0x295/0x2f0
   kernel:  ? ebitmap_cmp+0x6d/0x90
   kernel:  ? dev_get_by_name_rcu+0x73/0x90
   kernel:  ? skb_dequeue+0x52/0x60
   kernel:  ? __inode_wait_for_writeback+0x7f/0xf0
   kernel:  ? bit_waitqueue+0x30/0x30
   kernel:  ? fsnotify_grab_connector+0x3c/0x60
   kernel:  ? __sys_sendmsg+0x51/0x90
   kernel:  ? do_syscall_64+0x74/0x180
   kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
   kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00 00 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08 48 8b 3b e8 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00
   kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP: ffffbfbf4c1fb620
   kernel: CR2: 0000000000000008
   kernel: ---[ end trace e81a62bede66274e ]---

This occurs because if fq_codel_init fails, it has left the private data
in an incomplete state. For example, if tcf_block_get fails, (as in the
above panic), then q->flows and q->backlogs will be NULL. Thus they will
cause NULL pointer access when attempting to reset them in
fq_codel_reset.

We could mitigate some of these issues by changing fq_codel_init to more
explicitly cleanup after itself when failing. For example, we could
ensure that q->flowcnt was set to 0 so that the loop over each flow in
fq_codel_reset would not trigger. However, this would not prevent a NULL
pointer dereference when attempting to memset the q->backlogs.

Instead, just add a NULL check prior to attempting to reset these
fields.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
---
 net/sched/sch_fq_codel.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

Jacob Keller June 8, 2018, 12:08 a.m. UTC | #1
> -----Original Message-----
> From: Keller, Jacob E
> Sent: Thursday, June 07, 2018 5:07 PM
> To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org>
> Cc: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Eric Dumazet <edumazet@google.com>
> Subject: [PATCH] fq_codel: fix NULL pointer deref in fq_codel_reset
> 

This should be targeting [net] as that's where I built it, as it is a bug fix. I forgot to set the subject prefix when sending the mail.

Thanks,
Jake

> The function qdisc_create_dftl attempts to create a default qdisc. If
> this fails, it calls qdisc_destroy when cleaning up. The qdisc_destroy
> function calls the ->reset op on the qdisc.
> 
> In the case of sch_fq_codel.c, this function will panic when the qdisc
> wasn't properly initialized:
> 
>    kernel: BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000008
>    kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]
>    kernel: PGD 0 P4D 0
>    kernel: Oops: 0000 [#1] SMP PTI
>    kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc
> devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert
> iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt
> target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm
> ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt
> iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei joydev
> i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200 ixgbe
> drm_kms_helper isci ttm firewire_ohci
>    kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core
> scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf
> ipmi_msghandler [last unloaded: i40e]
>    kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE    4.16.13custom-fq-
> codel-test+ #3
>    kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS
> SE5C600.86B.02.05.0004.051120151007 05/11/2015
>    kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]
>    kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246
>    kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX: 00000000000005b9
>    kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00
>    kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9
>    kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00
>    kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15: 0000000000000001
>    kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000)
> knlGS:0000000000000000
>    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4: 00000000000606e0
>    kernel: Call Trace:
>    kernel:  qdisc_destroy+0x56/0x140
>    kernel:  qdisc_create_dflt+0x8b/0xb0
>    kernel:  mq_init+0xc1/0xf0
>    kernel:  qdisc_create_dflt+0x5a/0xb0
>    kernel:  dev_activate+0x205/0x230
>    kernel:  __dev_open+0xf5/0x160
>    kernel:  __dev_change_flags+0x1a3/0x210
>    kernel:  dev_change_flags+0x21/0x60
>    kernel:  do_setlink+0x660/0xdf0
>    kernel:  ? down_trylock+0x25/0x30
>    kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]
>    kernel:  ? rtnl_newlink+0x816/0x990
>    kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]
>    kernel:  ? _cond_resched+0x15/0x30
>    kernel:  ? kmem_cache_alloc+0x20/0x1b0
>    kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0
>    kernel:  ? rtnl_calcit.isra.30+0x100/0x100
>    kernel:  ? netlink_rcv_skb+0x4c/0x120
>    kernel:  ? netlink_unicast+0x19e/0x260
>    kernel:  ? netlink_sendmsg+0x1ff/0x3c0
>    kernel:  ? sock_sendmsg+0x36/0x40
>    kernel:  ? ___sys_sendmsg+0x295/0x2f0
>    kernel:  ? ebitmap_cmp+0x6d/0x90
>    kernel:  ? dev_get_by_name_rcu+0x73/0x90
>    kernel:  ? skb_dequeue+0x52/0x60
>    kernel:  ? __inode_wait_for_writeback+0x7f/0xf0
>    kernel:  ? bit_waitqueue+0x30/0x30
>    kernel:  ? fsnotify_grab_connector+0x3c/0x60
>    kernel:  ? __sys_sendmsg+0x51/0x90
>    kernel:  ? do_syscall_64+0x74/0x180
>    kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>    kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00 00
> 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08 48 8b 3b e8
> 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00
>    kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP: ffffbfbf4c1fb620
>    kernel: CR2: 0000000000000008
>    kernel: ---[ end trace e81a62bede66274e ]---
> 
> This occurs because if fq_codel_init fails, it has left the private data
> in an incomplete state. For example, if tcf_block_get fails, (as in the
> above panic), then q->flows and q->backlogs will be NULL. Thus they will
> cause NULL pointer access when attempting to reset them in
> fq_codel_reset.
> 
> We could mitigate some of these issues by changing fq_codel_init to more
> explicitly cleanup after itself when failing. For example, we could
> ensure that q->flowcnt was set to 0 so that the loop over each flow in
> fq_codel_reset would not trigger. However, this would not prevent a NULL
> pointer dereference when attempting to memset the q->backlogs.
> 
> Instead, just add a NULL check prior to attempting to reset these
> fields.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Eric Dumazet <edumazet@google.com>
> ---
>  net/sched/sch_fq_codel.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 22fa13cf5d8b..1658c314ee40 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -352,14 +352,17 @@ static void fq_codel_reset(struct Qdisc *sch)
> 
>  	INIT_LIST_HEAD(&q->new_flows);
>  	INIT_LIST_HEAD(&q->old_flows);
> -	for (i = 0; i < q->flows_cnt; i++) {
> -		struct fq_codel_flow *flow = q->flows + i;
> +	if (q->flows) {
> +		for (i = 0; i < q->flows_cnt; i++) {
> +			struct fq_codel_flow *flow = q->flows + i;
> 
> -		fq_codel_flow_purge(flow);
> -		INIT_LIST_HEAD(&flow->flowchain);
> -		codel_vars_init(&flow->cvars);
> +			fq_codel_flow_purge(flow);
> +			INIT_LIST_HEAD(&flow->flowchain);
> +			codel_vars_init(&flow->cvars);
> +		}
>  	}
> -	memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
> +	if (q->backlogs)
> +		memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
>  	sch->q.qlen = 0;
>  	sch->qstats.backlog = 0;
>  	q->memory_usage = 0;
> --
> 2.18.0.rc1.134.g5f29118f3507
Eric Dumazet June 8, 2018, 3:01 p.m. UTC | #2
Not sure this is the right fix.

It is very strange that

->init()   (failure) -> reset()  ,  then ->destroy()

Why reset() is called before a destroy, I have no idea.

Anyway you must send patches to netdev@

Thanks.


On Thu, Jun 7, 2018 at 5:08 PM Keller, Jacob E <jacob.e.keller@intel.com>
wrote:

> > -----Original Message-----
> > From: Keller, Jacob E
> > Sent: Thursday, June 07, 2018 5:07 PM
> > To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org>
> > Cc: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Keller, Jacob E
> > <jacob.e.keller@intel.com>; Eric Dumazet <edumazet@google.com>
> > Subject: [PATCH] fq_codel: fix NULL pointer deref in fq_codel_reset
> >
>
> This should be targeting [net] as that's where I built it, as it is a bug
> fix. I forgot to set the subject prefix when sending the mail.
>
> Thanks,
> Jake
>
> > The function qdisc_create_dftl attempts to create a default qdisc. If
> > this fails, it calls qdisc_destroy when cleaning up. The qdisc_destroy
> > function calls the ->reset op on the qdisc.
> >
> > In the case of sch_fq_codel.c, this function will panic when the qdisc
> > wasn't properly initialized:
> >
> >    kernel: BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000008
> >    kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]
> >    kernel: PGD 0 P4D 0
> >    kernel: Oops: 0000 [#1] SMP PTI
> >    kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle
> > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge
> stp llc
> > devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma
> ib_isert
> > iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt
> > target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm
> > ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac
> > x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt
> > iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei
> joydev
> > i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200
> ixgbe
> > drm_kms_helper isci ttm firewire_ohci
> >    kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core
> > scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf
> > ipmi_msghandler [last unloaded: i40e]
> >    kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE
> 4.16.13custom-fq-
> > codel-test+ #3
> >    kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS
> > SE5C600.86B.02.05.0004.051120151007 05/11/2015
> >    kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]
> >    kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246
> >    kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX:
> 00000000000005b9
> >    kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI:
> ffff9cfd17b31c00
> >    kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09:
> ffffffffb679c3e9
> >    kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12:
> ffff9cfd17b31c00
> >    kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15:
> 0000000000000001
> >    kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000)
> > knlGS:0000000000000000
> >    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >    kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4:
> 00000000000606e0
> >    kernel: Call Trace:
> >    kernel:  qdisc_destroy+0x56/0x140
> >    kernel:  qdisc_create_dflt+0x8b/0xb0
> >    kernel:  mq_init+0xc1/0xf0
> >    kernel:  qdisc_create_dflt+0x5a/0xb0
> >    kernel:  dev_activate+0x205/0x230
> >    kernel:  __dev_open+0xf5/0x160
> >    kernel:  __dev_change_flags+0x1a3/0x210
> >    kernel:  dev_change_flags+0x21/0x60
> >    kernel:  do_setlink+0x660/0xdf0
> >    kernel:  ? down_trylock+0x25/0x30
> >    kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]
> >    kernel:  ? rtnl_newlink+0x816/0x990
> >    kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]
> >    kernel:  ? _cond_resched+0x15/0x30
> >    kernel:  ? kmem_cache_alloc+0x20/0x1b0
> >    kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0
> >    kernel:  ? rtnl_calcit.isra.30+0x100/0x100
> >    kernel:  ? netlink_rcv_skb+0x4c/0x120
> >    kernel:  ? netlink_unicast+0x19e/0x260
> >    kernel:  ? netlink_sendmsg+0x1ff/0x3c0
> >    kernel:  ? sock_sendmsg+0x36/0x40
> >    kernel:  ? ___sys_sendmsg+0x295/0x2f0
> >    kernel:  ? ebitmap_cmp+0x6d/0x90
> >    kernel:  ? dev_get_by_name_rcu+0x73/0x90
> >    kernel:  ? skb_dequeue+0x52/0x60
> >    kernel:  ? __inode_wait_for_writeback+0x7f/0xf0
> >    kernel:  ? bit_waitqueue+0x30/0x30
> >    kernel:  ? fsnotify_grab_connector+0x3c/0x60
> >    kernel:  ? __sys_sendmsg+0x51/0x90
> >    kernel:  ? do_syscall_64+0x74/0x180
> >    kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> >    kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f
> 84 84 00 00 00
> > 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73
> 08 48 8b 3b e8
> > 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00
> >    kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP:
> ffffbfbf4c1fb620
> >    kernel: CR2: 0000000000000008
> >    kernel: ---[ end trace e81a62bede66274e ]---
> >
> > This occurs because if fq_codel_init fails, it has left the private data
> > in an incomplete state. For example, if tcf_block_get fails, (as in the
> > above panic), then q->flows and q->backlogs will be NULL. Thus they will
> > cause NULL pointer access when attempting to reset them in
> > fq_codel_reset.
> >
> > We could mitigate some of these issues by changing fq_codel_init to more
> > explicitly cleanup after itself when failing. For example, we could
> > ensure that q->flowcnt was set to 0 so that the loop over each flow in
> > fq_codel_reset would not trigger. However, this would not prevent a NULL
> > pointer dereference when attempting to memset the q->backlogs.
> >
> > Instead, just add a NULL check prior to attempting to reset these
> > fields.
> >
> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > Cc: Eric Dumazet <edumazet@google.com>
> > ---
> >  net/sched/sch_fq_codel.c | 15 +++++++++------
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> > index 22fa13cf5d8b..1658c314ee40 100644
> > --- a/net/sched/sch_fq_codel.c
> > +++ b/net/sched/sch_fq_codel.c
> > @@ -352,14 +352,17 @@ static void fq_codel_reset(struct Qdisc *sch)
> >
> >       INIT_LIST_HEAD(&q->new_flows);
> >       INIT_LIST_HEAD(&q->old_flows);
> > -     for (i = 0; i < q->flows_cnt; i++) {
> > -             struct fq_codel_flow *flow = q->flows + i;
> > +     if (q->flows) {
> > +             for (i = 0; i < q->flows_cnt; i++) {
> > +                     struct fq_codel_flow *flow = q->flows + i;
> >
> > -             fq_codel_flow_purge(flow);
> > -             INIT_LIST_HEAD(&flow->flowchain);
> > -             codel_vars_init(&flow->cvars);
> > +                     fq_codel_flow_purge(flow);
> > +                     INIT_LIST_HEAD(&flow->flowchain);
> > +                     codel_vars_init(&flow->cvars);
> > +             }
> >       }
> > -     memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
> > +     if (q->backlogs)
> > +             memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
> >       sch->q.qlen = 0;
> >       sch->qstats.backlog = 0;
> >       q->memory_usage = 0;
> > --
> > 2.18.0.rc1.134.g5f29118f3507
>
>
<div dir="ltr">Not sure this is the right fix.<div><br></div><div>It is very strange that </div><div><br></div><div>-&gt;init()   (failure) -&gt; reset()  ,  then -&gt;destroy()</div><div><br></div><div>Why reset() is called before a destroy, I have no idea.</div><div><br></div><div>Anyway you must send patches to netdev@</div><div><br></div><div>Thanks.</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Jun 7, 2018 at 5:08 PM Keller, Jacob E &lt;<a href="mailto:jacob.e.keller@intel.com">jacob.e.keller@intel.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">&gt; -----Original Message-----<br>
&gt; From: Keller, Jacob E<br>
&gt; Sent: Thursday, June 07, 2018 5:07 PM<br>
&gt; To: Intel Wired LAN &lt;<a href="mailto:intel-wired-lan@lists.osuosl.org" target="_blank">intel-wired-lan@lists.osuosl.org</a>&gt;<br>
&gt; Cc: Kirsher, Jeffrey T &lt;<a href="mailto:jeffrey.t.kirsher@intel.com" target="_blank">jeffrey.t.kirsher@intel.com</a>&gt;; Keller, Jacob E<br>
&gt; &lt;<a href="mailto:jacob.e.keller@intel.com" target="_blank">jacob.e.keller@intel.com</a>&gt;; Eric Dumazet &lt;<a href="mailto:edumazet@google.com" target="_blank">edumazet@google.com</a>&gt;<br>
&gt; Subject: [PATCH] fq_codel: fix NULL pointer deref in fq_codel_reset<br>
&gt; <br>
<br>
This should be targeting [net] as that&#39;s where I built it, as it is a bug fix. I forgot to set the subject prefix when sending the mail.<br>
<br>
Thanks,<br>
Jake<br>
<br>
&gt; The function qdisc_create_dftl attempts to create a default qdisc. If<br>
&gt; this fails, it calls qdisc_destroy when cleaning up. The qdisc_destroy<br>
&gt; function calls the -&gt;reset op on the qdisc.<br>
&gt; <br>
&gt; In the case of sch_fq_codel.c, this function will panic when the qdisc<br>
&gt; wasn&#39;t properly initialized:<br>
&gt; <br>
&gt;    kernel: BUG: unable to handle kernel NULL pointer dereference at<br>
&gt; 0000000000000008<br>
&gt;    kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]<br>
&gt;    kernel: PGD 0 P4D 0<br>
&gt;    kernel: Oops: 0000 [#1] SMP PTI<br>
&gt;    kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle<br>
&gt; ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat<br>
&gt; nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc<br>
&gt; devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert<br>
&gt; iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt<br>
&gt; target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm<br>
&gt; ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac<br>
&gt; x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass<br>
&gt; crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt<br>
&gt; iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei joydev<br>
&gt; i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200 ixgbe<br>
&gt; drm_kms_helper isci ttm firewire_ohci<br>
&gt;    kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core<br>
&gt; scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf<br>
&gt; ipmi_msghandler [last unloaded: i40e]<br>
&gt;    kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE    4.16.13custom-fq-<br>
&gt; codel-test+ #3<br>
&gt;    kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS<br>
&gt; SE5C600.86B.02.05.0004.051120151007 05/11/2015<br>
&gt;    kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]<br>
&gt;    kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246<br>
&gt;    kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX: 00000000000005b9<br>
&gt;    kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00<br>
&gt;    kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9<br>
&gt;    kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00<br>
&gt;    kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15: 0000000000000001<br>
&gt;    kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000)<br>
&gt; knlGS:0000000000000000<br>
&gt;    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033<br>
&gt;    kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4: 00000000000606e0<br>
&gt;    kernel: Call Trace:<br>
&gt;    kernel:  qdisc_destroy+0x56/0x140<br>
&gt;    kernel:  qdisc_create_dflt+0x8b/0xb0<br>
&gt;    kernel:  mq_init+0xc1/0xf0<br>
&gt;    kernel:  qdisc_create_dflt+0x5a/0xb0<br>
&gt;    kernel:  dev_activate+0x205/0x230<br>
&gt;    kernel:  __dev_open+0xf5/0x160<br>
&gt;    kernel:  __dev_change_flags+0x1a3/0x210<br>
&gt;    kernel:  dev_change_flags+0x21/0x60<br>
&gt;    kernel:  do_setlink+0x660/0xdf0<br>
&gt;    kernel:  ? down_trylock+0x25/0x30<br>
&gt;    kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]<br>
&gt;    kernel:  ? rtnl_newlink+0x816/0x990<br>
&gt;    kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]<br>
&gt;    kernel:  ? _cond_resched+0x15/0x30<br>
&gt;    kernel:  ? kmem_cache_alloc+0x20/0x1b0<br>
&gt;    kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0<br>
&gt;    kernel:  ? rtnl_calcit.isra.30+0x100/0x100<br>
&gt;    kernel:  ? netlink_rcv_skb+0x4c/0x120<br>
&gt;    kernel:  ? netlink_unicast+0x19e/0x260<br>
&gt;    kernel:  ? netlink_sendmsg+0x1ff/0x3c0<br>
&gt;    kernel:  ? sock_sendmsg+0x36/0x40<br>
&gt;    kernel:  ? ___sys_sendmsg+0x295/0x2f0<br>
&gt;    kernel:  ? ebitmap_cmp+0x6d/0x90<br>
&gt;    kernel:  ? dev_get_by_name_rcu+0x73/0x90<br>
&gt;    kernel:  ? skb_dequeue+0x52/0x60<br>
&gt;    kernel:  ? __inode_wait_for_writeback+0x7f/0xf0<br>
&gt;    kernel:  ? bit_waitqueue+0x30/0x30<br>
&gt;    kernel:  ? fsnotify_grab_connector+0x3c/0x60<br>
&gt;    kernel:  ? __sys_sendmsg+0x51/0x90<br>
&gt;    kernel:  ? do_syscall_64+0x74/0x180<br>
&gt;    kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2<br>
&gt;    kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00 00<br>
&gt; 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 &lt;48&gt; 8b 73 08 48 8b 3b e8<br>
&gt; 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00<br>
&gt;    kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP: ffffbfbf4c1fb620<br>
&gt;    kernel: CR2: 0000000000000008<br>
&gt;    kernel: ---[ end trace e81a62bede66274e ]---<br>
&gt; <br>
&gt; This occurs because if fq_codel_init fails, it has left the private data<br>
&gt; in an incomplete state. For example, if tcf_block_get fails, (as in the<br>
&gt; above panic), then q-&gt;flows and q-&gt;backlogs will be NULL. Thus they will<br>
&gt; cause NULL pointer access when attempting to reset them in<br>
&gt; fq_codel_reset.<br>
&gt; <br>
&gt; We could mitigate some of these issues by changing fq_codel_init to more<br>
&gt; explicitly cleanup after itself when failing. For example, we could<br>
&gt; ensure that q-&gt;flowcnt was set to 0 so that the loop over each flow in<br>
&gt; fq_codel_reset would not trigger. However, this would not prevent a NULL<br>
&gt; pointer dereference when attempting to memset the q-&gt;backlogs.<br>
&gt; <br>
&gt; Instead, just add a NULL check prior to attempting to reset these<br>
&gt; fields.<br>
&gt; <br>
&gt; Signed-off-by: Jacob Keller &lt;<a href="mailto:jacob.e.keller@intel.com" target="_blank">jacob.e.keller@intel.com</a>&gt;<br>
&gt; Cc: Eric Dumazet &lt;<a href="mailto:edumazet@google.com" target="_blank">edumazet@google.com</a>&gt;<br>
&gt; ---<br>
&gt;  net/sched/sch_fq_codel.c | 15 +++++++++------<br>
&gt;  1 file changed, 9 insertions(+), 6 deletions(-)<br>
&gt; <br>
&gt; diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c<br>
&gt; index 22fa13cf5d8b..1658c314ee40 100644<br>
&gt; --- a/net/sched/sch_fq_codel.c<br>
&gt; +++ b/net/sched/sch_fq_codel.c<br>
&gt; @@ -352,14 +352,17 @@ static void fq_codel_reset(struct Qdisc *sch)<br>
&gt; <br>
&gt;       INIT_LIST_HEAD(&amp;q-&gt;new_flows);<br>
&gt;       INIT_LIST_HEAD(&amp;q-&gt;old_flows);<br>
&gt; -     for (i = 0; i &lt; q-&gt;flows_cnt; i++) {<br>
&gt; -             struct fq_codel_flow *flow = q-&gt;flows + i;<br>
&gt; +     if (q-&gt;flows) {<br>
&gt; +             for (i = 0; i &lt; q-&gt;flows_cnt; i++) {<br>
&gt; +                     struct fq_codel_flow *flow = q-&gt;flows + i;<br>
&gt; <br>
&gt; -             fq_codel_flow_purge(flow);<br>
&gt; -             INIT_LIST_HEAD(&amp;flow-&gt;flowchain);<br>
&gt; -             codel_vars_init(&amp;flow-&gt;cvars);<br>
&gt; +                     fq_codel_flow_purge(flow);<br>
&gt; +                     INIT_LIST_HEAD(&amp;flow-&gt;flowchain);<br>
&gt; +                     codel_vars_init(&amp;flow-&gt;cvars);<br>
&gt; +             }<br>
&gt;       }<br>
&gt; -     memset(q-&gt;backlogs, 0, q-&gt;flows_cnt * sizeof(u32));<br>
&gt; +     if (q-&gt;backlogs)<br>
&gt; +             memset(q-&gt;backlogs, 0, q-&gt;flows_cnt * sizeof(u32));<br>
&gt;       sch-&gt;q.qlen = 0;<br>
&gt;       sch-&gt;qstats.backlog = 0;<br>
&gt;       q-&gt;memory_usage = 0;<br>
&gt; --<br>
&gt; 2.18.0.rc1.134.g5f29118f3507<br>
<br>
</blockquote></div>
Jacob Keller June 8, 2018, 3:19 p.m. UTC | #3
> -----Original Message-----
> From: Eric Dumazet [mailto:edumazet@google.com]
> Sent: Friday, June 08, 2018 8:01 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; Kirsher, Jeffrey T
> <jeffrey.t.kirsher@intel.com>
> Subject: Re: [PATCH] fq_codel: fix NULL pointer deref in fq_codel_reset
> 
> Not sure this is the right fix.
> 
> It is very strange that
> 
> ->init()   (failure) -> reset()  ,  then ->destroy()
> 
> Why reset() is called before a destroy, I have no idea.
> 

I don't understand either, and I'm not sure that this is the correct fix, but it seemed like the most sure way to avoid panics.

> Anyway you must send patches to netdev@
> 

Jeff forwards them to netdev after it's passed testing here.

Thanks,
Jake

> Thanks.
> 
> 
> On Thu, Jun 7, 2018 at 5:08 PM Keller, Jacob E <jacob.e.keller@intel.com
> <mailto:jacob.e.keller@intel.com> > wrote:
> 
> 
> 	> -----Original Message-----
> 	> From: Keller, Jacob E
> 	> Sent: Thursday, June 07, 2018 5:07 PM
> 	> To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org <mailto:intel-
> wired-lan@lists.osuosl.org> >
> 	> Cc: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com
> <mailto:jeffrey.t.kirsher@intel.com> >; Keller, Jacob E
> 	> <jacob.e.keller@intel.com <mailto:jacob.e.keller@intel.com> >; Eric
> Dumazet <edumazet@google.com <mailto:edumazet@google.com> >
> 	> Subject: [PATCH] fq_codel: fix NULL pointer deref in fq_codel_reset
> 	>
> 
> 	This should be targeting [net] as that's where I built it, as it is a bug fix. I
> forgot to set the subject prefix when sending the mail.
> 
> 	Thanks,
> 	Jake
> 
> 	> The function qdisc_create_dftl attempts to create a default qdisc. If
> 	> this fails, it calls qdisc_destroy when cleaning up. The qdisc_destroy
> 	> function calls the ->reset op on the qdisc.
> 	>
> 	> In the case of sch_fq_codel.c, this function will panic when the qdisc
> 	> wasn't properly initialized:
> 	>
> 	>    kernel: BUG: unable to handle kernel NULL pointer dereference at
> 	> 0000000000000008
> 	>    kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]
> 	>    kernel: PGD 0 P4D 0
> 	>    kernel: Oops: 0000 [#1] SMP PTI
> 	>    kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM
> iptable_mangle
> 	> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4
> nf_nat
> 	> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun
> bridge stp llc
> 	> devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma
> ib_isert
> 	> iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt
> 	> target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm
> 	> ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac
> 	> x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> 	> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate
> iTCO_wdt
> 	> iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me
> mei joydev
> 	> i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c
> mgag200 ixgbe
> 	> drm_kms_helper isci ttm firewire_ohci
> 	>    kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core
> 	> scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf
> 	> ipmi_msghandler [last unloaded: i40e]
> 	>    kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE
> 4.16.13custom-fq-
> 	> codel-test+ #3
> 	>    kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS
> 	> SE5C600.86B.02.05.0004.051120151007 05/11/2015
> 	>    kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]
> 	>    kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246
> 	>    kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX:
> 00000000000005b9
> 	>    kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI:
> ffff9cfd17b31c00
> 	>    kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09:
> ffffffffb679c3e9
> 	>    kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12:
> ffff9cfd17b31c00
> 	>    kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15:
> 0000000000000001
> 	>    kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000)
> 	> knlGS:0000000000000000
> 	>    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 	>    kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4:
> 00000000000606e0
> 	>    kernel: Call Trace:
> 	>    kernel:  qdisc_destroy+0x56/0x140
> 	>    kernel:  qdisc_create_dflt+0x8b/0xb0
> 	>    kernel:  mq_init+0xc1/0xf0
> 	>    kernel:  qdisc_create_dflt+0x5a/0xb0
> 	>    kernel:  dev_activate+0x205/0x230
> 	>    kernel:  __dev_open+0xf5/0x160
> 	>    kernel:  __dev_change_flags+0x1a3/0x210
> 	>    kernel:  dev_change_flags+0x21/0x60
> 	>    kernel:  do_setlink+0x660/0xdf0
> 	>    kernel:  ? down_trylock+0x25/0x30
> 	>    kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]
> 	>    kernel:  ? rtnl_newlink+0x816/0x990
> 	>    kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]
> 	>    kernel:  ? _cond_resched+0x15/0x30
> 	>    kernel:  ? kmem_cache_alloc+0x20/0x1b0
> 	>    kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0
> 	>    kernel:  ? rtnl_calcit.isra.30+0x100/0x100
> 	>    kernel:  ? netlink_rcv_skb+0x4c/0x120
> 	>    kernel:  ? netlink_unicast+0x19e/0x260
> 	>    kernel:  ? netlink_sendmsg+0x1ff/0x3c0
> 	>    kernel:  ? sock_sendmsg+0x36/0x40
> 	>    kernel:  ? ___sys_sendmsg+0x295/0x2f0
> 	>    kernel:  ? ebitmap_cmp+0x6d/0x90
> 	>    kernel:  ? dev_get_by_name_rcu+0x73/0x90
> 	>    kernel:  ? skb_dequeue+0x52/0x60
> 	>    kernel:  ? __inode_wait_for_writeback+0x7f/0xf0
> 	>    kernel:  ? bit_waitqueue+0x30/0x30
> 	>    kernel:  ? fsnotify_grab_connector+0x3c/0x60
> 	>    kernel:  ? __sys_sendmsg+0x51/0x90
> 	>    kernel:  ? do_syscall_64+0x74/0x180
> 	>    kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> 	>    kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84
> 00 00 00
> 	> 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08
> 48 8b 3b e8
> 	> 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00
> 	>    kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP:
> ffffbfbf4c1fb620
> 	>    kernel: CR2: 0000000000000008
> 	>    kernel: ---[ end trace e81a62bede66274e ]---
> 	>
> 	> This occurs because if fq_codel_init fails, it has left the private data
> 	> in an incomplete state. For example, if tcf_block_get fails, (as in the
> 	> above panic), then q->flows and q->backlogs will be NULL. Thus they
> will
> 	> cause NULL pointer access when attempting to reset them in
> 	> fq_codel_reset.
> 	>
> 	> We could mitigate some of these issues by changing fq_codel_init to
> more
> 	> explicitly cleanup after itself when failing. For example, we could
> 	> ensure that q->flowcnt was set to 0 so that the loop over each flow in
> 	> fq_codel_reset would not trigger. However, this would not prevent a
> NULL
> 	> pointer dereference when attempting to memset the q->backlogs.
> 	>
> 	> Instead, just add a NULL check prior to attempting to reset these
> 	> fields.
> 	>
> 	> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com
> <mailto:jacob.e.keller@intel.com> >
> 	> Cc: Eric Dumazet <edumazet@google.com
> <mailto:edumazet@google.com> >
> 	> ---
> 	>  net/sched/sch_fq_codel.c | 15 +++++++++------
> 	>  1 file changed, 9 insertions(+), 6 deletions(-)
> 	>
> 	> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> 	> index 22fa13cf5d8b..1658c314ee40 100644
> 	> --- a/net/sched/sch_fq_codel.c
> 	> +++ b/net/sched/sch_fq_codel.c
> 	> @@ -352,14 +352,17 @@ static void fq_codel_reset(struct Qdisc *sch)
> 	>
> 	>       INIT_LIST_HEAD(&q->new_flows);
> 	>       INIT_LIST_HEAD(&q->old_flows);
> 	> -     for (i = 0; i < q->flows_cnt; i++) {
> 	> -             struct fq_codel_flow *flow = q->flows + i;
> 	> +     if (q->flows) {
> 	> +             for (i = 0; i < q->flows_cnt; i++) {
> 	> +                     struct fq_codel_flow *flow = q->flows + i;
> 	>
> 	> -             fq_codel_flow_purge(flow);
> 	> -             INIT_LIST_HEAD(&flow->flowchain);
> 	> -             codel_vars_init(&flow->cvars);
> 	> +                     fq_codel_flow_purge(flow);
> 	> +                     INIT_LIST_HEAD(&flow->flowchain);
> 	> +                     codel_vars_init(&flow->cvars);
> 	> +             }
> 	>       }
> 	> -     memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
> 	> +     if (q->backlogs)
> 	> +             memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
> 	>       sch->q.qlen = 0;
> 	>       sch->qstats.backlog = 0;
> 	>       q->memory_usage = 0;
> 	> --
> 	> 2.18.0.rc1.134.g5f29118f3507
> 
>
Bowers, AndrewX June 8, 2018, 5:54 p.m. UTC | #4
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Jacob Keller
> Sent: Thursday, June 7, 2018 5:07 PM
> To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Subject: [Intel-wired-lan] [PATCH] fq_codel: fix NULL pointer deref in
> fq_codel_reset
> 
> The function qdisc_create_dftl attempts to create a default qdisc. If this fails,
> it calls qdisc_destroy when cleaning up. The qdisc_destroy function calls the -
> >reset op on the qdisc.
> 
> In the case of sch_fq_codel.c, this function will panic when the qdisc wasn't
> properly initialized:
> 
>    kernel: BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000008
>    kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]
>    kernel: PGD 0 P4D 0
>    kernel: Oops: 0000 [#1] SMP PTI
>    kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp
> llc devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert
> iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt
> target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm
> ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt
> iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei
> joydev i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c
> mgag200 ixgbe drm_kms_helper isci ttm firewire_ohci
>    kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core
> scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf
> ipmi_msghandler [last unloaded: i40e]
>    kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE    4.16.13custom-fq-
> codel-test+ #3
>    kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS
> SE5C600.86B.02.05.0004.051120151007 05/11/2015
>    kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]
>    kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246
>    kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX:
> 00000000000005b9
>    kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00
>    kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9
>    kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00
>    kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15:
> 0000000000000001
>    kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000)
> knlGS:0000000000000000
>    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4:
> 00000000000606e0
>    kernel: Call Trace:
>    kernel:  qdisc_destroy+0x56/0x140
>    kernel:  qdisc_create_dflt+0x8b/0xb0
>    kernel:  mq_init+0xc1/0xf0
>    kernel:  qdisc_create_dflt+0x5a/0xb0
>    kernel:  dev_activate+0x205/0x230
>    kernel:  __dev_open+0xf5/0x160
>    kernel:  __dev_change_flags+0x1a3/0x210
>    kernel:  dev_change_flags+0x21/0x60
>    kernel:  do_setlink+0x660/0xdf0
>    kernel:  ? down_trylock+0x25/0x30
>    kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]
>    kernel:  ? rtnl_newlink+0x816/0x990
>    kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]
>    kernel:  ? _cond_resched+0x15/0x30
>    kernel:  ? kmem_cache_alloc+0x20/0x1b0
>    kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0
>    kernel:  ? rtnl_calcit.isra.30+0x100/0x100
>    kernel:  ? netlink_rcv_skb+0x4c/0x120
>    kernel:  ? netlink_unicast+0x19e/0x260
>    kernel:  ? netlink_sendmsg+0x1ff/0x3c0
>    kernel:  ? sock_sendmsg+0x36/0x40
>    kernel:  ? ___sys_sendmsg+0x295/0x2f0
>    kernel:  ? ebitmap_cmp+0x6d/0x90
>    kernel:  ? dev_get_by_name_rcu+0x73/0x90
>    kernel:  ? skb_dequeue+0x52/0x60
>    kernel:  ? __inode_wait_for_writeback+0x7f/0xf0
>    kernel:  ? bit_waitqueue+0x30/0x30
>    kernel:  ? fsnotify_grab_connector+0x3c/0x60
>    kernel:  ? __sys_sendmsg+0x51/0x90
>    kernel:  ? do_syscall_64+0x74/0x180
>    kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>    kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00
> 00 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08 48
> 8b 3b e8 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00
>    kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP:
> ffffbfbf4c1fb620
>    kernel: CR2: 0000000000000008
>    kernel: ---[ end trace e81a62bede66274e ]---
> 
> This occurs because if fq_codel_init fails, it has left the private data in an
> incomplete state. For example, if tcf_block_get fails, (as in the above panic),
> then q->flows and q->backlogs will be NULL. Thus they will cause NULL
> pointer access when attempting to reset them in fq_codel_reset.
> 
> We could mitigate some of these issues by changing fq_codel_init to more
> explicitly cleanup after itself when failing. For example, we could ensure that
> q->flowcnt was set to 0 so that the loop over each flow in fq_codel_reset
> would not trigger. However, this would not prevent a NULL pointer
> dereference when attempting to memset the q->backlogs.
> 
> Instead, just add a NULL check prior to attempting to reset these fields.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Eric Dumazet <edumazet@google.com>
> ---
>  net/sched/sch_fq_codel.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Jacob Keller July 2, 2018, 11:27 p.m. UTC | #5
> -----Original Message-----
> From: Eric Dumazet [mailto:edumazet@google.com]
> Sent: Friday, June 08, 2018 8:01 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; Kirsher, Jeffrey T
> <jeffrey.t.kirsher@intel.com>
> Subject: Re: [PATCH] fq_codel: fix NULL pointer deref in fq_codel_reset
> 
> Not sure this is the right fix.
> 
> It is very strange that
> 
> ->init()   (failure) -> reset()  ,  then ->destroy()
> 
> Why reset() is called before a destroy, I have no idea.
> 
> Anyway you must send patches to netdev@
> 
> Thanks.

If you have a better suggestion, I'm all ears.

I've heard that we should set qflows_cnt to 0 instead of checking the null pointer directly. I'm fine with that solution too.

I'd be fine removing the reset() call also, if you think that's better. I don't know if that would adversely impact the other schedulers though.

Thanks,
Jake
diff mbox series

Patch

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 22fa13cf5d8b..1658c314ee40 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -352,14 +352,17 @@  static void fq_codel_reset(struct Qdisc *sch)
 
 	INIT_LIST_HEAD(&q->new_flows);
 	INIT_LIST_HEAD(&q->old_flows);
-	for (i = 0; i < q->flows_cnt; i++) {
-		struct fq_codel_flow *flow = q->flows + i;
+	if (q->flows) {
+		for (i = 0; i < q->flows_cnt; i++) {
+			struct fq_codel_flow *flow = q->flows + i;
 
-		fq_codel_flow_purge(flow);
-		INIT_LIST_HEAD(&flow->flowchain);
-		codel_vars_init(&flow->cvars);
+			fq_codel_flow_purge(flow);
+			INIT_LIST_HEAD(&flow->flowchain);
+			codel_vars_init(&flow->cvars);
+		}
 	}
-	memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
+	if (q->backlogs)
+		memset(q->backlogs, 0, q->flows_cnt * sizeof(u32));
 	sch->q.qlen = 0;
 	sch->qstats.backlog = 0;
 	q->memory_usage = 0;