diff mbox series

tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv()

Message ID 20171128125315.25334-1-tommi.t.rantala@nokia.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv() | expand

Commit Message

Tommi Rantala Nov. 28, 2017, 12:53 p.m. UTC
Call tipc_rcv() only if bearer is up in tipc_udp_recv().
Fixes a rare TIPC div-by-zero crash in tipc_node_calculate_timer():

We're enabling a bearer, but it's not yet up and fully initialized.
At the same time we receive a discovery packet, and in tipc_udp_recv()
we end up calling tipc_rcv() with the not-yet-initialized bearer,
causing later a div-by-zero crash in tipc_node_calculate_timer().

[   12.590450] Own node address <1.1.1>, network identity 1
[   12.668088] divide error: 0000 [#1] SMP
[   12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.2-dirty #1
[   12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
[   12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
[   12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
[   12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
[   12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX: 0000000000000000
[   12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI: ffff8c2a5b382600
[   12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09: 0000000000000001
[   12.696632] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c2a5d8949d8
[   12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15: ffff8c2a5d894800
[   12.702338] FS:  0000000000000000(0000) GS:ffff8c2a7fc80000(0000) knlGS:0000000000000000
[   12.705099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4: 00000000003606e0
[   12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   12.712627] Call Trace:
[   12.713390]  <IRQ>
[   12.714011]  tipc_node_check_dest+0x2e8/0x350 [tipc]
[   12.715286]  tipc_disc_rcv+0x14d/0x1d0 [tipc]
[   12.716370]  tipc_rcv+0x8b0/0xd40 [tipc]
[   12.717396]  ? minmax_running_min+0x2f/0x60
[   12.718248]  ? dst_alloc+0x4c/0xa0
[   12.718964]  ? tcp_ack+0xaf1/0x10b0
[   12.719658]  ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
[   12.720634]  tipc_udp_recv+0x71/0x1d0 [tipc]
[   12.721459]  ? dst_alloc+0x4c/0xa0
[   12.722130]  udp_queue_rcv_skb+0x264/0x490
[   12.722924]  __udp4_lib_rcv+0x21e/0x990
[   12.723670]  ? ip_route_input_rcu+0x2dd/0xbf0
[   12.724442]  ? tcp_v4_rcv+0x958/0xa40
[   12.725039]  udp_rcv+0x1a/0x20
[   12.725587]  ip_local_deliver_finish+0x97/0x1d0
[   12.726323]  ip_local_deliver+0xaf/0xc0
[   12.726959]  ? ip_route_input_noref+0x19/0x20
[   12.727689]  ip_rcv_finish+0xdd/0x3b0
[   12.728307]  ip_rcv+0x2ac/0x360
[   12.728839]  __netif_receive_skb_core+0x6fb/0xa90
[   12.729580]  ? udp4_gro_receive+0x1a7/0x2c0
[   12.730274]  __netif_receive_skb+0x1d/0x60
[   12.730953]  ? __netif_receive_skb+0x1d/0x60
[   12.731637]  netif_receive_skb_internal+0x37/0xd0
[   12.732371]  napi_gro_receive+0xc7/0xf0
[   12.732920]  receive_buf+0x3c3/0xd40
[   12.733441]  virtnet_poll+0xb1/0x250
[   12.733944]  net_rx_action+0x23e/0x370
[   12.734476]  __do_softirq+0xc5/0x2f8
[   12.734922]  irq_exit+0xfa/0x100
[   12.735315]  do_IRQ+0x4f/0xd0
[   12.735680]  common_interrupt+0xa2/0xa2
[   12.736126]  </IRQ>
[   12.736416] RIP: 0010:native_safe_halt+0x6/0x10
[   12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff4d
[   12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX: 0000000000000000
[   12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09: ffffa41cc12c7e88
[   12.740118] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
[   12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15: 0000000000000000
[   12.741831]  default_idle+0x2a/0x100
[   12.742323]  arch_cpu_idle+0xf/0x20
[   12.742796]  default_idle_call+0x28/0x40
[   12.743312]  do_idle+0x179/0x1f0
[   12.743761]  cpu_startup_entry+0x1d/0x20
[   12.744291]  start_secondary+0x112/0x120
[   12.744816]  secondary_startup_64+0xa5/0xa5
[   12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00
00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48
89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
[   12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] RSP: ffff8c2a7fc838a0
[   12.748555] ---[ end trace 1399ab83390650fd ]---
[   12.749296] Kernel panic - not syncing: Fatal exception in interrupt
[   12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   12.751215] Rebooting in 60 seconds..

Fixes: c9b64d492b1f ("tipc: add replicast peer discovery")
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
---
 net/tipc/udp_media.c | 29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

Comments

Jon Maloy Nov. 28, 2017, 1:15 p.m. UTC | #1
Acked.

///jon

> -----Original Message-----
> From: Tommi Rantala [mailto:tommi.t.rantala@nokia.com]
> Sent: Tuesday, November 28, 2017 07:53
> To: Jon Maloy <jon.maloy@ericsson.com>; Ying Xue
> <ying.xue@windriver.com>; David S. Miller <davem@davemloft.net>;
> netdev@vger.kernel.org; tipc-discussion@lists.sourceforge.net; linux-
> kernel@vger.kernel.org
> Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
> Subject: [PATCH] tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv()
> 
> Call tipc_rcv() only if bearer is up in tipc_udp_recv().
> Fixes a rare TIPC div-by-zero crash in tipc_node_calculate_timer():
> 
> We're enabling a bearer, but it's not yet up and fully initialized.
> At the same time we receive a discovery packet, and in tipc_udp_recv() we
> end up calling tipc_rcv() with the not-yet-initialized bearer, causing later a
> div-by-zero crash in tipc_node_calculate_timer().
> 
> [   12.590450] Own node address <1.1.1>, network identity 1
> [   12.668088] divide error: 0000 [#1] SMP
> [   12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.2-dirty #1
> [   12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.10.2-2.fc27 04/01/2014
> [   12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
> [   12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
> [   12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
> [   12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX:
> 0000000000000000
> [   12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI:
> ffff8c2a5b382600
> [   12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09:
> 0000000000000001
> [   12.696632] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff8c2a5d8949d8
> [   12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15:
> ffff8c2a5d894800
> [   12.702338] FS:  0000000000000000(0000) GS:ffff8c2a7fc80000(0000)
> knlGS:0000000000000000
> [   12.705099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4:
> 00000000003606e0
> [   12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [   12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [   12.712627] Call Trace:
> [   12.713390]  <IRQ>
> [   12.714011]  tipc_node_check_dest+0x2e8/0x350 [tipc]
> [   12.715286]  tipc_disc_rcv+0x14d/0x1d0 [tipc]
> [   12.716370]  tipc_rcv+0x8b0/0xd40 [tipc]
> [   12.717396]  ? minmax_running_min+0x2f/0x60
> [   12.718248]  ? dst_alloc+0x4c/0xa0
> [   12.718964]  ? tcp_ack+0xaf1/0x10b0
> [   12.719658]  ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
> [   12.720634]  tipc_udp_recv+0x71/0x1d0 [tipc]
> [   12.721459]  ? dst_alloc+0x4c/0xa0
> [   12.722130]  udp_queue_rcv_skb+0x264/0x490
> [   12.722924]  __udp4_lib_rcv+0x21e/0x990
> [   12.723670]  ? ip_route_input_rcu+0x2dd/0xbf0
> [   12.724442]  ? tcp_v4_rcv+0x958/0xa40
> [   12.725039]  udp_rcv+0x1a/0x20
> [   12.725587]  ip_local_deliver_finish+0x97/0x1d0
> [   12.726323]  ip_local_deliver+0xaf/0xc0
> [   12.726959]  ? ip_route_input_noref+0x19/0x20
> [   12.727689]  ip_rcv_finish+0xdd/0x3b0
> [   12.728307]  ip_rcv+0x2ac/0x360
> [   12.728839]  __netif_receive_skb_core+0x6fb/0xa90
> [   12.729580]  ? udp4_gro_receive+0x1a7/0x2c0
> [   12.730274]  __netif_receive_skb+0x1d/0x60
> [   12.730953]  ? __netif_receive_skb+0x1d/0x60
> [   12.731637]  netif_receive_skb_internal+0x37/0xd0
> [   12.732371]  napi_gro_receive+0xc7/0xf0
> [   12.732920]  receive_buf+0x3c3/0xd40
> [   12.733441]  virtnet_poll+0xb1/0x250
> [   12.733944]  net_rx_action+0x23e/0x370
> [   12.734476]  __do_softirq+0xc5/0x2f8
> [   12.734922]  irq_exit+0xfa/0x100
> [   12.735315]  do_IRQ+0x4f/0xd0
> [   12.735680]  common_interrupt+0xa2/0xa2
> [   12.736126]  </IRQ>
> [   12.736416] RIP: 0010:native_safe_halt+0x6/0x10
> [   12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffff4d
> [   12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX:
> 0000000000000000
> [   12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [   12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09:
> ffffa41cc12c7e88
> [   12.740118] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000002
> [   12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15:
> 0000000000000000
> [   12.741831]  default_idle+0x2a/0x100
> [   12.742323]  arch_cpu_idle+0xf/0x20
> [   12.742796]  default_idle_call+0x28/0x40
> [   12.743312]  do_idle+0x179/0x1f0
> [   12.743761]  cpu_startup_entry+0x1d/0x20
> [   12.744291]  start_secondary+0x112/0x120
> [   12.744816]  secondary_startup_64+0xa5/0xa5
> [   12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00
> 00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48
> 89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
> [   12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] RSP:
> ffff8c2a7fc838a0
> [   12.748555] ---[ end trace 1399ab83390650fd ]---
> [   12.749296] Kernel panic - not syncing: Fatal exception in interrupt
> [   12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [   12.751215] Rebooting in 60 seconds..
> 
> Fixes: c9b64d492b1f ("tipc: add replicast peer discovery")
> Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
> ---
>  net/tipc/udp_media.c | 29 +++++++----------------------
>  1 file changed, 7 insertions(+), 22 deletions(-)
> 
> diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index
> ecca64fc6a6f..599e7be92024 100644
> --- a/net/tipc/udp_media.c
> +++ b/net/tipc/udp_media.c
> @@ -344,42 +344,27 @@ static int tipc_udp_recv(struct sock *sk, struct
> sk_buff *skb)
>  	struct udp_bearer *ub;
>  	struct tipc_bearer *b;
>  	struct tipc_msg *hdr;
> -	int err;
> 
>  	ub = rcu_dereference_sk_user_data(sk);
>  	if (!ub) {
>  		pr_err_ratelimited("Failed to get UDP bearer reference");
> -		goto out;
> +		kfree_skb(skb);
> +		return 0;
>  	}
>  	skb_pull(skb, sizeof(struct udphdr));
>  	hdr = buf_msg(skb);
> 
>  	rcu_read_lock();
>  	b = rcu_dereference_rtnl(ub->bearer);
> -	if (!b)
> -		goto rcu_out;
> -
> -	if (b && test_bit(0, &b->up)) {
> +	if (likely(b && test_bit(0, &b->up))) {
>  		tipc_rcv(sock_net(sk), skb, b);
> -		rcu_read_unlock();
> -		return 0;
> -	}
> -
> -	if (unlikely(msg_user(hdr) == LINK_CONFIG)) {
> -		err = tipc_udp_rcast_disc(b, skb);
> -		if (err)
> -			goto rcu_out;
> +	} else {
> +		if (unlikely(b && msg_user(hdr) == LINK_CONFIG))
> +			tipc_udp_rcast_disc(b, skb);
> +		kfree_skb(skb);
>  	}
> -
> -	tipc_rcv(sock_net(sk), skb, b);
>  	rcu_read_unlock();
>  	return 0;
> -
> -rcu_out:
> -	rcu_read_unlock();
> -out:
> -	kfree_skb(skb);
> -	return 0;
>  }
> 
>  static int enable_mcast(struct udp_bearer *ub, struct udp_media_addr
> *remote)
> --
> 2.14.3
David Miller Nov. 28, 2017, 2:58 p.m. UTC | #2
From: Tommi Rantala <tommi.t.rantala@nokia.com>
Date: Tue, 28 Nov 2017 14:53:15 +0200

> Call tipc_rcv() only if bearer is up in tipc_udp_recv().
> Fixes a rare TIPC div-by-zero crash in tipc_node_calculate_timer():
> 
> We're enabling a bearer, but it's not yet up and fully initialized.
> At the same time we receive a discovery packet, and in tipc_udp_recv()
> we end up calling tipc_rcv() with the not-yet-initialized bearer,
> causing later a div-by-zero crash in tipc_node_calculate_timer().

You're also now ignoring any error being returned by tipc_udp_rcast_disc().

> -
> -	if (unlikely(msg_user(hdr) == LINK_CONFIG)) {
> -		err = tipc_udp_rcast_disc(b, skb);
> -		if (err)
> -			goto rcu_out;
> +	} else {
> +		if (unlikely(b && msg_user(hdr) == LINK_CONFIG))
> +			tipc_udp_rcast_disc(b, skb);
> +		kfree_skb(skb);
>  	}

Either put the 'err' propagation back or clearly explain in your
commit log message why this part of the change if absolutely essential
for this bug fix.

Thank you.
Tommi Rantala Nov. 28, 2017, 6:16 p.m. UTC | #3
On 28.11.2017 16:58, David Miller wrote:
> From: Tommi Rantala <tommi.t.rantala@nokia.com>
> Date: Tue, 28 Nov 2017 14:53:15 +0200
> 
>> -
>> -	if (unlikely(msg_user(hdr) == LINK_CONFIG)) {
>> -		err = tipc_udp_rcast_disc(b, skb);
>> -		if (err)
>> -			goto rcu_out;
>> +	} else {
>> +		if (unlikely(b && msg_user(hdr) == LINK_CONFIG))
>> +			tipc_udp_rcast_disc(b, skb);
>> +		kfree_skb(skb);
>>   	}
> 
> Either put the 'err' propagation back or clearly explain in your
> commit log message why this part of the change if absolutely essential
> for this bug fix.
> 
> Thank you.
> 

Thanks for the feedback. I'll post patch v2 soon.

-Tommi
diff mbox series

Patch

diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index ecca64fc6a6f..599e7be92024 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -344,42 +344,27 @@  static int tipc_udp_recv(struct sock *sk, struct sk_buff *skb)
 	struct udp_bearer *ub;
 	struct tipc_bearer *b;
 	struct tipc_msg *hdr;
-	int err;
 
 	ub = rcu_dereference_sk_user_data(sk);
 	if (!ub) {
 		pr_err_ratelimited("Failed to get UDP bearer reference");
-		goto out;
+		kfree_skb(skb);
+		return 0;
 	}
 	skb_pull(skb, sizeof(struct udphdr));
 	hdr = buf_msg(skb);
 
 	rcu_read_lock();
 	b = rcu_dereference_rtnl(ub->bearer);
-	if (!b)
-		goto rcu_out;
-
-	if (b && test_bit(0, &b->up)) {
+	if (likely(b && test_bit(0, &b->up))) {
 		tipc_rcv(sock_net(sk), skb, b);
-		rcu_read_unlock();
-		return 0;
-	}
-
-	if (unlikely(msg_user(hdr) == LINK_CONFIG)) {
-		err = tipc_udp_rcast_disc(b, skb);
-		if (err)
-			goto rcu_out;
+	} else {
+		if (unlikely(b && msg_user(hdr) == LINK_CONFIG))
+			tipc_udp_rcast_disc(b, skb);
+		kfree_skb(skb);
 	}
-
-	tipc_rcv(sock_net(sk), skb, b);
 	rcu_read_unlock();
 	return 0;
-
-rcu_out:
-	rcu_read_unlock();
-out:
-	kfree_skb(skb);
-	return 0;
 }
 
 static int enable_mcast(struct udp_bearer *ub, struct udp_media_addr *remote)