diff mbox series

[net-next] tipc: don't call sock_release() in atomic context

Message ID 8857e1324da5cd7b9ebd79641eab1444418bcdcb.1519063117.git.pabeni@redhat.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net-next] tipc: don't call sock_release() in atomic context | expand

Commit Message

Paolo Abeni Feb. 19, 2018, 6:02 p.m. UTC
syzbot reported a scheduling while atomic issue at netns
destruction time:

BUG: sleeping function called from invalid context at net/core/sock.c:2769
in_atomic(): 1, irqs_disabled(): 0, pid: 85, name: kworker/u4:3
5 locks held by kworker/u4:3/85:
  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000c9792deb>]
process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084
  #1:  (net_cleanup_work){+.+.}, at: [<00000000adc12e2a>]
process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088
  #2:  (net_sem){++++}, at: [<000000009ccb5669>] cleanup_net+0x23f/0xd20
net/core/net_namespace.c:494
  #3:  (net_mutex){+.+.}, at: [<00000000a92767d9>] cleanup_net+0xa7d/0xd20
net/core/net_namespace.c:496
  #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]
spin_lock_bh include/linux/spinlock.h:315 [inline]
  #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]
tipc_topsrv_stop+0x231/0x610 net/tipc/topsrv.c:685
CPU: 0 PID: 85 Comm: kworker/u4:3 Not tainted 4.16.0-rc1+ #230
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6128
  __might_sleep+0x95/0x190 kernel/sched/core.c:6081
  lock_sock_nested+0x37/0x110 net/core/sock.c:2769
  lock_sock include/net/sock.h:1463 [inline]
  tipc_release+0x103/0xff0 net/tipc/socket.c:572
  sock_release+0x8d/0x1e0 net/socket.c:594
  tipc_topsrv_stop+0x3c0/0x610 net/tipc/topsrv.c:696
  tipc_exit_net+0x15/0x40 net/tipc/core.c:96
  ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:148
  cleanup_net+0x6ba/0xd20 net/core/net_namespace.c:529
  process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
  kthread+0x33c/0x400 kernel/kthread.c:238
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429

This is caused by tipc_topsrv_stop() releasing the listener socket
with the idr lock held. This changeset addresses the issue moving
the release operation outside such lock.

Reported-and-tested-by: syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com
Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/tipc/topsrv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jon Maloy Feb. 19, 2018, 6:53 p.m. UTC | #1
> -----Original Message-----

> From: netdev-owner@vger.kernel.org [mailto:netdev-

> owner@vger.kernel.org] On Behalf Of Paolo Abeni

> Sent: Monday, February 19, 2018 19:02

> To: netdev@vger.kernel.org

> Cc: Jon Maloy <jon.maloy@ericsson.com>; Ying Xue

> <ying.xue@windriver.com>; David S. Miller <davem@davemloft.net>

> Subject: [PATCH net-next] tipc: don't call sock_release() in atomic context

> 

> syzbot reported a scheduling while atomic issue at netns destruction time:

> 

> BUG: sleeping function called from invalid context at net/core/sock.c:2769

> in_atomic(): 1, irqs_disabled(): 0, pid: 85, name: kworker/u4:3

> 5 locks held by kworker/u4:3/85:

>   #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000c9792deb>]

> process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084

>   #1:  (net_cleanup_work){+.+.}, at: [<00000000adc12e2a>]

> process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088

>   #2:  (net_sem){++++}, at: [<000000009ccb5669>] cleanup_net+0x23f/0xd20

> net/core/net_namespace.c:494

>   #3:  (net_mutex){+.+.}, at: [<00000000a92767d9>]

> cleanup_net+0xa7d/0xd20

> net/core/net_namespace.c:496

>   #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]

> spin_lock_bh include/linux/spinlock.h:315 [inline]

>   #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]

> tipc_topsrv_stop+0x231/0x610 net/tipc/topsrv.c:685

> CPU: 0 PID: 85 Comm: kworker/u4:3 Not tainted 4.16.0-rc1+ #230 Hardware

> name: Google Google Compute Engine/Google Compute Engine, BIOS

> Google 01/01/2011

> Workqueue: netns cleanup_net

> Call Trace:

>   __dump_stack lib/dump_stack.c:17 [inline]

>   dump_stack+0x194/0x257 lib/dump_stack.c:53

>   ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6128

>   __might_sleep+0x95/0x190 kernel/sched/core.c:6081

>   lock_sock_nested+0x37/0x110 net/core/sock.c:2769

>   lock_sock include/net/sock.h:1463 [inline]

>   tipc_release+0x103/0xff0 net/tipc/socket.c:572

>   sock_release+0x8d/0x1e0 net/socket.c:594

>   tipc_topsrv_stop+0x3c0/0x610 net/tipc/topsrv.c:696

>   tipc_exit_net+0x15/0x40 net/tipc/core.c:96

>   ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:148

>   cleanup_net+0x6ba/0xd20 net/core/net_namespace.c:529

>   process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113

>   worker_thread+0x223/0x1990 kernel/workqueue.c:2247

>   kthread+0x33c/0x400 kernel/kthread.c:238

>   ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429

> 

> This is caused by tipc_topsrv_stop() releasing the listener socket with the idr

> lock held. This changeset addresses the issue moving the release operation

> outside such lock.


Thank you Paolo. This was too obvious for me to catch ☹
Acked-by:  ///jon


> 

> Reported-and-tested-by:

> syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com

> Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from

> subcsriber sockets")

> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

> ---

>  net/tipc/topsrv.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c index

> 02013e00f287..63f35eae7236 100644

> --- a/net/tipc/topsrv.c

> +++ b/net/tipc/topsrv.c

> @@ -693,9 +693,9 @@ void tipc_topsrv_stop(struct net *net)

>  	}

>  	__module_get(lsock->ops->owner);

>  	__module_get(lsock->sk->sk_prot_creator->owner);

> -	sock_release(lsock);

>  	srv->listener = NULL;

>  	spin_unlock_bh(&srv->idr_lock);

> +	sock_release(lsock);

>  	tipc_topsrv_work_stop(srv);

>  	idr_destroy(&srv->conn_idr);

>  	kfree(srv);

> --

> 2.14.3
David Miller Feb. 19, 2018, 7:38 p.m. UTC | #2
From: Paolo Abeni <pabeni@redhat.com>
Date: Mon, 19 Feb 2018 19:02:24 +0100

> syzbot reported a scheduling while atomic issue at netns
> destruction time:
 ...
> This is caused by tipc_topsrv_stop() releasing the listener socket
> with the idr lock held. This changeset addresses the issue moving
> the release operation outside such lock.
> 
> Reported-and-tested-by: syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com
> Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Applied, thanks Paolo.
diff mbox series

Patch

diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c
index 02013e00f287..63f35eae7236 100644
--- a/net/tipc/topsrv.c
+++ b/net/tipc/topsrv.c
@@ -693,9 +693,9 @@  void tipc_topsrv_stop(struct net *net)
 	}
 	__module_get(lsock->ops->owner);
 	__module_get(lsock->sk->sk_prot_creator->owner);
-	sock_release(lsock);
 	srv->listener = NULL;
 	spin_unlock_bh(&srv->idr_lock);
+	sock_release(lsock);
 	tipc_topsrv_work_stop(srv);
 	idr_destroy(&srv->conn_idr);
 	kfree(srv);