Message ID | 20150917034134.GA19327@gondor.apana.org.au |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Sep 16, 2015 at 8:41 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Thu, Sep 17, 2015 at 11:08:45AM +0800, Herbert Xu wrote: >> >> Good catch! I think your explanation makes perfect sense. Linus >> ran into this previously too after suspend-and-resume. > > Unfortunately you can't just postpone the setting of portid because > once you pass it onto rhashtable the portid must never change while > it's in custody. > > So what I've done is essentially revert my previous fix and instead > add a new boolean "bound" to indicate whether the socket has been > bound. > > ---8<--- > netlink: Fix autobind race condition that leads to zero port ID > > The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink: > Reset portid after netlink_insert failure") introduced a race > condition where if two threads tried to autobind the same socket > one of them may end up with a zero port ID. > > This patch reverts that commit and instead fixes it by introducing > a separte "bound" variable to indicate whether a socket has been > bound. > > Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure") > Reported-by: Tejun Heo <tj@kernel.org> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org> We saw similar soft lockup with the one Tejun reported, in our data center. > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Just one comment below. [...] > @@ -1285,7 +1287,7 @@ static int netlink_release(struct socket *sock) > > skb_queue_purge(&sk->sk_write_queue); > > - if (nlk->portid) { > + if (nlk->bound) { > struct netlink_notify n = { > .net = sock_net(sk), > .protocol = sk->sk_protocol, This part doesn't look correct, seems it is checking if this is a kernel netlink socket rather than if it is bound. But I am not sure... Other than this, looks good to me: Reviewed-by: Cong Wang <cwang@twopensource.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 7f86d3b..abcbca5 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1095,7 +1095,7 @@ static int netlink_insert(struct sock *sk, u32 portid) lock_sock(sk); err = -EBUSY; - if (nlk_sk(sk)->portid) + if (nlk_sk(sk)->bound) goto err; err = -ENOMEM; @@ -1115,10 +1115,12 @@ static int netlink_insert(struct sock *sk, u32 portid) err = -EOVERFLOW; if (err == -EEXIST) err = -EADDRINUSE; - nlk_sk(sk)->portid = 0; sock_put(sk); + goto err; } + nlk_sk(sk)->bound = true; + err: release_sock(sk); return err; @@ -1285,7 +1287,7 @@ static int netlink_release(struct socket *sock) skb_queue_purge(&sk->sk_write_queue); - if (nlk->portid) { + if (nlk->bound) { struct netlink_notify n = { .net = sock_net(sk), .protocol = sk->sk_protocol, @@ -1519,7 +1521,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, return err; } - if (nlk->portid) + if (nlk->bound) if (nladdr->nl_pid != nlk->portid) return -EINVAL; @@ -1537,7 +1539,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, } } - if (!nlk->portid) { + if (!nlk->bound) { err = nladdr->nl_pid ? netlink_insert(sk, nladdr->nl_pid) : netlink_autobind(sock); @@ -1585,7 +1587,7 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr, !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND)) return -EPERM; - if (!nlk->portid) + if (!nlk->bound) err = netlink_autobind(sock); if (err == 0) { @@ -2426,7 +2428,7 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) dst_group = nlk->dst_group; } - if (!nlk->portid) { + if (!nlk->bound) { err = netlink_autobind(sock); if (err) goto out; diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h index 8900840..e6aae40 100644 --- a/net/netlink/af_netlink.h +++ b/net/netlink/af_netlink.h @@ -35,6 +35,7 @@ struct netlink_sock { unsigned long state; size_t max_recvmsg_len; wait_queue_head_t wait; + bool bound; bool cb_running; struct netlink_callback cb; struct mutex *cb_mutex;