Message ID | 20150918111650.GA7508@gondor.apana.org.au |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 18 Sep 2015 19:16:50 +0800 > The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink: > Reset portid after netlink_insert failure") introduced a race > condition where if two threads try to autobind the same socket > one of them may end up with a zero port ID. This led to kernel > deadlocks that were observed by multiple people. > > This patch reverts that commit and instead fixes it by introducing > a separte rhash_portid variable so that the real portid is only set > after the socket has been successfully hashed. > > Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure") > Reported-by: Tejun Heo <tj@kernel.org> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Applied and queued up for -stable, thanks Herbert. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 20, 2015 at 10:55:21PM -0700, David Miller wrote: > From: Herbert Xu <herbert@gondor.apana.org.au> > Date: Fri, 18 Sep 2015 19:16:50 +0800 > > > The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink: > > Reset portid after netlink_insert failure") introduced a race > > condition where if two threads try to autobind the same socket > > one of them may end up with a zero port ID. This led to kernel > > deadlocks that were observed by multiple people. > > > > This patch reverts that commit and instead fixes it by introducing > > a separte rhash_portid variable so that the real portid is only set > > after the socket has been successfully hashed. > > > > Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure") > > Reported-by: Tejun Heo <tj@kernel.org> > > Reported-by: Linus Torvalds <torvalds@linux-foundation.org> > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > Applied and queued up for -stable, thanks Herbert. Sorry but Dave but there are still races with v4 as Tejun pointed out. I'm still working on it and I could post them as incremental patches if that's the easiest. Thanks,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Mon, 21 Sep 2015 14:06:36 +0800 > On Sun, Sep 20, 2015 at 10:55:21PM -0700, David Miller wrote: >> From: Herbert Xu <herbert@gondor.apana.org.au> >> Date: Fri, 18 Sep 2015 19:16:50 +0800 >> >> > The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink: >> > Reset portid after netlink_insert failure") introduced a race >> > condition where if two threads try to autobind the same socket >> > one of them may end up with a zero port ID. This led to kernel >> > deadlocks that were observed by multiple people. >> > >> > This patch reverts that commit and instead fixes it by introducing >> > a separte rhash_portid variable so that the real portid is only set >> > after the socket has been successfully hashed. >> > >> > Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure") >> > Reported-by: Tejun Heo <tj@kernel.org> >> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org> >> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> >> >> Applied and queued up for -stable, thanks Herbert. > > Sorry but Dave but there are still races with v4 as Tejun pointed > out. I'm still working on it and I could post them as incremental > patches if that's the easiest. Oops, sorry about that. Yeah at this point incremental patches work the best. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 7f86d3b..303efb7 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1015,7 +1015,7 @@ static inline int netlink_compare(struct rhashtable_compare_arg *arg, const struct netlink_compare_arg *x = arg->key; const struct netlink_sock *nlk = ptr; - return nlk->portid != x->portid || + return nlk->rhash_portid != x->portid || !net_eq(sock_net(&nlk->sk), read_pnet(&x->pnet)); } @@ -1041,7 +1041,7 @@ static int __netlink_insert(struct netlink_table *table, struct sock *sk) { struct netlink_compare_arg arg; - netlink_compare_arg_init(&arg, sock_net(sk), nlk_sk(sk)->portid); + netlink_compare_arg_init(&arg, sock_net(sk), nlk_sk(sk)->rhash_portid); return rhashtable_lookup_insert_key(&table->hash, &arg, &nlk_sk(sk)->node, netlink_rhashtable_params); @@ -1103,7 +1103,7 @@ static int netlink_insert(struct sock *sk, u32 portid) unlikely(atomic_read(&table->hash.nelems) >= UINT_MAX)) goto err; - nlk_sk(sk)->portid = portid; + nlk_sk(sk)->rhash_portid = portid; sock_hold(sk); err = __netlink_insert(table, sk); @@ -1115,10 +1115,12 @@ static int netlink_insert(struct sock *sk, u32 portid) err = -EOVERFLOW; if (err == -EEXIST) err = -EADDRINUSE; - nlk_sk(sk)->portid = 0; sock_put(sk); + goto err; } + nlk_sk(sk)->portid = portid; + err: release_sock(sk); return err; @@ -3255,7 +3257,7 @@ static inline u32 netlink_hash(const void *data, u32 len, u32 seed) const struct netlink_sock *nlk = data; struct netlink_compare_arg arg; - netlink_compare_arg_init(&arg, sock_net(&nlk->sk), nlk->portid); + netlink_compare_arg_init(&arg, sock_net(&nlk->sk), nlk->rhash_portid); return jhash2((u32 *)&arg, netlink_compare_arg_len / sizeof(u32), seed); } diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h index 8900840..c96dfa3 100644 --- a/net/netlink/af_netlink.h +++ b/net/netlink/af_netlink.h @@ -25,6 +25,7 @@ struct netlink_ring { struct netlink_sock { /* struct sock has to be the first member of netlink_sock */ struct sock sk; + u32 rhash_portid; u32 portid; u32 dst_portid; u32 dst_group;