Message ID | 20170624130827.GD6901@oracle.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Sowmini Varadhan <sowmini.varadhan@oracle.com> Date: Sat, 24 Jun 2017 09:08:27 -0400 > We're seeing a memleak when we run an infinite loop that > loads/unloads rds-tcp, and runs some traffic between each > load/unload. > > Analysis shows that this is happening for the following reason: > > inet_accept -> sock_graft does > parent->sk = sk > but if the parent->sk was previously pointing at some other > struct sock "old_sk" (happens in the case of rds_tcp_accept_one() > which has historically called sock_create_kern() to set up > the new_sock), we need to sock_put(old_sk), else we'd leak it. > > In general, sock_graft() is cutting loose the parent->sk, > so it looks like it needs to release its refcnt on it? > > Patch below takes care of the leak in our case, but I could use > some input on other locking considerations, and if this is ok > with other modules that use sock_graft() It could simply be the case that rds-tcp is the first setup that created that situation where there is a parent->sk already. In all of the normal accept*() code paths, a plain struct socket is allocated and nothing sets newsock->sk to anything before that sock_graft() call. Why does rds-tcp need to call sock_graft() without those invariants met? Thanks.
On (06/27/17 15:38), David Miller wrote: > > It could simply be the case that rds-tcp is the first setup that > created that situation where there is a parent->sk already. Possibly, I noticed that other callers call sock_create_lite() and I dont know the history here - this seems to have been the case from day-1 of rds-tcp. (and I dread changing rds_tcp_accept_kern() to do this, because then every module unload would need to go and check if sock->sk is non-null first, before cleaning it up > Why does rds-tcp need to call sock_graft() without those invariants > met? It would certainly help to declare "dont use sock_creeate_kern() if you are going to accept on this socket"- I dont see that being mandated anywhere. It would also help to have a BUG_ON(parent->sk) or at least a WARN_ON(parent->sk) in sock_graft, before unilaterally assigning it to the new sk. --Sowmini
On (06/27/17 15:59), Sowmini Varadhan wrote: > > Why does rds-tcp need to call sock_graft() without those invariants > > met? > > It would certainly help to declare "dont use sock_creeate_kern() > if you are going to accept on this socket"- I dont see that being > mandated anywhere. I can look into getting rds_tcp_accept_one also calling sock_create_lite like every other caller, (though I may not get to this for another week, due to travel), but the code in sock_graft() doesnt look right either. At the very least, there needs to be a WARN_ON(parent->sk) there, to provide a gentle dope-slap for the next slob that stumbles on this type of leak. --Sowmini
From: Sowmini Varadhan <sowmini.varadhan@oracle.com> Date: Tue, 27 Jun 2017 16:45:29 -0400 > On (06/27/17 15:59), Sowmini Varadhan wrote: >> > Why does rds-tcp need to call sock_graft() without those invariants >> > met? >> >> It would certainly help to declare "dont use sock_creeate_kern() >> if you are going to accept on this socket"- I dont see that being >> mandated anywhere. > > I can look into getting rds_tcp_accept_one also calling sock_create_lite > like every other caller, (though I may not get to this for another week, > due to travel), but the code in sock_graft() doesnt look right either. > > At the very least, there needs to be a WARN_ON(parent->sk) there, > to provide a gentle dope-slap for the next slob that stumbles on this > type of leak. In the accept case it is, if anything, very wasteful. This is because you allocate a sock and then immediately free it up. I would say mimick the logic of sys_accept4() and just use sock_alloc() to allocate the struct socket without a struct sock. Or, as you say, use sock_create_lite() or a similar helper.
diff --git a/include/net/sock.h b/include/net/sock.h index 5374c0d..014ad56 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1686,12 +1686,19 @@ static inline void sock_orphan(struct sock *sk) static inline void sock_graft(struct sock *sk, struct socket *parent) { + struct sock *old_sk; + write_lock_bh(&sk->sk_callback_lock); sk->sk_wq = parent->wq; + old_sk = parent->sk; parent->sk = sk; sk_set_socket(sk, parent); security_sock_graft(sk, parent); write_unlock_bh(&sk->sk_callback_lock); + if (old_sk) { + sock_orphan(old_sk); + sock_put(old_sk); + } }