Message ID | 1533761833-106379-1-git-send-email-sowmini.varadhan@oracle.com |
---|---|
State | Deferred, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net-next] rds: avoid lock hierarchy violation between m_rs_lock and rs_recv_lock | expand |
On 8/8/2018 1:57 PM, Sowmini Varadhan wrote: > The following deadlock, reported by syzbot, can occur if CPU0 is in > rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue() > > CPU0 CPU1 > ---- ---- > lock(&(&rm->m_rs_lock)->rlock); > lock(&rs->rs_recv_lock); > lock(&(&rm->m_rs_lock)->rlock); > lock(&rs->rs_recv_lock); > > The deadlock should be avoided by moving the messages from the > rs_recv_queue into a tmp_list in rds_clear_recv_queue() under > the rs_recv_lock, and then dropping the refcnt on the messages > in the tmp_list (potentially resulting in rds_message_purge()) > after dropping the rs_recv_lock. > > The same lock hierarchy violation also exists in rds_still_queued() > and should be avoided in a similar manner > > Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> > Reported-by: syzbot+52140d69ac6dc6b927a9@syzkaller.appspotmail.com > --- This bug doesn't make sense since two different transports are using same socket (Loop and rds_tcp) and running together. For same transport, such race can't happen with MSG_ON_SOCK flag. CPU1-> rds_loop_inc_free CPU0 -> rds_tcp_cork ... I need to understand this test better. Regards, Santosh
On (08/08/18 14:51), Santosh Shilimkar wrote: > This bug doesn't make sense since two different transports are using > same socket (Loop and rds_tcp) and running together. > For same transport, such race can't happen with MSG_ON_SOCK flag. > CPU1-> rds_loop_inc_free > CPU0 -> rds_tcp_cork ... > The test is just reporting a lock hierarchy violation As far as I can tell, this wasn't an actual deadlock that happened because as you point out, either a socket has the rds_tcp transport or the rds_loop transport, so this particular pair of stack traces would not happen with the code as it exists today. but there is a valid lock hierachy violation here, and imho it's a good idea to get that cleaned up. It also avoids needlessly holding down the rs_recv_lock when doing an rds_inc_put. --Sowmini
On 8/8/2018 3:18 PM, Sowmini Varadhan wrote: > On (08/08/18 14:51), Santosh Shilimkar wrote: >> This bug doesn't make sense since two different transports are using >> same socket (Loop and rds_tcp) and running together. >> For same transport, such race can't happen with MSG_ON_SOCK flag. >> CPU1-> rds_loop_inc_free >> CPU0 -> rds_tcp_cork ... >> > > The test is just reporting a lock hierarchy violation > > As far as I can tell, this wasn't an actual deadlock that happened > because as you point out, either a socket has the rds_tcp transport > or the rds_loop transport, so this particular pair of stack traces > would not happen with the code as it exists today. > Exactly. > but there is a valid lock hierachy violation here, and > imho it's a good idea to get that cleaned up. > The lock hierarchy violation is protected for the same transport. I don't see this violation possible for legitimate use and hence the comment. If we start supporting two different transport on same socket then we have many more cases to fix and as such lock violation will be just one of those. Loop transport seems to keep throwing surprises. Need to confirm but looks like it can co-exist with another transport on same socket if those traces to be believed. If its the case, then definitely that need to be plugged. Regards, Santosh
From: Sowmini Varadhan <sowmini.varadhan@oracle.com> Date: Wed, 8 Aug 2018 13:57:13 -0700 > The following deadlock, reported by syzbot, can occur if CPU0 is in > rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue() > > CPU0 CPU1 > ---- ---- > lock(&(&rm->m_rs_lock)->rlock); > lock(&rs->rs_recv_lock); > lock(&(&rm->m_rs_lock)->rlock); > lock(&rs->rs_recv_lock); > > The deadlock should be avoided by moving the messages from the > rs_recv_queue into a tmp_list in rds_clear_recv_queue() under > the rs_recv_lock, and then dropping the refcnt on the messages > in the tmp_list (potentially resulting in rds_message_purge()) > after dropping the rs_recv_lock. > > The same lock hierarchy violation also exists in rds_still_queued() > and should be avoided in a similar manner > > Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> > Reported-by: syzbot+52140d69ac6dc6b927a9@syzkaller.appspotmail.com I'm putting this in deferred state for now. Sowmini, once you and Santosh agree on what exactly to do, please resubmit. Thank you.
diff --git a/net/rds/recv.c b/net/rds/recv.c index 504cd6b..1cf7072 100644 --- a/net/rds/recv.c +++ b/net/rds/recv.c @@ -429,6 +429,7 @@ static int rds_still_queued(struct rds_sock *rs, struct rds_incoming *inc, struct sock *sk = rds_rs_to_sk(rs); int ret = 0; unsigned long flags; + bool drop_ref = false; write_lock_irqsave(&rs->rs_recv_lock, flags); if (!list_empty(&inc->i_item)) { @@ -439,11 +440,13 @@ static int rds_still_queued(struct rds_sock *rs, struct rds_incoming *inc, -be32_to_cpu(inc->i_hdr.h_len), inc->i_hdr.h_dport); list_del_init(&inc->i_item); - rds_inc_put(inc); + drop_ref = true; } } write_unlock_irqrestore(&rs->rs_recv_lock, flags); + if (drop_ref) + rds_inc_put(inc); rdsdebug("inc %p rs %p still %d dropped %d\n", inc, rs, ret, drop); return ret; } @@ -751,16 +754,20 @@ void rds_clear_recv_queue(struct rds_sock *rs) struct sock *sk = rds_rs_to_sk(rs); struct rds_incoming *inc, *tmp; unsigned long flags; + LIST_HEAD(tmp_list); write_lock_irqsave(&rs->rs_recv_lock, flags); list_for_each_entry_safe(inc, tmp, &rs->rs_recv_queue, i_item) { rds_recv_rcvbuf_delta(rs, sk, inc->i_conn->c_lcong, -be32_to_cpu(inc->i_hdr.h_len), inc->i_hdr.h_dport); + list_move_tail(&inc->i_item, &tmp_list); + } + write_unlock_irqrestore(&rs->rs_recv_lock, flags); + list_for_each_entry_safe(inc, tmp, &tmp_list, i_item) { list_del_init(&inc->i_item); rds_inc_put(inc); } - write_unlock_irqrestore(&rs->rs_recv_lock, flags); } /*
The following deadlock, reported by syzbot, can occur if CPU0 is in rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue() CPU0 CPU1 ---- ---- lock(&(&rm->m_rs_lock)->rlock); lock(&rs->rs_recv_lock); lock(&(&rm->m_rs_lock)->rlock); lock(&rs->rs_recv_lock); The deadlock should be avoided by moving the messages from the rs_recv_queue into a tmp_list in rds_clear_recv_queue() under the rs_recv_lock, and then dropping the refcnt on the messages in the tmp_list (potentially resulting in rds_message_purge()) after dropping the rs_recv_lock. The same lock hierarchy violation also exists in rds_still_queued() and should be avoided in a similar manner Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Reported-by: syzbot+52140d69ac6dc6b927a9@syzkaller.appspotmail.com --- net/rds/recv.c | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-)