From patchwork Wed Aug 8 20:57:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sowmini Varadhan X-Patchwork-Id: 955241 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=oracle.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=oracle.com header.i=@oracle.com header.b="16P+c831"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41m3c86Qybz9s4c for ; Thu, 9 Aug 2018 06:57:28 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731139AbeHHXSu (ORCPT ); Wed, 8 Aug 2018 19:18:50 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:49920 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727530AbeHHXSt (ORCPT ); Wed, 8 Aug 2018 19:18:49 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w78KtLM6069739; Wed, 8 Aug 2018 20:57:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : to; s=corp-2018-07-02; bh=vNCge4kZe31YJTxq1hZylnm2BUuQSm++1FfaNwXdH1I=; b=16P+c8312PsviDxYYc/44TRPa6qYPvCAIzFOYZ9iUnHFVuLnudDz6GqpHhmKT7XU/FL5 HQUvmYRfh6MDHRHL5XnH0Ew6TsJjF6bq5JZ2ay5P/dJnyDAB8TL3CLrAEv3D8tBnjaj2 +B36GkiLPVE1My7xujysBGRuvDxqDT9mdWvQr8TYUOGs9+JcSRDBOBWPce0M9n1Qhu99 CNxRWQhyvmx7+XDOZpMwZDd7T+xjh5ehXkZu0c+icJqYS8G/svGiWx7k+5B+JFfXdHEz GtylXjw4f6FhheGZ/7X6NL8S0yRjWt1Y0IcscdmfGY2Q1SV+9B8Gtu+/m0bk86YX0TUZ xg== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2120.oracle.com with ESMTP id 2kn43nygy2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 08 Aug 2018 20:57:22 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w78KvLD9012613 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 Aug 2018 20:57:21 GMT Received: from abhmp0002.oracle.com (abhmp0002.oracle.com [141.146.116.8]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w78KvKRD001283; Wed, 8 Aug 2018 20:57:20 GMT Received: from ipftiger1.us.oracle.com (/10.208.179.35) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 08 Aug 2018 13:57:20 -0700 From: Sowmini Varadhan To: netdev@vger.kernel.org Cc: davem@davemloft.net, rds-devel@oss.oracle.com, sowmini.varadhan@oracle.com, santosh.shilimkar@oracle.com Subject: [PATCH net-next] rds: avoid lock hierarchy violation between m_rs_lock and rs_recv_lock Date: Wed, 8 Aug 2018 13:57:13 -0700 Message-Id: <1533761833-106379-1-git-send-email-sowmini.varadhan@oracle.com> X-Mailer: git-send-email 1.7.1 To: sowmini.varadhan@oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8979 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=868 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808080210 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The following deadlock, reported by syzbot, can occur if CPU0 is in rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue() CPU0 CPU1 ---- ---- lock(&(&rm->m_rs_lock)->rlock); lock(&rs->rs_recv_lock); lock(&(&rm->m_rs_lock)->rlock); lock(&rs->rs_recv_lock); The deadlock should be avoided by moving the messages from the rs_recv_queue into a tmp_list in rds_clear_recv_queue() under the rs_recv_lock, and then dropping the refcnt on the messages in the tmp_list (potentially resulting in rds_message_purge()) after dropping the rs_recv_lock. The same lock hierarchy violation also exists in rds_still_queued() and should be avoided in a similar manner Signed-off-by: Sowmini Varadhan Reported-by: syzbot+52140d69ac6dc6b927a9@syzkaller.appspotmail.com --- net/rds/recv.c | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/net/rds/recv.c b/net/rds/recv.c index 504cd6b..1cf7072 100644 --- a/net/rds/recv.c +++ b/net/rds/recv.c @@ -429,6 +429,7 @@ static int rds_still_queued(struct rds_sock *rs, struct rds_incoming *inc, struct sock *sk = rds_rs_to_sk(rs); int ret = 0; unsigned long flags; + bool drop_ref = false; write_lock_irqsave(&rs->rs_recv_lock, flags); if (!list_empty(&inc->i_item)) { @@ -439,11 +440,13 @@ static int rds_still_queued(struct rds_sock *rs, struct rds_incoming *inc, -be32_to_cpu(inc->i_hdr.h_len), inc->i_hdr.h_dport); list_del_init(&inc->i_item); - rds_inc_put(inc); + drop_ref = true; } } write_unlock_irqrestore(&rs->rs_recv_lock, flags); + if (drop_ref) + rds_inc_put(inc); rdsdebug("inc %p rs %p still %d dropped %d\n", inc, rs, ret, drop); return ret; } @@ -751,16 +754,20 @@ void rds_clear_recv_queue(struct rds_sock *rs) struct sock *sk = rds_rs_to_sk(rs); struct rds_incoming *inc, *tmp; unsigned long flags; + LIST_HEAD(tmp_list); write_lock_irqsave(&rs->rs_recv_lock, flags); list_for_each_entry_safe(inc, tmp, &rs->rs_recv_queue, i_item) { rds_recv_rcvbuf_delta(rs, sk, inc->i_conn->c_lcong, -be32_to_cpu(inc->i_hdr.h_len), inc->i_hdr.h_dport); + list_move_tail(&inc->i_item, &tmp_list); + } + write_unlock_irqrestore(&rs->rs_recv_lock, flags); + list_for_each_entry_safe(inc, tmp, &tmp_list, i_item) { list_del_init(&inc->i_item); rds_inc_put(inc); } - write_unlock_irqrestore(&rs->rs_recv_lock, flags); } /*