From patchwork Mon Jul 27 14:21:46 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sowmini Varadhan X-Patchwork-Id: 500446 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id DAA0C1402C0 for ; Tue, 28 Jul 2015 00:21:59 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753339AbbG0OVz (ORCPT ); Mon, 27 Jul 2015 10:21:55 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:40464 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751262AbbG0OVy (ORCPT ); Mon, 27 Jul 2015 10:21:54 -0400 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t6RELqYq023351 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 27 Jul 2015 14:21:53 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t6RELqf8025829 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Mon, 27 Jul 2015 14:21:52 GMT Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by userv0122.oracle.com (8.13.8/8.13.8) with ESMTP id t6RELqPQ003000; Mon, 27 Jul 2015 14:21:52 GMT Received: from oracle.com (/10.175.225.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 27 Jul 2015 07:21:51 -0700 Date: Mon, 27 Jul 2015 16:21:46 +0200 From: Sowmini Varadhan To: netdev@vger.kernel.org Cc: ebiederm@xmission.com, davem@davemloft.net, sowmini.varadhan@oracle.com Subject: netns refcnt leak for kernel accept sock Message-ID: <20150727142146.GC16447@oracle.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org I'm running into a netns refcnt issue, and I suspect that eeb1bd5c has something to do with it (perhaps we need an additional change in sk_clone_lock() after eeb1bd5c). Here's the problem: When we create an syn_recv sock based on a kernel listen sock, we take a get_net() ref with a stack similar to the one shown below. Note that the parent (kernel, listen) sock itself has not taken a get_net() ref, because it explicitly calls sock_create_kern(). get_net /* for the newsk */ sk_clone_lock inet_csk_clone_lock tcp_create_openreq_child tcp_v4_syn_recv_sock tcp_check_req tcp_v4_do_rcv tcp_v4_rcv : But it's not clear to me where this refcnt will be released: in my case, I expect to create/cleanup kernel sockets as part of ->init/->exit for my module, but because the accept socket has a netns refcnt, it blocks cleanup_net(), thus my ->exit pernet_subsys op cannot run and clean this up, and we have a leak. I think that sk_clone_lock() should only do a get_net() if the parent is not a kernel socket (making this similar to sk_alloc()), i.e., Does this sound right? --Sowmini --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/core/sock.c b/net/core/sock.c index 08f16db..371d1b7 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1497,7 +1497,8 @@ struct sock *sk_clone_lock(const struct sock *sk, const gf sock_copy(newsk, sk); /* SANITY */ - get_net(sock_net(newsk)); + if (likely(newsk->sk_net_refcnt)) + get_net(sock_net(newsk)); sk_node_init(&newsk->sk_node); sock_lock_init(newsk); bh_lock_sock(newsk);