From patchwork Tue Jan 20 21:07:17 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 19574 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 73AFADDDE3 for ; Wed, 21 Jan 2009 13:45:53 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756295AbZAUCpn (ORCPT ); Tue, 20 Jan 2009 21:45:43 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756378AbZAUCpl (ORCPT ); Tue, 20 Jan 2009 21:45:41 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:52478 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755529AbZAUCpS (ORCPT ); Tue, 20 Jan 2009 21:45:18 -0500 Received: from mx04.mta.xmission.com ([166.70.13.214]) by out01.mta.xmission.com with esmtp (Exim 4.62) (envelope-from ) id 1LPT5s-0007Ya-0M for netdev@vger.kernel.org; Tue, 20 Jan 2009 19:45:28 -0700 Received: from c-24-130-11-59.hsd1.ca.comcast.net ([24.130.11.59] helo=fess.ebiederm.org) by mx04.mta.xmission.com with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from ) id 1LPT5h-00054Y-1x for netdev@vger.kernel.org; Tue, 20 Jan 2009 19:45:17 -0700 Received: from fess.ebiederm.org (localhost [127.0.0.1]) by fess.ebiederm.org (8.14.3/8.14.3/Debian-4) with ESMTP id n0L2jIvf029125 for ; Tue, 20 Jan 2009 18:45:18 -0800 Received: (from eric@localhost) by fess.ebiederm.org (8.14.3/8.14.3/Submit) id n0L2jIVj029124 for netdev@vger.kernel.org; Tue, 20 Jan 2009 18:45:18 -0800 X-Authentication-Warning: fess.ebiederm.org: eric set sender to ebiederm@xmission.com using -f X-From-Line: nobody Tue Jan 20 13:07:18 2009 To: David Miller Cc: , Max Krasnyansky , Pavel Emelyanov References: From: ebiederm@xmission.com (Eric W. Biederman) Date: Tue, 20 Jan 2009 13:07:17 -0800 In-Reply-To: (Eric W. Biederman's message of "Tue\, 20 Jan 2009 13\:03\:21 -0800") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 Lines: 157 X-XM-SPF: eid=; ; ; mid=; ; ; hst=mx04.mta.xmission.com; ; ; ip=24.130.11.59; ; ; frm=ebiederm@xmission.com; ; ; spf=neutral X-SA-Exim-Connect-IP: 24.130.11.59 X-SA-Exim-Rcpt-To: netdev@vger.kernel.org X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-DCC: XMission; sa03 1397; Body=2 Fuz1=2 Fuz2=2 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on sa03.xmission.com X-Spam-Level: X-Spam-Status: No, score=-4.4 required=8.0 tests=ALL_TRUSTED,BAYES_00, DCC_CHECK_NEGATIVE,XM_SPF_Neutral autolearn=disabled version=3.2.5 X-Spam-Combo: ;David Miller X-Spam-Relay-Country: X-Spam-Report: * -1.8 ALL_TRUSTED Passed through trusted hosts only via SMTP * -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% * [score: 0.0000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa03 1397; Body=2 Fuz1=2 Fuz2=2] * 0.0 XM_SPF_Neutral SPF-Neutral Subject: [PATCH 08/10] tun: Fix races between tun_net_close and free_netdev. X-SA-Exim-Version: 4.2.1 (built Thu, 07 Dec 2006 04:40:56 +0000) X-SA-Exim-Scanned: Yes (on mx04.mta.xmission.com) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The tun code does not cope gracefully if the network device goes away before the tun file descriptor is closed. It looks like we can trigger this with rmmod, and moving tun devices between network namespaces will allow this to be triggered when network namespaces exit. To fix this I introduce an intermediate data structure tun_file which holds a count of users and a pointer to the struct tun_struct. tun_get increments that reference count if it is greater than 0. tun_put decrements that reference count and detaches from the network device if the count is 0. While we have a file attached to the network device I hold a reference to the network device keeping it from going away completely. When a network device is unregistered I decrement the count of the attached tun_file and if that was the last user I detach the tun_file, and all processes on read_wait are woken up to ensure they do not sleep indefinitely. As some of those sleeps happen with the count on the tun device elevated waking up the read waiters ensures that tun_file will be detached in a timely manner. Signed-off-by: Eric W. Biederman --- drivers/net/tun.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 48 insertions(+), 2 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 030d985..51dba61 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -88,6 +88,7 @@ struct tap_filter { }; struct tun_file { + atomic_t count; struct tun_struct *tun; struct net *net; wait_queue_head_t read_wait; @@ -138,6 +139,8 @@ static int tun_attach(struct tun_struct *tun, struct file *file) err = 0; tfile->tun = tun; tun->tfile = tfile; + dev_hold(tun->dev); + atomic_inc(&tfile->count); out: netif_tx_unlock_bh(tun->dev); @@ -156,11 +159,26 @@ static void __tun_detach(struct tun_struct *tun) /* Drop read queue */ skb_queue_purge(&tun->readq); + + /* Drop the extra count on the net device */ + dev_put(tun->dev); +} + +static void tun_detach(struct tun_struct *tun) +{ + rtnl_lock(); + __tun_detach(tun); + rtnl_unlock(); } static struct tun_struct *__tun_get(struct tun_file *tfile) { - return tfile->tun; + struct tun_struct *tun = NULL; + + if (atomic_inc_not_zero(&tfile->count)) + tun = tfile->tun; + + return tun; } static struct tun_struct *tun_get(struct file *file) @@ -170,7 +188,10 @@ static struct tun_struct *tun_get(struct file *file) static void tun_put(struct tun_struct *tun) { - /* Noop for now */ + struct tun_file *tfile = tun->tfile; + + if (atomic_dec_and_test(&tfile->count)) + tun_detach(tfile->tun); } /* TAP filterting */ @@ -281,6 +302,21 @@ static int check_filter(struct tap_filter *filter, const struct sk_buff *skb) static const struct ethtool_ops tun_ethtool_ops; +/* Net device detach from fd. */ +static void tun_net_uninit(struct net_device *dev) +{ + struct tun_struct *tun = netdev_priv(dev); + struct tun_file *tfile = tun->tfile; + + /* Inform the methods they need to stop using the dev. + */ + if (tfile) { + wake_up_all(&tfile->read_wait); + if (atomic_dec_and_test(&tfile->count)) + __tun_detach(tun); + } +} + /* Net device open. */ static int tun_net_open(struct net_device *dev) { @@ -367,6 +403,7 @@ tun_net_change_mtu(struct net_device *dev, int new_mtu) } static const struct net_device_ops tun_netdev_ops = { + .ndo_uninit = tun_net_uninit, .ndo_open = tun_net_open, .ndo_stop = tun_net_close, .ndo_start_xmit = tun_net_xmit, @@ -374,6 +411,7 @@ static const struct net_device_ops tun_netdev_ops = { }; static const struct net_device_ops tap_netdev_ops = { + .ndo_uninit = tun_net_uninit, .ndo_open = tun_net_open, .ndo_stop = tun_net_close, .ndo_start_xmit = tun_net_xmit, @@ -434,6 +472,9 @@ static unsigned int tun_chr_poll(struct file *file, poll_table * wait) if (!skb_queue_empty(&tun->readq)) mask |= POLLIN | POLLRDNORM; + if (tun->dev->reg_state != NETREG_REGISTERED) + mask = POLLERR; + tun_put(tun); return mask; } @@ -734,6 +775,10 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv, ret = -ERESTARTSYS; break; } + if (tun->dev->reg_state != NETREG_REGISTERED) { + ret = -EIO; + break; + } /* Nothing to read, let's sleep */ schedule(); @@ -1135,6 +1180,7 @@ static int tun_chr_open(struct inode *inode, struct file * file) tfile = kmalloc(sizeof(*tfile), GFP_KERNEL); if (!tfile) return -ENOMEM; + atomic_set(&tfile->count, 0); tfile->tun = NULL; tfile->net = get_net(current->nsproxy->net_ns); init_waitqueue_head(&tfile->read_wait);