From patchwork Thu Oct 20 14:29:24 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 120824 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id E8C4CB6F69 for ; Fri, 21 Oct 2011 01:28:59 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755767Ab1JTO2y (ORCPT ); Thu, 20 Oct 2011 10:28:54 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:52000 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755734Ab1JTO2x (ORCPT ); Thu, 20 Oct 2011 10:28:53 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out01.mta.xmission.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from ) id 1RGtc5-0003Mz-4N; Thu, 20 Oct 2011 08:28:53 -0600 Received: from c-98-207-153-68.hsd1.ca.comcast.net ([98.207.153.68] helo=x61.ebiederm.org) by in01.mta.xmission.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from ) id 1RGtc3-00070K-U3; Thu, 20 Oct 2011 08:28:53 -0600 Received: from fess.ebiederm.org (fess.int.ebiederm.org [192.168.4.7]) by x61.ebiederm.org (Postfix) with ESMTP id 8148C39D7A; Thu, 20 Oct 2011 07:27:36 -0700 (PDT) Received: by fess.ebiederm.org (Postfix, from userid 502) id 57591C04A1; Thu, 20 Oct 2011 07:29:24 -0700 (PDT) From: ebiederm@xmission.com (Eric W. Biederman) To: David Miller Cc: , Arnd Bergmann , Jason Wang , "Michael S. Tsirkin" , Ian Campbell , Shirly Ma References: Date: Thu, 20 Oct 2011 07:29:24 -0700 In-Reply-To: (Eric W. Biederman's message of "Thu, 20 Oct 2011 07:28:46 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=; ; ; mid=; ; ; hst=in01.mta.xmission.com; ; ; ip=98.207.153.68; ; ; frm=ebiederm@xmission.com; ; ; spf=neutral X-XM-AID: U2FsdGVkX1/ky6EQEjh8ymcNeVJqZgiWTYjxhxYB8Hc= X-SA-Exim-Connect-IP: 98.207.153.68 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sa04.xmission.com X-Spam-Level: X-Spam-Status: No, score=-1.1 required=8.0 tests=BAYES_00, DCC_CHECK_NEGATIVE, T_XMDrugObfuBody_04,T_XMDrugObfuBody_14,UNTRUSTED_Relay,XMNoVowels autolearn=disabled version=3.3.1 X-Spam-Report: * 1.5 XMNoVowels Alpha-numberic number with no vowels * -3.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_XMDrugObfuBody_14 obfuscated drug references * 0.0 T_XMDrugObfuBody_04 obfuscated drug references * 0.4 UNTRUSTED_Relay Comes from a non-trusted relay X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;David Miller X-Spam-Relay-Country: ** Subject: [PATCH 5/5] macvtap: Fix the minor device number allocation X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Fri, 06 Aug 2010 16:31:04 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On systems that create and delete lots of dynamic devices the 31bit linux ifindex fails to fit in the 16bit macvtap minor, resulting in unusable macvtap devices. I have systems running automated tests that that hit this condition in just a few days. Use a linux idr allocator to track which mavtap minor numbers are available and and to track the association between macvtap minor numbers and macvtap network devices. Remove the unnecessary unneccessary check to see if the network device we have found is indeed a macvtap device. With macvtap specific data structures it is impossible to find any other kind of networking device. Increase the macvtap minor range from 65536 to the full 20 bits that is supported by linux device numbers. It doesn't solve the original problem but there is no penalty for a larger minor device range. Signed-off-by: Eric W. Biederman --- drivers/net/macvtap.c | 87 ++++++++++++++++++++++++++++++++++++-------- include/linux/if_macvlan.h | 1 + 2 files changed, 72 insertions(+), 16 deletions(-) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 25689e9..1b7082d 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -51,15 +51,13 @@ static struct proto macvtap_proto = { }; /* - * Minor number matches netdev->ifindex, so need a potentially - * large value. This also makes it possible to split the - * tap functionality out again in the future by offering it - * from other drivers besides macvtap. As long as every device - * only has one tap, the interface numbers assure that the - * device nodes are unique. + * Variables for dealing with macvtaps device numbers. */ static dev_t macvtap_major; -#define MACVTAP_NUM_DEVS 65536 +#define MACVTAP_NUM_DEVS (1U << MINORBITS) +static DEFINE_MUTEX(minor_lock); +static DEFINE_IDR(minor_idr); + #define GOODCOPY_LEN 128 static struct class *macvtap_class; static struct cdev macvtap_cdev; @@ -275,6 +273,58 @@ static int macvtap_receive(struct sk_buff *skb) return macvtap_forward(skb->dev, skb); } +static int macvtap_get_minor(struct macvlan_dev *vlan) +{ + int retval = -ENOMEM; + int id; + + mutex_lock(&minor_lock); + if (idr_pre_get(&minor_idr, GFP_KERNEL) == 0) + goto exit; + + retval = idr_get_new_above(&minor_idr, vlan, 1, &id); + if (retval < 0) { + if (retval == -EAGAIN) + retval = -ENOMEM; + goto exit; + } + if (id < MACVTAP_NUM_DEVS) { + vlan->minor = id; + } else { + printk(KERN_ERR "too many macvtap devices\n"); + retval = -EINVAL; + idr_remove(&minor_idr, id); + } +exit: + mutex_unlock(&minor_lock); + return retval; +} + +static void macvtap_free_minor(struct macvlan_dev *vlan) +{ + mutex_lock(&minor_lock); + if (vlan->minor) { + idr_remove(&minor_idr, vlan->minor); + vlan->minor = 0; + } + mutex_unlock(&minor_lock); +} + +static struct net_device *dev_get_by_macvtap_minor(int minor) +{ + struct net_device *dev = NULL; + struct macvlan_dev *vlan; + + mutex_lock(&minor_lock); + vlan = idr_find(&minor_idr, minor); + if (vlan) { + dev = vlan->dev; + dev_hold(dev); + } + mutex_unlock(&minor_lock); + return dev; +} + static int macvtap_newlink(struct net *src_net, struct net_device *dev, struct nlattr *tb[], @@ -329,7 +379,7 @@ static void macvtap_sock_destruct(struct sock *sk) static int macvtap_open(struct inode *inode, struct file *file) { struct net *net = current->nsproxy->net_ns; - struct net_device *dev = dev_get_by_index(net, iminor(inode)); + struct net_device *dev = dev_get_by_macvtap_minor(iminor(inode)); struct macvtap_queue *q; int err; @@ -337,11 +387,6 @@ static int macvtap_open(struct inode *inode, struct file *file) if (!dev) goto out; - /* check if this is a macvtap device */ - err = -EINVAL; - if (dev->rtnl_link_ops != &macvtap_link_ops) - goto out; - err = -ENOMEM; q = (struct macvtap_queue *)sk_alloc(net, AF_UNSPEC, GFP_KERNEL, &macvtap_proto); @@ -961,12 +1006,15 @@ static int macvtap_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct net_device *dev = ptr; + struct macvlan_dev *vlan; struct device *classdev; dev_t devt; + int err; if (dev->rtnl_link_ops != &macvtap_link_ops) return NOTIFY_DONE; + vlan = netdev_priv(dev); switch (event) { case NETDEV_REGISTER: @@ -974,15 +1022,22 @@ static int macvtap_device_event(struct notifier_block *unused, * been registered but before register_netdevice has * finished running. */ - devt = MKDEV(MAJOR(macvtap_major), dev->ifindex); + err = macvtap_get_minor(vlan); + if (err) + return notifier_from_errno(err); + + devt = MKDEV(MAJOR(macvtap_major), vlan->minor); classdev = device_create(macvtap_class, &dev->dev, devt, dev, "tap%d", dev->ifindex); - if (IS_ERR(classdev)) + if (IS_ERR(classdev)) { + macvtap_free_minor(vlan); return notifier_from_errno(PTR_ERR(classdev)); + } break; case NETDEV_UNREGISTER: - devt = MKDEV(MAJOR(macvtap_major), dev->ifindex); + devt = MKDEV(MAJOR(macvtap_major), vlan->minor); device_destroy(macvtap_class, devt); + macvtap_free_minor(vlan); break; } diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h index e28b2e4..d103dca 100644 --- a/include/linux/if_macvlan.h +++ b/include/linux/if_macvlan.h @@ -64,6 +64,7 @@ struct macvlan_dev { int (*forward)(struct net_device *dev, struct sk_buff *skb); struct macvtap_queue *taps[MAX_MACVTAP_QUEUES]; int numvtaps; + int minor; }; static inline void macvlan_count_rx(const struct macvlan_dev *vlan,