diff mbox

[RFC] ipv4: release dev refcnt early when destroying inetdev

Message ID 4C8A3430.2070105@6wind.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Nicolas Dichtel Sept. 10, 2010, 1:35 p.m. UTC
Hi all,

We got a scalability problem when we try to remove a lot of virtual interfaces. 
After analysis, we found that a refcnt on a device was released too late.
Here is a proposal patch. If we are not missing something, the refcnt can be 
release before call_rcu(). In IPv6, this is already the case.

Comments are welcome.


Regards,
Nicolas

Comments

Eric Dumazet Sept. 10, 2010, 2:24 p.m. UTC | #1
Le vendredi 10 septembre 2010 à 15:35 +0200, Nicolas Dichtel a écrit :
> Hi all,
> 
> We got a scalability problem when we try to remove a lot of virtual interfaces. 
> After analysis, we found that a refcnt on a device was released too late.
> Here is a proposal patch. If we are not missing something, the refcnt can be 
> release before call_rcu(). In IPv6, this is already the case.
> 
> Comments are welcome.
> 
> 
> Regards,
> Nicolas
> pièce jointe différences entre fichiers
> (0001-ipv4-release-dev-refcnt-early-when-destroying-inetd.patch)
> From 6fe291ff56b1f94599dfaa57dfb0ed4c168b603f Mon Sep 17 00:00:00 2001
> From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Date: Fri, 10 Sep 2010 14:52:15 +0200
> Subject: [PATCH] ipv4: release dev refcnt early when destroying inetdev
> 
> When a virtual device is removed, refcnt on dev is released
> after rcu barrier, hence we fall always in the msleep(250)
> of netdev_wait_allrefs(). This causes a long delay when
> a lot of interfaces are removed.
> Refcnt can be released before this rcu barrier, this allows
> to accelerate the removing of virtual interfaces.
> 
> Test of removing 50 ipip tunnel interfaces:
>  Before the patch:
>   real    0m12.804s
>   user    0m0.020s
>   sys     0m0.000s
> 
>  After the patch:
>   real    0m0.988s
>   user    0m0.004s
>   sys     0m0.016s
> 
> Signed-off-by: Wang Xuefu <xuefu.wang@6wind.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---

This is a well known problem, (many patches were sent some months ago)
but your patch is not the right solution.

As long as the idev is not yet freed, it can be used and we need to
access idev->dev



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel Sept. 10, 2010, 2:57 p.m. UTC | #2
Le 10.09.2010 16:24, Eric Dumazet a écrit :
> Le vendredi 10 septembre 2010 à 15:35 +0200, Nicolas Dichtel a écrit :
>> Hi all,
>>
>> We got a scalability problem when we try to remove a lot of virtual interfaces. 
>> After analysis, we found that a refcnt on a device was released too late.
>> Here is a proposal patch. If we are not missing something, the refcnt can be 
>> release before call_rcu(). In IPv6, this is already the case.
>>
>> Comments are welcome.
>>
>>
>> Regards,
>> Nicolas
>> pièce jointe différences entre fichiers
>> (0001-ipv4-release-dev-refcnt-early-when-destroying-inetd.patch)
>> From 6fe291ff56b1f94599dfaa57dfb0ed4c168b603f Mon Sep 17 00:00:00 2001
>> From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>> Date: Fri, 10 Sep 2010 14:52:15 +0200
>> Subject: [PATCH] ipv4: release dev refcnt early when destroying inetdev
>>
>> When a virtual device is removed, refcnt on dev is released
>> after rcu barrier, hence we fall always in the msleep(250)
>> of netdev_wait_allrefs(). This causes a long delay when
>> a lot of interfaces are removed.
>> Refcnt can be released before this rcu barrier, this allows
>> to accelerate the removing of virtual interfaces.
>>
>> Test of removing 50 ipip tunnel interfaces:
>>  Before the patch:
>>   real    0m12.804s
>>   user    0m0.020s
>>   sys     0m0.000s
>>
>>  After the patch:
>>   real    0m0.988s
>>   user    0m0.004s
>>   sys     0m0.016s
>>
>> Signed-off-by: Wang Xuefu <xuefu.wang@6wind.com>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>> ---
> 
> This is a well known problem, (many patches were sent some months ago)
> but your patch is not the right solution.
> 
> As long as the idev is not yet freed, it can be used and we need to
> access idev->dev

Is this not true in IPv6? What is the difference?


Regards,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From 6fe291ff56b1f94599dfaa57dfb0ed4c168b603f Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Fri, 10 Sep 2010 14:52:15 +0200
Subject: [PATCH] ipv4: release dev refcnt early when destroying inetdev

When a virtual device is removed, refcnt on dev is released
after rcu barrier, hence we fall always in the msleep(250)
of netdev_wait_allrefs(). This causes a long delay when
a lot of interfaces are removed.
Refcnt can be released before this rcu barrier, this allows
to accelerate the removing of virtual interfaces.

Test of removing 50 ipip tunnel interfaces:
 Before the patch:
  real    0m12.804s
  user    0m0.020s
  sys     0m0.000s

 After the patch:
  real    0m0.988s
  user    0m0.004s
  sys     0m0.016s

Signed-off-by: Wang Xuefu <xuefu.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv4/devinet.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index da14c49..dd59e79 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -131,7 +131,9 @@  static inline void inet_free_ifa(struct in_ifaddr *ifa)
 
 void in_dev_finish_destroy(struct in_device *idev)
 {
+#ifdef NET_REFCNT_DEBUG
 	struct net_device *dev = idev->dev;
+#endif
 
 	WARN_ON(idev->ifa_list);
 	WARN_ON(idev->mc_list);
@@ -139,7 +141,6 @@  void in_dev_finish_destroy(struct in_device *idev)
 	printk(KERN_DEBUG "in_dev_finish_destroy: %p=%s\n",
 	       idev, dev ? dev->name : "NIL");
 #endif
-	dev_put(dev);
 	if (!idev->dead)
 		pr_err("Freeing alive in_device %p\n", idev);
 	else
@@ -215,6 +216,7 @@  static void inetdev_destroy(struct in_device *in_dev)
 	neigh_parms_release(&arp_tbl, in_dev->arp_parms);
 	arp_ifdown(dev);
 
+	dev_put(dev);
 	call_rcu(&in_dev->rcu_head, in_dev_rcu_put);
 }
 
-- 
1.5.4.5