diff mbox

netns: Coexist with the sysfs limitations

Message ID m17i80nzff.fsf@frodo.ebiederm.org
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman Oct. 22, 2008, 8:32 p.m. UTC
To make testing of the network namespace simpler allow
the network namespace code and the sysfs code to be
compiled and run at the same time.  To do this only
virtual devices are allowed in the additional network
namespaces and those virtual devices are not placed
in the kobject tree.

Since virtual devices don't actually do anything interesting
hardware wise that needs device management there should
be no loss in keeping them out of the kobject tree and
by implication sysfs.  The gain in ease of testing
and code coverage should be significant.

I.e. people running distributions that make it next to
impossible to boot without sysfs should at be able to
boot a test kernel now.

Plus no ABIs are harmed with this patch.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 net/Kconfig          |    2 +-
 net/core/dev.c       |   12 +++++++++++-
 net/core/net-sysfs.c |    7 +++++++
 3 files changed, 19 insertions(+), 2 deletions(-)

Comments

Daniel Lezcano Oct. 22, 2008, 8:40 p.m. UTC | #1
Eric W. Biederman wrote:
> To make testing of the network namespace simpler allow
> the network namespace code and the sysfs code to be
> compiled and run at the same time.  To do this only
> virtual devices are allowed in the additional network
> namespaces and those virtual devices are not placed
> in the kobject tree.
> 
> Since virtual devices don't actually do anything interesting
> hardware wise that needs device management there should
> be no loss in keeping them out of the kobject tree and
> by implication sysfs.  The gain in ease of testing
> and code coverage should be significant.
> 
> I.e. people running distributions that make it next to
> impossible to boot without sysfs should at be able to
> boot a test kernel now.
> 
> Plus no ABIs are harmed with this patch.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>

> ---
>  net/Kconfig          |    2 +-
>  net/core/dev.c       |   12 +++++++++++-
>  net/core/net-sysfs.c |    7 +++++++
>  3 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/net/Kconfig b/net/Kconfig
> index d789d79..8c3d97c 100644
> --- a/net/Kconfig
> +++ b/net/Kconfig
> @@ -27,7 +27,7 @@ menu "Networking options"
>  config NET_NS
>  	bool "Network namespace support"
>  	default n
> -	depends on EXPERIMENTAL && !SYSFS && NAMESPACES
> +	depends on EXPERIMENTAL && NAMESPACES
>  	help
>  	  Allow user space to create what appear to be multiple instances
>  	  of the network stack.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b8a4fd0..a7f0461 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4449,6 +4449,15 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
>  	if (dev->features & NETIF_F_NETNS_LOCAL)
>  		goto out;
> 
> +#ifdef CONFIG_SYSFS
> +	/* Don't allow real devices to be moved when sysfs
> +	 * is enabled.
> +	 */
> +	err = -EINVAL;
> +	if (dev->dev.parent)
> +		goto out;
> +#endif
> +
>  	/* Ensure the device has been registrered */
>  	err = -EINVAL;
>  	if (dev->reg_state != NETREG_REGISTERED)
> @@ -4506,6 +4515,8 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
>  	 */
>  	dev_addr_discard(dev);
> 
> +	netdev_unregister_kobject(dev);
> +
>  	/* Actually switch the network namespace */
>  	dev_net_set(dev, net);
> 
> @@ -4522,7 +4533,6 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
>  	}
> 
>  	/* Fixup kobjects */
> -	netdev_unregister_kobject(dev);
>  	err = netdev_register_kobject(dev);
>  	WARN_ON(err);
> 
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 92d6b94..85cb8bd 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -476,6 +476,10 @@ void netdev_unregister_kobject(struct net_device * net)
>  	struct device *dev = &(net->dev);
> 
>  	kobject_get(&dev->kobj);
> +
> +	if (dev_net(net) != &init_net)
> +		return;
> +
>  	device_del(dev);
>  }
> 
> @@ -501,6 +505,9 @@ int netdev_register_kobject(struct net_device *net)
>  #endif
>  #endif /* CONFIG_SYSFS */
> 
> +	if (dev_net(net) != &init_net)
> +		return 0;
> +
>  	return device_add(dev);
>  }
>
Serge E. Hallyn Oct. 22, 2008, 9:21 p.m. UTC | #2
Quoting Eric W. Biederman (ebiederm@xmission.com):
> 
> To make testing of the network namespace simpler allow
> the network namespace code and the sysfs code to be
> compiled and run at the same time.  To do this only
> virtual devices are allowed in the additional network
> namespaces and those virtual devices are not placed
> in the kobject tree.
> 
> Since virtual devices don't actually do anything interesting
> hardware wise that needs device management there should
> be no loss in keeping them out of the kobject tree and
> by implication sysfs.  The gain in ease of testing
> and code coverage should be significant.
> 
> I.e. people running distributions that make it next to
> impossible to boot without sysfs should at be able to
> boot a test kernel now.
> 
> Plus no ABIs are harmed with this patch.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Duh.

Tested-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>

Thanks, Eric!  Thanks, Benjamin!

-serge
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Thery Oct. 23, 2008, 8:04 a.m. UTC | #3
Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
>> To make testing of the network namespace simpler allow
>> the network namespace code and the sysfs code to be
>> compiled and run at the same time.  To do this only
>> virtual devices are allowed in the additional network
>> namespaces and those virtual devices are not placed
>> in the kobject tree.
>>
>> Since virtual devices don't actually do anything interesting
>> hardware wise that needs device management there should
>> be no loss in keeping them out of the kobject tree and
>> by implication sysfs.  The gain in ease of testing
>> and code coverage should be significant.
>>
>> I.e. people running distributions that make it next to
>> impossible to boot without sysfs should at be able to
>> boot a test kernel now.
>>
>> Plus no ABIs are harmed with this patch.

>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> 
> Duh.
> 
> Tested-by: Serge Hallyn <serue@us.ibm.com>
> Acked-by: Serge Hallyn <serue@us.ibm.com>

Oh, this patch is short, clean, and the limitation introduced isn't too
annoying for testing netns right now.

At least, my proposal provoked some reactions :)

BTW, there's a second limitation with your patch:
we can't rename the net devices in the additional network namespaces.

In net/core/dev.c, dev_change_name() fails: call to device_rename() 
return an (expected) -EINVAL error.
Maybe we should add a test on the net to only call it in init_net?

  rollback:
-       err = device_rename(&dev->dev, dev->name);
-       if (err) {
-               memcpy(dev->name, oldname, IFNAMSIZ);
-               return err;
+       if (net == &init_net) {
+               err = device_rename(&dev->dev, dev->name);
+               if (err) {
+                       memcpy(dev->name, oldname, IFNAMSIZ);
+                       return err;
+               }
         }

Otherwise,

Acked-by: Benjamin Thery <benjamin.thery@bull.net>

Thanks,
Benjamin

> 
> Thanks, Eric!  Thanks, Benjamin!
> 
> -serge
> 
>
Eric W. Biederman Oct. 23, 2008, 3:40 p.m. UTC | #4
Benjamin Thery <benjamin.thery@bull.net> writes:

> Serge E. Hallyn wrote:
>> Quoting Eric W. Biederman (ebiederm@xmission.com):
>>> To make testing of the network namespace simpler allow
>>> the network namespace code and the sysfs code to be
>>> compiled and run at the same time.  To do this only
>>> virtual devices are allowed in the additional network
>>> namespaces and those virtual devices are not placed
>>> in the kobject tree.
>>>
>>> Since virtual devices don't actually do anything interesting
>>> hardware wise that needs device management there should
>>> be no loss in keeping them out of the kobject tree and
>>> by implication sysfs.  The gain in ease of testing
>>> and code coverage should be significant.
>>>
>>> I.e. people running distributions that make it next to
>>> impossible to boot without sysfs should at be able to
>>> boot a test kernel now.
>>>
>>> Plus no ABIs are harmed with this patch.
>
>>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>>
>> Duh.
>>
>> Tested-by: Serge Hallyn <serue@us.ibm.com>
>> Acked-by: Serge Hallyn <serue@us.ibm.com>
>
> Oh, this patch is short, clean, and the limitation introduced isn't too
> annoying for testing netns right now.
>
> At least, my proposal provoked some reactions :)

Yes.

> BTW, there's a second limitation with your patch:
> we can't rename the net devices in the additional network namespaces.
>
> In net/core/dev.c, dev_change_name() fails: call to device_rename() return an
> (expected) -EINVAL error.
> Maybe we should add a test on the net to only call it in init_net?

Yes.  Good catch.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/Kconfig b/net/Kconfig
index d789d79..8c3d97c 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -27,7 +27,7 @@  menu "Networking options"
 config NET_NS
 	bool "Network namespace support"
 	default n
-	depends on EXPERIMENTAL && !SYSFS && NAMESPACES
+	depends on EXPERIMENTAL && NAMESPACES
 	help
 	  Allow user space to create what appear to be multiple instances
 	  of the network stack.
diff --git a/net/core/dev.c b/net/core/dev.c
index b8a4fd0..a7f0461 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4449,6 +4449,15 @@  int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 	if (dev->features & NETIF_F_NETNS_LOCAL)
 		goto out;
 
+#ifdef CONFIG_SYSFS
+	/* Don't allow real devices to be moved when sysfs
+	 * is enabled.
+	 */
+	err = -EINVAL;
+	if (dev->dev.parent)
+		goto out;
+#endif
+
 	/* Ensure the device has been registrered */
 	err = -EINVAL;
 	if (dev->reg_state != NETREG_REGISTERED)
@@ -4506,6 +4515,8 @@  int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 	 */
 	dev_addr_discard(dev);
 
+	netdev_unregister_kobject(dev);
+
 	/* Actually switch the network namespace */
 	dev_net_set(dev, net);
 
@@ -4522,7 +4533,6 @@  int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 	}
 
 	/* Fixup kobjects */
-	netdev_unregister_kobject(dev);
 	err = netdev_register_kobject(dev);
 	WARN_ON(err);
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 92d6b94..85cb8bd 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -476,6 +476,10 @@  void netdev_unregister_kobject(struct net_device * net)
 	struct device *dev = &(net->dev);
 
 	kobject_get(&dev->kobj);
+
+	if (dev_net(net) != &init_net)
+		return;
+
 	device_del(dev);
 }
 
@@ -501,6 +505,9 @@  int netdev_register_kobject(struct net_device *net)
 #endif
 #endif /* CONFIG_SYSFS */
 
+	if (dev_net(net) != &init_net)
+		return 0;
+
 	return device_add(dev);
 }