diff mbox

[3/7] ns proc: Add support for the network namespace.

Message ID 1304735101-1824-3-git-send-email-ebiederm@xmission.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman May 7, 2011, 2:24 a.m. UTC
Implementing file descriptors for the network namespace
is simple and straight forward.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/proc/namespaces.c     |    3 +++
 include/linux/proc_fs.h  |    1 +
 net/core/net_namespace.c |   31 +++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

Comments

Daniel Lezcano May 7, 2011, 10:41 p.m. UTC | #1
On 05/07/2011 04:24 AM, Eric W. Biederman wrote:
> Implementing file descriptors for the network namespace
> is simple and straight forward.
>
> Signed-off-by: Eric W. Biederman<ebiederm@xmission.com>
> ---

Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nathan Lynch May 11, 2011, 7:21 p.m. UTC | #2
On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote:
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index 3f86026..bf7707e 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops)
>  	mutex_unlock(&net_mutex);
>  }
>  EXPORT_SYMBOL_GPL(unregister_pernet_device);
> +
> +#ifdef CONFIG_NET_NS
> +static void *netns_get(struct task_struct *task)
> +{
> +	struct net *net;
> +	rcu_read_lock();
> +	net = get_net(task->nsproxy->net_ns);

This should use task_nsproxy() and check the result before grabbing the
net_ns, but I think you fix that in a later patch.

Regardless, it looks as if all the proc_ns_ops->get() implementations
really just want the nsproxy, so maybe the get() methods should take
that instead of the task_struct, and proc_ns_instantiate() should do
something like:

struct nsproxy *nsproxy;
...

ei->ns_ops = ns_ops;
error = -ESRCH;
rcu_read_lock();
nsproxy = task_nsproxy(task);
rcu_read_unlock();
if (!nsproxy)
	got out;
ei->ns = ns_ops->get(nsproxy);


So then the zombie check is consolidated in one place instead of having
to do it in every get() method.



> +	rcu_read_unlock();
> +	return net;
> +}
> +
> +static void netns_put(void *ns)
> +{
> +	put_net(ns);
> +}
> +
> +static int netns_install(struct nsproxy *nsproxy, void *ns)
> +{
> +	put_net(nsproxy->net_ns);
> +	nsproxy->net_ns = get_net(ns);
> +	return 0;
> +}

This introduces a window where, potentially, nsproxy->net_ns is stale
before it is updated with the namespace which is being attached, no? 
(Same concern applies to other install methods in the patch set).  It
seems possible to oops the kernel in this window by looking up
/proc/$PID/ns/net while $PID is in the midst of setns().


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman May 11, 2011, 9:34 p.m. UTC | #3
Nathan Lynch <ntl@pobox.com> writes:

> On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote:
>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>> index 3f86026..bf7707e 100644
>> --- a/net/core/net_namespace.c
>> +++ b/net/core/net_namespace.c
>> @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops)
>>  	mutex_unlock(&net_mutex);
>>  }
>>  EXPORT_SYMBOL_GPL(unregister_pernet_device);
>> +
>> +#ifdef CONFIG_NET_NS
>> +static void *netns_get(struct task_struct *task)
>> +{
>> +	struct net *net;
>> +	rcu_read_lock();
>> +	net = get_net(task->nsproxy->net_ns);
>
> This should use task_nsproxy() and check the result before grabbing the
> net_ns, but I think you fix that in a later patch.
>
> Regardless, it looks as if all the proc_ns_ops->get() implementations
> really just want the nsproxy, so maybe the get() methods should take
> that instead of the task_struct, and proc_ns_instantiate() should do
> something like:
>
> struct nsproxy *nsproxy;
> ...
>
> ei->ns_ops = ns_ops;
> error = -ESRCH;
> rcu_read_lock();
> nsproxy = task_nsproxy(task);
> rcu_read_unlock();
> if (!nsproxy)
> 	got out;
> ei->ns = ns_ops->get(nsproxy);
>
>
> So then the zombie check is consolidated in one place instead of having
> to do it in every get() method.

For the pid namespace at least I want the task not the nsproxy,
so I can use task_active_pid_namespace().

I admit that is a little asymmetrical with the install, but at
least until the details of getting the pid namespace working in
this context are worked out I don't want to reconsider the
current design.

There is also the user namespace that does not even exist in
nsproxy to consider.  I will worry about that namespace when
it happens.

Ultimately nsproxy is an space/time optimization that not all
namespaces use so forcing it in the design is probably not
what we want.

>> +	rcu_read_unlock();
>> +	return net;
>> +}
>> +
>> +static void netns_put(void *ns)
>> +{
>> +	put_net(ns);
>> +}
>> +
>> +static int netns_install(struct nsproxy *nsproxy, void *ns)
>> +{
>> +	put_net(nsproxy->net_ns);
>> +	nsproxy->net_ns = get_net(ns);
>> +	return 0;
>> +}
>
> This introduces a window where, potentially, nsproxy->net_ns is stale
> before it is updated with the namespace which is being attached, no? 
> (Same concern applies to other install methods in the patch set).  It
> seems possible to oops the kernel in this window by looking up
> /proc/$PID/ns/net while $PID is in the midst of setns().

Except the nsproxy being referred to is a brand new nsproxy, with an
extra reference count on every namespace.  current->nsproxy still
contains the reference counts of the current process.

Eric

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nathan Lynch May 11, 2011, 9:42 p.m. UTC | #4
On Wed, 2011-05-11 at 14:34 -0700, Eric W. Biederman wrote:
> Nathan Lynch <ntl@pobox.com> writes:
> 
> > On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote:
> >> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> >> index 3f86026..bf7707e 100644
> >> --- a/net/core/net_namespace.c
> >> +++ b/net/core/net_namespace.c
> >> @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops)
> >>  	mutex_unlock(&net_mutex);
> >>  }
> >>  EXPORT_SYMBOL_GPL(unregister_pernet_device);
> >> +
> >> +#ifdef CONFIG_NET_NS
> >> +static void *netns_get(struct task_struct *task)
> >> +{
> >> +	struct net *net;
> >> +	rcu_read_lock();
> >> +	net = get_net(task->nsproxy->net_ns);
> >
> > This should use task_nsproxy() and check the result before grabbing the
> > net_ns, but I think you fix that in a later patch.
> >
> > Regardless, it looks as if all the proc_ns_ops->get() implementations
> > really just want the nsproxy, so maybe the get() methods should take
> > that instead of the task_struct, and proc_ns_instantiate() should do
> > something like:
> >
> > struct nsproxy *nsproxy;
> > ...
> >
> > ei->ns_ops = ns_ops;
> > error = -ESRCH;
> > rcu_read_lock();
> > nsproxy = task_nsproxy(task);
> > rcu_read_unlock();
> > if (!nsproxy)
> > 	got out;
> > ei->ns = ns_ops->get(nsproxy);
> >
> >
> > So then the zombie check is consolidated in one place instead of having
> > to do it in every get() method.
> 
> For the pid namespace at least I want the task not the nsproxy,
> so I can use task_active_pid_namespace().
> 
> I admit that is a little asymmetrical with the install, but at
> least until the details of getting the pid namespace working in
> this context are worked out I don't want to reconsider the
> current design.
> 
> There is also the user namespace that does not even exist in
> nsproxy to consider.  I will worry about that namespace when
> it happens.
> 
> Ultimately nsproxy is an space/time optimization that not all
> namespaces use so forcing it in the design is probably not
> what we want.

Okay.


> >> +	rcu_read_unlock();
> >> +	return net;
> >> +}
> >> +
> >> +static void netns_put(void *ns)
> >> +{
> >> +	put_net(ns);
> >> +}
> >> +
> >> +static int netns_install(struct nsproxy *nsproxy, void *ns)
> >> +{
> >> +	put_net(nsproxy->net_ns);
> >> +	nsproxy->net_ns = get_net(ns);
> >> +	return 0;
> >> +}
> >
> > This introduces a window where, potentially, nsproxy->net_ns is stale
> > before it is updated with the namespace which is being attached, no? 
> > (Same concern applies to other install methods in the patch set).  It
> > seems possible to oops the kernel in this window by looking up
> > /proc/$PID/ns/net while $PID is in the midst of setns().
> 
> Except the nsproxy being referred to is a brand new nsproxy, with an
> extra reference count on every namespace.  current->nsproxy still
> contains the reference counts of the current process.

Ahh, yeah.  Got it.  Thanks.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index 6ae9f07..dcbd483 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -16,6 +16,9 @@ 
 
 
 static const struct proc_ns_operations *ns_entries[] = {
+#ifdef CONFIG_NET_NS
+	&netns_operations,
+#endif
 };
 
 static const struct file_operations ns_file_operations = {
diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
index a6d2c6d..62126ec 100644
--- a/include/linux/proc_fs.h
+++ b/include/linux/proc_fs.h
@@ -265,6 +265,7 @@  struct proc_ns_operations {
 	void (*put)(void *ns);
 	int (*install)(struct nsproxy *nsproxy, void *ns);
 };
+extern const struct proc_ns_operations netns_operations;
 
 union proc_op {
 	int (*proc_get_link)(struct inode *, struct path *);
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 3f86026..bf7707e 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -573,3 +573,34 @@  void unregister_pernet_device(struct pernet_operations *ops)
 	mutex_unlock(&net_mutex);
 }
 EXPORT_SYMBOL_GPL(unregister_pernet_device);
+
+#ifdef CONFIG_NET_NS
+static void *netns_get(struct task_struct *task)
+{
+	struct net *net;
+	rcu_read_lock();
+	net = get_net(task->nsproxy->net_ns);
+	rcu_read_unlock();
+	return net;
+}
+
+static void netns_put(void *ns)
+{
+	put_net(ns);
+}
+
+static int netns_install(struct nsproxy *nsproxy, void *ns)
+{
+	put_net(nsproxy->net_ns);
+	nsproxy->net_ns = get_net(ns);
+	return 0;
+}
+
+const struct proc_ns_operations netns_operations = {
+	.name		= "net",
+	.type		= CLONE_NEWNET,
+	.get		= netns_get,
+	.put		= netns_put,
+	.install	= netns_install,
+};
+#endif