Message ID | 1304735101-1824-3-git-send-email-ebiederm@xmission.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 05/07/2011 04:24 AM, Eric W. Biederman wrote: > Implementing file descriptors for the network namespace > is simple and straight forward. > > Signed-off-by: Eric W. Biederman<ebiederm@xmission.com> > --- Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote: > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c > index 3f86026..bf7707e 100644 > --- a/net/core/net_namespace.c > +++ b/net/core/net_namespace.c > @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops) > mutex_unlock(&net_mutex); > } > EXPORT_SYMBOL_GPL(unregister_pernet_device); > + > +#ifdef CONFIG_NET_NS > +static void *netns_get(struct task_struct *task) > +{ > + struct net *net; > + rcu_read_lock(); > + net = get_net(task->nsproxy->net_ns); This should use task_nsproxy() and check the result before grabbing the net_ns, but I think you fix that in a later patch. Regardless, it looks as if all the proc_ns_ops->get() implementations really just want the nsproxy, so maybe the get() methods should take that instead of the task_struct, and proc_ns_instantiate() should do something like: struct nsproxy *nsproxy; ... ei->ns_ops = ns_ops; error = -ESRCH; rcu_read_lock(); nsproxy = task_nsproxy(task); rcu_read_unlock(); if (!nsproxy) got out; ei->ns = ns_ops->get(nsproxy); So then the zombie check is consolidated in one place instead of having to do it in every get() method. > + rcu_read_unlock(); > + return net; > +} > + > +static void netns_put(void *ns) > +{ > + put_net(ns); > +} > + > +static int netns_install(struct nsproxy *nsproxy, void *ns) > +{ > + put_net(nsproxy->net_ns); > + nsproxy->net_ns = get_net(ns); > + return 0; > +} This introduces a window where, potentially, nsproxy->net_ns is stale before it is updated with the namespace which is being attached, no? (Same concern applies to other install methods in the patch set). It seems possible to oops the kernel in this window by looking up /proc/$PID/ns/net while $PID is in the midst of setns(). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nathan Lynch <ntl@pobox.com> writes: > On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote: >> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c >> index 3f86026..bf7707e 100644 >> --- a/net/core/net_namespace.c >> +++ b/net/core/net_namespace.c >> @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops) >> mutex_unlock(&net_mutex); >> } >> EXPORT_SYMBOL_GPL(unregister_pernet_device); >> + >> +#ifdef CONFIG_NET_NS >> +static void *netns_get(struct task_struct *task) >> +{ >> + struct net *net; >> + rcu_read_lock(); >> + net = get_net(task->nsproxy->net_ns); > > This should use task_nsproxy() and check the result before grabbing the > net_ns, but I think you fix that in a later patch. > > Regardless, it looks as if all the proc_ns_ops->get() implementations > really just want the nsproxy, so maybe the get() methods should take > that instead of the task_struct, and proc_ns_instantiate() should do > something like: > > struct nsproxy *nsproxy; > ... > > ei->ns_ops = ns_ops; > error = -ESRCH; > rcu_read_lock(); > nsproxy = task_nsproxy(task); > rcu_read_unlock(); > if (!nsproxy) > got out; > ei->ns = ns_ops->get(nsproxy); > > > So then the zombie check is consolidated in one place instead of having > to do it in every get() method. For the pid namespace at least I want the task not the nsproxy, so I can use task_active_pid_namespace(). I admit that is a little asymmetrical with the install, but at least until the details of getting the pid namespace working in this context are worked out I don't want to reconsider the current design. There is also the user namespace that does not even exist in nsproxy to consider. I will worry about that namespace when it happens. Ultimately nsproxy is an space/time optimization that not all namespaces use so forcing it in the design is probably not what we want. >> + rcu_read_unlock(); >> + return net; >> +} >> + >> +static void netns_put(void *ns) >> +{ >> + put_net(ns); >> +} >> + >> +static int netns_install(struct nsproxy *nsproxy, void *ns) >> +{ >> + put_net(nsproxy->net_ns); >> + nsproxy->net_ns = get_net(ns); >> + return 0; >> +} > > This introduces a window where, potentially, nsproxy->net_ns is stale > before it is updated with the namespace which is being attached, no? > (Same concern applies to other install methods in the patch set). It > seems possible to oops the kernel in this window by looking up > /proc/$PID/ns/net while $PID is in the midst of setns(). Except the nsproxy being referred to is a brand new nsproxy, with an extra reference count on every namespace. current->nsproxy still contains the reference counts of the current process. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2011-05-11 at 14:34 -0700, Eric W. Biederman wrote: > Nathan Lynch <ntl@pobox.com> writes: > > > On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote: > >> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c > >> index 3f86026..bf7707e 100644 > >> --- a/net/core/net_namespace.c > >> +++ b/net/core/net_namespace.c > >> @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops) > >> mutex_unlock(&net_mutex); > >> } > >> EXPORT_SYMBOL_GPL(unregister_pernet_device); > >> + > >> +#ifdef CONFIG_NET_NS > >> +static void *netns_get(struct task_struct *task) > >> +{ > >> + struct net *net; > >> + rcu_read_lock(); > >> + net = get_net(task->nsproxy->net_ns); > > > > This should use task_nsproxy() and check the result before grabbing the > > net_ns, but I think you fix that in a later patch. > > > > Regardless, it looks as if all the proc_ns_ops->get() implementations > > really just want the nsproxy, so maybe the get() methods should take > > that instead of the task_struct, and proc_ns_instantiate() should do > > something like: > > > > struct nsproxy *nsproxy; > > ... > > > > ei->ns_ops = ns_ops; > > error = -ESRCH; > > rcu_read_lock(); > > nsproxy = task_nsproxy(task); > > rcu_read_unlock(); > > if (!nsproxy) > > got out; > > ei->ns = ns_ops->get(nsproxy); > > > > > > So then the zombie check is consolidated in one place instead of having > > to do it in every get() method. > > For the pid namespace at least I want the task not the nsproxy, > so I can use task_active_pid_namespace(). > > I admit that is a little asymmetrical with the install, but at > least until the details of getting the pid namespace working in > this context are worked out I don't want to reconsider the > current design. > > There is also the user namespace that does not even exist in > nsproxy to consider. I will worry about that namespace when > it happens. > > Ultimately nsproxy is an space/time optimization that not all > namespaces use so forcing it in the design is probably not > what we want. Okay. > >> + rcu_read_unlock(); > >> + return net; > >> +} > >> + > >> +static void netns_put(void *ns) > >> +{ > >> + put_net(ns); > >> +} > >> + > >> +static int netns_install(struct nsproxy *nsproxy, void *ns) > >> +{ > >> + put_net(nsproxy->net_ns); > >> + nsproxy->net_ns = get_net(ns); > >> + return 0; > >> +} > > > > This introduces a window where, potentially, nsproxy->net_ns is stale > > before it is updated with the namespace which is being attached, no? > > (Same concern applies to other install methods in the patch set). It > > seems possible to oops the kernel in this window by looking up > > /proc/$PID/ns/net while $PID is in the midst of setns(). > > Except the nsproxy being referred to is a brand new nsproxy, with an > extra reference count on every namespace. current->nsproxy still > contains the reference counts of the current process. Ahh, yeah. Got it. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c index 6ae9f07..dcbd483 100644 --- a/fs/proc/namespaces.c +++ b/fs/proc/namespaces.c @@ -16,6 +16,9 @@ static const struct proc_ns_operations *ns_entries[] = { +#ifdef CONFIG_NET_NS + &netns_operations, +#endif }; static const struct file_operations ns_file_operations = { diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h index a6d2c6d..62126ec 100644 --- a/include/linux/proc_fs.h +++ b/include/linux/proc_fs.h @@ -265,6 +265,7 @@ struct proc_ns_operations { void (*put)(void *ns); int (*install)(struct nsproxy *nsproxy, void *ns); }; +extern const struct proc_ns_operations netns_operations; union proc_op { int (*proc_get_link)(struct inode *, struct path *); diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 3f86026..bf7707e 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -573,3 +573,34 @@ void unregister_pernet_device(struct pernet_operations *ops) mutex_unlock(&net_mutex); } EXPORT_SYMBOL_GPL(unregister_pernet_device); + +#ifdef CONFIG_NET_NS +static void *netns_get(struct task_struct *task) +{ + struct net *net; + rcu_read_lock(); + net = get_net(task->nsproxy->net_ns); + rcu_read_unlock(); + return net; +} + +static void netns_put(void *ns) +{ + put_net(ns); +} + +static int netns_install(struct nsproxy *nsproxy, void *ns) +{ + put_net(nsproxy->net_ns); + nsproxy->net_ns = get_net(ns); + return 0; +} + +const struct proc_ns_operations netns_operations = { + .name = "net", + .type = CLONE_NEWNET, + .get = netns_get, + .put = netns_put, + .install = netns_install, +}; +#endif
Implementing file descriptors for the network namespace is simple and straight forward. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- fs/proc/namespaces.c | 3 +++ include/linux/proc_fs.h | 1 + net/core/net_namespace.c | 31 +++++++++++++++++++++++++++++++ 3 files changed, 35 insertions(+), 0 deletions(-)