diff mbox

[v2] Root in namespace owns x_tables /proc entries

Message ID 20151114091214.GA486@compaq.slightly-cracked.com
State Changes Requested
Delegated to: Pablo Neira
Headers show

Commit Message

Philip Whineray Nov. 14, 2015, 9:12 a.m. UTC
Reading these files is impossible in an unprivileged user namespace,
interfering with various firewall tools. For instance, iptables-save
relies on reading /proc/net/ip_tables_names to dump only loaded tables.
---

Please don't apply in current form - it doesn't work. The namespace is
only set up after the /proc entry is created so it keeps the original
owner (an unshare within an unshare can work... mapping root to root).

Since it's in danger of getting quite complicate, would one or more of
the following be acceptable?

- Choose permission in a module parameter

- Allow setting with sysctl e.g. net.netfilter.conf.xtable_proc_perms

- Match permissions of /proc/modules (grsec restricts these so we will
  gain the same policy).

I also worked on a capabilities patch but that made userspace much more
complicated in a namespace than outside one. It would be simpler
to patch programs to read /proc/modules or assume the contents if they
can't read /proc/net/ip_tables_names.

 net/netfilter/x_tables.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Jozsef Kadlecsik Nov. 15, 2015, 6:53 p.m. UTC | #1
Hi Philip,

On Sat, 14 Nov 2015, Philip Whineray wrote:

> Since it's in danger of getting quite complicate, would one or more of
> the following be acceptable?
> 
> - Choose permission in a module parameter
> 
> - Allow setting with sysctl e.g. net.netfilter.conf.xtable_proc_perms
> 
> - Match permissions of /proc/modules (grsec restricts these so we will
>   gain the same policy).

In my opinion either one is good and I'd pick the sysctl setting. That way 
the permissions could be changed without reloading the module and 
independently of the permissions of /proc/modules.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Nov. 16, 2015, 11:56 a.m. UTC | #2
On Sun, Nov 15, 2015 at 07:53:53PM +0100, Jozsef Kadlecsik wrote:
> Hi Philip,
> 
> On Sat, 14 Nov 2015, Philip Whineray wrote:
> 
> > Since it's in danger of getting quite complicate, would one or more of
> > the following be acceptable?
> > 
> > - Choose permission in a module parameter
> > 
> > - Allow setting with sysctl e.g. net.netfilter.conf.xtable_proc_perms
> > 
> > - Match permissions of /proc/modules (grsec restricts these so we will
> >   gain the same policy).
> 
> In my opinion either one is good and I'd pick the sysctl setting. That way 
> the permissions could be changed without reloading the module and 
> independently of the permissions of /proc/modules.

I'd rather not to have a sysctl for this thing.

I suspect it will not take long until someone else will follow up with
a similar patch /proc/net/nf_conntrack.

What is the plan of namespace people for unprivileged namespaces with
non-world readable /proc entries?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Philip Whineray Nov. 16, 2015, 12:57 p.m. UTC | #3
On Mon, Nov 16, 2015 at 12:56:59PM +0100, Pablo Neira Ayuso wrote:
> On Sun, Nov 15, 2015 at 07:53:53PM +0100, Jozsef Kadlecsik wrote:
> > Hi Philip,
> > 
> > On Sat, 14 Nov 2015, Philip Whineray wrote:
> > 
> > > Since it's in danger of getting quite complicate, would one or more of
> > > the following be acceptable?
> > > 
> > > - Choose permission in a module parameter
> > > 
> > > - Allow setting with sysctl e.g. net.netfilter.conf.xtable_proc_perms
> > > 
> > > - Match permissions of /proc/modules (grsec restricts these so we will
> > >   gain the same policy).
> > 
> > In my opinion either one is good and I'd pick the sysctl setting. That way 
> > the permissions could be changed without reloading the module and 
> > independently of the permissions of /proc/modules.
> 
> I'd rather not to have a sysctl for this thing.
> 
> I suspect it will not take long until someone else will follow up with
> a similar patch /proc/net/nf_conntrack.

That may be true. The two are not equivalent though: the nf_conntrack
information is per-namespace (so setting owner to root in the current
namespace would certainly be sensible), whereas the information in
ip_tables_names is global and directly relates to modules loaded.

> What is the plan of namespace people for unprivileged namespaces with
> non-world readable /proc entries?

It may not have come up as an issue: a quick survey on my systems suggests
pretty much only only netfilter creates files which are group readable but
not world readable:

$ sudo find /proc/ -perm /g=r -a \! -perm /o=r | sed \
     's:proc/[0-9]*/:proc/0/:' | sort -u | less
/proc/0/net/ip6_tables_matches
/proc/0/net/ip6_tables_names
/proc/0/net/ip6_tables_targets
/proc/0/net/ip_tables_matches
/proc/0/net/ip_tables_names
/proc/0/net/ip_tables_targets
/proc/0/net/netfilter/nfnetlink_log
/proc/0/net/nf_conntrack
/proc/0/net/nf_conntrack_expect
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Nov. 16, 2015, 9:56 p.m. UTC | #4
Philip Whineray <phil@firehol.org> writes:

> Reading these files is impossible in an unprivileged user namespace,
> interfering with various firewall tools. For instance, iptables-save
> relies on reading /proc/net/ip_tables_names to dump only loaded tables.
> ---
>
> Please don't apply in current form - it doesn't work. The namespace is
> only set up after the /proc entry is created so it keeps the original
> owner (an unshare within an unshare can work... mapping root to root).
>
> Since it's in danger of getting quite complicate, would one or more of
> the following be acceptable?
>
> - Choose permission in a module parameter
>
> - Allow setting with sysctl e.g. net.netfilter.conf.xtable_proc_perms
>
> - Match permissions of /proc/modules (grsec restricts these so we will
>   gain the same policy).
>
> I also worked on a capabilities patch but that made userspace much more
> complicated in a namespace than outside one. It would be simpler
> to patch programs to read /proc/modules or assume the contents if they
> can't read /proc/net/ip_tables_names.
>
>  net/netfilter/x_tables.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index 9b42b5e..671654d 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -1227,6 +1227,8 @@ int xt_proto_init(struct net *net, u_int8_t af)
>  #ifdef CONFIG_PROC_FS
>  	char buf[XT_FUNCTION_MAXNAMELEN];
>  	struct proc_dir_entry *proc;
> +	kuid_t root_uid;
> +	kgid_t root_gid;
>  #endif
>  
>  	if (af >= ARRAY_SIZE(xt_prefix))
> @@ -1234,12 +1236,16 @@ int xt_proto_init(struct net *net, u_int8_t af)
>  
>  
>  #ifdef CONFIG_PROC_FS
> +	root_uid = make_kuid(current_user_ns(), 1000);
> +	root_gid = make_kgid(current_user_ns(), 1000);

These lines are wrong.  They should be:

	root_uid = make_kuid(net->user_ns, 0);
        root_gid = make_kgid(net->user_ns, 0);
        if (!uid_valid(root_uid) || !gid_valid(root_gid))
        	goto out;

>  	strlcpy(buf, xt_prefix[af], sizeof(buf));
>  	strlcat(buf, FORMAT_TABLES, sizeof(buf));
>  	proc = proc_create_data(buf, 0440, net->proc_net, &xt_table_ops,
>  				(void *)(unsigned long)af);
>  	if (!proc)
>  		goto out;
> +	proc_set_user(proc, root_uid, root_gid);
>  
>  	strlcpy(buf, xt_prefix[af], sizeof(buf));
>  	strlcat(buf, FORMAT_MATCHES, sizeof(buf));
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Nov. 16, 2015, 10:03 p.m. UTC | #5
Pablo Neira Ayuso <pablo@netfilter.org> writes:

> On Sun, Nov 15, 2015 at 07:53:53PM +0100, Jozsef Kadlecsik wrote:
>> Hi Philip,
>> 
>> On Sat, 14 Nov 2015, Philip Whineray wrote:
>> 
>> > Since it's in danger of getting quite complicate, would one or more of
>> > the following be acceptable?
>> > 
>> > - Choose permission in a module parameter
>> > 
>> > - Allow setting with sysctl e.g. net.netfilter.conf.xtable_proc_perms
>> > 
>> > - Match permissions of /proc/modules (grsec restricts these so we will
>> >   gain the same policy).
>> 
>> In my opinion either one is good and I'd pick the sysctl setting. That way 
>> the permissions could be changed without reloading the module and 
>> independently of the permissions of /proc/modules.
>
> I'd rather not to have a sysctl for this thing.
>
> I suspect it will not take long until someone else will follow up with
> a similar patch /proc/net/nf_conntrack.
>
> What is the plan of namespace people for unprivileged namespaces with
> non-world readable /proc entries?

If it makes sense to have them per namespace and allow manipulation by
the user namespace root, the plan is to make the files owned by the
root user in the user namespace that created the network namespace.

That preserves all of the existing logic.

If we don't trust the user namespace root to read/write the file we
probably want to just avoid creating the file altogether if
(net->user_ns != init_user_ns).

Eric
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Philip Whineray Nov. 18, 2015, 7:37 a.m. UTC | #6
On Mon, Nov 16, 2015 at 03:56:13PM -0600, Eric W. Biederman wrote:
> Philip Whineray <phil@firehol.org> writes:
> 
> > Reading these files is impossible in an unprivileged user namespace,
> > interfering with various firewall tools. For instance, iptables-save
> > relies on reading /proc/net/ip_tables_names to dump only loaded tables.
> 
> These lines are wrong.  They should be:
> 
>         root_uid = make_kuid(net->user_ns, 0);
>         root_gid = make_kgid(net->user_ns, 0);
>         if (!uid_valid(root_uid) || !gid_valid(root_gid))
>         	goto out;
> 
> >  	strlcpy(buf, xt_prefix[af], sizeof(buf));
> >  	strlcat(buf, FORMAT_TABLES, sizeof(buf));
> >  	proc = proc_create_data(buf, 0440, net->proc_net, &xt_table_ops,
> >  				(void *)(unsigned long)af);
> >  	if (!proc)
> >  		goto out;
> > +	proc_set_user(proc, root_uid, root_gid);

Thanks for the pointer Eric. As written it doesn't quite work because
out is an error path. unshare(CLONE_NEWUSER|CLONE_NEWNET) always fails
due to there not being a mapping for the user yet. Instead:

       root_uid = make_kuid(net->user_ns, 0);
       root_gid = make_kgid(net->user_ns, 0);

followed by:

       if (!proc)
              goto out;
       if (uid_valid(root_uid) && gid_valid(root_gid))
              proc_set_user(proc, root_uid, root_gid);

would preserve the current behaviour but allow the files to be
correctly mapped by first unsharing the user namespace, then setting
the gid map and finally unsharing the namespace.

Or, is it sane to bypass all the above and jump straight to:

        proc_set_user(proc, net->user_ns->owner, net->user_ns->group);

Cheers
Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Nov. 18, 2015, 9:13 a.m. UTC | #7
Phil Whineray <phil@firehol.org> writes:

> On Mon, Nov 16, 2015 at 03:56:13PM -0600, Eric W. Biederman wrote:
>> Philip Whineray <phil@firehol.org> writes:
>> 
>> > Reading these files is impossible in an unprivileged user namespace,
>> > interfering with various firewall tools. For instance, iptables-save
>> > relies on reading /proc/net/ip_tables_names to dump only loaded tables.
>> 
>> These lines are wrong.  They should be:
>> 
>>         root_uid = make_kuid(net->user_ns, 0);
>>         root_gid = make_kgid(net->user_ns, 0);
>>         if (!uid_valid(root_uid) || !gid_valid(root_gid))
>>         	goto out;
>> 
>> >  	strlcpy(buf, xt_prefix[af], sizeof(buf));
>> >  	strlcat(buf, FORMAT_TABLES, sizeof(buf));
>> >  	proc = proc_create_data(buf, 0440, net->proc_net, &xt_table_ops,
>> >  				(void *)(unsigned long)af);
>> >  	if (!proc)
>> >  		goto out;
>> > +	proc_set_user(proc, root_uid, root_gid);
>
> Thanks for the pointer Eric. As written it doesn't quite work because
> out is an error path. unshare(CLONE_NEWUSER|CLONE_NEWNET) always fails
> due to there not being a mapping for the user yet. Instead:

Point.

Although CLONE_NEWUSER|CLONE_NEWNET are not required to happen
simultaneously.

So you can make what I was suggesting work with:
unshare(CLONE_NEWUSER);
/* Setup the mapping */
unshare(CLONE_NEWNET);

The simplest version of this I can think of is to just not change the
user if the mapping is not setup at the time the proc files are created.

Certainly that would not be worse than what we have today.

>        root_uid = make_kuid(net->user_ns, 0);
>        root_gid = make_kgid(net->user_ns, 0);
>
> followed by:
>
>        if (!proc)
>               goto out;
>        if (uid_valid(root_uid) && gid_valid(root_gid))
>               proc_set_user(proc, root_uid, root_gid);
>
> would preserve the current behaviour but allow the files to be
> correctly mapped by first unsharing the user namespace, then setting
> the gid map and finally unsharing the namespace.

I am not quite certain what you are suggesting above.  It looks and
sounds like my suggest to only call proc_set_user if a mapping exists.
Which is fine. 

> Or, is it sane to bypass all the above and jump straight to:
>
>         proc_set_user(proc, net->user_ns->owner, net->user_ns->group);

It is not.   There are many reasons why typically user_ns->owner and
user_ns->group are rarely mapped in a user namespace.

Eric

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Philip Whineray Nov. 18, 2015, 6:39 p.m. UTC | #8
On Wed, Nov 18, 2015 at 03:13:51AM -0600, Eric W. Biederman wrote:
> Phil Whineray <phil@firehol.org> writes:
> 
> > On Mon, Nov 16, 2015 at 03:56:13PM -0600, Eric W. Biederman wrote:
> >> Philip Whineray <phil@firehol.org> writes:
> >> 
> >> > Reading these files is impossible in an unprivileged user namespace,
> >> > interfering with various firewall tools. For instance, iptables-save
> >> > relies on reading /proc/net/ip_tables_names to dump only loaded tables.
> >> 
> >> These lines are wrong.  They should be:
> >> 
> >>         root_uid = make_kuid(net->user_ns, 0);
> >>         root_gid = make_kgid(net->user_ns, 0);
> >>         if (!uid_valid(root_uid) || !gid_valid(root_gid))
> >>         	goto out;
> >> 
> >> >  	strlcpy(buf, xt_prefix[af], sizeof(buf));
> >> >  	strlcat(buf, FORMAT_TABLES, sizeof(buf));
> >> >  	proc = proc_create_data(buf, 0440, net->proc_net, &xt_table_ops,
> >> >  				(void *)(unsigned long)af);
> >> >  	if (!proc)
> >> >  		goto out;
> >> > +	proc_set_user(proc, root_uid, root_gid);
> >
> > Thanks for the pointer Eric. As written it doesn't quite work because
> > out is an error path. unshare(CLONE_NEWUSER|CLONE_NEWNET) always fails
> > due to there not being a mapping for the user yet. Instead:
> 
> Point.
> 
> Although CLONE_NEWUSER|CLONE_NEWNET are not required to happen
> simultaneously.
> 
> So you can make what I was suggesting work with:
> unshare(CLONE_NEWUSER);
> /* Setup the mapping */
> unshare(CLONE_NEWNET);
> 
> The simplest version of this I can think of is to just not change the
> user if the mapping is not setup at the time the proc files are created.
> 
> Certainly that would not be worse than what we have today.
> 
> >        root_uid = make_kuid(net->user_ns, 0);
> >        root_gid = make_kgid(net->user_ns, 0);
> >
> > followed by:
> >
> >        if (!proc)
> >               goto out;
> >        if (uid_valid(root_uid) && gid_valid(root_gid))
> >               proc_set_user(proc, root_uid, root_gid);
> >
> > would preserve the current behaviour but allow the files to be
> > correctly mapped by first unsharing the user namespace, then setting
> > the gid map and finally unsharing the namespace.
> 
> I am not quite certain what you are suggesting above.  It looks and
> sounds like my suggest to only call proc_set_user if a mapping exists.
> Which is fine. 

I was indeed trying to get at precisely the same as your suggestion.
Thanks for the much more succinct description.

Patch in due course...

Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 9b42b5e..671654d 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1227,6 +1227,8 @@  int xt_proto_init(struct net *net, u_int8_t af)
 #ifdef CONFIG_PROC_FS
 	char buf[XT_FUNCTION_MAXNAMELEN];
 	struct proc_dir_entry *proc;
+	kuid_t root_uid;
+	kgid_t root_gid;
 #endif
 
 	if (af >= ARRAY_SIZE(xt_prefix))
@@ -1234,12 +1236,16 @@  int xt_proto_init(struct net *net, u_int8_t af)
 
 
 #ifdef CONFIG_PROC_FS
+	root_uid = make_kuid(current_user_ns(), 1000);
+	root_gid = make_kgid(current_user_ns(), 1000);
+
 	strlcpy(buf, xt_prefix[af], sizeof(buf));
 	strlcat(buf, FORMAT_TABLES, sizeof(buf));
 	proc = proc_create_data(buf, 0440, net->proc_net, &xt_table_ops,
 				(void *)(unsigned long)af);
 	if (!proc)
 		goto out;
+	proc_set_user(proc, root_uid, root_gid);
 
 	strlcpy(buf, xt_prefix[af], sizeof(buf));
 	strlcat(buf, FORMAT_MATCHES, sizeof(buf));