Message ID | 1316734189-26668-1-git-send-email-zenczykowski@gmail.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
Quoting Maciej Żenczykowski (zenczykowski@gmail.com): > From: Maciej Żenczykowski <maze@google.com> > > Up till now the IP{,V6}_TRANSPARENT socket options (which actually set > the same bit in the socket struct) have required CAP_NET_ADMIN > privileges to set or clear the option. > > - we make clearing the bit not require any privileges. > - we deprecate using CAP_NET_ADMIN for this purpose. > - we introduce a new capability CAP_NET_TRANSPARENT, > which is tailored to allow setting just this bit. > - we allow either one of CAP_NET_TRANSPARENT or CAP_NET_RAW > to set this bit, because raw sockets already effectively > allow you to emulate socket transparency, and make the > transition easier for apps not desiring to use a brand > new capability (because of header file or glibc support) > - we print a warning (but allow it) if you try to set > the socket option with CAP_NET_ADMIN privs, but without > either one of CAP_NET_TRANSPARENT or CAP_NET_RAW. > > The reason for introducing a new capability is that while > transparent sockets are potentially dangerous (and can let you > spoof your source IP on traffic), they don't normally give you > the full 'freedom' of eavesdropping and/or spoofing that raw sockets > give you. > > Signed-off-by: Maciej Żenczykowski <maze@google.com> > Acked-by: Balazs Scheidler <bazsi@balabit.hu> > Acked-by: David Miller <davem@redhat.com> Looks good to me. Please do make sure to also send the required patch for libcap2. Should the comments in capability.h reference each other to make clear that it's not a mistake, either one offers the privilege? I know it's clear from the comment in the code itself, but something like > +/* > + * Allow binding to any address for transparent proxying - either > + * this or CAP_NET_TRANSPARENT can be used > + */ In any case, Acked-by: Serge Hallyn <serge.hallyn@canonical.com> thanks, -serge > --- > include/linux/capability.h | 13 +++++++++---- > net/ipv4/ip_sockglue.c | 26 ++++++++++++++++++++++---- > net/ipv6/ipv6_sockglue.c | 29 ++++++++++++++++++++++++----- > 3 files changed, 55 insertions(+), 13 deletions(-) > > diff --git a/include/linux/capability.h b/include/linux/capability.h > index c421123..a115ed4 100644 > --- a/include/linux/capability.h > +++ b/include/linux/capability.h > @@ -198,7 +198,7 @@ struct cpu_vfs_cap_data { > /* Allow modification of routing tables */ > /* Allow setting arbitrary process / process group ownership on > sockets */ > -/* Allow binding to any address for transparent proxying */ > +/* Allow binding to any address for transparent proxying (deprecated) */ > /* Allow setting TOS (type of service) */ > /* Allow setting promiscuous mode */ > /* Allow clearing driver statistics */ > @@ -210,6 +210,7 @@ struct cpu_vfs_cap_data { > > /* Allow use of RAW sockets */ > /* Allow use of PACKET sockets */ > +/* Allow binding to any address for transparent proxying */ > > #define CAP_NET_RAW 13 > > @@ -332,7 +333,7 @@ struct cpu_vfs_cap_data { > > #define CAP_AUDIT_CONTROL 30 > > -#define CAP_SETFCAP 31 > +#define CAP_SETFCAP 31 > > /* Override MAC access. > The base kernel enforces no MAC policy. > @@ -357,10 +358,14 @@ struct cpu_vfs_cap_data { > > /* Allow triggering something that will wake the system */ > > -#define CAP_WAKE_ALARM 35 > +#define CAP_WAKE_ALARM 35 > + > +/* Allow binding to any address for transparent proxying */ > + > +#define CAP_NET_TRANSPARENT 36 > > > -#define CAP_LAST_CAP CAP_WAKE_ALARM > +#define CAP_LAST_CAP CAP_NET_TRANSPARENT > > #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) > > diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c > index 8905e92..44efa39 100644 > --- a/net/ipv4/ip_sockglue.c > +++ b/net/ipv4/ip_sockglue.c > @@ -961,12 +961,30 @@ mc_msf_out: > break; > > case IP_TRANSPARENT: > - if (!capable(CAP_NET_ADMIN)) { > - err = -EPERM; > - break; > - } > if (optlen < 1) > goto e_inval; > + /* Always allow clearing the transparent proxy socket option. > + * The pre-3.2 permission for setting this was CAP_NET_ADMIN, > + * and this is still supported - but deprecated. As of Linux > + * 3.2 the proper permission is one of CAP_NET_TRANSPARENT > + * (preferred, a new capability) or CAP_NET_RAW. The latter > + * is supported to make the transition easier (and because > + * raw sockets already effectively allow one to emulate > + * socket transparency). > + */ > + if (!!val && !capable(CAP_NET_TRANSPARENT) > + && !capable(CAP_NET_RAW)) { > + if (!capable(CAP_NET_ADMIN)) { > + err = -EPERM; > + break; > + } > + printk_once(KERN_WARNING "%s (%d): " > + "deprecated: attempt to set socket option " > + "IP_TRANSPARENT with CAP_NET_ADMIN but " > + "without either one of CAP_NET_TRANSPARENT " > + "or CAP_NET_RAW.\n", > + current->comm, task_pid_nr(current)); > + } > inet->transparent = !!val; > break; > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c > index 2fbda5f..b8315c8 100644 > --- a/net/ipv6/ipv6_sockglue.c > +++ b/net/ipv6/ipv6_sockglue.c > @@ -343,13 +343,32 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, > break; > > case IPV6_TRANSPARENT: > - if (!capable(CAP_NET_ADMIN)) { > - retv = -EPERM; > - break; > - } > if (optlen < sizeof(int)) > goto e_inval; > - /* we don't have a separate transparent bit for IPV6 we use the one in the IPv4 socket */ > + /* Always allow clearing the transparent proxy socket option. > + * The pre-3.2 permission for setting this was CAP_NET_ADMIN, > + * and this is still supported - but deprecated. As of Linux > + * 3.2 the proper permission is one of CAP_NET_TRANSPARENT > + * (preferred, a new capability) or CAP_NET_RAW. The latter > + * is supported to make the transition easier (and because > + * raw sockets already effectively allow one to emulate > + * socket transparency). > + */ > + if (valbool && !capable(CAP_NET_TRANSPARENT) > + && !capable(CAP_NET_RAW)) { > + if (!capable(CAP_NET_ADMIN)) { > + retv = -EPERM; > + break; > + } > + printk_once(KERN_WARNING "%s (%d): " > + "deprecated: attempt to set socket option " > + "IPV6_TRANSPARENT with CAP_NET_ADMIN but " > + "without either one of CAP_NET_TRANSPARENT " > + "or CAP_NET_RAW.\n", > + current->comm, task_pid_nr(current)); > + } > + /* we don't have a separate transparent bit for IPV6 we use the > + * one in the IPv4 socket */ > inet_sk(sk)->transparent = valbool; > retv = 0; > break; > -- > 1.7.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-security-module" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/22/2011 4:29 PM, Maciej Żenczykowski wrote: > From: Maciej Żenczykowski <maze@google.com> > > Up till now the IP{,V6}_TRANSPARENT socket options (which actually set > the same bit in the socket struct) have required CAP_NET_ADMIN > privileges to set or clear the option. > > - we make clearing the bit not require any privileges. > - we deprecate using CAP_NET_ADMIN for this purpose. > - we introduce a new capability CAP_NET_TRANSPARENT, > which is tailored to allow setting just this bit. Under what circumstances would a process that requires the new capability not require CAP_NET_ADMIN? Is there a real case where a process would be expected to require only this new capability? Adding new capability values is somewhat perilous and the granularity you are proposing, that of controlling a single bit, would explode the list of capabilities into the hundreds if it were applied throughout the kernel. > - we allow either one of CAP_NET_TRANSPARENT or CAP_NET_RAW > to set this bit, because raw sockets already effectively > allow you to emulate socket transparency, and make the > transition easier for apps not desiring to use a brand > new capability (because of header file or glibc support) > - we print a warning (but allow it) if you try to set > the socket option with CAP_NET_ADMIN privs, but without > either one of CAP_NET_TRANSPARENT or CAP_NET_RAW. > > The reason for introducing a new capability is that while > transparent sockets are potentially dangerous (and can let you > spoof your source IP on traffic), they don't normally give you > the full 'freedom' of eavesdropping and/or spoofing that raw sockets > give you. > > Signed-off-by: Maciej Żenczykowski <maze@google.com> > Acked-by: Balazs Scheidler <bazsi@balabit.hu> > Acked-by: David Miller <davem@redhat.com> > --- > include/linux/capability.h | 13 +++++++++---- > net/ipv4/ip_sockglue.c | 26 ++++++++++++++++++++++---- > net/ipv6/ipv6_sockglue.c | 29 ++++++++++++++++++++++++----- > 3 files changed, 55 insertions(+), 13 deletions(-) > > diff --git a/include/linux/capability.h b/include/linux/capability.h > index c421123..a115ed4 100644 > --- a/include/linux/capability.h > +++ b/include/linux/capability.h > @@ -198,7 +198,7 @@ struct cpu_vfs_cap_data { > /* Allow modification of routing tables */ > /* Allow setting arbitrary process / process group ownership on > sockets */ > -/* Allow binding to any address for transparent proxying */ > +/* Allow binding to any address for transparent proxying (deprecated) */ > /* Allow setting TOS (type of service) */ > /* Allow setting promiscuous mode */ > /* Allow clearing driver statistics */ > @@ -210,6 +210,7 @@ struct cpu_vfs_cap_data { > > /* Allow use of RAW sockets */ > /* Allow use of PACKET sockets */ > +/* Allow binding to any address for transparent proxying */ > > #define CAP_NET_RAW 13 > > @@ -332,7 +333,7 @@ struct cpu_vfs_cap_data { > > #define CAP_AUDIT_CONTROL 30 > > -#define CAP_SETFCAP 31 > +#define CAP_SETFCAP 31 > > /* Override MAC access. > The base kernel enforces no MAC policy. > @@ -357,10 +358,14 @@ struct cpu_vfs_cap_data { > > /* Allow triggering something that will wake the system */ > > -#define CAP_WAKE_ALARM 35 > +#define CAP_WAKE_ALARM 35 > + > +/* Allow binding to any address for transparent proxying */ > + > +#define CAP_NET_TRANSPARENT 36 > > > -#define CAP_LAST_CAP CAP_WAKE_ALARM > +#define CAP_LAST_CAP CAP_NET_TRANSPARENT > > #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) > > diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c > index 8905e92..44efa39 100644 > --- a/net/ipv4/ip_sockglue.c > +++ b/net/ipv4/ip_sockglue.c > @@ -961,12 +961,30 @@ mc_msf_out: > break; > > case IP_TRANSPARENT: > - if (!capable(CAP_NET_ADMIN)) { > - err = -EPERM; > - break; > - } > if (optlen < 1) > goto e_inval; > + /* Always allow clearing the transparent proxy socket option. > + * The pre-3.2 permission for setting this was CAP_NET_ADMIN, > + * and this is still supported - but deprecated. As of Linux > + * 3.2 the proper permission is one of CAP_NET_TRANSPARENT > + * (preferred, a new capability) or CAP_NET_RAW. The latter > + * is supported to make the transition easier (and because > + * raw sockets already effectively allow one to emulate > + * socket transparency). > + */ > + if (!!val && !capable(CAP_NET_TRANSPARENT) > + && !capable(CAP_NET_RAW)) { > + if (!capable(CAP_NET_ADMIN)) { > + err = -EPERM; > + break; > + } > + printk_once(KERN_WARNING "%s (%d): " > + "deprecated: attempt to set socket option " > + "IP_TRANSPARENT with CAP_NET_ADMIN but " > + "without either one of CAP_NET_TRANSPARENT " > + "or CAP_NET_RAW.\n", > + current->comm, task_pid_nr(current)); > + } > inet->transparent = !!val; > break; > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c > index 2fbda5f..b8315c8 100644 > --- a/net/ipv6/ipv6_sockglue.c > +++ b/net/ipv6/ipv6_sockglue.c > @@ -343,13 +343,32 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, > break; > > case IPV6_TRANSPARENT: > - if (!capable(CAP_NET_ADMIN)) { > - retv = -EPERM; > - break; > - } > if (optlen < sizeof(int)) > goto e_inval; > - /* we don't have a separate transparent bit for IPV6 we use the one in the IPv4 socket */ > + /* Always allow clearing the transparent proxy socket option. > + * The pre-3.2 permission for setting this was CAP_NET_ADMIN, > + * and this is still supported - but deprecated. As of Linux > + * 3.2 the proper permission is one of CAP_NET_TRANSPARENT > + * (preferred, a new capability) or CAP_NET_RAW. The latter > + * is supported to make the transition easier (and because > + * raw sockets already effectively allow one to emulate > + * socket transparency). > + */ > + if (valbool && !capable(CAP_NET_TRANSPARENT) > + && !capable(CAP_NET_RAW)) { > + if (!capable(CAP_NET_ADMIN)) { > + retv = -EPERM; > + break; > + } > + printk_once(KERN_WARNING "%s (%d): " > + "deprecated: attempt to set socket option " > + "IPV6_TRANSPARENT with CAP_NET_ADMIN but " > + "without either one of CAP_NET_TRANSPARENT " > + "or CAP_NET_RAW.\n", > + current->comm, task_pid_nr(current)); > + } > + /* we don't have a separate transparent bit for IPV6 we use the > + * one in the IPv4 socket */ > inet_sk(sk)->transparent = valbool; > retv = 0; > break; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Under what circumstances would a process that requires the > new capability not require CAP_NET_ADMIN? Is there a real > case where a process would be expected to require only this > new capability? Adding new capability values is somewhat > perilous and the granularity you are proposing, that of > controlling a single bit, would explode the list of > capabilities into the hundreds if it were applied throughout > the kernel. CAP_NET_ADMIN is a huge hammer, it allows one to totally reconfigure the networking subsystem. In a containerized multi-user/job environment, you do not want something like an instance of a load-balanced web server, proxy or dns server being able to do that - policy/configuration decisions should be left up to the administrator and/or machine management daemon(s). Each of these can make use of transparent sockets (in various ways, mostly in coordination with large scale load balancing). You also do not want one user running in one container being able to sniff (CAP_NET_RAW) traffic from another user (hence CAP_NET_RAW isn't an acceptable substitute). One could conceivably use network namespaces for seperation, but in this particular case they are _way_ too overkill (and also add too much overhead). This might be *just* a single bit in the socket, but this bit effectively controls whether you can do certain types of privileged operations on the socket in question - and it gets tested in various places throughout the networking stack. Hopefully, this answers your question. - Maciej -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/23/2011 12:33 PM, Maciej Żenczykowski wrote: >> Under what circumstances would a process that requires the >> new capability not require CAP_NET_ADMIN? Is there a real >> case where a process would be expected to require only this >> new capability? Adding new capability values is somewhat >> perilous and the granularity you are proposing, that of >> controlling a single bit, would explode the list of >> capabilities into the hundreds if it were applied throughout >> the kernel. > CAP_NET_ADMIN is a huge hammer, it allows one to totally > reconfigure the networking subsystem. > > In a containerized multi-user/job environment, you do not want > something like an instance of a load-balanced web server, proxy > or dns server being able to do that - policy/configuration decisions > should be left up to the administrator and/or machine management > daemon(s). Each of these can make use of transparent sockets > (in various ways, mostly in coordination with large scale load balancing). > > You also do not want one user running in one container being able > to sniff (CAP_NET_RAW) traffic from another user (hence CAP_NET_RAW > isn't an acceptable substitute). > > One could conceivably use network namespaces for seperation, but > in this particular case they are _way_ too overkill (and also add too > much overhead). > > This might be *just* a single bit in the socket, but this bit effectively > controls whether you can do certain types of privileged operations > on the socket in question - and it gets tested in various places throughout > the networking stack. > > Hopefully, this answers your question. It is an important argument, but no, it does not address my issue. The problem is that you can make that same argument for breaking up just about every capability. CAP_SYSADMIN could easily be broken into a hundred separate capabilities and CAP_NET_ADMIN, as you point out, into dozens. My point is that with the potential to create so many capabilities, what makes this particular action so much more important than the other things already covered by CAP_NET_ADMIN? If we introduce dozens of new capabilities we will end up with the exact same problem that has been so clearly demonstrated by the SELinux reference policy. Excessive granularity will kill and security facility. Capabilities are already more granular than most people are comfortable with. For a facility to warrant a new capability it must be completely unreasonable to fit it into an existing capability, sufficiently general in its use that the bit won't show up unused when networking fashions change in a year or two, and it needs to protect something that is obviously very important. You have a variant of a somewhat obscure facility that will only be used in edge cases in support of an unproven (if promising) technology that is targeted to a disappearing use case. > > - Maciej > -- > To unsubscribe from this list: send the line "unsubscribe linux-security-module" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/capability.h b/include/linux/capability.h index c421123..a115ed4 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -198,7 +198,7 @@ struct cpu_vfs_cap_data { /* Allow modification of routing tables */ /* Allow setting arbitrary process / process group ownership on sockets */ -/* Allow binding to any address for transparent proxying */ +/* Allow binding to any address for transparent proxying (deprecated) */ /* Allow setting TOS (type of service) */ /* Allow setting promiscuous mode */ /* Allow clearing driver statistics */ @@ -210,6 +210,7 @@ struct cpu_vfs_cap_data { /* Allow use of RAW sockets */ /* Allow use of PACKET sockets */ +/* Allow binding to any address for transparent proxying */ #define CAP_NET_RAW 13 @@ -332,7 +333,7 @@ struct cpu_vfs_cap_data { #define CAP_AUDIT_CONTROL 30 -#define CAP_SETFCAP 31 +#define CAP_SETFCAP 31 /* Override MAC access. The base kernel enforces no MAC policy. @@ -357,10 +358,14 @@ struct cpu_vfs_cap_data { /* Allow triggering something that will wake the system */ -#define CAP_WAKE_ALARM 35 +#define CAP_WAKE_ALARM 35 + +/* Allow binding to any address for transparent proxying */ + +#define CAP_NET_TRANSPARENT 36 -#define CAP_LAST_CAP CAP_WAKE_ALARM +#define CAP_LAST_CAP CAP_NET_TRANSPARENT #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index 8905e92..44efa39 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -961,12 +961,30 @@ mc_msf_out: break; case IP_TRANSPARENT: - if (!capable(CAP_NET_ADMIN)) { - err = -EPERM; - break; - } if (optlen < 1) goto e_inval; + /* Always allow clearing the transparent proxy socket option. + * The pre-3.2 permission for setting this was CAP_NET_ADMIN, + * and this is still supported - but deprecated. As of Linux + * 3.2 the proper permission is one of CAP_NET_TRANSPARENT + * (preferred, a new capability) or CAP_NET_RAW. The latter + * is supported to make the transition easier (and because + * raw sockets already effectively allow one to emulate + * socket transparency). + */ + if (!!val && !capable(CAP_NET_TRANSPARENT) + && !capable(CAP_NET_RAW)) { + if (!capable(CAP_NET_ADMIN)) { + err = -EPERM; + break; + } + printk_once(KERN_WARNING "%s (%d): " + "deprecated: attempt to set socket option " + "IP_TRANSPARENT with CAP_NET_ADMIN but " + "without either one of CAP_NET_TRANSPARENT " + "or CAP_NET_RAW.\n", + current->comm, task_pid_nr(current)); + } inet->transparent = !!val; break; diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 2fbda5f..b8315c8 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -343,13 +343,32 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, break; case IPV6_TRANSPARENT: - if (!capable(CAP_NET_ADMIN)) { - retv = -EPERM; - break; - } if (optlen < sizeof(int)) goto e_inval; - /* we don't have a separate transparent bit for IPV6 we use the one in the IPv4 socket */ + /* Always allow clearing the transparent proxy socket option. + * The pre-3.2 permission for setting this was CAP_NET_ADMIN, + * and this is still supported - but deprecated. As of Linux + * 3.2 the proper permission is one of CAP_NET_TRANSPARENT + * (preferred, a new capability) or CAP_NET_RAW. The latter + * is supported to make the transition easier (and because + * raw sockets already effectively allow one to emulate + * socket transparency). + */ + if (valbool && !capable(CAP_NET_TRANSPARENT) + && !capable(CAP_NET_RAW)) { + if (!capable(CAP_NET_ADMIN)) { + retv = -EPERM; + break; + } + printk_once(KERN_WARNING "%s (%d): " + "deprecated: attempt to set socket option " + "IPV6_TRANSPARENT with CAP_NET_ADMIN but " + "without either one of CAP_NET_TRANSPARENT " + "or CAP_NET_RAW.\n", + current->comm, task_pid_nr(current)); + } + /* we don't have a separate transparent bit for IPV6 we use the + * one in the IPv4 socket */ inet_sk(sk)->transparent = valbool; retv = 0; break;