Message ID | 1353070992-5552-9-git-send-email-ebiederm@xmission.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On 11/16/2012 05:03 PM, Eric W. Biederman wrote: > + if (!capable(CAP_NET_ADMIN)) > + return -EPERM; > + > return netdev_store(dev, attr, buf, len, change_tx_queue_len); You mean ns_capable here? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Glauber Costa <glommer@parallels.com> writes: > On 11/16/2012 05:03 PM, Eric W. Biederman wrote: >> + if (!capable(CAP_NET_ADMIN)) >> + return -EPERM; >> + >> return netdev_store(dev, attr, buf, len, change_tx_queue_len); > > You mean ns_capable here? No. There I meant capable. I deliberately call capable here because I don't understand what the tx_queue_len well enough to be certain it is safe to relax that check to be just ns_capable. My get feel is that allowing an unprivileged user to be able to arbitrarily change the tx_queue_len on a networking device would be a nice way to allow queuing as many network packets as you would like with kernel memory and DOSing the machine. So since with a quick read of the code I could not convince myself it was safe to allow unprivilged users to change tx_queue_len I left it protected by capable. While at the same time I relaxed the check in netdev_store to be ns_capable. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-11-16 at 06:32 -0800, Eric W. Biederman wrote: > Glauber Costa <glommer@parallels.com> writes: > > > On 11/16/2012 05:03 PM, Eric W. Biederman wrote: > >> + if (!capable(CAP_NET_ADMIN)) > >> + return -EPERM; > >> + > >> return netdev_store(dev, attr, buf, len, change_tx_queue_len); > > > > You mean ns_capable here? > > No. There I meant capable. > > I deliberately call capable here because I don't understand what > the tx_queue_len well enough to be certain it is safe to relax > that check to be just ns_capable. > > My get feel is that allowing an unprivileged user to be able to > arbitrarily change the tx_queue_len on a networking device would be a > nice way to allow queuing as many network packets as you would like with > kernel memory and DOSing the machine. > > So since with a quick read of the code I could not convince myself it > was safe to allow unprivilged users to change tx_queue_len I left it > protected by capable. While at the same time I relaxed the check in > netdev_store to be ns_capable. Tor the same reason you had better be very selective about which ethtool commands are allowed based on per-user_ns CAP_NET_ADMIN. Consider for a start: ETHTOOL_SMSGLVL => fill up the system log ETHTOOL_SEEPROM => brick the NIC ETHTOOL_FLASHDEV => brick the NIC; own the system if it's not using an IOMMU Ben.
Ben Hutchings <bhutchings@solarflare.com> writes: > On Fri, 2012-11-16 at 06:32 -0800, Eric W. Biederman wrote: >> Glauber Costa <glommer@parallels.com> writes: >> >> > On 11/16/2012 05:03 PM, Eric W. Biederman wrote: >> >> + if (!capable(CAP_NET_ADMIN)) >> >> + return -EPERM; >> >> + >> >> return netdev_store(dev, attr, buf, len, change_tx_queue_len); >> > >> > You mean ns_capable here? >> >> No. There I meant capable. >> >> I deliberately call capable here because I don't understand what >> the tx_queue_len well enough to be certain it is safe to relax >> that check to be just ns_capable. >> >> My get feel is that allowing an unprivileged user to be able to >> arbitrarily change the tx_queue_len on a networking device would be a >> nice way to allow queuing as many network packets as you would like with >> kernel memory and DOSing the machine. >> >> So since with a quick read of the code I could not convince myself it >> was safe to allow unprivilged users to change tx_queue_len I left it >> protected by capable. While at the same time I relaxed the check in >> netdev_store to be ns_capable. > > Tor the same reason you had better be very selective about which ethtool > commands are allowed based on per-user_ns CAP_NET_ADMIN. Consider for a > start: > > ETHTOOL_SEEPROM => brick the NIC > ETHTOOL_FLASHDEV => brick the NIC; own the system if it's not using an IOMMU These are prevented by not having access to real hardware by default. A physical network interface must be moved into a network namespace for you to have access to it. There are a handful of software network devices that are generally safe macvlan, veth, tun, ipip tunnels, etc. Using those network devices is very interesting and about as performant as you can get while still being safe. A buffer overflow in an ethtool command looks as likely to me as being able to own the system by reflashing the NIC. Access to a real physical NIC is an act of trust. Given the general linux policy that drivers are merged when they mostly work I don't currently know of any trust models between "I trust you with full access to this device" and "I don't trust you with direct access to this device" that I would feel confident giving to an untrusted user. Which is a convoluted way of saying "ip link set eth0 netns bob" is the moral equivalent of "chown bob.bob /dev/eth0; chmod u+rwx /dev/eth0" > ETHTOOL_SMSGLVL => fill up the system log That one might be worth doing something about, as there is non-local effect. Still I don't believe for any of the software based "safe" networking devices ETHTOOL_SMSGLVL has any effect, and being able to tweak the debug level could be important for debugging if you do have direct access to the NIC. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-11-16 at 18:46 -0800, Eric W. Biederman wrote: > Ben Hutchings <bhutchings@solarflare.com> writes: > > > On Fri, 2012-11-16 at 06:32 -0800, Eric W. Biederman wrote: > >> Glauber Costa <glommer@parallels.com> writes: > >> > >> > On 11/16/2012 05:03 PM, Eric W. Biederman wrote: > >> >> + if (!capable(CAP_NET_ADMIN)) > >> >> + return -EPERM; > >> >> + > >> >> return netdev_store(dev, attr, buf, len, change_tx_queue_len); > >> > > >> > You mean ns_capable here? > >> > >> No. There I meant capable. > >> > >> I deliberately call capable here because I don't understand what > >> the tx_queue_len well enough to be certain it is safe to relax > >> that check to be just ns_capable. > >> > >> My get feel is that allowing an unprivileged user to be able to > >> arbitrarily change the tx_queue_len on a networking device would be a > >> nice way to allow queuing as many network packets as you would like with > >> kernel memory and DOSing the machine. > >> > >> So since with a quick read of the code I could not convince myself it > >> was safe to allow unprivilged users to change tx_queue_len I left it > >> protected by capable. While at the same time I relaxed the check in > >> netdev_store to be ns_capable. > > > > Tor the same reason you had better be very selective about which ethtool > > commands are allowed based on per-user_ns CAP_NET_ADMIN. Consider for a > > start: > > > > ETHTOOL_SEEPROM => brick the NIC > > ETHTOOL_FLASHDEV => brick the NIC; own the system if it's not using an IOMMU > > These are prevented by not having access to real hardware by default. A > physical network interface must be moved into a network namespace for > you to have access to it. Yes, I realise that. The question is whether you would expect anything in a container to be able to do those things, even with a physical net device assigned to it. Actually we have the same issue without considering containers - should CAP_NET_ADMIN really give you low-level control over hardware just because it's networking hardware? I think some of these ethtool operations, and access to non-standard MDIO registers, should perhaps require an additional capability (CAP_SYS_ADMIN or CAP_SYS_RAWIO?). > There are a handful of software network devices that are generally safe > macvlan, veth, tun, ipip tunnels, etc. Using those network devices is > very interesting and about as performant as you can get while still > being safe. > > A buffer overflow in an ethtool command looks as likely to me as being > able to own the system by reflashing the NIC. Sure, if you can find one. But on many NICs the firmware can perform more or less arbitrary DMA *by design* (one reason for using IOMMUs), and the ability to update the firmware is not a bug to be fixed! > Access to a real physical NIC is an act of trust. Given the general > linux policy that drivers are merged when they mostly work I don't > currently know of any trust models between "I trust you with full access > to this device" and "I don't trust you with direct access to this > device" that I would feel confident giving to an untrusted user. At the moment it's 'I trust you with full access to *all* network devices' (init ns CAP_NET_ADMIN), 'I trust you with some reconfiguration of these network devices' (other ns CAP_NET_ADMIN) and 'I don't trust you...' You're expanding what other-ns-CAP_NET_ADMIN means, to 'I trust you with full access to these network devices'. > Which is a convoluted way of saying "ip link set eth0 netns bob" is the > moral equivalent of "chown bob.bob /dev/eth0; chmod u+rwx /dev/eth0" [...] And it's previously been decided that ownership of a block device still should *not* mean full control over it (see responses to CVE-2011-4127). Ben.
diff --git a/net/core/dev.c b/net/core/dev.c index 09cb3f6..7150ea9 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5200,7 +5200,7 @@ int dev_ioctl(struct net *net, unsigned int cmd, void __user *arg) case SIOCGMIIPHY: case SIOCGMIIREG: case SIOCSIFNAME: - if (!capable(CAP_NET_ADMIN)) + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; dev_load(net, ifr.ifr_name); rtnl_lock(); @@ -5221,16 +5221,25 @@ int dev_ioctl(struct net *net, unsigned int cmd, void __user *arg) * - require strict serialization. * - do not return a value */ + case SIOCSIFMAP: + case SIOCSIFTXQLEN: + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + /* fall through */ + /* + * These ioctl calls: + * - require local superuser power. + * - require strict serialization. + * - do not return a value + */ case SIOCSIFFLAGS: case SIOCSIFMETRIC: case SIOCSIFMTU: - case SIOCSIFMAP: case SIOCSIFHWADDR: case SIOCSIFSLAVE: case SIOCADDMULTI: case SIOCDELMULTI: case SIOCSIFHWBROADCAST: - case SIOCSIFTXQLEN: case SIOCSMIIREG: case SIOCBONDENSLAVE: case SIOCBONDRELEASE: @@ -5239,7 +5248,7 @@ int dev_ioctl(struct net *net, unsigned int cmd, void __user *arg) case SIOCBRADDIF: case SIOCBRDELIF: case SIOCSHWTSTAMP: - if (!capable(CAP_NET_ADMIN)) + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; /* fall through */ case SIOCBONDSLAVEINFOQUERY: diff --git a/net/core/ethtool.c b/net/core/ethtool.c index 4d64cc2..a870543 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -1460,7 +1460,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr) case ETHTOOL_GEEE: break; default: - if (!capable(CAP_NET_ADMIN)) + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; } diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index bcf02f6..c66b8c2 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -73,11 +73,12 @@ static ssize_t netdev_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len, int (*set)(struct net_device *, unsigned long)) { - struct net_device *net = to_net_dev(dev); + struct net_device *netdev = to_net_dev(dev); + struct net *net = dev_net(netdev); unsigned long new; int ret = -EINVAL; - if (!capable(CAP_NET_ADMIN)) + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; ret = kstrtoul(buf, 0, &new); @@ -87,8 +88,8 @@ static ssize_t netdev_store(struct device *dev, struct device_attribute *attr, if (!rtnl_trylock()) return restart_syscall(); - if (dev_isalive(net)) { - if ((ret = (*set)(net, new)) == 0) + if (dev_isalive(netdev)) { + if ((ret = (*set)(netdev, new)) == 0) ret = len; } rtnl_unlock(); @@ -264,6 +265,9 @@ static ssize_t store_tx_queue_len(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + return netdev_store(dev, attr, buf, len, change_tx_queue_len); } @@ -271,10 +275,11 @@ static ssize_t store_ifalias(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { struct net_device *netdev = to_net_dev(dev); + struct net *net = dev_net(netdev); size_t count = len; ssize_t ret; - if (!capable(CAP_NET_ADMIN)) + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; /* ignore trailing newline */ diff --git a/net/core/sock.c b/net/core/sock.c index 8a146cf..85d75cb 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -515,7 +515,7 @@ static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen) /* Sorry... */ ret = -EPERM; - if (!capable(CAP_NET_RAW)) + if (!ns_capable(net->user_ns, CAP_NET_RAW)) goto out; ret = -EINVAL; @@ -696,7 +696,8 @@ set_rcvbuf: break; case SO_PRIORITY: - if ((val >= 0 && val <= 6) || capable(CAP_NET_ADMIN)) + if ((val >= 0 && val <= 6) || + ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) sk->sk_priority = val; else ret = -EPERM; @@ -813,7 +814,7 @@ set_rcvbuf: clear_bit(SOCK_PASSSEC, &sock->flags); break; case SO_MARK: - if (!capable(CAP_NET_ADMIN)) + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) ret = -EPERM; else sk->sk_mark = val;