diff mbox series

[net-next,v2,08/10] net-sysfs: add netdev_change_owner()

Message ID 20200217161436.1748598-9-christian.brauner@ubuntu.com
State Changes Requested
Delegated to: David Miller
Headers show
Series net: fix sysfs permssions when device changes network | expand

Commit Message

Christian Brauner Feb. 17, 2020, 4:14 p.m. UTC
Add a function to change the owner of a network device when it is moved
between network namespaces.

Currently, when moving network devices between network namespaces the
ownership of the corresponding sysfs entries is not changed. This leads
to problems when tools try to operate on the corresponding sysfs files.
This leads to a bug whereby a network device that is created in a
network namespaces owned by a user namespace will have its corresponding
sysfs entry owned by the root user of the corresponding user namespace.
If such a network device has to be moved back to the host network
namespace the permissions will still be set to the user namespaces. This
means unprivileged users can e.g. trigger uevents for such incorrectly
owned devices. They can also modify the settings of the device itself.
Both of these things are unwanted.

For example, workloads will create network devices in the host network
namespace. Other tools will then proceed to move such devices between
network namespaces owner by other user namespaces. While the ownership
of the device itself is updated in
net/core/net-sysfs.c:dev_change_net_namespace() the corresponding sysfs
entry for the device is not:

drwxr-xr-x 5 nobody nobody    0 Jan 25 18:08 .
drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_assign_type
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_len
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 address
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 broadcast
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_changes
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_down_count
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_up_count
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_id
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_port
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dormant
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 duplex
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 flags
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 gro_flush_timeout
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifalias
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifindex
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 iflink
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 link_mode
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 mtu
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 name_assign_type
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 netdev_group
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 operstate
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_id
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_name
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_switch_id
drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 power
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 proto_down
drwxr-xr-x 4 nobody nobody    0 Jan 25 18:09 queues
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 speed
drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 statistics
lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:08 subsystem -> ../../../../class/net
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 tx_queue_len
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 type
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:08 uevent

However, if a device is created directly in the network namespace then
the device's sysfs permissions will be correctly updated:

drwxr-xr-x 5 root   root      0 Jan 25 18:12 .
drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
-r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_assign_type
-r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_len
-r--r--r-- 1 root   root   4096 Jan 25 18:12 address
-r--r--r-- 1 root   root   4096 Jan 25 18:12 broadcast
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 carrier
-r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_changes
-r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_down_count
-r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_up_count
-r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_id
-r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_port
-r--r--r-- 1 root   root   4096 Jan 25 18:12 dormant
-r--r--r-- 1 root   root   4096 Jan 25 18:12 duplex
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 flags
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 gro_flush_timeout
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 ifalias
-r--r--r-- 1 root   root   4096 Jan 25 18:12 ifindex
-r--r--r-- 1 root   root   4096 Jan 25 18:12 iflink
-r--r--r-- 1 root   root   4096 Jan 25 18:12 link_mode
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 mtu
-r--r--r-- 1 root   root   4096 Jan 25 18:12 name_assign_type
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 netdev_group
-r--r--r-- 1 root   root   4096 Jan 25 18:12 operstate
-r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_id
-r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_name
-r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_switch_id
drwxr-xr-x 2 root   root      0 Jan 25 18:12 power
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 proto_down
drwxr-xr-x 4 root   root      0 Jan 25 18:12 queues
-r--r--r-- 1 root   root   4096 Jan 25 18:12 speed
drwxr-xr-x 2 root   root      0 Jan 25 18:12 statistics
lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:12 subsystem -> ../../../../class/net
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 tx_queue_len
-r--r--r-- 1 root   root   4096 Jan 25 18:12 type
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 uevent

Now, when creating a network device in a network namespace owned by a
user namespace and moving it to the host the permissions will be set to
the id that the user namespace root user has been mapped to on the host
leading to all sorts of permission issues:

458752
drwxr-xr-x 5 458752 458752      0 Jan 25 18:12 .
drwxr-xr-x 9 root   root        0 Jan 25 18:08 ..
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_assign_type
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_len
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 address
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 broadcast
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_changes
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_down_count
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_up_count
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_id
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_port
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dormant
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 duplex
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 flags
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 gro_flush_timeout
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 ifalias
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 ifindex
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 iflink
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 link_mode
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 mtu
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 name_assign_type
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 netdev_group
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 operstate
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_id
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_name
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_switch_id
drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 power
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 proto_down
drwxr-xr-x 4 458752 458752      0 Jan 25 18:12 queues
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 speed
drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 statistics
lrwxrwxrwx 1 root   root        0 Jan 25 18:12 subsystem -> ../../../../class/net
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 tx_queue_len
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 type
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 uevent

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 net/core/net-sysfs.c | 27 +++++++++++++++++++++++++++
 net/core/net-sysfs.h |  2 ++
 2 files changed, 29 insertions(+)

Comments

gregkh@linuxfoundation.org Feb. 17, 2020, 4:28 p.m. UTC | #1
On Mon, Feb 17, 2020 at 05:14:34PM +0100, Christian Brauner wrote:
> Add a function to change the owner of a network device when it is moved
> between network namespaces.
> 
> Currently, when moving network devices between network namespaces the
> ownership of the corresponding sysfs entries is not changed. This leads
> to problems when tools try to operate on the corresponding sysfs files.
> This leads to a bug whereby a network device that is created in a
> network namespaces owned by a user namespace will have its corresponding
> sysfs entry owned by the root user of the corresponding user namespace.
> If such a network device has to be moved back to the host network
> namespace the permissions will still be set to the user namespaces. This
> means unprivileged users can e.g. trigger uevents for such incorrectly
> owned devices. They can also modify the settings of the device itself.
> Both of these things are unwanted.
> 
> For example, workloads will create network devices in the host network
> namespace. Other tools will then proceed to move such devices between
> network namespaces owner by other user namespaces. While the ownership
> of the device itself is updated in
> net/core/net-sysfs.c:dev_change_net_namespace() the corresponding sysfs
> entry for the device is not:
> 
> drwxr-xr-x 5 nobody nobody    0 Jan 25 18:08 .
> drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_assign_type
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_len
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 address
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 broadcast
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_changes
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_down_count
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_up_count
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_id
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_port
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dormant
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 duplex
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 flags
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 gro_flush_timeout
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifalias
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifindex
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 iflink
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 link_mode
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 mtu
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 name_assign_type
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 netdev_group
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 operstate
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_id
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_name
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_switch_id
> drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 power
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 proto_down
> drwxr-xr-x 4 nobody nobody    0 Jan 25 18:09 queues
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 speed
> drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 statistics
> lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:08 subsystem -> ../../../../class/net
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 tx_queue_len
> -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 type
> -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:08 uevent
> 
> However, if a device is created directly in the network namespace then
> the device's sysfs permissions will be correctly updated:
> 
> drwxr-xr-x 5 root   root      0 Jan 25 18:12 .
> drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_assign_type
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_len
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 address
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 broadcast
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 carrier
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_changes
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_down_count
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_up_count
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_id
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_port
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 dormant
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 duplex
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 flags
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 gro_flush_timeout
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 ifalias
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 ifindex
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 iflink
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 link_mode
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 mtu
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 name_assign_type
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 netdev_group
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 operstate
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_id
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_name
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_switch_id
> drwxr-xr-x 2 root   root      0 Jan 25 18:12 power
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 proto_down
> drwxr-xr-x 4 root   root      0 Jan 25 18:12 queues
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 speed
> drwxr-xr-x 2 root   root      0 Jan 25 18:12 statistics
> lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:12 subsystem -> ../../../../class/net
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 tx_queue_len
> -r--r--r-- 1 root   root   4096 Jan 25 18:12 type
> -rw-r--r-- 1 root   root   4096 Jan 25 18:12 uevent
> 
> Now, when creating a network device in a network namespace owned by a
> user namespace and moving it to the host the permissions will be set to
> the id that the user namespace root user has been mapped to on the host
> leading to all sorts of permission issues:
> 
> 458752
> drwxr-xr-x 5 458752 458752      0 Jan 25 18:12 .
> drwxr-xr-x 9 root   root        0 Jan 25 18:08 ..
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_assign_type
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_len
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 address
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 broadcast
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_changes
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_down_count
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_up_count
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_id
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_port
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dormant
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 duplex
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 flags
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 gro_flush_timeout
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 ifalias
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 ifindex
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 iflink
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 link_mode
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 mtu
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 name_assign_type
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 netdev_group
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 operstate
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_id
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_name
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_switch_id
> drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 power
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 proto_down
> drwxr-xr-x 4 458752 458752      0 Jan 25 18:12 queues
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 speed
> drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 statistics
> lrwxrwxrwx 1 root   root        0 Jan 25 18:12 subsystem -> ../../../../class/net
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 tx_queue_len
> -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 type
> -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 uevent
> 
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
> /* v2 */
> unchanged
> ---
>  net/core/net-sysfs.c | 27 +++++++++++++++++++++++++++
>  net/core/net-sysfs.h |  2 ++
>  2 files changed, 29 insertions(+)
> 
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 4c826b8bf9b1..4fda021edf6d 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -1767,6 +1767,33 @@ int netdev_register_kobject(struct net_device *ndev)
>  	return error;
>  }
>  
> +/* Change owner for sysfs entries when moving network devices across network
> + * namespaces owned by different user namespaces.
> + */
> +int netdev_change_owner(struct net_device *ndev, const struct net *net_old,
> +			const struct net *net_new)
> +{
> +	struct device *dev = &ndev->dev;
> +	kuid_t old_uid, new_uid;
> +	kgid_t old_gid, new_gid;
> +	int error;
> +
> +	net_ns_get_ownership(net_old, &old_uid, &old_gid);
> +	net_ns_get_ownership(net_new, &new_uid, &new_gid);
> +
> +	/* The network namespace was changed but the owning user namespace is
> +	 * identical so there's no need to change the owner of sysfs entries.
> +	 */
> +	if (uid_eq(old_uid, new_uid) && gid_eq(old_gid, new_gid))
> +		return 0;
> +
> +	error = device_change_owner(dev);

Ok, maybe I'm slow here, but what actually changed the gid/uid here?
How did it change?  All you did was look up the old uid/gid and the new
uid/gid.  But what set the device to the new one?

All of these functions are just really odd to me, one would think that a
"change owner" function would have the new owner in the paramter to know
what to change it to?  Your documentation says "owner must be changed
before calling this function", but how did that get changed and who
changed it?

Why not just pass it as part of the function itself?

Otherwise it looks really odd, like the above call.  As I can't see how
anything changes here at all by reading this code.  And that's a huge
sign of a bad API, when the maintainer of the subsystem can not even
understand how someone is using it with a single function call :)

thanks,

greg k-h
Christian Brauner Feb. 17, 2020, 4:58 p.m. UTC | #2
On Mon, Feb 17, 2020 at 05:28:03PM +0100, Greg Kroah-Hartman wrote:
> On Mon, Feb 17, 2020 at 05:14:34PM +0100, Christian Brauner wrote:
> > Add a function to change the owner of a network device when it is moved
> > between network namespaces.
> > 
> > Currently, when moving network devices between network namespaces the
> > ownership of the corresponding sysfs entries is not changed. This leads
> > to problems when tools try to operate on the corresponding sysfs files.
> > This leads to a bug whereby a network device that is created in a
> > network namespaces owned by a user namespace will have its corresponding
> > sysfs entry owned by the root user of the corresponding user namespace.
> > If such a network device has to be moved back to the host network
> > namespace the permissions will still be set to the user namespaces. This
> > means unprivileged users can e.g. trigger uevents for such incorrectly
> > owned devices. They can also modify the settings of the device itself.
> > Both of these things are unwanted.
> > 
> > For example, workloads will create network devices in the host network
> > namespace. Other tools will then proceed to move such devices between
> > network namespaces owner by other user namespaces. While the ownership
> > of the device itself is updated in
> > net/core/net-sysfs.c:dev_change_net_namespace() the corresponding sysfs
> > entry for the device is not:
> > 
> > drwxr-xr-x 5 nobody nobody    0 Jan 25 18:08 .
> > drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_assign_type
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_len
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 address
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 broadcast
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_changes
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_down_count
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_up_count
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_id
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_port
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dormant
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 duplex
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 flags
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 gro_flush_timeout
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifalias
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifindex
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 iflink
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 link_mode
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 mtu
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 name_assign_type
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 netdev_group
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 operstate
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_id
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_name
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_switch_id
> > drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 power
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 proto_down
> > drwxr-xr-x 4 nobody nobody    0 Jan 25 18:09 queues
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 speed
> > drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 statistics
> > lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:08 subsystem -> ../../../../class/net
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 tx_queue_len
> > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 type
> > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:08 uevent
> > 
> > However, if a device is created directly in the network namespace then
> > the device's sysfs permissions will be correctly updated:
> > 
> > drwxr-xr-x 5 root   root      0 Jan 25 18:12 .
> > drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_assign_type
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_len
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 address
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 broadcast
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 carrier
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_changes
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_down_count
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_up_count
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_id
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_port
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 dormant
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 duplex
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 flags
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 gro_flush_timeout
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 ifalias
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 ifindex
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 iflink
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 link_mode
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 mtu
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 name_assign_type
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 netdev_group
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 operstate
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_id
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_name
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_switch_id
> > drwxr-xr-x 2 root   root      0 Jan 25 18:12 power
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 proto_down
> > drwxr-xr-x 4 root   root      0 Jan 25 18:12 queues
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 speed
> > drwxr-xr-x 2 root   root      0 Jan 25 18:12 statistics
> > lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:12 subsystem -> ../../../../class/net
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 tx_queue_len
> > -r--r--r-- 1 root   root   4096 Jan 25 18:12 type
> > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 uevent
> > 
> > Now, when creating a network device in a network namespace owned by a
> > user namespace and moving it to the host the permissions will be set to
> > the id that the user namespace root user has been mapped to on the host
> > leading to all sorts of permission issues:
> > 
> > 458752
> > drwxr-xr-x 5 458752 458752      0 Jan 25 18:12 .
> > drwxr-xr-x 9 root   root        0 Jan 25 18:08 ..
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_assign_type
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_len
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 address
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 broadcast
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_changes
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_down_count
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_up_count
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_id
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_port
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dormant
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 duplex
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 flags
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 gro_flush_timeout
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 ifalias
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 ifindex
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 iflink
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 link_mode
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 mtu
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 name_assign_type
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 netdev_group
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 operstate
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_id
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_name
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_switch_id
> > drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 power
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 proto_down
> > drwxr-xr-x 4 458752 458752      0 Jan 25 18:12 queues
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 speed
> > drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 statistics
> > lrwxrwxrwx 1 root   root        0 Jan 25 18:12 subsystem -> ../../../../class/net
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 tx_queue_len
> > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 type
> > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 uevent
> > 
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> > /* v2 */
> > unchanged
> > ---
> >  net/core/net-sysfs.c | 27 +++++++++++++++++++++++++++
> >  net/core/net-sysfs.h |  2 ++
> >  2 files changed, 29 insertions(+)
> > 
> > diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> > index 4c826b8bf9b1..4fda021edf6d 100644
> > --- a/net/core/net-sysfs.c
> > +++ b/net/core/net-sysfs.c
> > @@ -1767,6 +1767,33 @@ int netdev_register_kobject(struct net_device *ndev)
> >  	return error;
> >  }
> >  
> > +/* Change owner for sysfs entries when moving network devices across network
> > + * namespaces owned by different user namespaces.
> > + */
> > +int netdev_change_owner(struct net_device *ndev, const struct net *net_old,
> > +			const struct net *net_new)
> > +{
> > +	struct device *dev = &ndev->dev;
> > +	kuid_t old_uid, new_uid;
> > +	kgid_t old_gid, new_gid;
> > +	int error;
> > +
> > +	net_ns_get_ownership(net_old, &old_uid, &old_gid);
> > +	net_ns_get_ownership(net_new, &new_uid, &new_gid);
> > +
> > +	/* The network namespace was changed but the owning user namespace is
> > +	 * identical so there's no need to change the owner of sysfs entries.
> > +	 */
> > +	if (uid_eq(old_uid, new_uid) && gid_eq(old_gid, new_gid))
> > +		return 0;
> > +
> > +	error = device_change_owner(dev);
> 
> Ok, maybe I'm slow here, but what actually changed the gid/uid here?
> How did it change?  All you did was look up the old uid/gid and the new
> uid/gid.  But what set the device to the new one?
> 
> All of these functions are just really odd to me, one would think that a
> "change owner" function would have the new owner in the paramter to know
> what to change it to?  Your documentation says "owner must be changed
> before calling this function", but how did that get changed and who
> changed it?
> 
> Why not just pass it as part of the function itself?
> 
> Otherwise it looks really odd, like the above call.  As I can't see how
> anything changes here at all by reading this code.  And that's a huge
> sign of a bad API, when the maintainer of the subsystem can not even
> understand how someone is using it with a single function call :)

So we've been briefly discussing this in the first iteration of this
series where I also explained why I in the first iteration I didn't add
explicit uid_t and gid_t parameters. I sugggested adding uid_t/gid_t
parameters in that thread but from your response in
https://lore.kernel.org/lkml/20200212160402.GA1799124@kroah.com/:

> > So ownership only changes if the kobject's uid/gid have been changed.
> > So when to stick with the networking example, when a network device is
> > moved into a new network namespace, the uid/gid of the kobject will be
> > changed to the root user of the owning user namespace of that network
> > namespace. So when the move of the network device has completed and
> > kobject_get_ownership() is called it will now return a different
> > uid/gid.
> 
> Ok, then this needs to say "change the uid/gid of the kobject to..." in
> order to explain what it is now being set to.  Otherwise this is really
> confusing if you only read the kerneldoc, right?

From this answer I took it that you did prefer not adding uid/gid
parameters. If that has changed or I misunderstood you then I can change
all these functions.

For clarity, for network namespaces: when the move of a network device
has completed dev_change_net_namespace() is called which changes the
ownership of the kobject associated with the net device. Now
kobject_get_ownership() will return the updated uids/gids. In any case,
asking explicitly: do you want to have those sysfs and device functions
take explicit uid and gid parameters? That sounds nicer to me as well. I
just didn't want to expose it because we alreay had this dynamic way of
setting ownership via kobject_get_ownership(). 

Thanks!
Christian
gregkh@linuxfoundation.org Feb. 17, 2020, 7:02 p.m. UTC | #3
On Mon, Feb 17, 2020 at 05:58:54PM +0100, Christian Brauner wrote:
> On Mon, Feb 17, 2020 at 05:28:03PM +0100, Greg Kroah-Hartman wrote:
> > On Mon, Feb 17, 2020 at 05:14:34PM +0100, Christian Brauner wrote:
> > > Add a function to change the owner of a network device when it is moved
> > > between network namespaces.
> > > 
> > > Currently, when moving network devices between network namespaces the
> > > ownership of the corresponding sysfs entries is not changed. This leads
> > > to problems when tools try to operate on the corresponding sysfs files.
> > > This leads to a bug whereby a network device that is created in a
> > > network namespaces owned by a user namespace will have its corresponding
> > > sysfs entry owned by the root user of the corresponding user namespace.
> > > If such a network device has to be moved back to the host network
> > > namespace the permissions will still be set to the user namespaces. This
> > > means unprivileged users can e.g. trigger uevents for such incorrectly
> > > owned devices. They can also modify the settings of the device itself.
> > > Both of these things are unwanted.
> > > 
> > > For example, workloads will create network devices in the host network
> > > namespace. Other tools will then proceed to move such devices between
> > > network namespaces owner by other user namespaces. While the ownership
> > > of the device itself is updated in
> > > net/core/net-sysfs.c:dev_change_net_namespace() the corresponding sysfs
> > > entry for the device is not:
> > > 
> > > drwxr-xr-x 5 nobody nobody    0 Jan 25 18:08 .
> > > drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_assign_type
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_len
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 address
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 broadcast
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_changes
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_down_count
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_up_count
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_id
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_port
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dormant
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 duplex
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 flags
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 gro_flush_timeout
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifalias
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifindex
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 iflink
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 link_mode
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 mtu
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 name_assign_type
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 netdev_group
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 operstate
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_id
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_name
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_switch_id
> > > drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 power
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 proto_down
> > > drwxr-xr-x 4 nobody nobody    0 Jan 25 18:09 queues
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 speed
> > > drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 statistics
> > > lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:08 subsystem -> ../../../../class/net
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 tx_queue_len
> > > -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 type
> > > -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:08 uevent
> > > 
> > > However, if a device is created directly in the network namespace then
> > > the device's sysfs permissions will be correctly updated:
> > > 
> > > drwxr-xr-x 5 root   root      0 Jan 25 18:12 .
> > > drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_assign_type
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_len
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 address
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 broadcast
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 carrier
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_changes
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_down_count
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_up_count
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_id
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_port
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 dormant
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 duplex
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 flags
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 gro_flush_timeout
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 ifalias
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 ifindex
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 iflink
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 link_mode
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 mtu
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 name_assign_type
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 netdev_group
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 operstate
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_id
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_name
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_switch_id
> > > drwxr-xr-x 2 root   root      0 Jan 25 18:12 power
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 proto_down
> > > drwxr-xr-x 4 root   root      0 Jan 25 18:12 queues
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 speed
> > > drwxr-xr-x 2 root   root      0 Jan 25 18:12 statistics
> > > lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:12 subsystem -> ../../../../class/net
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 tx_queue_len
> > > -r--r--r-- 1 root   root   4096 Jan 25 18:12 type
> > > -rw-r--r-- 1 root   root   4096 Jan 25 18:12 uevent
> > > 
> > > Now, when creating a network device in a network namespace owned by a
> > > user namespace and moving it to the host the permissions will be set to
> > > the id that the user namespace root user has been mapped to on the host
> > > leading to all sorts of permission issues:
> > > 
> > > 458752
> > > drwxr-xr-x 5 458752 458752      0 Jan 25 18:12 .
> > > drwxr-xr-x 9 root   root        0 Jan 25 18:08 ..
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_assign_type
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_len
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 address
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 broadcast
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_changes
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_down_count
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_up_count
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_id
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_port
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dormant
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 duplex
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 flags
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 gro_flush_timeout
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 ifalias
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 ifindex
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 iflink
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 link_mode
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 mtu
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 name_assign_type
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 netdev_group
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 operstate
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_id
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_name
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_switch_id
> > > drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 power
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 proto_down
> > > drwxr-xr-x 4 458752 458752      0 Jan 25 18:12 queues
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 speed
> > > drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 statistics
> > > lrwxrwxrwx 1 root   root        0 Jan 25 18:12 subsystem -> ../../../../class/net
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 tx_queue_len
> > > -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 type
> > > -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 uevent
> > > 
> > > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > > ---
> > > /* v2 */
> > > unchanged
> > > ---
> > >  net/core/net-sysfs.c | 27 +++++++++++++++++++++++++++
> > >  net/core/net-sysfs.h |  2 ++
> > >  2 files changed, 29 insertions(+)
> > > 
> > > diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> > > index 4c826b8bf9b1..4fda021edf6d 100644
> > > --- a/net/core/net-sysfs.c
> > > +++ b/net/core/net-sysfs.c
> > > @@ -1767,6 +1767,33 @@ int netdev_register_kobject(struct net_device *ndev)
> > >  	return error;
> > >  }
> > >  
> > > +/* Change owner for sysfs entries when moving network devices across network
> > > + * namespaces owned by different user namespaces.
> > > + */
> > > +int netdev_change_owner(struct net_device *ndev, const struct net *net_old,
> > > +			const struct net *net_new)
> > > +{
> > > +	struct device *dev = &ndev->dev;
> > > +	kuid_t old_uid, new_uid;
> > > +	kgid_t old_gid, new_gid;
> > > +	int error;
> > > +
> > > +	net_ns_get_ownership(net_old, &old_uid, &old_gid);
> > > +	net_ns_get_ownership(net_new, &new_uid, &new_gid);
> > > +
> > > +	/* The network namespace was changed but the owning user namespace is
> > > +	 * identical so there's no need to change the owner of sysfs entries.
> > > +	 */
> > > +	if (uid_eq(old_uid, new_uid) && gid_eq(old_gid, new_gid))
> > > +		return 0;
> > > +
> > > +	error = device_change_owner(dev);
> > 
> > Ok, maybe I'm slow here, but what actually changed the gid/uid here?
> > How did it change?  All you did was look up the old uid/gid and the new
> > uid/gid.  But what set the device to the new one?
> > 
> > All of these functions are just really odd to me, one would think that a
> > "change owner" function would have the new owner in the paramter to know
> > what to change it to?  Your documentation says "owner must be changed
> > before calling this function", but how did that get changed and who
> > changed it?
> > 
> > Why not just pass it as part of the function itself?
> > 
> > Otherwise it looks really odd, like the above call.  As I can't see how
> > anything changes here at all by reading this code.  And that's a huge
> > sign of a bad API, when the maintainer of the subsystem can not even
> > understand how someone is using it with a single function call :)
> 
> So we've been briefly discussing this in the first iteration of this
> series where I also explained why I in the first iteration I didn't add
> explicit uid_t and gid_t parameters. I sugggested adding uid_t/gid_t
> parameters in that thread but from your response in
> https://lore.kernel.org/lkml/20200212160402.GA1799124@kroah.com/:
> 
> > > So ownership only changes if the kobject's uid/gid have been changed.
> > > So when to stick with the networking example, when a network device is
> > > moved into a new network namespace, the uid/gid of the kobject will be
> > > changed to the root user of the owning user namespace of that network
> > > namespace. So when the move of the network device has completed and
> > > kobject_get_ownership() is called it will now return a different
> > > uid/gid.
> > 
> > Ok, then this needs to say "change the uid/gid of the kobject to..." in
> > order to explain what it is now being set to.  Otherwise this is really
> > confusing if you only read the kerneldoc, right?
> 
> From this answer I took it that you did prefer not adding uid/gid
> parameters. If that has changed or I misunderstood you then I can change
> all these functions.

Sorry for the confusion.  Yes, the documentation is better now, but the
function parameters are still just as bad.  I think you need to add
uid/gid to the parameters to make it obvious what is happening for when
you just read the code, without seeing the function comments as no one
is going to dig them up somewhere else in another file.

> For clarity, for network namespaces: when the move of a network device
> has completed dev_change_net_namespace() is called which changes the
> ownership of the kobject associated with the net device. Now
> kobject_get_ownership() will return the updated uids/gids. In any case,
> asking explicitly: do you want to have those sysfs and device functions
> take explicit uid and gid parameters?

Yes please.

> That sounds nicer to me as well. I just didn't want to expose it
> because we alreay had this dynamic way of setting ownership via
> kobject_get_ownership(). 

I think that's totally not intutive :)

thanks,

greg k-h
diff mbox series

Patch

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 4c826b8bf9b1..4fda021edf6d 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1767,6 +1767,33 @@  int netdev_register_kobject(struct net_device *ndev)
 	return error;
 }
 
+/* Change owner for sysfs entries when moving network devices across network
+ * namespaces owned by different user namespaces.
+ */
+int netdev_change_owner(struct net_device *ndev, const struct net *net_old,
+			const struct net *net_new)
+{
+	struct device *dev = &ndev->dev;
+	kuid_t old_uid, new_uid;
+	kgid_t old_gid, new_gid;
+	int error;
+
+	net_ns_get_ownership(net_old, &old_uid, &old_gid);
+	net_ns_get_ownership(net_new, &new_uid, &new_gid);
+
+	/* The network namespace was changed but the owning user namespace is
+	 * identical so there's no need to change the owner of sysfs entries.
+	 */
+	if (uid_eq(old_uid, new_uid) && gid_eq(old_gid, new_gid))
+		return 0;
+
+	error = device_change_owner(dev);
+	if (error)
+		return error;
+
+	return 0;
+}
+
 int netdev_class_create_file_ns(const struct class_attribute *class_attr,
 				const void *ns)
 {
diff --git a/net/core/net-sysfs.h b/net/core/net-sysfs.h
index 006876c7b78d..8a5b04c2699a 100644
--- a/net/core/net-sysfs.h
+++ b/net/core/net-sysfs.h
@@ -8,5 +8,7 @@  void netdev_unregister_kobject(struct net_device *);
 int net_rx_queue_update_kobjects(struct net_device *, int old_num, int new_num);
 int netdev_queue_update_kobjects(struct net_device *net,
 				 int old_num, int new_num);
+int netdev_change_owner(struct net_device *, const struct net *net_old,
+			const struct net *net_new);
 
 #endif