diff mbox

[net] bpf: Use mount_nodev not mount_ns to mount the bpf filesystem

Message ID 874m9sfjyf.fsf_-_@x220.int.ebiederm.org
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman May 20, 2016, 10:22 p.m. UTC
While reviewing the filesystems that set FS_USERNS_MOUNT I spotted the
bpf filesystem.  Looking at the code I saw a broken usage of mount_ns
with current->nsproxy->mnt_ns. As the code does not acquire a
reference to the mount namespace it can not possibly be correct to
store the mount namespace on the superblock as it does.

Replace mount_ns with mount_nodev so that each mount of the bpf
filesystem returns a distinct instance, and the code is not buggy.

In discussion with Hannes Frederic Sowa it was reported that the use
of mount_ns was an attempt to have one bpf instance per mount
namespace, in an attempt to keep resources that pin resources from
hiding.  That intent simply does not work, the vfs is not built to
allow that kind of behavior.  Which means that the bpf filesystem
really is buggy both semantically and in it's implemenation as it does
not nor can it implement the original intent.

This change is userspace visible, but my experience with similar
filesystems leads me to believe nothing will break with a model of each
mount of the bpf filesystem is distinct from all others.

Fixes: b2197755b263 ("bpf: add support for persistent maps/progs")
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/bpf/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Hannes Frederic Sowa May 20, 2016, 11:27 p.m. UTC | #1
On 21.05.2016 00:22, Eric W. Biederman wrote:
> 
> While reviewing the filesystems that set FS_USERNS_MOUNT I spotted the
> bpf filesystem.  Looking at the code I saw a broken usage of mount_ns
> with current->nsproxy->mnt_ns. As the code does not acquire a
> reference to the mount namespace it can not possibly be correct to
> store the mount namespace on the superblock as it does.
> 
> Replace mount_ns with mount_nodev so that each mount of the bpf
> filesystem returns a distinct instance, and the code is not buggy.
> 
> In discussion with Hannes Frederic Sowa it was reported that the use
> of mount_ns was an attempt to have one bpf instance per mount
> namespace, in an attempt to keep resources that pin resources from
> hiding.  That intent simply does not work, the vfs is not built to
> allow that kind of behavior.  Which means that the bpf filesystem
> really is buggy both semantically and in it's implemenation as it does
> not nor can it implement the original intent.
> 
> This change is userspace visible, but my experience with similar
> filesystems leads me to believe nothing will break with a model of each
> mount of the bpf filesystem is distinct from all others.
> 
> Fixes: b2197755b263 ("bpf: add support for persistent maps/progs")
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Thanks Eric!
David Miller May 20, 2016, 11:46 p.m. UTC | #2
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 20 May 2016 17:22:48 -0500

> 
> While reviewing the filesystems that set FS_USERNS_MOUNT I spotted the
> bpf filesystem.  Looking at the code I saw a broken usage of mount_ns
> with current->nsproxy->mnt_ns. As the code does not acquire a
> reference to the mount namespace it can not possibly be correct to
> store the mount namespace on the superblock as it does.
> 
> Replace mount_ns with mount_nodev so that each mount of the bpf
> filesystem returns a distinct instance, and the code is not buggy.
> 
> In discussion with Hannes Frederic Sowa it was reported that the use
> of mount_ns was an attempt to have one bpf instance per mount
> namespace, in an attempt to keep resources that pin resources from
> hiding.  That intent simply does not work, the vfs is not built to
> allow that kind of behavior.  Which means that the bpf filesystem
> really is buggy both semantically and in it's implemenation as it does
> not nor can it implement the original intent.
> 
> This change is userspace visible, but my experience with similar
> filesystems leads me to believe nothing will break with a model of each
> mount of the bpf filesystem is distinct from all others.
> 
> Fixes: b2197755b263 ("bpf: add support for persistent maps/progs")
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Applied and queued up for -stable, thanks everyone.
diff mbox

Patch

diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 8f94ca1860cf..55d923688f85 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -378,7 +378,7 @@  static int bpf_fill_super(struct super_block *sb, void *data, int silent)
 static struct dentry *bpf_mount(struct file_system_type *type, int flags,
 				const char *dev_name, void *data)
 {
-	return mount_ns(type, flags, current->nsproxy->mnt_ns, bpf_fill_super);
+	return mount_nodev(type, flags, data, bpf_fill_super);
 }
 
 static struct file_system_type bpf_fs_type = {