diff mbox

[ebiederm@xmission.com:,[REVIEW,3/3] vfs: Fix a regression in mounting proc]

Message ID 20131129220752.GB16889@sergelap
State New
Headers show

Commit Message

Serge E. Hallyn Nov. 29, 2013, 10:07 p.m. UTC
Hi,

This is the second patch needed in trusty's kernel for containers.  It
is in
https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-next
(and I assume linux-next).  This fixes a regression recently introduced
which prevents using user namespaces for containers.

----- Forwarded message from "Eric W. Biederman" <ebiederm@xmission.com> -----

Date: Tue, 26 Nov 2013 16:17:36 -0800
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>, Containers <containers@lists.linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, Aditya Kali <adityakali@google.com>, Oleg Nesterov
	<oleg@redhat.com>, Andy Lutomirski <luto@amacapital.net>
Subject: [REVIEW][PATCH 3/3] vfs: Fix a regression in mounting proc


Gao feng <gaofeng@cn.fujitsu.com> reported that commit
e51db73532955dc5eaba4235e62b74b460709d5b
userns: Better restrictions on when proc and sysfs can be mounted
caused a regression on mounting a new instance of proc in a mount
namespace created with user namespace privileges, when binfmt_misc
is mounted on /proc/sys/fs/binfmt_misc.

This is an unintended regression caused by the absolutely bogus empty
directory check in fs_fully_visible.  The check fs_fully_visible replaced
didn't even bother to attempt to verify proc was fully visible and
hiding proc files with any kind of mount is rare.  So for now fix
the userspace regression by allowing directory with nlink == 1
as /proc/sys/fs/binfmt_misc has.

I will have a better patch but it is not stable material, or
last minute kernel material.  So it will have to wait.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/namespace.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Andy Whitcroft Dec. 2, 2013, 11:16 a.m. UTC | #1
On Fri, Nov 29, 2013 at 04:07:52PM -0600, Serge Hallyn wrote:
> Hi,
> 
> This is the second patch needed in trusty's kernel for containers.  It
> is in
> https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-next
> (and I assume linux-next).  This fixes a regression recently introduced
> which prevents using user namespaces for containers.
> 
> ----- Forwarded message from "Eric W. Biederman" <ebiederm@xmission.com> -----
> 
> Date: Tue, 26 Nov 2013 16:17:36 -0800
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> To: "Serge E. Hallyn" <serge@hallyn.com>
> Cc: Gao feng <gaofeng@cn.fujitsu.com>, Containers <containers@lists.linux-foundation.org>,
> 	linux-fsdevel@vger.kernel.org, Aditya Kali <adityakali@google.com>, Oleg Nesterov
> 	<oleg@redhat.com>, Andy Lutomirski <luto@amacapital.net>
> Subject: [REVIEW][PATCH 3/3] vfs: Fix a regression in mounting proc
> 
> 
> Gao feng <gaofeng@cn.fujitsu.com> reported that commit
> e51db73532955dc5eaba4235e62b74b460709d5b
> userns: Better restrictions on when proc and sysfs can be mounted
> caused a regression on mounting a new instance of proc in a mount
> namespace created with user namespace privileges, when binfmt_misc
> is mounted on /proc/sys/fs/binfmt_misc.
> 
> This is an unintended regression caused by the absolutely bogus empty
> directory check in fs_fully_visible.  The check fs_fully_visible replaced
> didn't even bother to attempt to verify proc was fully visible and
> hiding proc files with any kind of mount is rare.  So for now fix
> the userspace regression by allowing directory with nlink == 1
> as /proc/sys/fs/binfmt_misc has.
> 
> I will have a better patch but it is not stable material, or
> last minute kernel material.  So it will have to wait.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  fs/namespace.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index ac2ce8a766e1..be32ebccdeb1 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2886,7 +2886,7 @@ bool fs_fully_visible(struct file_system_type *type)
>  			struct inode *inode = child->mnt_mountpoint->d_inode;
>  			if (!S_ISDIR(inode->i_mode))
>  				goto next;
> -			if (inode->i_nlink != 2)
> +			if (inode->i_nlink > 2)
>  				goto next;
>  		}
>  		visible = true;

Looks reasonable.  Applied to trusty unstable.

-apw
diff mbox

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index ac2ce8a766e1..be32ebccdeb1 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2886,7 +2886,7 @@  bool fs_fully_visible(struct file_system_type *type)
 			struct inode *inode = child->mnt_mountpoint->d_inode;
 			if (!S_ISDIR(inode->i_mode))
 				goto next;
-			if (inode->i_nlink != 2)
+			if (inode->i_nlink > 2)
 				goto next;
 		}
 		visible = true;