diff mbox series

[SRU,K/L/Unstable,1/1] UBUNTU: SAUCE: overlayfs: handle idmapped mounts in ovl_do_(set|remove)xattr

Message ID 20230302212355.643110-2-aleksandr.mikhalitsyn@canonical.com
State New
Headers show
Series UBUNTU: SAUCE: overlayfs: handle idmapped mounts in ovl_do_(set|remove)xattr (LP: 2009065) | expand

Commit Message

Aleksandr Mikhalitsyn March 2, 2023, 9:23 p.m. UTC
BugLink: http://bugs.launchpad.net/bugs/2009065

We have to use ovl_upper_mnt_userns(ofs) helper to get proper user namespace
for idmapped layer. Otherwise we'll get -EPERM.

Right now, overlayfs on top of idmapped layer always mounted as read-only.
This is serious blocker for LXD/LXC unprivileged containers users who run
Docker containers inside.

Reproducer:
$ cd /idmapped/mount/path
$ mkdir {work,upper,lower,ovl}
$ mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl
$ touch ovl/test
touch: cannot touch 'ovl/test': Read-only file system

Error from dmesg:
overlayfs: failed to create directory work/work (errno: 1); mounting read-only

Reproducible on all Ubuntu kernels with the base >= 5.19

Fixes: eea996a46f ("UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs")
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/overlayfs/overlayfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Andrea Righi March 3, 2023, 7:08 a.m. UTC | #1
On Thu, Mar 02, 2023 at 10:23:55PM +0100, Alexander Mikhalitsyn wrote:
> BugLink: http://bugs.launchpad.net/bugs/2009065
> 
> We have to use ovl_upper_mnt_userns(ofs) helper to get proper user namespace
> for idmapped layer. Otherwise we'll get -EPERM.
> 
> Right now, overlayfs on top of idmapped layer always mounted as read-only.
> This is serious blocker for LXD/LXC unprivileged containers users who run
> Docker containers inside.
> 
> Reproducer:
> $ cd /idmapped/mount/path
> $ mkdir {work,upper,lower,ovl}
> $ mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl
> $ touch ovl/test
> touch: cannot touch 'ovl/test': Read-only file system
> 
> Error from dmesg:
> overlayfs: failed to create directory work/work (errno: 1); mounting read-only
> 
> Reproducible on all Ubuntu kernels with the base >= 5.19
> 
> Fixes: eea996a46f ("UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs")
> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>

It all makes sense to me, we also have a clear reproducer of the
problem, so: applied to lunar/linux-unstable (no need to apply this in
lunar/linux, because it'll be replaced by linux-unstable in the next
respin). Also:

Acked-by: Andrea Righi <andrea.righi@canonical.com>

BTW, I'm wondering if we have similar cases with our custom overlayfs
changes where we just use init_user_ns, instead of the proper namespace.
However, if there are similar cases we should be able to catch them
running the typical LXD/LXC test cases I guess.

Do you think we need to add other specific overlayfs test cases to our
regression test suite?

Thanks,
-Andrea

> ---
>  fs/overlayfs/overlayfs.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index e523d600da4e..3a85be75d64a 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -255,7 +255,7 @@ static inline int ovl_do_setxattr(struct ovl_fs *ofs, struct dentry *dentry,
>  	int err;
>  
>  	inode_lock(inode);
> -	err = __vfs_setxattr_noperm(&init_user_ns, dentry, name, value, size, flags);
> +	err = __vfs_setxattr_noperm(ovl_upper_mnt_userns(ofs), dentry, name, value, size, flags);
>  	inode_unlock(inode);
>  
>  	pr_debug("setxattr(%pd2, \"%s\", \"%*pE\", %zu, %d) = %i\n",
> @@ -277,7 +277,7 @@ static inline int ovl_do_removexattr(struct ovl_fs *ofs, struct dentry *dentry,
>  	int err;
>  
>  	inode_lock(inode);
> -	err = __vfs_removexattr_noperm(&init_user_ns, dentry, name);
> +	err = __vfs_removexattr_noperm(ovl_upper_mnt_userns(ofs), dentry, name);
>  	inode_unlock(inode);
>  
>  	pr_debug("removexattr(%pd2, \"%s\") = %i\n", dentry, name, err);
> -- 
> 2.34.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Aleksandr Mikhalitsyn March 3, 2023, 8:19 a.m. UTC | #2
On Fri, Mar 3, 2023 at 8:08 AM Andrea Righi <andrea.righi@canonical.com> wrote:
>
> On Thu, Mar 02, 2023 at 10:23:55PM +0100, Alexander Mikhalitsyn wrote:
> > BugLink: http://bugs.launchpad.net/bugs/2009065
> >
> > We have to use ovl_upper_mnt_userns(ofs) helper to get proper user namespace
> > for idmapped layer. Otherwise we'll get -EPERM.
> >
> > Right now, overlayfs on top of idmapped layer always mounted as read-only.
> > This is serious blocker for LXD/LXC unprivileged containers users who run
> > Docker containers inside.
> >
> > Reproducer:
> > $ cd /idmapped/mount/path
> > $ mkdir {work,upper,lower,ovl}
> > $ mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl
> > $ touch ovl/test
> > touch: cannot touch 'ovl/test': Read-only file system
> >
> > Error from dmesg:
> > overlayfs: failed to create directory work/work (errno: 1); mounting read-only
> >
> > Reproducible on all Ubuntu kernels with the base >= 5.19
> >
> > Fixes: eea996a46f ("UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs")
> > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
>
> It all makes sense to me, we also have a clear reproducer of the
> problem, so: applied to lunar/linux-unstable (no need to apply this in
> lunar/linux, because it'll be replaced by linux-unstable in the next
> respin). Also:

Great, thanks! What about kinetic/master-next and jammy/hwe-5.19,
those branches inherit this change somehow?

>
> Acked-by: Andrea Righi <andrea.righi@canonical.com>
>
> BTW, I'm wondering if we have similar cases with our custom overlayfs
> changes where we just use init_user_ns, instead of the proper namespace.

Yep, that's really complex stuff. If you check the actual
torvalds/linux tree fs/overlayfs
you find like 6 places where init_user_ns is used (and user properly),
sometimes it is used implicitly through "nop_mnt_idmap"
[ https://github.com/torvalds/linux/commit/256c8aed2b420a7c57ed6469fbb0f8310f5aeec9
]
So we need to check each case very carefully.

> However, if there are similar cases we should be able to catch them
> running the typical LXD/LXC test cases I guess.

Yep, probably we need to add overlayfs to LXC tests somehow to catch
issues like this.

>
> Do you think we need to add other specific overlayfs test cases to our
> regression test suite?

Right now I have no extra tests suggestions.

Kind regards,
Alex

>
> Thanks,
> -Andrea
>
> > ---
> >  fs/overlayfs/overlayfs.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index e523d600da4e..3a85be75d64a 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -255,7 +255,7 @@ static inline int ovl_do_setxattr(struct ovl_fs *ofs, struct dentry *dentry,
> >       int err;
> >
> >       inode_lock(inode);
> > -     err = __vfs_setxattr_noperm(&init_user_ns, dentry, name, value, size, flags);
> > +     err = __vfs_setxattr_noperm(ovl_upper_mnt_userns(ofs), dentry, name, value, size, flags);
> >       inode_unlock(inode);
> >
> >       pr_debug("setxattr(%pd2, \"%s\", \"%*pE\", %zu, %d) = %i\n",
> > @@ -277,7 +277,7 @@ static inline int ovl_do_removexattr(struct ovl_fs *ofs, struct dentry *dentry,
> >       int err;
> >
> >       inode_lock(inode);
> > -     err = __vfs_removexattr_noperm(&init_user_ns, dentry, name);
> > +     err = __vfs_removexattr_noperm(ovl_upper_mnt_userns(ofs), dentry, name);
> >       inode_unlock(inode);
> >
> >       pr_debug("removexattr(%pd2, \"%s\") = %i\n", dentry, name, err);
> > --
> > 2.34.1
> >
> >
> > --
> > kernel-team mailing list
> > kernel-team@lists.ubuntu.com
> > https://lists.ubuntu.com/mailman/listinfo/kernel-team
Andrea Righi March 3, 2023, 8:40 a.m. UTC | #3
On Fri, Mar 03, 2023 at 09:19:55AM +0100, Aleksandr Mikhalitsyn wrote:
> On Fri, Mar 3, 2023 at 8:08 AM Andrea Righi <andrea.righi@canonical.com> wrote:
> >
> > On Thu, Mar 02, 2023 at 10:23:55PM +0100, Alexander Mikhalitsyn wrote:
> > > BugLink: http://bugs.launchpad.net/bugs/2009065
> > >
> > > We have to use ovl_upper_mnt_userns(ofs) helper to get proper user namespace
> > > for idmapped layer. Otherwise we'll get -EPERM.
> > >
> > > Right now, overlayfs on top of idmapped layer always mounted as read-only.
> > > This is serious blocker for LXD/LXC unprivileged containers users who run
> > > Docker containers inside.
> > >
> > > Reproducer:
> > > $ cd /idmapped/mount/path
> > > $ mkdir {work,upper,lower,ovl}
> > > $ mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl
> > > $ touch ovl/test
> > > touch: cannot touch 'ovl/test': Read-only file system
> > >
> > > Error from dmesg:
> > > overlayfs: failed to create directory work/work (errno: 1); mounting read-only
> > >
> > > Reproducible on all Ubuntu kernels with the base >= 5.19
> > >
> > > Fixes: eea996a46f ("UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs")
> > > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> >
> > It all makes sense to me, we also have a clear reproducer of the
> > problem, so: applied to lunar/linux-unstable (no need to apply this in
> > lunar/linux, because it'll be replaced by linux-unstable in the next
> > respin). Also:
> 
> Great, thanks! What about kinetic/master-next and jammy/hwe-5.19,
> those branches inherit this change somehow?

No, this patch needs to be applied also to kinetic, then hwe-5.19 will
automatically inherit the change.

I already sent my Ack, if we get another one we can apply it to kinetic
as well.

> 
> >
> > Acked-by: Andrea Righi <andrea.righi@canonical.com>
> >
> > BTW, I'm wondering if we have similar cases with our custom overlayfs
> > changes where we just use init_user_ns, instead of the proper namespace.
> 
> Yep, that's really complex stuff. If you check the actual
> torvalds/linux tree fs/overlayfs
> you find like 6 places where init_user_ns is used (and user properly),
> sometimes it is used implicitly through "nop_mnt_idmap"
> [ https://github.com/torvalds/linux/commit/256c8aed2b420a7c57ed6469fbb0f8310f5aeec9
> ]
> So we need to check each case very carefully.

Makes sense, thanks for the clarification.

> 
> > However, if there are similar cases we should be able to catch them
> > running the typical LXD/LXC test cases I guess.
> 
> Yep, probably we need to add overlayfs to LXC tests somehow to catch
> issues like this.

That'd be great.

> 
> >
> > Do you think we need to add other specific overlayfs test cases to our
> > regression test suite?
> 
> Right now I have no extra tests suggestions.
> 
> Kind regards,
> Alex

Thanks Alex.

-Andrea
Aleksandr Mikhalitsyn March 3, 2023, 8:43 a.m. UTC | #4
On Fri, Mar 3, 2023 at 9:41 AM Andrea Righi <andrea.righi@canonical.com> wrote:
>
> On Fri, Mar 03, 2023 at 09:19:55AM +0100, Aleksandr Mikhalitsyn wrote:
> > On Fri, Mar 3, 2023 at 8:08 AM Andrea Righi <andrea.righi@canonical.com> wrote:
> > >
> > > On Thu, Mar 02, 2023 at 10:23:55PM +0100, Alexander Mikhalitsyn wrote:
> > > > BugLink: http://bugs.launchpad.net/bugs/2009065
> > > >
> > > > We have to use ovl_upper_mnt_userns(ofs) helper to get proper user namespace
> > > > for idmapped layer. Otherwise we'll get -EPERM.
> > > >
> > > > Right now, overlayfs on top of idmapped layer always mounted as read-only.
> > > > This is serious blocker for LXD/LXC unprivileged containers users who run
> > > > Docker containers inside.
> > > >
> > > > Reproducer:
> > > > $ cd /idmapped/mount/path
> > > > $ mkdir {work,upper,lower,ovl}
> > > > $ mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl
> > > > $ touch ovl/test
> > > > touch: cannot touch 'ovl/test': Read-only file system
> > > >
> > > > Error from dmesg:
> > > > overlayfs: failed to create directory work/work (errno: 1); mounting read-only
> > > >
> > > > Reproducible on all Ubuntu kernels with the base >= 5.19
> > > >
> > > > Fixes: eea996a46f ("UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs")
> > > > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> > >
> > > It all makes sense to me, we also have a clear reproducer of the
> > > problem, so: applied to lunar/linux-unstable (no need to apply this in
> > > lunar/linux, because it'll be replaced by linux-unstable in the next
> > > respin). Also:
> >
> > Great, thanks! What about kinetic/master-next and jammy/hwe-5.19,
> > those branches inherit this change somehow?
>
> No, this patch needs to be applied also to kinetic, then hwe-5.19 will
> automatically inherit the change.
>
> I already sent my Ack, if we get another one we can apply it to kinetic
> as well.

Ah, got it! Thanks! ;-)

>
> >
> > >
> > > Acked-by: Andrea Righi <andrea.righi@canonical.com>
> > >
> > > BTW, I'm wondering if we have similar cases with our custom overlayfs
> > > changes where we just use init_user_ns, instead of the proper namespace.
> >
> > Yep, that's really complex stuff. If you check the actual
> > torvalds/linux tree fs/overlayfs
> > you find like 6 places where init_user_ns is used (and user properly),
> > sometimes it is used implicitly through "nop_mnt_idmap"
> > [ https://github.com/torvalds/linux/commit/256c8aed2b420a7c57ed6469fbb0f8310f5aeec9
> > ]
> > So we need to check each case very carefully.
>
> Makes sense, thanks for the clarification.
>
> >
> > > However, if there are similar cases we should be able to catch them
> > > running the typical LXD/LXC test cases I guess.
> >
> > Yep, probably we need to add overlayfs to LXC tests somehow to catch
> > issues like this.
>
> That'd be great.
>
> >
> > >
> > > Do you think we need to add other specific overlayfs test cases to our
> > > regression test suite?
> >
> > Right now I have no extra tests suggestions.
> >
> > Kind regards,
> > Alex
>
> Thanks Alex.
>
> -Andrea
diff mbox series

Patch

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e523d600da4e..3a85be75d64a 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -255,7 +255,7 @@  static inline int ovl_do_setxattr(struct ovl_fs *ofs, struct dentry *dentry,
 	int err;
 
 	inode_lock(inode);
-	err = __vfs_setxattr_noperm(&init_user_ns, dentry, name, value, size, flags);
+	err = __vfs_setxattr_noperm(ovl_upper_mnt_userns(ofs), dentry, name, value, size, flags);
 	inode_unlock(inode);
 
 	pr_debug("setxattr(%pd2, \"%s\", \"%*pE\", %zu, %d) = %i\n",
@@ -277,7 +277,7 @@  static inline int ovl_do_removexattr(struct ovl_fs *ofs, struct dentry *dentry,
 	int err;
 
 	inode_lock(inode);
-	err = __vfs_removexattr_noperm(&init_user_ns, dentry, name);
+	err = __vfs_removexattr_noperm(ovl_upper_mnt_userns(ofs), dentry, name);
 	inode_unlock(inode);
 
 	pr_debug("removexattr(%pd2, \"%s\") = %i\n", dentry, name, err);