Message ID | 20191105090553.6350-4-cyphar@cyphar.com |
---|---|
State | New |
Headers | show |
Series | open: introduce openat2(2) syscall | expand |
On 2019-11-13, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Tue, Nov 05, 2019 at 08:05:47PM +1100, Aleksa Sarai wrote: > > > @@ -862,6 +870,8 @@ static int nd_jump_root(struct nameidata *nd) > > void nd_jump_link(struct path *path) > > { > > struct nameidata *nd = current->nameidata; > > + > > + nd->last_magiclink.same_mnt = (nd->path.mnt == path->mnt); > > path_put(&nd->path); > > > > nd->path = *path; > > @@ -1082,6 +1092,10 @@ const char *get_link(struct nameidata *nd) > > if (nd->flags & LOOKUP_MAGICLINK_JUMPED) { > > if (unlikely(nd->flags & LOOKUP_NO_MAGICLINKS)) > > return ERR_PTR(-ELOOP); > > + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) { > > + if (!nd->last_magiclink.same_mnt) > > + return ERR_PTR(-EXDEV); > > + } > > } > > Ugh... Wouldn't it be better to take that logics (some equivalent thereof) > into nd_jump_link()? Or just have nd_jump_link() return an error... This could be done, but the reason for stashing it away in last_magiclink is because of the future magic-link re-opening patches which can't be implemented like that without putting the open_flags inside nameidata (which was decided to be too ugly a while ago). My point being that I could implement it this way for this series, but I'd have to implement something like last_magiclink when I end up re-posting the magic-link stuff in a few weeks. Looking at all the nd_jump_link() users, the other option is to just disallow magic-link crossings entirely for LOOKUP_NO_XDEV. The only thing allowing them permits is to resolve file descriptors that are pointing to the same procfs mount -- and it's unclear to me how useful that really is (apparmorfs and nsfs will always give -EXDEV because aafs_mnt and nsfs_mnt are internal kernel vfsmounts).
On 2019-11-14, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Thu, Nov 14, 2019 at 03:49:45PM +1100, Aleksa Sarai wrote: > > On 2019-11-13, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > On Tue, Nov 05, 2019 at 08:05:47PM +1100, Aleksa Sarai wrote: > > > > > > > @@ -862,6 +870,8 @@ static int nd_jump_root(struct nameidata *nd) > > > > void nd_jump_link(struct path *path) > > > > { > > > > struct nameidata *nd = current->nameidata; > > > > + > > > > + nd->last_magiclink.same_mnt = (nd->path.mnt == path->mnt); > > > > path_put(&nd->path); > > > > > > > > nd->path = *path; > > > > @@ -1082,6 +1092,10 @@ const char *get_link(struct nameidata *nd) > > > > if (nd->flags & LOOKUP_MAGICLINK_JUMPED) { > > > > if (unlikely(nd->flags & LOOKUP_NO_MAGICLINKS)) > > > > return ERR_PTR(-ELOOP); > > > > + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) { > > > > + if (!nd->last_magiclink.same_mnt) > > > > + return ERR_PTR(-EXDEV); > > > > + } > > > > } > > > > > > Ugh... Wouldn't it be better to take that logics (some equivalent thereof) > > > into nd_jump_link()? Or just have nd_jump_link() return an error... > > > > This could be done, but the reason for stashing it away in > > last_magiclink is because of the future magic-link re-opening patches > > which can't be implemented like that without putting the open_flags > > inside nameidata (which was decided to be too ugly a while ago). > > > > My point being that I could implement it this way for this series, but > > I'd have to implement something like last_magiclink when I end up > > re-posting the magic-link stuff in a few weeks. > > > > Looking at all the nd_jump_link() users, the other option is to just > > disallow magic-link crossings entirely for LOOKUP_NO_XDEV. The only > > thing allowing them permits is to resolve file descriptors that are > > pointing to the same procfs mount -- and it's unclear to me how useful > > that really is (apparmorfs and nsfs will always give -EXDEV because > > aafs_mnt and nsfs_mnt are internal kernel vfsmounts). > > I would rather keep the entire if (nd->flags & LOOKUP_MAGICLINK_JUMPED) > out of the get_link(). If you want to generate some error if > nd_jump_link() has been called, just do it right there. The fewer > pieces of state need to be carried around, the better... Sure, I can make nd_jump_link() give -ELOOP and drop the current need for LOOKUP_MAGICLINK_JUMPED -- if necessary we can re-add it for the magic-link reopening patches. > And as for opening them... Why would you need full open_flags in there? > Details, please... I was referring to [1] which has been dropped from this series. I misspoke -- you don't need the full open_flags, you just need acc_mode in nameidata -- but from memory you (understandably) weren't in favour of that either because it further muddled the open semantics with namei. So the solution I went with was to stash away the i_mode of the magiclink in nd->last_magiclink.mode (though to avoid a race which Jann found, you actually need to recalculate it when you call nd_jump_link() but that's a different topic) and then check it in trailing_magiclink(). However, I've since figured out that we need to restrict things like bind-mounts and truncate() because they can be used to get around the restrictions. I dropped that patch from this series so that I could work on implementing the restrictions for the other relevant VFS syscalls separately from openat2 (upgrade_mask will be re-added to open_how with those patches). My point was that AFAICS we will either have to have nd->acc_mode (or something similar) or have nd->last_magiclink in order to implement the magic-link reopening hardening. [1]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
diff --git a/fs/namei.c b/fs/namei.c index 1f0d871199e5..b73ee1601bd4 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -504,6 +504,9 @@ struct nameidata { struct filename *name; struct nameidata *saved; struct inode *link_inode; + struct { + bool same_mnt; + } last_magiclink; unsigned root_seq; int dfd; } __randomize_layout; @@ -837,6 +840,11 @@ static inline void path_to_nameidata(const struct path *path, static int nd_jump_root(struct nameidata *nd) { + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) { + /* Absolute path arguments to path_init() are allowed. */ + if (nd->path.mnt != NULL && nd->path.mnt != nd->root.mnt) + return -EXDEV; + } if (nd->flags & LOOKUP_RCU) { struct dentry *d; nd->path = nd->root; @@ -862,6 +870,8 @@ static int nd_jump_root(struct nameidata *nd) void nd_jump_link(struct path *path) { struct nameidata *nd = current->nameidata; + + nd->last_magiclink.same_mnt = (nd->path.mnt == path->mnt); path_put(&nd->path); nd->path = *path; @@ -1082,6 +1092,10 @@ const char *get_link(struct nameidata *nd) if (nd->flags & LOOKUP_MAGICLINK_JUMPED) { if (unlikely(nd->flags & LOOKUP_NO_MAGICLINKS)) return ERR_PTR(-ELOOP); + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) { + if (!nd->last_magiclink.same_mnt) + return ERR_PTR(-EXDEV); + } } if (IS_ERR_OR_NULL(res)) return res; @@ -1271,12 +1285,16 @@ static int follow_managed(struct path *path, struct nameidata *nd) break; } - if (need_mntput && path->mnt == mnt) - mntput(path->mnt); + if (need_mntput) { + if (path->mnt == mnt) + mntput(path->mnt); + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) + ret = -EXDEV; + else + nd->flags |= LOOKUP_JUMPED; + } if (ret == -EISDIR || !ret) ret = 1; - if (need_mntput) - nd->flags |= LOOKUP_JUMPED; if (unlikely(ret < 0)) path_put_conditional(path, nd); return ret; @@ -1333,6 +1351,8 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, mounted = __lookup_mnt(path->mnt, path->dentry); if (!mounted) break; + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) + return false; path->mnt = &mounted->mnt; path->dentry = mounted->mnt.mnt_root; nd->flags |= LOOKUP_JUMPED; @@ -1379,6 +1399,8 @@ static int follow_dotdot_rcu(struct nameidata *nd) return -ECHILD; if (&mparent->mnt == nd->path.mnt) break; + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) + return -EXDEV; /* we know that mountpoint was pinned */ nd->path.dentry = mountpoint; nd->path.mnt = &mparent->mnt; @@ -1393,6 +1415,8 @@ static int follow_dotdot_rcu(struct nameidata *nd) return -ECHILD; if (!mounted) break; + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) + return -EXDEV; nd->path.mnt = &mounted->mnt; nd->path.dentry = mounted->mnt.mnt_root; inode = nd->path.dentry->d_inode; @@ -1491,6 +1515,8 @@ static int follow_dotdot(struct nameidata *nd) } if (!follow_up(&nd->path)) break; + if (unlikely(nd->flags & LOOKUP_NO_XDEV)) + return -EXDEV; } follow_mount(&nd->path); nd->inode = nd->path.dentry->d_inode; diff --git a/include/linux/namei.h b/include/linux/namei.h index a8b3f93338da..6105c8a59fc8 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -43,6 +43,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; /* Scoping flags for lookup. */ #define LOOKUP_NO_SYMLINKS 0x020000 /* No symlink crossing. */ #define LOOKUP_NO_MAGICLINKS 0x040000 /* No /proc/$pid/fd/ "symlink" crossing. */ +#define LOOKUP_NO_XDEV 0x080000 /* No mountpoint crossing. */ extern int path_pts(struct path *path);
/* Background. */ The need to contain path operations within a mountpoint has been a long-standing usecase that userspace has historically implemented manually with liberal usage of stat(). find, rsync, tar and many other programs implement these semantics -- but it'd be much simpler to have a fool-proof way of refusing to open a path if it crosses a mountpoint. This is part of a refresh of Al's AT_NO_JUMPS patchset[1] (which was a variation on David Drysdale's O_BENEATH patchset[2], which in turn was based on the Capsicum project[3]). /* Userspace API. */ LOOKUP_NO_XDEV will be exposed to userspace through openat2(2). /* Semantics. */ Unlike most other LOOKUP flags (most notably LOOKUP_FOLLOW), LOOKUP_NO_XDEV applies to all components of the path. With LOOKUP_NO_XDEV, any path component which crosses a mount-point during path resolution (including "..") will yield an -EXDEV. Absolute paths, absolute symlinks, and magic-links will only yield an -EXDEV if the jump involved changing mount-points. /* Testing. */ LOOKUP_NO_XDEV is tested as part of the openat2(2) selftests. [1]: https://lore.kernel.org/lkml/20170429220414.GT29622@ZenIV.linux.org.uk/ [2]: https://lore.kernel.org/lkml/1415094884-18349-1-git-send-email-drysdale@google.com/ [3]: https://lore.kernel.org/lkml/1404124096-21445-1-git-send-email-drysdale@google.com/ Cc: Christian Brauner <christian.brauner@ubuntu.com> Suggested-by: David Drysdale <drysdale@google.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> --- fs/namei.c | 34 ++++++++++++++++++++++++++++++---- include/linux/namei.h | 1 + 2 files changed, 31 insertions(+), 4 deletions(-)