Message ID | xntugl53vx.fsf@greed.delorie.com |
---|---|
State | New |
Headers | show |
Series | [v1] Allow for unpriviledged nested containers | expand |
* DJ Delorie via Libc-alpha: > When running a "make check" in an untrusted podman container, > we do not have priviledges to mount /proc. Previously, we just > failed to initialize the container and thus all test-container > tests were "unsupported". With this change, we set up as much > of the container as we're allowed, so tests that run in > test-container but do not need /proc will run correctly, > and those that require /proc will go from "unsupported" to (likely) > "fail" (but should give diagnostics that make it obvious that > a missing /proc is responsible). Have you tried a bind mount of the existing /proc into the chroot (from the outside of that chroot)? Thanks, Florian
Florian Weimer <fweimer@redhat.com> writes: > Have you tried a bind mount of the existing /proc into the chroot (from > the outside of that chroot)? That's an interesting idea, but the directory it (and /sys, /dev, etc, eventually, I suppose) needs to be mounted on doesn't exist until we're late into "make check" and rsync'ing the pristine test container to the working test container. And we delete and rebuild that container as needed. It would be a lot of messy logic to pre-mount that. I talked with Carlos about this a bit and he suggested we could add a support_need_special_mounts() that just exits UNSUPPORTED if mounts like /proc are missing, for tests that rely on such. I'd rather not wait for that for this patch, though, as this patch at least enables more PASSing tests in the CICD stuff.
* DJ Delorie: > Florian Weimer <fweimer@redhat.com> writes: >> Have you tried a bind mount of the existing /proc into the chroot (from >> the outside of that chroot)? > > That's an interesting idea, but the directory it (and /sys, /dev, etc, > eventually, I suppose) needs to be mounted on doesn't exist until we're > late into "make check" and rsync'ing the pristine test container to the > working test container. And we delete and rebuild that container as > needed. It would be a lot of messy logic to pre-mount that. Huh. We already do this for various parts of /dev. I had something like this in mind (untested): diff --git a/support/test-container.c b/support/test-container.c index 94498d3901..ff91a12860 100644 --- a/support/test-container.c +++ b/support/test-container.c @@ -1094,6 +1094,13 @@ main (int argc, char **argv) trymount (support_srcdir_root, new_srcdir_path); trymount (support_objdir_root, new_objdir_path); + /* It may not be possible to mount /proc directly. */ + { + char *new_proc = concat (new_root_path, "/proc", NULL); + xmkdirp (new_proc); + trymount ("/proc", new_proc); + } + xmkdirp (concat (new_root_path, "/dev", NULL), 0755); devmount (new_root_path, "null"); devmount (new_root_path, "zero"); @@ -1163,11 +1170,6 @@ main (int argc, char **argv) maybe_xmkdir ("/tmp", 0755); - /* Now that we're pid 1 (effectively "root") we can mount /proc */ - maybe_xmkdir ("/proc", 0777); - if (mount ("proc", "/proc", "proc", 0, NULL) < 0) - FAIL_EXIT1 ("Unable to mount /proc: "); - /* We map our original UID to the same UID in the container so we can own our own files normally. */ UMAP = open ("/proc/self/uid_map", O_WRONLY); Thanks, Florian
Florian Weimer <fweimer@redhat.com> writes: > * DJ Delorie: >> Florian Weimer <fweimer@redhat.com> writes: >>> Have you tried a bind mount of the existing /proc into the chroot (from >>> the outside of that chroot)? >> >> That's an interesting idea, but the directory it (and /sys, /dev, etc, >> eventually, I suppose) needs to be mounted on doesn't exist until we're >> late into "make check" and rsync'ing the pristine test container to the >> working test container. And we delete and rebuild that container as >> needed. It would be a lot of messy logic to pre-mount that. > > Huh. We already do this for various parts of /dev. Ah, I thought you meant "outside of the podman container". Testing...
+ char *new_proc = concat (new_root_path, "/proc", NULL); + xmkdirp (new_proc, 0755); + trymount ("/proc", new_proc); + free (new_proc); Note to self: don't free anything returned from concat()
diff --git a/support/test-container.c b/support/test-container.c index 94498d39019..53fd7b2b5b6 100644 --- a/support/test-container.c +++ b/support/test-container.c @@ -1165,40 +1165,52 @@ main (int argc, char **argv) /* Now that we're pid 1 (effectively "root") we can mount /proc */ maybe_xmkdir ("/proc", 0777); - if (mount ("proc", "/proc", "proc", 0, NULL) < 0) - FAIL_EXIT1 ("Unable to mount /proc: "); - - /* We map our original UID to the same UID in the container so we - can own our own files normally. */ - UMAP = open ("/proc/self/uid_map", O_WRONLY); - if (UMAP < 0) - FAIL_EXIT1 ("can't write to /proc/self/uid_map\n"); - - sprintf (tmp, "%lld %lld 1\n", - (long long) (be_su ? 0 : original_uid), (long long) original_uid); - write (UMAP, tmp, strlen (tmp)); - xclose (UMAP); - - /* We must disable setgroups () before we can map our groups, else we - get EPERM. */ - GMAP = open ("/proc/self/setgroups", O_WRONLY); - if (GMAP >= 0) + if (mount ("proc", "/proc", "proc", 0, NULL) != 0) { - /* We support kernels old enough to not have this. */ - write (GMAP, "deny\n", 5); - xclose (GMAP); + // This happens if we're trying to create a nested container, + // like if the build is running under podman, and we lack + // priviledges. + + // Ideally we would WARN here, but that would just add noise to + // *every* test-container test, and the ones that care should + // have their own relevent diagnostics. + + // FAIL_EXIT1 ("Unable to mount /proc: "); } + else + { + /* We map our original UID to the same UID in the container so we + can own our own files normally. */ + UMAP = open ("/proc/self/uid_map", O_WRONLY); + if (UMAP < 0) + FAIL_EXIT1 ("can't write to /proc/self/uid_map\n"); + + sprintf (tmp, "%lld %lld 1\n", + (long long) (be_su ? 0 : original_uid), (long long) original_uid); + write (UMAP, tmp, strlen (tmp)); + xclose (UMAP); + + /* We must disable setgroups () before we can map our groups, else we + get EPERM. */ + GMAP = open ("/proc/self/setgroups", O_WRONLY); + if (GMAP >= 0) + { + /* We support kernels old enough to not have this. */ + write (GMAP, "deny\n", 5); + xclose (GMAP); + } - /* We map our original GID to the same GID in the container so we - can own our own files normally. */ - GMAP = open ("/proc/self/gid_map", O_WRONLY); - if (GMAP < 0) - FAIL_EXIT1 ("can't write to /proc/self/gid_map\n"); + /* We map our original GID to the same GID in the container so we + can own our own files normally. */ + GMAP = open ("/proc/self/gid_map", O_WRONLY); + if (GMAP < 0) + FAIL_EXIT1 ("can't write to /proc/self/gid_map\n"); - sprintf (tmp, "%lld %lld 1\n", - (long long) (be_su ? 0 : original_gid), (long long) original_gid); - write (GMAP, tmp, strlen (tmp)); - xclose (GMAP); + sprintf (tmp, "%lld %lld 1\n", + (long long) (be_su ? 0 : original_gid), (long long) original_gid); + write (GMAP, tmp, strlen (tmp)); + xclose (GMAP); + } if (change_cwd) {