diff mbox series

[v1] Allow for unpriviledged nested containers

Message ID xntugl53vx.fsf@greed.delorie.com
State New
Headers show
Series [v1] Allow for unpriviledged nested containers | expand

Commit Message

DJ Delorie Nov. 9, 2021, 11:02 p.m. UTC
When running a "make check" in an untrusted podman container,
we do not have priviledges to mount /proc.  Previously, we just
failed to initialize the container and thus all test-container
tests were "unsupported".  With this change, we set up as much
of the container as we're allowed, so tests that run in
test-container but do not need /proc will run correctly,
and those that require /proc will go from "unsupported" to (likely)
"fail" (but should give diagnostics that make it obvious that
a missing /proc is responsible).

Comments

Florian Weimer Nov. 10, 2021, 8:23 a.m. UTC | #1
* DJ Delorie via Libc-alpha:

> When running a "make check" in an untrusted podman container,
> we do not have priviledges to mount /proc.  Previously, we just
> failed to initialize the container and thus all test-container
> tests were "unsupported".  With this change, we set up as much
> of the container as we're allowed, so tests that run in
> test-container but do not need /proc will run correctly,
> and those that require /proc will go from "unsupported" to (likely)
> "fail" (but should give diagnostics that make it obvious that
> a missing /proc is responsible).

Have you tried a bind mount of the existing /proc into the chroot (from
the outside of that chroot)?

Thanks,
Florian
DJ Delorie Nov. 10, 2021, 6:30 p.m. UTC | #2
Florian Weimer <fweimer@redhat.com> writes:
> Have you tried a bind mount of the existing /proc into the chroot (from
> the outside of that chroot)?

That's an interesting idea, but the directory it (and /sys, /dev, etc,
eventually, I suppose) needs to be mounted on doesn't exist until we're
late into "make check" and rsync'ing the pristine test container to the
working test container.  And we delete and rebuild that container as
needed.  It would be a lot of messy logic to pre-mount that.

I talked with Carlos about this a bit and he suggested we could add a
support_need_special_mounts() that just exits UNSUPPORTED if mounts like
/proc are missing, for tests that rely on such.  I'd rather not wait for
that for this patch, though, as this patch at least enables more PASSing
tests in the CICD stuff.
Florian Weimer Nov. 12, 2021, 1:31 p.m. UTC | #3
* DJ Delorie:

> Florian Weimer <fweimer@redhat.com> writes:
>> Have you tried a bind mount of the existing /proc into the chroot (from
>> the outside of that chroot)?
>
> That's an interesting idea, but the directory it (and /sys, /dev, etc,
> eventually, I suppose) needs to be mounted on doesn't exist until we're
> late into "make check" and rsync'ing the pristine test container to the
> working test container.  And we delete and rebuild that container as
> needed.  It would be a lot of messy logic to pre-mount that.

Huh.  We already do this for various parts of /dev.  I had something
like this in mind (untested):

diff --git a/support/test-container.c b/support/test-container.c
index 94498d3901..ff91a12860 100644
--- a/support/test-container.c
+++ b/support/test-container.c
@@ -1094,6 +1094,13 @@ main (int argc, char **argv)
   trymount (support_srcdir_root, new_srcdir_path);
   trymount (support_objdir_root, new_objdir_path);
 
+  /* It may not be possible to mount /proc directly.  */
+  {
+    char *new_proc = concat (new_root_path, "/proc", NULL);
+    xmkdirp (new_proc);
+    trymount ("/proc", new_proc);
+  }
+
   xmkdirp (concat (new_root_path, "/dev", NULL), 0755);
   devmount (new_root_path, "null");
   devmount (new_root_path, "zero");
@@ -1163,11 +1170,6 @@ main (int argc, char **argv)
 
   maybe_xmkdir ("/tmp", 0755);
 
-  /* Now that we're pid 1 (effectively "root") we can mount /proc  */
-  maybe_xmkdir ("/proc", 0777);
-  if (mount ("proc", "/proc", "proc", 0, NULL) < 0)
-    FAIL_EXIT1 ("Unable to mount /proc: ");
-
   /* We map our original UID to the same UID in the container so we
      can own our own files normally.  */
   UMAP = open ("/proc/self/uid_map", O_WRONLY);

Thanks,
Florian
DJ Delorie Nov. 15, 2021, 8:58 p.m. UTC | #4
Florian Weimer <fweimer@redhat.com> writes:
> * DJ Delorie:
>> Florian Weimer <fweimer@redhat.com> writes:
>>> Have you tried a bind mount of the existing /proc into the chroot (from
>>> the outside of that chroot)?
>>
>> That's an interesting idea, but the directory it (and /sys, /dev, etc,
>> eventually, I suppose) needs to be mounted on doesn't exist until we're
>> late into "make check" and rsync'ing the pristine test container to the
>> working test container.  And we delete and rebuild that container as
>> needed.  It would be a lot of messy logic to pre-mount that.
>
> Huh.  We already do this for various parts of /dev.

Ah, I thought you meant "outside of the podman container".  Testing...
DJ Delorie Nov. 15, 2021, 10:43 p.m. UTC | #5
+    char *new_proc = concat (new_root_path, "/proc", NULL);
+    xmkdirp (new_proc, 0755);
+    trymount ("/proc", new_proc);
+    free (new_proc);

Note to self: don't free anything returned from concat()
diff mbox series

Patch

diff --git a/support/test-container.c b/support/test-container.c
index 94498d39019..53fd7b2b5b6 100644
--- a/support/test-container.c
+++ b/support/test-container.c
@@ -1165,40 +1165,52 @@  main (int argc, char **argv)
 
   /* Now that we're pid 1 (effectively "root") we can mount /proc  */
   maybe_xmkdir ("/proc", 0777);
-  if (mount ("proc", "/proc", "proc", 0, NULL) < 0)
-    FAIL_EXIT1 ("Unable to mount /proc: ");
-
-  /* We map our original UID to the same UID in the container so we
-     can own our own files normally.  */
-  UMAP = open ("/proc/self/uid_map", O_WRONLY);
-  if (UMAP < 0)
-    FAIL_EXIT1 ("can't write to /proc/self/uid_map\n");
-
-  sprintf (tmp, "%lld %lld 1\n",
-	   (long long) (be_su ? 0 : original_uid), (long long) original_uid);
-  write (UMAP, tmp, strlen (tmp));
-  xclose (UMAP);
-
-  /* We must disable setgroups () before we can map our groups, else we
-     get EPERM.  */
-  GMAP = open ("/proc/self/setgroups", O_WRONLY);
-  if (GMAP >= 0)
+  if (mount ("proc", "/proc", "proc", 0, NULL) != 0)
     {
-      /* We support kernels old enough to not have this.  */
-      write (GMAP, "deny\n", 5);
-      xclose (GMAP);
+      // This happens if we're trying to create a nested container,
+      // like if the build is running under podman, and we lack
+      // priviledges.
+
+      // Ideally we would WARN here, but that would just add noise to
+      // *every* test-container test, and the ones that care should
+      // have their own relevent diagnostics.
+
+      // FAIL_EXIT1 ("Unable to mount /proc: ");
     }
+  else
+    {
+      /* We map our original UID to the same UID in the container so we
+	 can own our own files normally.  */
+      UMAP = open ("/proc/self/uid_map", O_WRONLY);
+      if (UMAP < 0)
+	FAIL_EXIT1 ("can't write to /proc/self/uid_map\n");
+
+      sprintf (tmp, "%lld %lld 1\n",
+	       (long long) (be_su ? 0 : original_uid), (long long) original_uid);
+      write (UMAP, tmp, strlen (tmp));
+      xclose (UMAP);
+
+      /* We must disable setgroups () before we can map our groups, else we
+	 get EPERM.  */
+      GMAP = open ("/proc/self/setgroups", O_WRONLY);
+      if (GMAP >= 0)
+	{
+	  /* We support kernels old enough to not have this.  */
+	  write (GMAP, "deny\n", 5);
+	  xclose (GMAP);
+	}
 
-  /* We map our original GID to the same GID in the container so we
-     can own our own files normally.  */
-  GMAP = open ("/proc/self/gid_map", O_WRONLY);
-  if (GMAP < 0)
-    FAIL_EXIT1 ("can't write to /proc/self/gid_map\n");
+      /* We map our original GID to the same GID in the container so we
+	 can own our own files normally.  */
+      GMAP = open ("/proc/self/gid_map", O_WRONLY);
+      if (GMAP < 0)
+	FAIL_EXIT1 ("can't write to /proc/self/gid_map\n");
 
-  sprintf (tmp, "%lld %lld 1\n",
-	   (long long) (be_su ? 0 : original_gid), (long long) original_gid);
-  write (GMAP, tmp, strlen (tmp));
-  xclose (GMAP);
+      sprintf (tmp, "%lld %lld 1\n",
+	       (long long) (be_su ? 0 : original_gid), (long long) original_gid);
+      write (GMAP, tmp, strlen (tmp));
+      xclose (GMAP);
+    }
 
   if (change_cwd)
     {