Patchwork [V4,2/8] Provide chroot environment server side interfaces

login
register
mail settings
Submitter Mohan Kumar M
Date Feb. 1, 2011, 5:25 a.m.
Message ID <1296537939-16649-1-git-send-email-mohan@in.ibm.com>
Download mbox | patch
Permalink /patch/81273/
State New
Headers show

Comments

Mohan Kumar M - Feb. 1, 2011, 5:25 a.m.
Implement chroot server side interfaces like sending the file
descriptor to qemu process, reading the object request from socket etc.
Also add chroot main function and other helper routines.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
---
 Makefile.objs              |    1 +
 hw/9pfs/virtio-9p-chroot.c |  212 ++++++++++++++++++++++++++++++++++++++++++++
 hw/9pfs/virtio-9p-chroot.h |   41 +++++++++
 hw/9pfs/virtio-9p.c        |   33 +++++++
 hw/file-op-9p.h            |    3 +
 5 files changed, 290 insertions(+), 0 deletions(-)
 create mode 100644 hw/9pfs/virtio-9p-chroot.c
 create mode 100644 hw/9pfs/virtio-9p-chroot.h
Daniel P. Berrange - Feb. 1, 2011, 10:32 a.m.
On Tue, Feb 01, 2011 at 10:55:39AM +0530, M. Mohan Kumar wrote:
> Implement chroot server side interfaces like sending the file
> descriptor to qemu process, reading the object request from socket etc.
> Also add chroot main function and other helper routines.
> 
> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
> ---
>  Makefile.objs              |    1 +
>  hw/9pfs/virtio-9p-chroot.c |  212 ++++++++++++++++++++++++++++++++++++++++++++
>  hw/9pfs/virtio-9p-chroot.h |   41 +++++++++
>  hw/9pfs/virtio-9p.c        |   33 +++++++
>  hw/file-op-9p.h            |    3 +
>  5 files changed, 290 insertions(+), 0 deletions(-)
>  create mode 100644 hw/9pfs/virtio-9p-chroot.c
>  create mode 100644 hw/9pfs/virtio-9p-chroot.h
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index bc0344c..3007b6d 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> +/*
> + * Fork a process and chroot into the share path. Communication
> + * between qemu process and chroot process happens via socket
> + * All file descriptors (including stdout and stderr) are closed
> + * except one socket descriptor (which is used for communicating
> + * between qemu process and chroot process)
> + */
> +int v9fs_chroot(FsContext *fs_ctx)
> +{
> +    int fd_pair[2], chroot_sock, error;
> +    V9fsFileObjectRequest request;
> +    pid_t pid;
> +    uint64_t code;
> +    FdInfo fd_info;
> +
> +    if (socketpair(PF_UNIX, SOCK_STREAM, 0, fd_pair) < 0) {
> +        error_report("socketpair %s", strerror(errno));
> +        return -1;
> +    }
> +
> +    pid = fork();
> +    if (pid < 0) {
> +        error_report("fork %s", strerror(errno));
> +        return -1;
> +    }
> +    if (pid != 0) {
> +        fs_ctx->chroot_socket = fd_pair[0];
> +        close(fd_pair[1]);
> +        return 0;
> +    }
> +
> +    close(fd_pair[0]);
> +    chroot_sock = fd_pair[1];
> +    if (chroot(fs_ctx->fs_root) < 0) {
> +        code = CHROOT_ERROR << 32 | errno;
> +        error = qemu_write_full(chroot_sock, &code, sizeof(code));
> +        _exit(1);
> +    }
> +
> +    error = chroot_daemonize(chroot_sock);
> +    if (error) {
> +        code = SETSID_ERROR << 32 | error;
> +        error = qemu_write_full(chroot_sock, &code, sizeof(code));
> +        _exit(1);
> +    }
> +
> +    /*
> +     * Write 0 to chroot socket to indicate chroot process creation is
> +     * successful
> +     */
> +    code = 0;
> +    if (qemu_write_full(chroot_sock, &code, sizeof(code))
> +                    != sizeof(code)) {
> +        _exit(1);
> +    }
> +    /* get the request from the socket */
> +    while (1) {
> +        memset(&fd_info, 0, sizeof(fd_info));
> +        if (chroot_read_request(chroot_sock, &request) == EIO) {
> +            fd_info.fi_fd = 0;
> +            fd_info.fi_error = EIO;
> +            fd_info.fi_flags = FI_SOCKERR;
> +            chroot_sendfd(chroot_sock, &fd_info);
> +            continue;
> +        }
> +        qemu_free((void *)request.path.path);
> +        if (request.data.oldpath_len) {
> +            qemu_free((void *)request.path.old_path);
> +        }
> +    }
> +}

There is a subtle problem with using fork() in a multi-threaded
program that I was recently made aware of in libvirt. In short
if you have a multi-threaded program that calls fork(), then
the child process must only use POSIX functions that are
declared 'async signal safe', until the child calls exec() or
exit().  In particular any malloc()/free() related functions
are *not* async signal safe.

  http://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html

  "If a multi-threaded process calls fork(), the new process shall contain
  a replica of the calling thread and its entire address space, possibly
  including the states of mutexes and other resources. Consequently, to
  avoid errors, the child process may only execute async-signal-safe
  operations until such time as one of the exec functions is called."

One example problem scenario. Thread 1 is currently doing a
malloc() and the malloc() impl is holding a mutex. Thread 2
now does a fork(), and in the child process calls malloc().
The child process will deadlock / hang forever because there
is nothing which will ever release the malloc() mutex that
was originally held by Thread 1. See also this thread which
brought the problem to my attention:

  http://lists.gnu.org/archive/html/coreutils/2011-01/msg00085.html

Regards,
Daniel
Stefan Hajnoczi - Feb. 1, 2011, 12:02 p.m.
On Tue, Feb 1, 2011 at 10:32 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
> There is a subtle problem with using fork() in a multi-threaded
> program that I was recently made aware of in libvirt. In short
> if you have a multi-threaded program that calls fork(), then
> the child process must only use POSIX functions that are
> declared 'async signal safe', until the child calls exec() or
> exit().  In particular any malloc()/free() related functions
> are *not* async signal safe.

In this particular patch the fork() call happens quite early so the
risk should be low but it would be nice to investigate this issue
fully.

Stefan

Patch

diff --git a/Makefile.objs b/Makefile.objs
index bc0344c..3007b6d 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -273,6 +273,7 @@  hw-obj-$(CONFIG_SOUND) += $(sound-obj-y)
 9pfs-nested-$(CONFIG_VIRTFS) = virtio-9p-debug.o
 9pfs-nested-$(CONFIG_VIRTFS) +=  virtio-9p-local.o virtio-9p-xattr.o
 9pfs-nested-$(CONFIG_VIRTFS) +=   virtio-9p-xattr-user.o virtio-9p-posix-acl.o
+9pfs-nested-$(CONFIG_VIRTFS) +=   virtio-9p-chroot.o
 
 hw-obj-$(CONFIG_REALLY_VIRTFS) += $(addprefix 9pfs/, $(9pfs-nested-y))
 $(addprefix 9pfs/, $(9pfs-nested-y)): CFLAGS +=  -I$(SRC_PATH)/hw/
diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c
new file mode 100644
index 0000000..5150ff0
--- /dev/null
+++ b/hw/9pfs/virtio-9p-chroot.c
@@ -0,0 +1,212 @@ 
+/*
+ * Virtio 9p chroot environment for contained access to exported path
+ *
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ * M. Mohan Kumar <mohan@in.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the copying file in the top-level directory
+ *
+ */
+
+#include <sys/fsuid.h>
+#include <sys/resource.h>
+#include <signal.h>
+#include "virtio.h"
+#include "qemu_socket.h"
+#include "qemu-thread.h"
+#include "qerror.h"
+#include "virtio-9p.h"
+#include "virtio-9p-chroot.h"
+
+/*
+ * Structure used by chroot functions to transmit file descriptor and
+ * error info
+ */
+typedef struct {
+    int fi_fd;
+#define FI_FDVALID 1
+#define FI_SOCKERR 2
+    int fi_flags;
+    int fi_error;
+} FdInfo;
+
+union MsgControl {
+    struct cmsghdr cmsg;
+    char control[CMSG_SPACE(sizeof(int))];
+};
+
+/* Send file descriptor and error status to qemu process */
+static void chroot_sendfd(int sockfd, const FdInfo *fd_info)
+{
+    struct msghdr msg = { };
+    struct iovec iov;
+    struct cmsghdr *cmsg;
+    int retval;
+    union MsgControl msg_control;
+
+    iov.iov_base = (void *)fd_info;
+    iov.iov_len = sizeof(*fd_info);
+
+    memset(&msg, 0, sizeof(msg));
+    msg.msg_iov = &iov;
+    msg.msg_iovlen = 1;
+    /* No ancillary data on error/fd invalid flag */
+    if (fd_info->fi_flags & FI_FDVALID) {
+        msg.msg_control = &msg_control;
+        msg.msg_controllen = sizeof(msg_control);
+
+        cmsg = &msg_control.cmsg;
+        cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info->fi_fd));
+        cmsg->cmsg_level = SOL_SOCKET;
+        cmsg->cmsg_type = SCM_RIGHTS;
+        memcpy(CMSG_DATA(cmsg), &fd_info->fi_fd, sizeof(fd_info->fi_fd));
+    }
+    retval = sendmsg(sockfd, &msg, 0);
+    if (retval == EPIPE || retval == EBADF) {
+        _exit(1);
+    }
+    if (fd_info->fi_flags & FI_FDVALID) {
+        close(fd_info->fi_fd);
+    }
+}
+
+/* Read V9fsFileObjectRequest written by QEMU process */
+static int chroot_read_request(int sockfd, V9fsFileObjectRequest *request)
+{
+    int retval;
+    retval = qemu_read_full(sockfd, request, sizeof(request->data));
+    if (retval != sizeof(request->data)) {
+        if (errno == EBADF) {
+            _exit(1);
+        }
+        return EIO;
+    }
+    request->path.path = qemu_mallocz(request->data.path_len + 1);
+    retval = qemu_read_full(sockfd, (void *)request->path.path,
+                        request->data.path_len);
+    if (retval != request->data.path_len) {
+        qemu_free((void *)request->path.path);
+        if (errno == EBADF) {
+            _exit(1);
+        }
+        return EIO;
+    }
+    if (request->data.oldpath_len > 0) {
+        request->path.old_path =
+                qemu_mallocz(request->data.oldpath_len + 1);
+        retval = qemu_read_full(sockfd, (void *)request->path.old_path,
+                            request->data.oldpath_len);
+        if (retval != request->data.oldpath_len) {
+            qemu_free((void *)request->path.path);
+            qemu_free((void *)request->path.old_path);
+            if (errno == EBADF) {
+                _exit(1);
+            }
+            return EIO;
+        }
+    }
+    return 0;
+}
+
+static int chroot_daemonize(int chroot_sock)
+{
+    sigset_t sigset;
+    struct rlimit nr_fd;
+    int fd;
+
+    /* Block all signals for this process */
+    sigprocmask(SIG_SETMASK, &sigset, NULL);
+
+    /* Daemonize */
+    if (setsid() < 0) {
+        return errno;
+    }
+
+    /* Close other file descriptors */
+    getrlimit(RLIMIT_NOFILE, &nr_fd);
+    for (fd = 0; fd < nr_fd.rlim_cur; fd++) {
+        if (fd != chroot_sock) {
+            close(fd);
+        }
+    }
+    chdir("/");
+
+    /* Create files with mode as requested by client */
+    umask(0);
+    return 0;
+}
+
+/*
+ * Fork a process and chroot into the share path. Communication
+ * between qemu process and chroot process happens via socket
+ * All file descriptors (including stdout and stderr) are closed
+ * except one socket descriptor (which is used for communicating
+ * between qemu process and chroot process)
+ */
+int v9fs_chroot(FsContext *fs_ctx)
+{
+    int fd_pair[2], chroot_sock, error;
+    V9fsFileObjectRequest request;
+    pid_t pid;
+    uint64_t code;
+    FdInfo fd_info;
+
+    if (socketpair(PF_UNIX, SOCK_STREAM, 0, fd_pair) < 0) {
+        error_report("socketpair %s", strerror(errno));
+        return -1;
+    }
+
+    pid = fork();
+    if (pid < 0) {
+        error_report("fork %s", strerror(errno));
+        return -1;
+    }
+    if (pid != 0) {
+        fs_ctx->chroot_socket = fd_pair[0];
+        close(fd_pair[1]);
+        return 0;
+    }
+
+    close(fd_pair[0]);
+    chroot_sock = fd_pair[1];
+    if (chroot(fs_ctx->fs_root) < 0) {
+        code = CHROOT_ERROR << 32 | errno;
+        error = qemu_write_full(chroot_sock, &code, sizeof(code));
+        _exit(1);
+    }
+
+    error = chroot_daemonize(chroot_sock);
+    if (error) {
+        code = SETSID_ERROR << 32 | error;
+        error = qemu_write_full(chroot_sock, &code, sizeof(code));
+        _exit(1);
+    }
+
+    /*
+     * Write 0 to chroot socket to indicate chroot process creation is
+     * successful
+     */
+    code = 0;
+    if (qemu_write_full(chroot_sock, &code, sizeof(code))
+                    != sizeof(code)) {
+        _exit(1);
+    }
+    /* get the request from the socket */
+    while (1) {
+        memset(&fd_info, 0, sizeof(fd_info));
+        if (chroot_read_request(chroot_sock, &request) == EIO) {
+            fd_info.fi_fd = 0;
+            fd_info.fi_error = EIO;
+            fd_info.fi_flags = FI_SOCKERR;
+            chroot_sendfd(chroot_sock, &fd_info);
+            continue;
+        }
+        qemu_free((void *)request.path.path);
+        if (request.data.oldpath_len) {
+            qemu_free((void *)request.path.old_path);
+        }
+    }
+}
diff --git a/hw/9pfs/virtio-9p-chroot.h b/hw/9pfs/virtio-9p-chroot.h
new file mode 100644
index 0000000..6f3fd14
--- /dev/null
+++ b/hw/9pfs/virtio-9p-chroot.h
@@ -0,0 +1,41 @@ 
+#ifndef _QEMU_VIRTIO_9P_CHROOT_H
+#define _QEMU_VIRTIO_9P_CHROOT_H
+
+/* types for V9fsFileObjectRequest */
+#define T_OPEN      1
+#define T_CREATE    2
+#define T_MKDIR     3
+#define T_MKNOD     4
+#define T_SYMLINK   5
+#define T_LINK      6
+#define CHROOT_ERROR 1ULL
+#define SETSID_ERROR 2ULL
+
+struct V9fsFileObjectData
+{
+    int flags;
+    int mode;
+    uid_t uid;
+    gid_t gid;
+    dev_t dev;
+    int path_len;
+    int oldpath_len;
+    int type;
+};
+
+struct V9fsFileObjectPath
+{
+    const char *path;
+    const char *old_path;
+};
+
+typedef struct V9fsFileObjectRequest
+{
+    struct V9fsFileObjectData data;
+    struct V9fsFileObjectPath path;
+} V9fsFileObjectRequest;
+
+
+int v9fs_chroot(FsContext *fs_ctx);
+
+#endif /* _QEMU_VIRTIO_9P_CHROOT_H */
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 27e7750..fc006a9 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -14,10 +14,13 @@ 
 #include "virtio.h"
 #include "pc.h"
 #include "qemu_socket.h"
+#include "qerror.h"
 #include "virtio-9p.h"
 #include "fsdev/qemu-fsdev.h"
 #include "virtio-9p-debug.h"
 #include "virtio-9p-xattr.h"
+#include "virtio-9p-chroot.h"
+#include <pthread.h>
 
 int debug_9p_pdu;
 
@@ -3743,5 +3746,35 @@  VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf)
                         s->tag_len;
     s->vdev.get_config = virtio_9p_get_config;
 
+    if (s->ctx.fs_sm == SM_PASSTHROUGH) {
+        uint64_t code;
+        pthread_mutex_init(&s->ctx.chroot_mutex, 0);
+        s->ctx.chroot_ioerror = 0;
+        if (v9fs_chroot(&s->ctx) < 0) {
+            exit(1);
+        }
+
+        /*
+         * Chroot process sends 0 to indicate chroot process creation is
+         * successful
+         */
+        if (read(s->ctx.chroot_socket, &code, sizeof(code)) != sizeof(code)) {
+            error_report("chroot process creation failed");
+            exit(1);
+        }
+        if (code != 0) {
+            switch (code >> 32) {
+            case CHROOT_ERROR:
+                error_report("chroot system call failed: %s",
+                                strerror(code & 0xFFFFFFFF));
+                break;
+            case SETSID_ERROR:
+                error_report("setsid failed: %s", strerror(code & 0xFFFFFFFF));
+                break;
+            }
+            exit(1);
+        }
+    }
+
     return &s->vdev;
 }
diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h
index c7731c2..0cd7728 100644
--- a/hw/file-op-9p.h
+++ b/hw/file-op-9p.h
@@ -55,6 +55,9 @@  typedef struct FsContext
     SecModel fs_sm;
     uid_t uid;
     struct xattr_operations **xops;
+    pthread_mutex_t chroot_mutex;
+    int chroot_socket;
+    int chroot_ioerror;
 } FsContext;
 
 extern void cred_init(FsCred *);