diff mbox series

[23/24] vhost-user-fs: Implement drop CAP_FSETID functionality

Message ID 20210209190224.62827-24-dgilbert@redhat.com
State New
Headers show
Series virtiofs dax patches | expand

Commit Message

Dr. David Alan Gilbert Feb. 9, 2021, 7:02 p.m. UTC
From: Vivek Goyal <vgoyal@redhat.com>

As part of slave_io message, slave can ask to do I/O on an fd. Additionally
slave can ask for dropping CAP_FSETID (if master has it) before doing I/O.
Implement functionality to drop CAP_FSETID and gain it back after the
operation.

This also creates a dependency on libcap-ng.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 hw/virtio/meson.build     |  1 +
 hw/virtio/vhost-user-fs.c | 92 ++++++++++++++++++++++++++++++++++++++-
 meson.build               |  6 +++
 3 files changed, 97 insertions(+), 2 deletions(-)

Comments

Stefan Hajnoczi Feb. 11, 2021, 2:35 p.m. UTC | #1
On Tue, Feb 09, 2021 at 07:02:23PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: Vivek Goyal <vgoyal@redhat.com>
> 
> As part of slave_io message, slave can ask to do I/O on an fd. Additionally
> slave can ask for dropping CAP_FSETID (if master has it) before doing I/O.
> Implement functionality to drop CAP_FSETID and gain it back after the
> operation.
> 
> This also creates a dependency on libcap-ng.

Is this patch only for the case where QEMU is running as root?

I'm not sure it will have any effect on a regular QEMU (e.g. launched by
libvirt).
Vivek Goyal Feb. 11, 2021, 2:40 p.m. UTC | #2
On Thu, Feb 11, 2021 at 02:35:42PM +0000, Stefan Hajnoczi wrote:
> On Tue, Feb 09, 2021 at 07:02:23PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: Vivek Goyal <vgoyal@redhat.com>
> > 
> > As part of slave_io message, slave can ask to do I/O on an fd. Additionally
> > slave can ask for dropping CAP_FSETID (if master has it) before doing I/O.
> > Implement functionality to drop CAP_FSETID and gain it back after the
> > operation.
> > 
> > This also creates a dependency on libcap-ng.
> 
> Is this patch only for the case where QEMU is running as root?
> 

Yes, it primarily is for the case where qemu is running as root, or
somebody managed to launch it non-root but with still having capability
CAP_FSETID.

Vivek

> I'm not sure it will have any effect on a regular QEMU (e.g. launched by
> libvirt).
Stefan Hajnoczi Feb. 15, 2021, 3:57 p.m. UTC | #3
On Thu, Feb 11, 2021 at 09:40:31AM -0500, Vivek Goyal wrote:
> On Thu, Feb 11, 2021 at 02:35:42PM +0000, Stefan Hajnoczi wrote:
> > On Tue, Feb 09, 2021 at 07:02:23PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > From: Vivek Goyal <vgoyal@redhat.com>
> > > 
> > > As part of slave_io message, slave can ask to do I/O on an fd. Additionally
> > > slave can ask for dropping CAP_FSETID (if master has it) before doing I/O.
> > > Implement functionality to drop CAP_FSETID and gain it back after the
> > > operation.
> > > 
> > > This also creates a dependency on libcap-ng.
> > 
> > Is this patch only for the case where QEMU is running as root?
> > 
> 
> Yes, it primarily is for the case where qemu is running as root, or
> somebody managed to launch it non-root but with still having capability
> CAP_FSETID.

Running QEMU as root is not encouraged because the security model is
designed around the principle of least privilege (only give QEMU access
to resources that belong to the guest).

What happens in the case where QEMU is not root? Does that mean QEMU
will drop suid/guid bits even if the FUSE client wanted them to be
preserved?

Stefan
Vivek Goyal Feb. 16, 2021, 3:57 p.m. UTC | #4
On Mon, Feb 15, 2021 at 03:57:11PM +0000, Stefan Hajnoczi wrote:
> On Thu, Feb 11, 2021 at 09:40:31AM -0500, Vivek Goyal wrote:
> > On Thu, Feb 11, 2021 at 02:35:42PM +0000, Stefan Hajnoczi wrote:
> > > On Tue, Feb 09, 2021 at 07:02:23PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > > From: Vivek Goyal <vgoyal@redhat.com>
> > > > 
> > > > As part of slave_io message, slave can ask to do I/O on an fd. Additionally
> > > > slave can ask for dropping CAP_FSETID (if master has it) before doing I/O.
> > > > Implement functionality to drop CAP_FSETID and gain it back after the
> > > > operation.
> > > > 
> > > > This also creates a dependency on libcap-ng.
> > > 
> > > Is this patch only for the case where QEMU is running as root?
> > > 
> > 
> > Yes, it primarily is for the case where qemu is running as root, or
> > somebody managed to launch it non-root but with still having capability
> > CAP_FSETID.
> 
> Running QEMU as root is not encouraged because the security model is
> designed around the principle of least privilege (only give QEMU access
> to resources that belong to the guest).
> 
> What happens in the case where QEMU is not root? Does that mean QEMU
> will drop suid/guid bits even if the FUSE client wanted them to be
> preserved?

QEMU will drop CAP_FSETID only if vhost-user slave asked for it. There
is no notion of gaining CAP_FSETID.

IOW, yes, if qemu is running unpriviliged and does not have CAP_FSETID,
then we will end up clearing setuid bit on host. Not sure how that
problem can be fixed.

Vivek
Stefan Hajnoczi Feb. 22, 2021, 4:53 p.m. UTC | #5
On Tue, Feb 16, 2021 at 10:57:10AM -0500, Vivek Goyal wrote:
> On Mon, Feb 15, 2021 at 03:57:11PM +0000, Stefan Hajnoczi wrote:
> > On Thu, Feb 11, 2021 at 09:40:31AM -0500, Vivek Goyal wrote:
> > > On Thu, Feb 11, 2021 at 02:35:42PM +0000, Stefan Hajnoczi wrote:
> > > > On Tue, Feb 09, 2021 at 07:02:23PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > > > From: Vivek Goyal <vgoyal@redhat.com>
> > > > > 
> > > > > As part of slave_io message, slave can ask to do I/O on an fd. Additionally
> > > > > slave can ask for dropping CAP_FSETID (if master has it) before doing I/O.
> > > > > Implement functionality to drop CAP_FSETID and gain it back after the
> > > > > operation.
> > > > > 
> > > > > This also creates a dependency on libcap-ng.
> > > > 
> > > > Is this patch only for the case where QEMU is running as root?
> > > > 
> > > 
> > > Yes, it primarily is for the case where qemu is running as root, or
> > > somebody managed to launch it non-root but with still having capability
> > > CAP_FSETID.
> > 
> > Running QEMU as root is not encouraged because the security model is
> > designed around the principle of least privilege (only give QEMU access
> > to resources that belong to the guest).
> > 
> > What happens in the case where QEMU is not root? Does that mean QEMU
> > will drop suid/guid bits even if the FUSE client wanted them to be
> > preserved?
> 
> QEMU will drop CAP_FSETID only if vhost-user slave asked for it. There
> is no notion of gaining CAP_FSETID.
> 
> IOW, yes, if qemu is running unpriviliged and does not have CAP_FSETID,
> then we will end up clearing setuid bit on host. Not sure how that
> problem can be fixed.

Yeah, that seems problematic since the suid bit should stay set in that
case. The host cannot set the bit again (even if it has privileges)
because that would create a race condition where the guest expects the
bit set but it's cleared.

Stefan
diff mbox series

Patch

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index fbff9bc9d4..bdcdc82e13 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -18,6 +18,7 @@  virtio_ss.add(when: 'CONFIG_VIRTIO_BALLOON', if_true: files('virtio-balloon.c'))
 virtio_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('virtio-crypto.c'))
 virtio_ss.add(when: ['CONFIG_VIRTIO_CRYPTO', 'CONFIG_VIRTIO_PCI'], if_true: files('virtio-crypto-pci.c'))
 virtio_ss.add(when: 'CONFIG_VHOST_USER_FS', if_true: files('vhost-user-fs.c'))
+virtio_ss.add(when: 'CONFIG_VHOST_USER_FS', if_true: libcap_ng)
 virtio_ss.add(when: ['CONFIG_VHOST_USER_FS', 'CONFIG_VIRTIO_PCI'], if_true: files('vhost-user-fs-pci.c'))
 virtio_ss.add(when: 'CONFIG_VIRTIO_PMEM', if_true: files('virtio-pmem.c'))
 virtio_ss.add(when: 'CONFIG_VHOST_VSOCK', if_true: files('vhost-vsock.c', 'vhost-vsock-common.c'))
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index 61e891c82d..0d6ec27edd 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -13,6 +13,8 @@ 
 
 #include "qemu/osdep.h"
 #include <sys/ioctl.h>
+#include <cap-ng.h>
+#include <sys/syscall.h>
 #include "standard-headers/linux/virtio_fs.h"
 #include "qapi/error.h"
 #include "hw/qdev-properties.h"
@@ -36,6 +38,84 @@ 
 #define DAX_WINDOW_PROT PROT_NONE
 #endif
 
+/*
+ * Helpers for dropping and regaining effective capabilities. Returns 0
+ * on success, error otherwise
+ */
+static int drop_effective_cap(const char *cap_name, bool *cap_dropped)
+{
+    int cap, ret;
+
+    cap = capng_name_to_capability(cap_name);
+    if (cap < 0) {
+        ret = -errno;
+        error_report("capng_name_to_capability(%s) failed:%s", cap_name,
+                     strerror(errno));
+        goto out;
+    }
+
+    if (capng_get_caps_process()) {
+        ret = -errno;
+        error_report("capng_get_caps_process() failed:%s", strerror(errno));
+        goto out;
+    }
+
+    /* We dont have this capability in effective set already. */
+    if (!capng_have_capability(CAPNG_EFFECTIVE, cap)) {
+        ret = 0;
+        goto out;
+    }
+
+    if (capng_update(CAPNG_DROP, CAPNG_EFFECTIVE, cap)) {
+        ret = -errno;
+        error_report("capng_update(DROP,) failed");
+        goto out;
+    }
+    if (capng_apply(CAPNG_SELECT_CAPS)) {
+        ret = -errno;
+        error_report("drop:capng_apply() failed");
+        goto out;
+    }
+
+    ret = 0;
+    if (cap_dropped) {
+        *cap_dropped = true;
+    }
+
+out:
+    return ret;
+}
+
+static int gain_effective_cap(const char *cap_name)
+{
+    int cap;
+    int ret = 0;
+
+    cap = capng_name_to_capability(cap_name);
+    if (cap < 0) {
+        ret = -errno;
+        error_report("capng_name_to_capability(%s) failed:%s", cap_name,
+                     strerror(errno));
+        goto out;
+    }
+
+    if (capng_update(CAPNG_ADD, CAPNG_EFFECTIVE, cap)) {
+        ret = -errno;
+        error_report("capng_update(ADD,) failed");
+        goto out;
+    }
+
+    if (capng_apply(CAPNG_SELECT_CAPS)) {
+        ret = -errno;
+        error_report("gain:capng_apply() failed");
+        goto out;
+    }
+    ret = 0;
+
+out:
+    return ret;
+}
+
 uint64_t vhost_user_fs_slave_map(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
                                  int fd)
 {
@@ -170,6 +250,7 @@  uint64_t vhost_user_fs_slave_io(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
     unsigned int i;
     int res = 0;
     size_t done = 0;
+    bool cap_fsetid_dropped = false;
 
     if (fd < 0) {
         error_report("Bad fd for map");
@@ -177,8 +258,10 @@  uint64_t vhost_user_fs_slave_io(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
     }
 
     if (sm->gen_flags & VHOST_USER_FS_GENFLAG_DROP_FSETID) {
-        error_report("Dropping CAP_FSETID is not supported");
-        return (uint64_t)-ENOTSUP;
+        res = drop_effective_cap("FSETID", &cap_fsetid_dropped);
+        if (res != 0) {
+            return (uint64_t)res;
+        }
     }
 
     for (i = 0; i < VHOST_USER_FS_SLAVE_ENTRIES && !res; i++) {
@@ -237,6 +320,11 @@  uint64_t vhost_user_fs_slave_io(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
     }
     close(fd);
 
+    if (cap_fsetid_dropped) {
+        if (gain_effective_cap("FSETID")) {
+            error_report("Failed to gain CAP_FSETID");
+        }
+    }
     trace_vhost_user_fs_slave_io_exit(res, done);
     if (res < 0) {
         return (uint64_t)res;
diff --git a/meson.build b/meson.build
index 2d8b433ff0..99a7fbacc1 100644
--- a/meson.build
+++ b/meson.build
@@ -1060,6 +1060,12 @@  elif get_option('virtfs').disabled()
   have_virtfs = false
 endif
 
+if config_host.has_key('CONFIG_VHOST_USER_FS')
+  if not libcap_ng.found()
+    error('vhost-user-fs requires libcap-ng-devel')
+  endif
+endif
+
 config_host_data.set_quoted('CONFIG_BINDIR', get_option('prefix') / get_option('bindir'))
 config_host_data.set_quoted('CONFIG_PREFIX', get_option('prefix'))
 config_host_data.set_quoted('CONFIG_QEMU_CONFDIR', get_option('prefix') / qemu_confdir)