Patchwork [RFC,3/4] block: Enable QEMU to retrieve passed fd before attempting open

login
register
mail settings
Submitter Corey Bryant
Date May 21, 2012, 8:19 p.m.
Message ID <1337631598-30639-4-git-send-email-coreyb@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/160470/
State New
Headers show

Comments

Corey Bryant - May 21, 2012, 8:19 p.m.
With this patch, when QEMU needs to "open" a file, it will first
check to see if a matching filename/fd pair were passed via the
-filefd command line option or the getfd_file monitor command.
If a match is found, QEMU will use the passed fd and will not
attempt to open the file.  Otherwise, if a match is not found,
QEMU will attempt to open the file on it's own.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
 block.c           |   31 +++++++++++++++++++++++++++++++
 block/raw-posix.c |   20 ++++++++++----------
 block/raw-win32.c |    4 ++--
 block/vdi.c       |    4 ++--
 block/vmdk.c      |   21 +++++++++------------
 block/vpc.c       |    2 +-
 block/vvfat.c     |    4 ++--
 block_int.h       |   12 ++++++++++++
 vl.c              |    6 ++++++
 9 files changed, 75 insertions(+), 29 deletions(-)
Eric Blake - May 21, 2012, 9:50 p.m.
On 05/21/2012 02:19 PM, Corey Bryant wrote:
> With this patch, when QEMU needs to "open" a file, it will first
> check to see if a matching filename/fd pair were passed via the
> -filefd command line option or the getfd_file monitor command.
> If a match is found, QEMU will use the passed fd and will not
> attempt to open the file.  Otherwise, if a match is not found,
> QEMU will attempt to open the file on it's own.
> 
> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>

> +int file_open(const char *filename, int flags, mode_t mode)
> +{
> +    int fd;
> +
> +#ifdef _WIN32
> +    return qemu_open(filename, flags, mode);
> +#else

Would it be any easier to write:

#ifndef _WIN32
  qemu_get_fd_file() stuff
#endif
  return qemu_open()

so that you aren't repeating the return line?


> +    fd = qemu_get_fd_file(filename, false);
> +    if (fd != -1) {
> +        return dup(fd);

Why are you dup'ing the fd?  That just sounds like a way to leak fds.
Remember, the existing 'getfd' monitor command doesn't dup things - it
either gets consumed by a successful use of the named fd, or it remains
open on failure and the user can close it by calling 'closefd'.

Or, if you are intentionally allowing the user to reuse the fd for more
than one qemu open instance, you need to document this point.

What happens if qemu wants O_WRONLY or O_RDWR access, but the user
passed in an fd with only O_RDONLY access?
Corey Bryant - May 22, 2012, 2:06 p.m.
On 05/21/2012 05:50 PM, Eric Blake wrote:
> On 05/21/2012 02:19 PM, Corey Bryant wrote:
>> With this patch, when QEMU needs to "open" a file, it will first
>> check to see if a matching filename/fd pair were passed via the
>> -filefd command line option or the getfd_file monitor command.
>> If a match is found, QEMU will use the passed fd and will not
>> attempt to open the file.  Otherwise, if a match is not found,
>> QEMU will attempt to open the file on it's own.
>>
>> Signed-off-by: Corey Bryant<coreyb@linux.vnet.ibm.com>
>
>> +int file_open(const char *filename, int flags, mode_t mode)
>> +{
>> +    int fd;
>> +
>> +#ifdef _WIN32
>> +    return qemu_open(filename, flags, mode);
>> +#else
>
> Would it be any easier to write:
>
> #ifndef _WIN32
>    qemu_get_fd_file() stuff
> #endif
>    return qemu_open()
>
> so that you aren't repeating the return line?
>

Yes that's easier.  Thanks for the suggestion!

>
>> +    fd = qemu_get_fd_file(filename, false);
>> +    if (fd != -1) {
>> +        return dup(fd);
>
> Why are you dup'ing the fd?  That just sounds like a way to leak fds.
> Remember, the existing 'getfd' monitor command doesn't dup things - it
> either gets consumed by a successful use of the named fd, or it remains
> open on failure and the user can close it by calling 'closefd'.

The way it works now is that the fd/filename pairs that are passed in 
(either through -filefd or getfd_file) will persist on the option and 
monitor structures.  In other words, when we find a match for a filename 
on the monitor structure, we don't delete it from the struct.  We leave 
it there in case we need to open the file again.

So we dup the fd and let QEMU use the new fd as if it opened it itself. 
  This allows QEMU to close the fd on its own, and if it needs to 
re-open the fd, it can dup it again.

>
> Or, if you are intentionally allowing the user to reuse the fd for more
> than one qemu open instance, you need to document this point.

Ok, yes.  I'll document this.

>
> What happens if qemu wants O_WRONLY or O_RDWR access, but the user
> passed in an fd with only O_RDONLY access?

This is an area of concern for me.  And it's an area where Anthony's 
call-back approach was much simpler because it passed the open flags 
directly from QEMU to libvirt.

I don't think these flags can be set through fcntl(F_SETFL).  So I think 
they have to be set on the open() by the managing application.  The 
problem is that, today, QEMU will open a single file several different 
times on initialization alone (reading cow headers and what not), and 
the open flags vary on these open calls.  The difference with the new 
approach in this patch series is that the fd from a single open call is 
re-used for each of the "opens".

Patch

diff --git a/block.c b/block.c
index af2ab4f..6472114 100644
--- a/block.c
+++ b/block.c
@@ -102,6 +102,37 @@  static BlockDriverState *bs_snapshots;
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
+static int filename_match(QemuOpts *opts, void *opaque)
+{
+    const char *file = qemu_opt_get(opts, "file");
+    int fd = qemu_opt_get_number(opts, "fd", -1);
+
+    return (strcmp((char *)opaque, file) == 0) ? fd : 0;
+}
+
+int file_open(const char *filename, int flags, mode_t mode)
+{
+    int fd;
+
+#ifdef _WIN32
+    return qemu_open(filename, flags, mode);
+#else
+
+    fd = qemu_opts_foreach(qemu_find_opts("filefd"), filename_match,
+                           (void *)filename, 1);
+    if (fd != 0) {
+        return dup(fd);
+    }
+
+    fd = qemu_get_fd_file(filename, false);
+    if (fd != -1) {
+        return dup(fd);
+    }
+
+    return qemu_open(filename, flags, mode);
+#endif
+}
+
 #ifdef _WIN32
 static int is_windows_drive_prefix(const char *filename)
 {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 03fcfcc..4f7b40f 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -208,7 +208,7 @@  static int raw_open_common(BlockDriverState *bs, const char *filename,
         s->open_flags |= O_DSYNC;
 
     s->fd = -1;
-    fd = qemu_open(filename, s->open_flags, 0644);
+    fd = file_open(filename, s->open_flags, 0644);
     if (fd < 0) {
         ret = -errno;
         if (ret == -EROFS)
@@ -568,8 +568,8 @@  static int raw_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
+    fd = file_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
     if (fd < 0) {
         result = -errno;
     } else {
@@ -741,7 +741,7 @@  static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
         if ( bsdPath[ 0 ] != '\0' ) {
             strcat(bsdPath,"s0");
             /* some CDs don't have a partition 0 */
-            fd = open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
+            fd = file_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE, 0);
             if (fd < 0) {
                 bsdPath[strlen(bsdPath)-1] = '1';
             } else {
@@ -798,7 +798,7 @@  static int fd_open(BlockDriverState *bs)
 #endif
             return -EIO;
         }
-        s->fd = open(bs->filename, s->open_flags & ~O_NONBLOCK);
+        s->fd = file_open(bs->filename, s->open_flags & ~O_NONBLOCK, 0);
         if (s->fd < 0) {
             s->fd_error_time = get_clock();
             s->fd_got_error = 1;
@@ -872,7 +872,7 @@  static int hdev_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_BINARY);
+    fd = file_open(filename, O_WRONLY | O_BINARY, 0);
     if (fd < 0)
         return -errno;
 
@@ -950,7 +950,7 @@  static int floppy_probe_device(const char *filename)
     if (strstart(filename, "/dev/fd", NULL))
         prio = 50;
 
-    fd = open(filename, O_RDONLY | O_NONBLOCK);
+    fd = file_open(filename, O_RDONLY | O_NONBLOCK, 0);
     if (fd < 0) {
         goto out;
     }
@@ -1003,7 +1003,7 @@  static void floppy_eject(BlockDriverState *bs, bool eject_flag)
         close(s->fd);
         s->fd = -1;
     }
-    fd = open(bs->filename, s->open_flags | O_NONBLOCK);
+    fd = file_open(bs->filename, s->open_flags | O_NONBLOCK, 0);
     if (fd >= 0) {
         if (ioctl(fd, FDEJECT, 0) < 0)
             perror("FDEJECT");
@@ -1053,7 +1053,7 @@  static int cdrom_probe_device(const char *filename)
     int prio = 0;
     struct stat st;
 
-    fd = open(filename, O_RDONLY | O_NONBLOCK);
+    fd = file_open(filename, O_RDONLY | O_NONBLOCK, 0);
     if (fd < 0) {
         goto out;
     }
@@ -1177,7 +1177,7 @@  static int cdrom_reopen(BlockDriverState *bs)
      */
     if (s->fd >= 0)
         close(s->fd);
-    fd = open(bs->filename, s->open_flags, 0644);
+    fd = file_open(bs->filename, s->open_flags, 0644);
     if (fd < 0) {
         s->fd = -1;
         return -EIO;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index e4b0b75..ec4ac96 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -255,8 +255,8 @@  static int raw_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
+    fd = file_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
     if (fd < 0)
         return -EIO;
     set_sparse(fd);
diff --git a/block/vdi.c b/block/vdi.c
index 119d3c7..b3ec9b2 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -648,8 +648,8 @@  static int vdi_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-              0644);
+    fd = file_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                   0644);
     if (fd < 0) {
         return -errno;
     }
diff --git a/block/vmdk.c b/block/vmdk.c
index 18e9b4c..bda4c1a 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1161,10 +1161,9 @@  static int vmdk_create_extent(const char *filename, int64_t filesize,
     VMDK4Header header;
     uint32_t tmp, magic, grains, gd_size, gt_size, gt_count;
 
-    fd = open(
-        filename,
-        O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-        0644);
+    fd = file_open(filename,
+                   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                   0644);
     if (fd < 0) {
         return -errno;
     }
@@ -1484,15 +1483,13 @@  static int vmdk_create(const char *filename, QEMUOptionParameter *options)
             (flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
             total_size / (int64_t)(63 * 16 * 512));
     if (split || flat) {
-        fd = open(
-                filename,
-                O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-                0644);
+        fd = file_open(filename,
+                       O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                       0644);
     } else {
-        fd = open(
-                filename,
-                O_WRONLY | O_BINARY | O_LARGEFILE,
-                0644);
+        fd = file_open(filename,
+                       O_WRONLY | O_BINARY | O_LARGEFILE,
+                       0644);
     }
     if (fd < 0) {
         return -errno;
diff --git a/block/vpc.c b/block/vpc.c
index 5cd13d1..c1efb10 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -678,7 +678,7 @@  static int vpc_create(const char *filename, QEMUOptionParameter *options)
     }
 
     /* Create the file */
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
+    fd = file_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
     if (fd < 0) {
         return -EIO;
     }
diff --git a/block/vvfat.c b/block/vvfat.c
index 2dc9d50..1573d8f 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1156,7 +1156,7 @@  static int open_file(BDRVVVFATState* s,mapping_t* mapping)
     if(!s->current_mapping ||
 	    strcmp(s->current_mapping->path,mapping->path)) {
 	/* open file */
-	int fd = open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE);
+	int fd = file_open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE, 0);
 	if(fd<0)
 	    return -1;
 	vvfat_close_current_file(s);
@@ -2215,7 +2215,7 @@  static int commit_one_file(BDRVVVFATState* s,
     for (i = s->cluster_size; i < offset; i += s->cluster_size)
 	c = modified_fat_get(s, c);
 
-    fd = open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
+    fd = file_open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
     if (fd < 0) {
 	fprintf(stderr, "Could not open %s... (%s, %d)\n", mapping->path,
 		strerror(errno), errno);
diff --git a/block_int.h b/block_int.h
index b80e66d..f3b6144 100644
--- a/block_int.h
+++ b/block_int.h
@@ -453,4 +453,16 @@  void stream_start(BlockDriverState *bs, BlockDriverState *base,
                   BlockDriverCompletionFunc *cb,
                   void *opaque, Error **errp);
 
+/**
+ * file_open:
+ * @filename: the filename to attempt to open
+ * @flags: O_ flags that determine how the file is open
+ * @mode: the mode to create the file with if #O_CREAT is included in @flags
+ *
+ * This function behaves just like the @open libc function.  It may, however,
+ * get the file descriptor from the QEMU command line or monitor if QEMU is
+ * being run with fewer privileges.
+ */
+int file_open(const char *filename, int flags, mode_t mode);
+
 #endif /* BLOCK_INT_H */
diff --git a/vl.c b/vl.c
index 23ab3a3..6a55166 100644
--- a/vl.c
+++ b/vl.c
@@ -3201,6 +3201,12 @@  int main(int argc, char **argv, char **envp)
             case QEMU_OPTION_qtest_log:
                 qtest_log = optarg;
                 break;
+            case QEMU_OPTION_filefd:
+                opts = qemu_opts_parse(qemu_find_opts("filefd"), optarg, 0);
+                if (!opts) {
+                    exit(1);
+                }
+                break;
             default:
                 os_parse_cmd_args(popt->index, optarg);
             }