From patchwork Tue Aug 7 15:58:28 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Corey Bryant X-Patchwork-Id: 175692 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id C8A2D2C00AD for ; Wed, 8 Aug 2012 02:10:43 +1000 (EST) Received: from localhost ([::1]:44112 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SymMj-0007c0-S3 for incoming@patchwork.ozlabs.org; Tue, 07 Aug 2012 12:10:41 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45943) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SymMV-0007aB-VP for qemu-devel@nongnu.org; Tue, 07 Aug 2012 12:10:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SymMQ-0004Va-Hc for qemu-devel@nongnu.org; Tue, 07 Aug 2012 12:10:27 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:56769) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SymMQ-0004VI-7M for qemu-devel@nongnu.org; Tue, 07 Aug 2012 12:10:22 -0400 Received: from /spool/local by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2012 10:00:08 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 7 Aug 2012 09:59:11 -0600 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id DDBE119D806C for ; Tue, 7 Aug 2012 15:59:02 +0000 (WET) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q77FwnTh126448 for ; Tue, 7 Aug 2012 09:58:50 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q77Fwmsa021184 for ; Tue, 7 Aug 2012 09:58:48 -0600 Received: from localhost ([9.80.90.46]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q77FwllX021116; Tue, 7 Aug 2012 09:58:47 -0600 From: Corey Bryant To: qemu-devel@nongnu.org Date: Tue, 7 Aug 2012 11:58:28 -0400 Message-Id: <1344355108-14786-7-git-send-email-coreyb@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1344355108-14786-1-git-send-email-coreyb@linux.vnet.ibm.com> References: <1344355108-14786-1-git-send-email-coreyb@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12080715-7606-0000-0000-0000029CBD3B X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 32.97.110.154 Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, libvir-list@redhat.com, Corey Bryant , lcapitulino@redhat.com, eblake@redhat.com Subject: [Qemu-devel] [PATCH v7 6/6] block: Enable qemu_open/close to work with fd sets X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org When qemu_open is passed a filename of the "/dev/fdset/nnn" format (where nnn is the fdset ID), an fd with matching access mode flags will be searched for within the specified monitor fd set. If the fd is found, a dup of the fd will be returned from qemu_open. Each fd set has a reference count. The purpose of the reference count is to determine if an fd set contains file descriptors that have open dup() references that have not yet been closed. It is incremented on qemu_open and decremented on qemu_close. It is not until the refcount is zero that file desriptors in an fd set can be closed. If an fd set has dup() references open, then we must keep the other fds in the fd set open in case a reopen of the file occurs that requires an fd with a different access mode. Signed-off-by: Corey Bryant --- v2: -Get rid of file_open and move dup code to qemu_open (kwolf@redhat.com) -Use strtol wrapper instead of atoi (kwolf@redhat.com) v3: -Add note about fd leakage (eblake@redhat.com) v4 -Moved patch to be later in series (lcapitulino@redhat.com) -Update qemu_open to check access mode flags and set flags that can be set (eblake@redhat.com, kwolf@redhat.com) v5: -This patch was overhauled quite a bit in this version, with the addition of fd set and refcount support. -Use qemu_set_cloexec() on dup'd fd (eblake@redhat.com) -Modify flags set by fcntl on dup'd fd (eblake@redhat.com) -Reduce syscalls when setting flags for dup'd fd (eblake@redhat.com) -Fix O_RDWR, O_RDONLY, O_WRONLY checks (eblake@redhat.com) v6: -Pass only the fd to qemu_close() and keep track of dup fds per fd set. (kwolf@redhat.com, eblake@redhat.com) -Handle refcount incr/decr in new dup_fd_add/remove fd functions. -Use qemu_set_cloexec() appropriately in qemu_dup() (kwolf@redhat.com) -Simplify setting of setfl_flags in qemu_dup() (kwolf@redhat.com) -Add preprocessor checks for F_DUPFD_CLOEXEC (eblake@redhat.com) -Simplify flag checking in monitor_fdset_get_fd() (kwolf@redhat.com) v7: -Minor updates to reference global mon_fdsets, and to remove default_mon usage in osdep.c. (kwolf@redhat.com) cutils.c | 5 +++ monitor.c | 87 ++++++++++++++++++++++++++++++++++++++++++++ monitor.h | 5 +++ osdep.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ qemu-common.h | 1 + qemu-tool.c | 20 +++++++++++ 6 files changed, 230 insertions(+) diff --git a/cutils.c b/cutils.c index 9d4c570..8b0d2bb 100644 --- a/cutils.c +++ b/cutils.c @@ -382,3 +382,8 @@ int qemu_parse_fd(const char *param) } return fd; } + +int qemu_parse_fdset(const char *param) +{ + return qemu_parse_fd(param); +} diff --git a/monitor.c b/monitor.c index 84eade8..a16d48b 100644 --- a/monitor.c +++ b/monitor.c @@ -154,6 +154,7 @@ struct mon_fdset_t { int64_t id; int refcount; QLIST_HEAD(, mon_fdset_fd_t) fds; + QLIST_HEAD(, mon_fdset_fd_t) dup_fds; QLIST_ENTRY(mon_fdset_t) next; }; @@ -2566,6 +2567,92 @@ FdsetInfoList *qmp_query_fdsets(Error **errp) return fdset_list; } +int monitor_fdset_get_fd(int64_t fdset_id, int flags) +{ + mon_fdset_t *mon_fdset; + mon_fdset_fd_t *mon_fdset_fd; + int mon_fd_flags; + + QLIST_FOREACH(mon_fdset, &mon_fdsets, next) { + if (mon_fdset->id != fdset_id) { + continue; + } + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) { + if (mon_fdset_fd->removed) { + continue; + } + + mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL); + if (mon_fd_flags == -1) { + return -1; + } + + if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) { + return mon_fdset_fd->fd; + } + } + errno = EACCES; + return -1; + } + errno = ENOENT; + return -1; +} + +int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd) +{ + mon_fdset_t *mon_fdset; + mon_fdset_fd_t *mon_fdset_fd_dup; + + QLIST_FOREACH(mon_fdset, &mon_fdsets, next) { + if (mon_fdset->id != fdset_id) { + continue; + } + QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) { + if (mon_fdset_fd_dup->fd == dup_fd) { + return -1; + } + } + mon_fdset_fd_dup = g_malloc0(sizeof(*mon_fdset_fd_dup)); + mon_fdset_fd_dup->fd = dup_fd; + QLIST_INSERT_HEAD(&mon_fdset->dup_fds, mon_fdset_fd_dup, next); + mon_fdset->refcount++; + return 0; + } + return -1; +} + +static int _monitor_fdset_dup_fd_find(int dup_fd, bool remove) +{ + mon_fdset_t *mon_fdset; + mon_fdset_fd_t *mon_fdset_fd_dup; + + QLIST_FOREACH(mon_fdset, &mon_fdsets, next) { + QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) { + if (mon_fdset_fd_dup->fd == dup_fd) { + if (remove) { + QLIST_REMOVE(mon_fdset_fd_dup, next); + mon_fdset->refcount--; + if (mon_fdset->refcount == 0) { + monitor_fdset_cleanup(mon_fdset); + } + } + return mon_fdset->id; + } + } + } + return -1; +} + +int monitor_fdset_dup_fd_find(int dup_fd) +{ + return _monitor_fdset_dup_fd_find(dup_fd, false); +} + +int monitor_fdset_dup_fd_remove(int dup_fd) +{ + return _monitor_fdset_dup_fd_find(dup_fd, true); +} + /* mon_cmds and info_cmds would be sorted at runtime */ static mon_cmd_t mon_cmds[] = { #include "hmp-commands.h" diff --git a/monitor.h b/monitor.h index 5f4de1b..30b660f 100644 --- a/monitor.h +++ b/monitor.h @@ -86,4 +86,9 @@ int qmp_qom_set(Monitor *mon, const QDict *qdict, QObject **ret); int qmp_qom_get(Monitor *mon, const QDict *qdict, QObject **ret); +int monitor_fdset_get_fd(int64_t fdset_id, int flags); +int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd); +int monitor_fdset_dup_fd_remove(int dup_fd); +int monitor_fdset_dup_fd_find(int dup_fd); + #endif /* !MONITOR_H */ diff --git a/osdep.c b/osdep.c index 7f876ae..9ff3561 100644 --- a/osdep.c +++ b/osdep.c @@ -48,6 +48,7 @@ extern int madvise(caddr_t, size_t, int); #include "qemu-common.h" #include "trace.h" #include "qemu_socket.h" +#include "monitor.h" static bool fips_enabled = false; @@ -78,6 +79,69 @@ int qemu_madvise(void *addr, size_t len, int advice) #endif } +/* + * Dups an fd and sets the flags + */ +static int qemu_dup(int fd, int flags) +{ + int ret; + int serrno; + int dup_flags; + int setfl_flags; + + if (flags & O_CLOEXEC) { +#ifdef F_DUPFD_CLOEXEC + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0); +#else + ret = dup(fd); + if (ret != -1) { + qemu_set_cloexec(ret); + } +#endif + } else { + ret = dup(fd); + } + + if (ret == -1) { + goto fail; + } + + dup_flags = fcntl(ret, F_GETFL); + if (dup_flags == -1) { + goto fail; + } + + if ((flags & O_SYNC) != (dup_flags & O_SYNC)) { + errno = EINVAL; + goto fail; + } + + /* Set/unset flags that we can with fcntl */ + setfl_flags = O_APPEND | O_ASYNC | O_DIRECT | O_NOATIME | O_NONBLOCK; + dup_flags &= ~setfl_flags; + dup_flags |= (flags & setfl_flags); + if (fcntl(ret, F_SETFL, dup_flags) == -1) { + goto fail; + } + + /* Truncate the file in the cases that open() would truncate it */ + if (flags & O_TRUNC || + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) { + if (ftruncate(ret, 0) == -1) { + goto fail; + } + } + + return ret; + +fail: + serrno = errno; + if (ret != -1) { + close(ret); + } + errno = serrno; + return -1; +} /* * Opens a file with FD_CLOEXEC set @@ -87,6 +151,39 @@ int qemu_open(const char *name, int flags, ...) int ret; int mode = 0; +#ifndef _WIN32 + const char *fdset_id_str; + + /* Attempt dup of fd from fd set */ + if (strstart(name, "/dev/fdset/", &fdset_id_str)) { + int64_t fdset_id; + int fd, dupfd; + + fdset_id = qemu_parse_fdset(fdset_id_str); + if (fdset_id == -1) { + errno = EINVAL; + return -1; + } + + fd = monitor_fdset_get_fd(fdset_id, flags); + if (fd == -1) { + return -1; + } + + dupfd = qemu_dup(fd, flags); + if (fd == -1) { + return -1; + } + + ret = monitor_fdset_dup_fd_add(fdset_id, dupfd); + if (ret == -1) { + return -1; + } + + return dupfd; + } +#endif + if (flags & O_CREAT) { va_list ap; @@ -109,6 +206,21 @@ int qemu_open(const char *name, int flags, ...) int qemu_close(int fd) { + int64_t fdset_id; + + /* Close fd that was dup'd from an fdset */ + fdset_id = monitor_fdset_dup_fd_find(fd); + if (fdset_id != -1) { + int ret; + + ret = close(fd); + if (ret == 0) { + monitor_fdset_dup_fd_remove(fd); + } + + return ret; + } + return close(fd); } diff --git a/qemu-common.h b/qemu-common.h index e53126d..9becb32 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -166,6 +166,7 @@ int qemu_fls(int i); int qemu_fdatasync(int fd); int fcntl_setfl(int fd, int flag); int qemu_parse_fd(const char *param); +int qemu_parse_fdset(const char *param); /* * strtosz() suffixes used to specify the default treatment of an diff --git a/qemu-tool.c b/qemu-tool.c index 318c5fc..b7622f5 100644 --- a/qemu-tool.c +++ b/qemu-tool.c @@ -57,6 +57,26 @@ void monitor_protocol_event(MonitorEvent event, QObject *data) { } +int monitor_fdset_get_fd(int64_t fdset_id, int flags) +{ + return -1; +} + +int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd) +{ + return -1; +} + +int monitor_fdset_dup_fd_remove(int dup_fd) +{ + return -1; +} + +int monitor_fdset_dup_fd_find(int dup_fd) +{ + return -1; +} + int64_t cpu_get_clock(void) { return qemu_get_clock_ns(rt_clock);