diff mbox

[4/9] qemu-ga: use key-value store to avoid recycling fd handles after restart

Message ID 1363049600-29456-5-git-send-email-mdroth@linux.vnet.ibm.com
State New
Headers show

Commit Message

Michael Roth March 12, 2013, 12:53 a.m. UTC
Hosts hold on to handles provided by guest-file-open for periods that can
span beyond the life of the qemu-ga process that issued them. Since these
are issued starting from 0 on every restart, we run the risk of issuing
duplicate handles after restarts/reboots.

As a result, users with a stale copy of these handles may end up
reading/writing corrupted data due to their existing handles effectively
being re-assigned to an unexpected file or offset.

We unfortunately do not issue handles as strings, but as integers, so a
solution such as using UUIDs can't be implemented without introducing a
new interface.

As a workaround, we fix this by implementing a persistent key-value store
that will be used to track the value of the last handle that was issued
across restarts/reboots to avoid issuing duplicates.

The store is automatically written to the same directory we currently
set via --statedir to track fsfreeze state, and so should be applicable
for stable releases where this flag is supported.

A follow-up can use this same store for handling fsfreeze state, but
that change is cosmetic and left out for now.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org

* fixed guest_file_handle_add() return value from uint64_t to int64_t
---
 qga/commands-posix.c   |   25 +++++--
 qga/guest-agent-core.h |    1 +
 qga/main.c             |  184 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 204 insertions(+), 6 deletions(-)

Comments

Peter Crosthwaite March 15, 2013, 3:31 a.m. UTC | #1
Hi Michael, Anthony,

On Tue, Mar 12, 2013 at 10:53 AM, Michael Roth
<mdroth@linux.vnet.ibm.com> wrote:
> Hosts hold on to handles provided by guest-file-open for periods that can
> span beyond the life of the qemu-ga process that issued them. Since these
> are issued starting from 0 on every restart, we run the risk of issuing
> duplicate handles after restarts/reboots.
>
> As a result, users with a stale copy of these handles may end up
> reading/writing corrupted data due to their existing handles effectively
> being re-assigned to an unexpected file or offset.
>
> We unfortunately do not issue handles as strings, but as integers, so a
> solution such as using UUIDs can't be implemented without introducing a
> new interface.
>
> As a workaround, we fix this by implementing a persistent key-value store
> that will be used to track the value of the last handle that was issued
> across restarts/reboots to avoid issuing duplicates.
>
> The store is automatically written to the same directory we currently
> set via --statedir to track fsfreeze state, and so should be applicable
> for stable releases where this flag is supported.
>
> A follow-up can use this same store for handling fsfreeze state, but
> that change is cosmetic and left out for now.
>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Cc: qemu-stable@nongnu.org
>
> * fixed guest_file_handle_add() return value from uint64_t to int64_t
> ---
>  qga/commands-posix.c   |   25 +++++--
>  qga/guest-agent-core.h |    1 +
>  qga/main.c             |  184 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 204 insertions(+), 6 deletions(-)
>
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 7a0202e..1c2aff3 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
[snip]
> +    g_assert(pstate);
> +    g_assert(keyfile);
> +    /* if any fields are missing, either because the file was tampered with
> +     * by agents of chaos, or because the field wasn't present at the time the
> +     * file was created, the best we can ever do is start over with the default
> +     * values. so load them now, and ignore any errors in accessing key-value
> +     * pairs
> +     */
> +    set_persistent_state_defaults(pstate);
> +
> +    if (g_key_file_has_key(keyfile, "global", "fd_counter", NULL)) {
> +        pstate->fd_counter =
> +            g_key_file_get_int64(keyfile, "global", "fd_counter", NULL);

This function (and friends) doesn't exist in glib pre v2.26, breaking
the build for a few distros (RHEL 5.6 and Ubuntu 10.04 are two that I
am building with). Are we mandating glib > 2.26 as of this series (and
should configure be updated) or do we need an alternate implementation
of this code?

Regards,
peter
Michael Roth March 15, 2013, 3:52 a.m. UTC | #2
On Thu, Mar 14, 2013 at 10:31 PM, Peter Crosthwaite
<peter.crosthwaite@xilinx.com> wrote:
> Hi Michael, Anthony,
>
> On Tue, Mar 12, 2013 at 10:53 AM, Michael Roth
> <mdroth@linux.vnet.ibm.com> wrote:
>> Hosts hold on to handles provided by guest-file-open for periods that can
>> span beyond the life of the qemu-ga process that issued them. Since these
>> are issued starting from 0 on every restart, we run the risk of issuing
>> duplicate handles after restarts/reboots.
>>
>> As a result, users with a stale copy of these handles may end up
>> reading/writing corrupted data due to their existing handles effectively
>> being re-assigned to an unexpected file or offset.
>>
>> We unfortunately do not issue handles as strings, but as integers, so a
>> solution such as using UUIDs can't be implemented without introducing a
>> new interface.
>>
>> As a workaround, we fix this by implementing a persistent key-value store
>> that will be used to track the value of the last handle that was issued
>> across restarts/reboots to avoid issuing duplicates.
>>
>> The store is automatically written to the same directory we currently
>> set via --statedir to track fsfreeze state, and so should be applicable
>> for stable releases where this flag is supported.
>>
>> A follow-up can use this same store for handling fsfreeze state, but
>> that change is cosmetic and left out for now.
>>
>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>> Cc: qemu-stable@nongnu.org
>>
>> * fixed guest_file_handle_add() return value from uint64_t to int64_t
>> ---
>>  qga/commands-posix.c   |   25 +++++--
>>  qga/guest-agent-core.h |    1 +
>>  qga/main.c             |  184 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 204 insertions(+), 6 deletions(-)
>>
>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>> index 7a0202e..1c2aff3 100644
>> --- a/qga/commands-posix.c
>> +++ b/qga/commands-posix.c
> [snip]
>> +    g_assert(pstate);
>> +    g_assert(keyfile);
>> +    /* if any fields are missing, either because the file was tampered with
>> +     * by agents of chaos, or because the field wasn't present at the time the
>> +     * file was created, the best we can ever do is start over with the default
>> +     * values. so load them now, and ignore any errors in accessing key-value
>> +     * pairs
>> +     */
>> +    set_persistent_state_defaults(pstate);
>> +
>> +    if (g_key_file_has_key(keyfile, "global", "fd_counter", NULL)) {
>> +        pstate->fd_counter =
>> +            g_key_file_get_int64(keyfile, "global", "fd_counter", NULL);
>
> This function (and friends) doesn't exist in glib pre v2.26, breaking
> the build for a few distros (RHEL 5.6 and Ubuntu 10.04 are two that I
> am building with). Are we mandating glib > 2.26 as of this series (and
> should configure be updated) or do we need an alternate implementation
> of this code?

Sigh...that certainly wasn't the intention, and something I specifically tried
to avoid doing. GKeyFile was added in 2.6 according to the documentation:

https://developer.gnome.org/glib/2.31/glib-Key-value-file-parser.html

It seems that it may specifically be g_key_file_get_int64() which is
causing the 2.26 dependency.

Would you mind testing with get_int64/set_int64 swapped out for
get_integer/set_integer? If that does the trick I can send an updated
patch tomorrow.

>
> Regards,
> peter
>
Peter Crosthwaite March 15, 2013, 5:09 a.m. UTC | #3
On Fri, Mar 15, 2013 at 1:52 PM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> On Thu, Mar 14, 2013 at 10:31 PM, Peter Crosthwaite
> <peter.crosthwaite@xilinx.com> wrote:
>> Hi Michael, Anthony,
>>
>> On Tue, Mar 12, 2013 at 10:53 AM, Michael Roth
>> <mdroth@linux.vnet.ibm.com> wrote:
>>> Hosts hold on to handles provided by guest-file-open for periods that can
>>> span beyond the life of the qemu-ga process that issued them. Since these
>>> are issued starting from 0 on every restart, we run the risk of issuing
>>> duplicate handles after restarts/reboots.
>>>
>>> As a result, users with a stale copy of these handles may end up
>>> reading/writing corrupted data due to their existing handles effectively
>>> being re-assigned to an unexpected file or offset.
>>>
>>> We unfortunately do not issue handles as strings, but as integers, so a
>>> solution such as using UUIDs can't be implemented without introducing a
>>> new interface.
>>>
>>> As a workaround, we fix this by implementing a persistent key-value store
>>> that will be used to track the value of the last handle that was issued
>>> across restarts/reboots to avoid issuing duplicates.
>>>
>>> The store is automatically written to the same directory we currently
>>> set via --statedir to track fsfreeze state, and so should be applicable
>>> for stable releases where this flag is supported.
>>>
>>> A follow-up can use this same store for handling fsfreeze state, but
>>> that change is cosmetic and left out for now.
>>>
>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>> Cc: qemu-stable@nongnu.org
>>>
>>> * fixed guest_file_handle_add() return value from uint64_t to int64_t
>>> ---
>>>  qga/commands-posix.c   |   25 +++++--
>>>  qga/guest-agent-core.h |    1 +
>>>  qga/main.c             |  184 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  3 files changed, 204 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>>> index 7a0202e..1c2aff3 100644
>>> --- a/qga/commands-posix.c
>>> +++ b/qga/commands-posix.c
>> [snip]
>>> +    g_assert(pstate);
>>> +    g_assert(keyfile);
>>> +    /* if any fields are missing, either because the file was tampered with
>>> +     * by agents of chaos, or because the field wasn't present at the time the
>>> +     * file was created, the best we can ever do is start over with the default
>>> +     * values. so load them now, and ignore any errors in accessing key-value
>>> +     * pairs
>>> +     */
>>> +    set_persistent_state_defaults(pstate);
>>> +
>>> +    if (g_key_file_has_key(keyfile, "global", "fd_counter", NULL)) {
>>> +        pstate->fd_counter =
>>> +            g_key_file_get_int64(keyfile, "global", "fd_counter", NULL);
>>
>> This function (and friends) doesn't exist in glib pre v2.26, breaking
>> the build for a few distros (RHEL 5.6 and Ubuntu 10.04 are two that I
>> am building with). Are we mandating glib > 2.26 as of this series (and
>> should configure be updated) or do we need an alternate implementation
>> of this code?
>
> Sigh...that certainly wasn't the intention, and something I specifically tried
> to avoid doing. GKeyFile was added in 2.6 according to the documentation:
>
> https://developer.gnome.org/glib/2.31/glib-Key-value-file-parser.html
>
> It seems that it may specifically be g_key_file_get_int64() which is
> causing the 2.26 dependency.
>
> Would you mind testing with get_int64/set_int64 swapped out for
> get_integer/set_integer? If that does the trick I can send an updated
> patch tomorrow.
>

That did the trick,

Patch up on list if you want to sign it off or suggested-by it.

Regards,
Peter


>>
>> Regards,
>> peter
>>
>
diff mbox

Patch

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 7a0202e..1c2aff3 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -129,14 +129,22 @@  static struct {
     QTAILQ_HEAD(, GuestFileHandle) filehandles;
 } guest_file_state;
 
-static void guest_file_handle_add(FILE *fh)
+static int64_t guest_file_handle_add(FILE *fh, Error **errp)
 {
     GuestFileHandle *gfh;
+    int64_t handle;
+
+    handle = ga_get_fd_handle(ga_state, errp);
+    if (error_is_set(errp)) {
+        return 0;
+    }
 
     gfh = g_malloc0(sizeof(GuestFileHandle));
-    gfh->id = fileno(fh);
+    gfh->id = handle;
     gfh->fh = fh;
     QTAILQ_INSERT_TAIL(&guest_file_state.filehandles, gfh, next);
+
+    return handle;
 }
 
 static GuestFileHandle *guest_file_handle_find(int64_t id, Error **err)
@@ -158,7 +166,7 @@  int64_t qmp_guest_file_open(const char *path, bool has_mode, const char *mode, E
 {
     FILE *fh;
     int fd;
-    int64_t ret = -1;
+    int64_t ret = -1, handle;
 
     if (!has_mode) {
         mode = "r";
@@ -184,9 +192,14 @@  int64_t qmp_guest_file_open(const char *path, bool has_mode, const char *mode, E
         return -1;
     }
 
-    guest_file_handle_add(fh);
-    slog("guest-file-open, handle: %d", fd);
-    return fd;
+    handle = guest_file_handle_add(fh, err);
+    if (error_is_set(err)) {
+        fclose(fh);
+        return -1;
+    }
+
+    slog("guest-file-open, handle: %d", handle);
+    return handle;
 }
 
 void qmp_guest_file_close(int64_t handle, Error **err)
diff --git a/qga/guest-agent-core.h b/qga/guest-agent-core.h
index 3354598..624a559 100644
--- a/qga/guest-agent-core.h
+++ b/qga/guest-agent-core.h
@@ -35,6 +35,7 @@  bool ga_is_frozen(GAState *s);
 void ga_set_frozen(GAState *s);
 void ga_unset_frozen(GAState *s);
 const char *ga_fsfreeze_hook(GAState *s);
+int64_t ga_get_fd_handle(GAState *s, Error **errp);
 
 #ifndef _WIN32
 void reopen_fd_to_null(int fd);
diff --git a/qga/main.c b/qga/main.c
index ad75171..99346e1 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -15,6 +15,7 @@ 
 #include <stdbool.h>
 #include <glib.h>
 #include <getopt.h>
+#include <glib/gstdio.h>
 #ifndef _WIN32
 #include <syslog.h>
 #include <sys/wait.h>
@@ -30,6 +31,7 @@ 
 #include "qapi/qmp/qerror.h"
 #include "qapi/qmp/dispatch.h"
 #include "qga/channel.h"
+#include "qemu/bswap.h"
 #ifdef _WIN32
 #include "qga/service-win32.h"
 #include <windows.h>
@@ -53,6 +55,11 @@ 
 #endif
 #define QGA_SENTINEL_BYTE 0xFF
 
+typedef struct GAPersistentState {
+#define QGA_PSTATE_DEFAULT_FD_COUNTER 1000
+    int64_t fd_counter;
+} GAPersistentState;
+
 struct GAState {
     JSONMessageParser parser;
     GMainLoop *main_loop;
@@ -76,6 +83,8 @@  struct GAState {
 #ifdef CONFIG_FSFREEZE
     const char *fsfreeze_hook;
 #endif
+    const gchar *pstate_filepath;
+    GAPersistentState pstate;
 };
 
 struct GAState *ga_state;
@@ -725,6 +734,171 @@  VOID WINAPI service_main(DWORD argc, TCHAR *argv[])
 }
 #endif
 
+static void set_persistent_state_defaults(GAPersistentState *pstate)
+{
+    g_assert(pstate);
+    pstate->fd_counter = QGA_PSTATE_DEFAULT_FD_COUNTER;
+}
+
+static void persistent_state_from_keyfile(GAPersistentState *pstate,
+                                          GKeyFile *keyfile)
+{
+    g_assert(pstate);
+    g_assert(keyfile);
+    /* if any fields are missing, either because the file was tampered with
+     * by agents of chaos, or because the field wasn't present at the time the
+     * file was created, the best we can ever do is start over with the default
+     * values. so load them now, and ignore any errors in accessing key-value
+     * pairs
+     */
+    set_persistent_state_defaults(pstate);
+
+    if (g_key_file_has_key(keyfile, "global", "fd_counter", NULL)) {
+        pstate->fd_counter =
+            g_key_file_get_int64(keyfile, "global", "fd_counter", NULL);
+    }
+}
+
+static void persistent_state_to_keyfile(const GAPersistentState *pstate,
+                                        GKeyFile *keyfile)
+{
+    g_assert(pstate);
+    g_assert(keyfile);
+
+    g_key_file_set_int64(keyfile, "global", "fd_counter", pstate->fd_counter);
+}
+
+static gboolean write_persistent_state(const GAPersistentState *pstate,
+                                       const gchar *path)
+{
+    GKeyFile *keyfile = g_key_file_new();
+    GError *gerr = NULL;
+    gboolean ret = true;
+    gchar *data = NULL;
+    gsize data_len;
+
+    g_assert(pstate);
+
+    persistent_state_to_keyfile(pstate, keyfile);
+    data = g_key_file_to_data(keyfile, &data_len, &gerr);
+    if (gerr) {
+        g_critical("failed to convert persistent state to string: %s",
+                   gerr->message);
+        ret = false;
+        goto out;
+    }
+
+    g_file_set_contents(path, data, data_len, &gerr);
+    if (gerr) {
+        g_critical("failed to write persistent state to %s: %s",
+                    path, gerr->message);
+        ret = false;
+        goto out;
+    }
+
+out:
+    if (gerr) {
+        g_error_free(gerr);
+    }
+    if (keyfile) {
+        g_key_file_free(keyfile);
+    }
+    g_free(data);
+    return ret;
+}
+
+static gboolean read_persistent_state(GAPersistentState *pstate,
+                                      const gchar *path, gboolean frozen)
+{
+    GKeyFile *keyfile = NULL;
+    GError *gerr = NULL;
+    struct stat st;
+    gboolean ret = true;
+
+    g_assert(pstate);
+
+    if (stat(path, &st) == -1) {
+        /* it's okay if state file doesn't exist, but any other error
+         * indicates a permissions issue or some other misconfiguration
+         * that we likely won't be able to recover from.
+         */
+        if (errno != ENOENT) {
+            g_critical("unable to access state file at path %s: %s",
+                       path, strerror(errno));
+            ret = false;
+            goto out;
+        }
+
+        /* file doesn't exist. initialize state to default values and
+         * attempt to save now. (we could wait till later when we have
+         * modified state we need to commit, but if there's a problem,
+         * such as a missing parent directory, we want to catch it now)
+         *
+         * there is a potential scenario where someone either managed to
+         * update the agent from a version that didn't use a key store
+         * while qemu-ga thought the filesystem was frozen, or
+         * deleted the key store prior to issuing a fsfreeze, prior
+         * to restarting the agent. in this case we go ahead and defer
+         * initial creation till we actually have modified state to
+         * write, otherwise fail to recover from freeze.
+         */
+        set_persistent_state_defaults(pstate);
+        if (!frozen) {
+            ret = write_persistent_state(pstate, path);
+            if (!ret) {
+                g_critical("unable to create state file at path %s", path);
+                ret = false;
+                goto out;
+            }
+        }
+        ret = true;
+        goto out;
+    }
+
+    keyfile = g_key_file_new();
+    g_key_file_load_from_file(keyfile, path, 0, &gerr);
+    if (gerr) {
+        g_critical("error loading persistent state from path: %s, %s",
+                   path, gerr->message);
+        ret = false;
+        goto out;
+    }
+
+    persistent_state_from_keyfile(pstate, keyfile);
+
+out:
+    if (keyfile) {
+        g_key_file_free(keyfile);
+    }
+    if (gerr) {
+        g_error_free(gerr);
+    }
+
+    return ret;
+}
+
+int64_t ga_get_fd_handle(GAState *s, Error **errp)
+{
+    int64_t handle;
+
+    g_assert(s->pstate_filepath);
+    /* we blacklist commands and avoid operations that potentially require
+     * writing to disk when we're in a frozen state. this includes opening
+     * new files, so we should never get here in that situation
+     */
+    g_assert(!ga_is_frozen(s));
+
+    handle = s->pstate.fd_counter++;
+    if (s->pstate.fd_counter < 0) {
+        s->pstate.fd_counter = 0;
+    }
+    if (!write_persistent_state(&s->pstate, s->pstate_filepath)) {
+        error_setg(errp, "failed to commit persistent state to disk");
+    }
+
+    return handle;
+}
+
 int main(int argc, char **argv)
 {
     const char *sopt = "hVvdm:p:l:f:F::b:s:t:";
@@ -854,7 +1028,9 @@  int main(int argc, char **argv)
     ga_enable_logging(s);
     s->state_filepath_isfrozen = g_strdup_printf("%s/qga.state.isfrozen",
                                                  state_dir);
+    s->pstate_filepath = g_strdup_printf("%s/qga.state", state_dir);
     s->frozen = false;
+
 #ifndef _WIN32
     /* check if a previous instance of qemu-ga exited with filesystems' state
      * marked as frozen. this could be a stale value (a non-qemu-ga process
@@ -911,6 +1087,14 @@  int main(int argc, char **argv)
         }
     }
 
+    /* load persistent state from disk */
+    if (!read_persistent_state(&s->pstate,
+                               s->pstate_filepath,
+                               ga_is_frozen(s))) {
+        g_critical("failed to load persistent state");
+        goto out_bad;
+    }
+
     if (blacklist) {
         s->blacklist = blacklist;
         do {