diff mbox series

Updating gdbstub to allow safe multithreading in usermode emulation

Message ID 20220528195344.4160516-1-ben@thefarbeyond.com
State New
Headers show
Series Updating gdbstub to allow safe multithreading in usermode emulation | expand

Commit Message

Ben Cohen May 28, 2022, 7:54 p.m. UTC
I was testing some multi-threaded code in qemu's usermode and ran into
issues with the gdbstub because the user mode qemu emulation spawns new
threads when the process tries to make a new thread but the gdbstub does
not handle the threads well. The current gdbstub has a single global
struct which contains all the state data, and multiple threads can write
to this struct simultaneously, causing gdb packets to be corrupted. The
different threads can also try to read off the gdb socket at the same
time causing the packet to be devided between threads. This patch is
designed to add a single separate thread for the usermode gdbstub which
will handle all the gdb comms and avoid the multithreading issues.

To demonstrate that the mutlithreading was not working properly before
and that it hopefully works properlly now, I wrote a small test program
with some gdb scripts that can be found here:
http://url6163.thefarbeyond.com/ls/click?upn=dUoerjbXT3NK9PiRMHEMD5Fm1RmJf0eUWTDkIDLODGZSRXu94ynuiGwQ-2FCFcimw5rndUfJbozecGKoOE5zbdfRVgadbVSeCrgP5IlB2UqPU-3DUBT-_E8SO-2FEypfU855L0ybQoiQY4Xaj8Z6NYzBoBK89OH-2BiIJOE8-2BoeErelzsKKZyRONZN5M7Gzw0Zs4K0tnG4gxJSOWW89OLEjRW7ISHPWO2lT6fJJUM88-2BLL6sh4BfexcL-2FKt7KhnEpzqyb9bd5UZ-2FR2iPVIjp7zfshwPtjJEHAHaIGeNZbI4nFw81hhs0N1tt9sAcC1ALVazbgnC5E-2F5ZChA-3D-3D

Signed-off-by: Ben Cohen <ben@thefarbeyond.com>
---
 gdbstub.c              | 105 +++++++++++++++++++++++++++++++++++++++++
 include/exec/gdbstub.h |  13 +++++
 linux-user/signal.c    |   4 ++
 3 files changed, 122 insertions(+)

Comments

Peter Maydell June 24, 2022, 10:37 a.m. UTC | #1
On Sat, 28 May 2022 at 21:11, Ben Cohen <ben@thefarbeyond.com> wrote:
>
> I was testing some multi-threaded code in qemu's usermode and ran into
> issues with the gdbstub because the user mode qemu emulation spawns new
> threads when the process tries to make a new thread but the gdbstub does
> not handle the threads well. The current gdbstub has a single global
> struct which contains all the state data, and multiple threads can write
> to this struct simultaneously, causing gdb packets to be corrupted. The
> different threads can also try to read off the gdb socket at the same
> time causing the packet to be devided between threads. This patch is
> designed to add a single separate thread for the usermode gdbstub which
> will handle all the gdb comms and avoid the multithreading issues.
>
> To demonstrate that the mutlithreading was not working properly before
> and that it hopefully works properlly now, I wrote a small test program
> with some gdb scripts that can be found here:
> http://url6163.thefarbeyond.com/ls/click?upn=dUoerjbXT3NK9PiRMHEMD5Fm1RmJf0eUWTDkIDLODGZSRXu94ynuiGwQ-2FCFcimw5rndUfJbozecGKoOE5zbdfRVgadbVSeCrgP5IlB2UqPU-3DUBT-_E8SO-2FEypfU855L0ybQoiQY4Xaj8Z6NYzBoBK89OH-2BiIJOE8-2BoeErelzsKKZyRONZN5M7Gzw0Zs4K0tnG4gxJSOWW89OLEjRW7ISHPWO2lT6fJJUM88-2BLL6sh4BfexcL-2FKt7KhnEpzqyb9bd5UZ-2FR2iPVIjp7zfshwPtjJEHAHaIGeNZbI4nFw81hhs0N1tt9sAcC1ALVazbgnC5E-2F5ZChA-3D-3D

Thanks for this patch. I don't have time to do a full review of it,
but it sounds like you've definitely identified a bug. However
I'm not sure this is the right way to fix it.

Is signal handling really the only place where we need to be more careful
about the gdbstub handling for a multithreaded guest program?
How about other bits of gdbstub activity ?

Some more specific comments:

> +        /*
> +         * This mutex is first locked here to ensure that it is in a locked
> +         * state before the gdb_signal_handler_loop handles the next signal
> +         * and unlocks it.
> +         */
> +        qemu_mutex_lock(&signal_done_mutex);
> +        waiting_signal = &signal;
> +        /*
> +         * The thread locks this mutex again to wait until the
> +         * gdb_signal_handler_loop is finished handling the signal and has
> +         * unlocked the mutex.
> +         */
> +        qemu_mutex_lock(&signal_done_mutex);

This code is locking the same mutex twice in a row. That's not
guaranteed to do anything sensible, so it's not a valid thing to do.

> +        /*
> +         * Finally, unlock this mutex in preparation for the next call to
> +         * this function
> +         */
> +        qemu_mutex_unlock(&signal_done_mutex);
> +        sig = signal_result;
> +        if (!cpu->singlestep_enabled) {
> +            /*
> +             * If this thread is not stepping and is handling its signal
> +             * then it can always safely unlock this mutex.
> +             */
> +            qemu_mutex_unlock(&another_thread_is_stepping);
> +        } else {
> +            /*
> +             * If this thread was already stepping it will already be holding
> +             * this mutex so here try to lock instead of waiting on a lock.
> +             * This lock will prevent other non-stepping threads from handling
> +             * a signal until stepping is done.
> +             */
> +            qemu_mutex_trylock(&another_thread_is_stepping);
> +        }
> +    }
> +    /*
> +     * Unlock here to because we are done handling the signal and
> +     * another thread can now start handling a pending signal.
> +     */
> +    qemu_mutex_unlock(&signal_wait_mutex);
> +    return sig;
> +}

I have not analysed the code in detail but I get the impression that
maybe you're trying to use some of these mutexes to do jobs that
would be better done with a semaphore ?

> diff --git a/include/exec/gdbstub.h b/include/exec/gdbstub.h
> index c35d7334b4..15bfb76cca 100644
> --- a/include/exec/gdbstub.h
> +++ b/include/exec/gdbstub.h
> @@ -47,6 +47,19 @@ void gdb_do_syscallv(gdb_syscall_complete_cb cb, const char *fmt, va_list va);
>  int use_gdb_syscalls(void);
>
>  #ifdef CONFIG_USER_ONLY
> +/**
> + * gdb_thread_handle_signal
> + * @cpu_env: The guest thread's cpu env
> + * @sig: The signal being handled for the guest thread
> + *
> + * This function is a layer in between the gdb_handlesig function and the
> + * guest cpu threads. Instead of directly handling signals in the guest
> + * threads, this function passes off a signal to a handler loop thread running
> + * in the gdbstub that will handle each thread's signal atomically to avoid
> + * having races between threads to read and send data on the gdb socket. The
> + * function returns the signal value from gdb_handlesig
> + */
> +int gdb_thread_handle_signal(CPUState *cpu_env, int sig);
>  /**
>   * gdb_handlesig: yield control to gdb
>   * @cpu: CPU
> diff --git a/linux-user/signal.c b/linux-user/signal.c
> index 8d29bfaa6b..a252791217 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -1068,7 +1068,11 @@ static void handle_pending_signal(CPUArchState *cpu_env, int sig,
>      /* dequeue signal */
>      k->pending = 0;
>
> +#ifdef CONFIG_USER_ONLY
> +    sig = gdb_thread_handle_signal(cpu, sig);
> +#else
>      sig = gdb_handlesig(cpu, sig);
> +#endif

You might as well have left the function name alone and changed
the implementation of it, which would avoid changing this file.
In particular, for linux-user/ code, as the name suggests,
CONFIG_USER_ONLY will always be set, so the #else here is
unnecessary. Plus you would need to change bsd-user/ if you
change the function name.

thanks
-- PMM
diff mbox series

Patch

diff --git a/gdbstub.c b/gdbstub.c
index a3ff8702ce..11a76da575 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -29,6 +29,7 @@ 
 #include "qemu/ctype.h"
 #include "qemu/cutils.h"
 #include "qemu/module.h"
+#include "qemu/thread.h"
 #include "trace/trace-root.h"
 #include "exec/gdbstub.h"
 #ifdef CONFIG_USER_ONLY
@@ -370,8 +371,103 @@  typedef struct GDBState {
     int sstep_flags;
     int supported_sstep_flags;
 } GDBState;
+typedef struct GDBSignal {
+    CPUState *cpu;
+    int sig;
+} GDBSignal;
 
 static GDBState gdbserver_state;
+#ifdef CONFIG_USER_ONLY
+static GDBSignal *waiting_signal;
+static QemuMutex signal_wait_mutex;
+static QemuMutex signal_done_mutex;
+static QemuMutex another_thread_is_stepping;
+static int signal_result;
+#endif
+
+static void *gdb_signal_handler_loop(void* args)
+{
+    while (TRUE) {
+        if (NULL != waiting_signal) {
+            signal_result = gdb_handlesig(waiting_signal->cpu,
+                    waiting_signal->sig);
+            waiting_signal = NULL;
+            qemu_mutex_unlock(&signal_done_mutex);
+        }
+    }
+    /* Should never return */
+    return NULL;
+}
+
+int gdb_thread_handle_signal(CPUState *cpu, int sig)
+{
+    GDBSignal signal = {
+        .cpu = cpu,
+        .sig = sig
+    };
+    if (!cpu->singlestep_enabled) {
+        /*
+         * This mutex is locked by all threads that are not stepping to allow
+         * only the thread that is currently stepping to handle signals until
+         * it finished stepping. Without this, pending signals that are queued
+         * up while the stepping thread handles its signal will race with the
+         * stepping thread to get the next signal handled. This is bad, because
+         * the gdb client expects the stepping thread to be the first response
+         * back to the step command that was sent.
+         */
+        qemu_mutex_lock(&another_thread_is_stepping);
+    }
+    /*
+     * This mutex is locked to allow only one thread at a time to be
+     * handling a signal. This is necessary, because otherwise multiple
+     * threads will try to talk to the gdb client simultaneously and can
+     * race each other sending and receiving packets, potentially going
+     * out of order or simulatenously reading off of the same socket.
+     */
+    qemu_mutex_lock(&signal_wait_mutex);
+    {
+        /*
+         * This mutex is first locked here to ensure that it is in a locked
+         * state before the gdb_signal_handler_loop handles the next signal
+         * and unlocks it.
+         */
+        qemu_mutex_lock(&signal_done_mutex);
+        waiting_signal = &signal;
+        /*
+         * The thread locks this mutex again to wait until the
+         * gdb_signal_handler_loop is finished handling the signal and has
+         * unlocked the mutex.
+         */
+        qemu_mutex_lock(&signal_done_mutex);
+        /*
+         * Finally, unlock this mutex in preparation for the next call to
+         * this function
+         */
+        qemu_mutex_unlock(&signal_done_mutex);
+        sig = signal_result;
+        if (!cpu->singlestep_enabled) {
+            /*
+             * If this thread is not stepping and is handling its signal
+             * then it can always safely unlock this mutex.
+             */
+            qemu_mutex_unlock(&another_thread_is_stepping);
+        } else {
+            /*
+             * If this thread was already stepping it will already be holding
+             * this mutex so here try to lock instead of waiting on a lock.
+             * This lock will prevent other non-stepping threads from handling
+             * a signal until stepping is done.
+             */
+            qemu_mutex_trylock(&another_thread_is_stepping);
+        }
+    }
+    /*
+     * Unlock here to because we are done handling the signal and
+     * another thread can now start handling a pending signal.
+     */
+    qemu_mutex_unlock(&signal_wait_mutex);
+    return sig;
+}
 
 static void init_gdbserver_state(void)
 {
@@ -381,6 +477,15 @@  static void init_gdbserver_state(void)
     gdbserver_state.str_buf = g_string_new(NULL);
     gdbserver_state.mem_buf = g_byte_array_sized_new(MAX_PACKET_LENGTH);
     gdbserver_state.last_packet = g_byte_array_sized_new(MAX_PACKET_LENGTH + 4);
+#ifdef CONFIG_USER_ONLY
+    waiting_signal = NULL;
+    qemu_mutex_init(&signal_wait_mutex);
+    qemu_mutex_init(&signal_done_mutex);
+    qemu_mutex_init(&another_thread_is_stepping);
+    pthread_t signal_thread;
+    pthread_create(&signal_thread, NULL, gdb_signal_handler_loop, NULL);
+    pthread_setname_np(signal_thread, "gdbstub_handler");
+#endif
 
     /*
      * In replay mode all events will come from the log and can't be
diff --git a/include/exec/gdbstub.h b/include/exec/gdbstub.h
index c35d7334b4..15bfb76cca 100644
--- a/include/exec/gdbstub.h
+++ b/include/exec/gdbstub.h
@@ -47,6 +47,19 @@  void gdb_do_syscallv(gdb_syscall_complete_cb cb, const char *fmt, va_list va);
 int use_gdb_syscalls(void);
 
 #ifdef CONFIG_USER_ONLY
+/**
+ * gdb_thread_handle_signal
+ * @cpu_env: The guest thread's cpu env
+ * @sig: The signal being handled for the guest thread
+ *
+ * This function is a layer in between the gdb_handlesig function and the
+ * guest cpu threads. Instead of directly handling signals in the guest
+ * threads, this function passes off a signal to a handler loop thread running
+ * in the gdbstub that will handle each thread's signal atomically to avoid
+ * having races between threads to read and send data on the gdb socket. The
+ * function returns the signal value from gdb_handlesig
+ */
+int gdb_thread_handle_signal(CPUState *cpu_env, int sig);
 /**
  * gdb_handlesig: yield control to gdb
  * @cpu: CPU
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 8d29bfaa6b..a252791217 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1068,7 +1068,11 @@  static void handle_pending_signal(CPUArchState *cpu_env, int sig,
     /* dequeue signal */
     k->pending = 0;
 
+#ifdef CONFIG_USER_ONLY
+    sig = gdb_thread_handle_signal(cpu, sig);
+#else
     sig = gdb_handlesig(cpu, sig);
+#endif
     if (!sig) {
         sa = NULL;
         handler = TARGET_SIG_IGN;