diff mbox series

[v2,1/1] vl.c: do not execute trace_init_backends() before daemonizing

Message ID 20210105181437.538366-2-danielhb413@gmail.com
State New
Headers show
Series vl.c: fix trace backend init ordering | expand

Commit Message

Daniel Henrique Barboza Jan. 5, 2021, 6:14 p.m. UTC
Commit v5.2.0-190-g0546c0609c ("vl: split various early command line
options to a separate function") moved the trace backend init code to
the qemu_process_early_options(). Which is now being called before
os_daemonize() via qemu_maybe_daemonize().

Turns out that this change of order causes a problem when executing
QEMU in daemon mode and with CONFIG_TRACE_SIMPLE. The trace thread
is now being created by the parent, and the parent is left waiting for
a trace file flush that was registered via st_init(). The result is
that the parent process never exits.

To reproduce, fire up a QEMU process with -daemonize and with
CONFIG_TRACE_SIMPLE enabled. Two QEMU process will be left in the
host:

$ sudo ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \
  -nographic -machine none,accel=kvm:tcg -daemonize

$ ps axf | grep qemu
 529710 pts/3    S+     0:00  |       \_ grep --color=auto qemu
 529697 ?        Ssl    0:00  \_ ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -daemonize
 529699 ?        Sl     0:00      \_ ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -daemonize

The parent thread is hang in flush_trace_file:

$ sudo gdb ./x86_64-softmmu/qemu-system-x86_64 529697
(..)
(gdb) bt
 #0  0x00007f9dac6a137d in syscall () at /lib64/libc.so.6
 #1  0x00007f9dacc3c4f3 in g_cond_wait () at /lib64/libglib-2.0.so.0
 #2  0x0000555d12f952da in flush_trace_file (wait=true) at ../trace/simple.c:140
 #3  0x0000555d12f95b4c in st_flush_trace_buffer () at ../trace/simple.c:383
 #4  0x00007f9dac5e43a7 in __run_exit_handlers () at /lib64/libc.so.6
 #5  0x00007f9dac5e4550 in on_exit () at /lib64/libc.so.6
 #6  0x0000555d12d454de in os_daemonize () at ../os-posix.c:255
 #7  0x0000555d12d0bd5c in qemu_maybe_daemonize (pid_file=0x0) at ../softmmu/vl.c:2408
 #8  0x0000555d12d0e566 in qemu_init (argc=8, argv=0x7fffc594d9b8, envp=0x7fffc594da00) at ../softmmu/vl.c:3459
 #9  0x0000555d128edac1 in main (argc=8, argv=0x7fffc594d9b8, envp=0x7fffc594da00) at ../softmmu/main.c:49
(gdb)

Aside from the 'zombie' process in the host, this is directly impacting
Libvirt. Libvirt waits for the parent process to exit to be sure that the
QMP monitor is available in the daemonized process to fetch QEMU
capabilities, and as is now Libvirt hangs at daemon start waiting
for the parent thread to exit.

The fix is simple: just move the trace backend related code back to
be executed after daemonizing.

Fixes: 0546c0609cb5a8d90c1cbac8e0d64b5a048bbb19
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
 softmmu/vl.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

Comments

Stefan Hajnoczi Jan. 6, 2021, 4:59 p.m. UTC | #1
On Tue, Jan 05, 2021 at 03:14:37PM -0300, Daniel Henrique Barboza wrote:
> Commit v5.2.0-190-g0546c0609c ("vl: split various early command line
> options to a separate function") moved the trace backend init code to
> the qemu_process_early_options(). Which is now being called before
> os_daemonize() via qemu_maybe_daemonize().
> 
> Turns out that this change of order causes a problem when executing
> QEMU in daemon mode and with CONFIG_TRACE_SIMPLE. The trace thread
> is now being created by the parent, and the parent is left waiting for
> a trace file flush that was registered via st_init(). The result is
> that the parent process never exits.
> 
> To reproduce, fire up a QEMU process with -daemonize and with
> CONFIG_TRACE_SIMPLE enabled. Two QEMU process will be left in the
> host:
> 
> $ sudo ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \
>   -nographic -machine none,accel=kvm:tcg -daemonize
> 
> $ ps axf | grep qemu
>  529710 pts/3    S+     0:00  |       \_ grep --color=auto qemu
>  529697 ?        Ssl    0:00  \_ ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -daemonize
>  529699 ?        Sl     0:00      \_ ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -daemonize
> 
> The parent thread is hang in flush_trace_file:
> 
> $ sudo gdb ./x86_64-softmmu/qemu-system-x86_64 529697
> (..)
> (gdb) bt
>  #0  0x00007f9dac6a137d in syscall () at /lib64/libc.so.6
>  #1  0x00007f9dacc3c4f3 in g_cond_wait () at /lib64/libglib-2.0.so.0
>  #2  0x0000555d12f952da in flush_trace_file (wait=true) at ../trace/simple.c:140
>  #3  0x0000555d12f95b4c in st_flush_trace_buffer () at ../trace/simple.c:383
>  #4  0x00007f9dac5e43a7 in __run_exit_handlers () at /lib64/libc.so.6
>  #5  0x00007f9dac5e4550 in on_exit () at /lib64/libc.so.6
>  #6  0x0000555d12d454de in os_daemonize () at ../os-posix.c:255
>  #7  0x0000555d12d0bd5c in qemu_maybe_daemonize (pid_file=0x0) at ../softmmu/vl.c:2408
>  #8  0x0000555d12d0e566 in qemu_init (argc=8, argv=0x7fffc594d9b8, envp=0x7fffc594da00) at ../softmmu/vl.c:3459
>  #9  0x0000555d128edac1 in main (argc=8, argv=0x7fffc594d9b8, envp=0x7fffc594da00) at ../softmmu/main.c:49
> (gdb)
> 
> Aside from the 'zombie' process in the host, this is directly impacting
> Libvirt. Libvirt waits for the parent process to exit to be sure that the
> QMP monitor is available in the daemonized process to fetch QEMU
> capabilities, and as is now Libvirt hangs at daemon start waiting
> for the parent thread to exit.
> 
> The fix is simple: just move the trace backend related code back to
> be executed after daemonizing.
> 
> Fixes: 0546c0609cb5a8d90c1cbac8e0d64b5a048bbb19
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  softmmu/vl.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini Jan. 6, 2021, 7:59 p.m. UTC | #2
Il mer 6 gen 2021, 17:59 Stefan Hajnoczi <stefanha@gmail.com> ha scritto:

> Acked-by: Stefan Hajnoczi <stefanha@redhat.com


I don't have anything queued shortly so feel free to include it yourself.

Paolo


>
Daniel Henrique Barboza Jan. 19, 2021, 2:10 p.m. UTC | #3
Hi,

Is this queued to be pushed? We can't test Libvirt with upstream QEMU
without this fix to be able to daemonize properly.


Thanks,


DHB

On 1/6/21 4:59 PM, Paolo Bonzini wrote:
> 
> 
> Il mer 6 gen 2021, 17:59 Stefan Hajnoczi <stefanha@gmail.com <mailto:stefanha@gmail.com>> ha scritto:
> 
>     Acked-by: Stefan Hajnoczi <stefanha@redhat.com <mailto:stefanha@redhat.com>
> 
> 
> I don't have anything queued shortly so feel free to include it yourself.
> 
> Paolo
> 
>
diff mbox series

Patch

diff --git a/softmmu/vl.c b/softmmu/vl.c
index 0ed5c5ba93..ddf252f5f0 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2354,11 +2354,6 @@  static void qemu_process_early_options(void)
                       cleanup_add_fd, NULL, &error_fatal);
 #endif
 
-    if (!trace_init_backends()) {
-        exit(1);
-    }
-    trace_init_file();
-
     /* Open the logfile at this point and set the log mask if necessary.  */
     qemu_set_log_filename(log_file, &error_fatal);
     if (log_mask) {
@@ -3458,6 +3453,19 @@  void qemu_init(int argc, char **argv, char **envp)
     qemu_process_help_options();
     qemu_maybe_daemonize(pid_file);
 
+    /*
+     * The trace backend must be initialized after daemonizing.
+     * trace_init_backends() will call st_init(), which will create the
+     * trace thread in the parent, and also register st_flush_trace_buffer()
+     * in atexit(). This function will force the parent to wait for the
+     * writeout thread to finish, which will not occur, and the parent
+     * process will be left in the host.
+     */
+    if (!trace_init_backends()) {
+        exit(1);
+    }
+    trace_init_file();
+
     qemu_init_main_loop(&error_fatal);
     cpu_timers_init();