diff mbox

[for-1.4] libqtest: Wait for the right child PID after killing QEMU

Message ID 1359396916-10418-1-git-send-email-ehabkost@redhat.com
State New
Headers show

Commit Message

Eduardo Habkost Jan. 28, 2013, 6:15 p.m. UTC
When running "make check" with gcov enabled, we get the following
message:

   hw/tmp105.gcda:cannot open data file, assuming not executed

The problem happens because:

 * tmp105-test exits before QEMU exits, because waitpid() at
   qtest_quit() fails;
 * waitpid() fails because there's another process already
   waiting for the QEMU process;
 * The process that is already waiting for QEMU is the child created by
   qtest_init() to run system();
 * qtest_quit() is incorrectly waiting for the QEMU PID directly instead
   of the child created by qtest_init().

This fixes the problem by sending SIGTERM to QEMU, but waiting for the
child process created by qtest_init() (that exits immediately after QEMU
exits).

Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 tests/libqtest.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Andreas Färber Jan. 30, 2013, 12:42 a.m. UTC | #1
Am 28.01.2013 19:15, schrieb Eduardo Habkost:
> When running "make check" with gcov enabled, we get the following
> message:
> 
>    hw/tmp105.gcda:cannot open data file, assuming not executed
> 
> The problem happens because:
> 
>  * tmp105-test exits before QEMU exits, because waitpid() at
>    qtest_quit() fails;
>  * waitpid() fails because there's another process already
>    waiting for the QEMU process;
>  * The process that is already waiting for QEMU is the child created by
>    qtest_init() to run system();
>  * qtest_quit() is incorrectly waiting for the QEMU PID directly instead
>    of the child created by qtest_init().
> 
> This fixes the problem by sending SIGTERM to QEMU, but waiting for the
> child process created by qtest_init() (that exits immediately after QEMU
> exits).
> 
> Reported-by: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Thanks. Still need to test this...

Anthony, might this by any chance fix your sparc 100% CPU issue?

Andreas

> ---
>  tests/libqtest.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 913fa05..762dec4 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -39,7 +39,8 @@ struct QTestState
>      int qmp_fd;
>      bool irq_level[MAX_IRQ];
>      GString *rx;
> -    gchar *pid_file;
> +    gchar *pid_file; /* QEMU PID file */
> +    int child_pid;   /* Child process created to execute QEMU */
>      char *socket_path, *qmp_socket_path;
>  };
>  
> @@ -144,6 +145,7 @@ QTestState *qtest_init(const char *extra_args)
>  
>      s->rx = g_string_new("");
>      s->pid_file = pid_file;
> +    s->child_pid = pid;
>      for (i = 0; i < MAX_IRQ; i++) {
>          s->irq_level[i] = false;
>      }
> @@ -165,8 +167,9 @@ void qtest_quit(QTestState *s)
>  
>      pid_t pid = qtest_qemu_pid(s);
>      if (pid != -1) {
> +        /* kill QEMU, but wait for the child created by us to run system() */
>          kill(pid, SIGTERM);
> -        waitpid(pid, &status, 0);
> +        waitpid(s->child_pid, &status, 0);
>      }
>  
>      unlink(s->pid_file);
>
Anthony Liguori Jan. 30, 2013, 2:14 a.m. UTC | #2
Andreas Färber <afaerber@suse.de> writes:

> Am 28.01.2013 19:15, schrieb Eduardo Habkost:
>> When running "make check" with gcov enabled, we get the following
>> message:
>> 
>>    hw/tmp105.gcda:cannot open data file, assuming not executed
>> 
>> The problem happens because:
>> 
>>  * tmp105-test exits before QEMU exits, because waitpid() at
>>    qtest_quit() fails;
>>  * waitpid() fails because there's another process already
>>    waiting for the QEMU process;
>>  * The process that is already waiting for QEMU is the child created by
>>    qtest_init() to run system();
>>  * qtest_quit() is incorrectly waiting for the QEMU PID directly instead
>>    of the child created by qtest_init().
>> 
>> This fixes the problem by sending SIGTERM to QEMU, but waiting for the
>> child process created by qtest_init() (that exits immediately after QEMU
>> exits).
>> 
>> Reported-by: Andreas Färber <afaerber@suse.de>
>> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
>
> Thanks. Still need to test this...
>
> Anthony, might this by any chance fix your sparc 100% CPU issue?

I don't see how.  How would this impact the QEMU process?

Regards,

Anthony Liguori


>
> Andreas
>
>> ---
>>  tests/libqtest.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tests/libqtest.c b/tests/libqtest.c
>> index 913fa05..762dec4 100644
>> --- a/tests/libqtest.c
>> +++ b/tests/libqtest.c
>> @@ -39,7 +39,8 @@ struct QTestState
>>      int qmp_fd;
>>      bool irq_level[MAX_IRQ];
>>      GString *rx;
>> -    gchar *pid_file;
>> +    gchar *pid_file; /* QEMU PID file */
>> +    int child_pid;   /* Child process created to execute QEMU */
>>      char *socket_path, *qmp_socket_path;
>>  };
>>  
>> @@ -144,6 +145,7 @@ QTestState *qtest_init(const char *extra_args)
>>  
>>      s->rx = g_string_new("");
>>      s->pid_file = pid_file;
>> +    s->child_pid = pid;
>>      for (i = 0; i < MAX_IRQ; i++) {
>>          s->irq_level[i] = false;
>>      }
>> @@ -165,8 +167,9 @@ void qtest_quit(QTestState *s)
>>  
>>      pid_t pid = qtest_qemu_pid(s);
>>      if (pid != -1) {
>> +        /* kill QEMU, but wait for the child created by us to run system() */
>>          kill(pid, SIGTERM);
>> -        waitpid(pid, &status, 0);
>> +        waitpid(s->child_pid, &status, 0);
>>      }
>>  
>>      unlink(s->pid_file);
>> 
>
>
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Eduardo Habkost Jan. 30, 2013, 1:02 p.m. UTC | #3
On Tue, Jan 29, 2013 at 08:14:24PM -0600, Anthony Liguori wrote:
> Andreas Färber <afaerber@suse.de> writes:
> 
> > Am 28.01.2013 19:15, schrieb Eduardo Habkost:
> >> When running "make check" with gcov enabled, we get the following
> >> message:
> >> 
> >>    hw/tmp105.gcda:cannot open data file, assuming not executed
> >> 
> >> The problem happens because:
> >> 
> >>  * tmp105-test exits before QEMU exits, because waitpid() at
> >>    qtest_quit() fails;
> >>  * waitpid() fails because there's another process already
> >>    waiting for the QEMU process;
> >>  * The process that is already waiting for QEMU is the child created by
> >>    qtest_init() to run system();
> >>  * qtest_quit() is incorrectly waiting for the QEMU PID directly instead
> >>    of the child created by qtest_init().
> >> 
> >> This fixes the problem by sending SIGTERM to QEMU, but waiting for the
> >> child process created by qtest_init() (that exits immediately after QEMU
> >> exits).
> >> 
> >> Reported-by: Andreas Färber <afaerber@suse.de>
> >> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> >
> > Thanks. Still need to test this...
> >
> > Anthony, might this by any chance fix your sparc 100% CPU issue?
> 
> I don't see how.  How would this impact the QEMU process?

The only consequence of this bug that I can see, is that the *-test
process may exit before the QEMU process exits, and may leave a zombie
child.

It may affect QEMU in case it tries to do something after receiving
SIGTERM, such as sending data through the closed monitor or qtest
sockets without handling the write errors properly.


> 
> Regards,
> 
> Anthony Liguori
> 
> 
> >
> > Andreas
> >
> >> ---
> >>  tests/libqtest.c | 7 +++++--
> >>  1 file changed, 5 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/tests/libqtest.c b/tests/libqtest.c
> >> index 913fa05..762dec4 100644
> >> --- a/tests/libqtest.c
> >> +++ b/tests/libqtest.c
> >> @@ -39,7 +39,8 @@ struct QTestState
> >>      int qmp_fd;
> >>      bool irq_level[MAX_IRQ];
> >>      GString *rx;
> >> -    gchar *pid_file;
> >> +    gchar *pid_file; /* QEMU PID file */
> >> +    int child_pid;   /* Child process created to execute QEMU */
> >>      char *socket_path, *qmp_socket_path;
> >>  };
> >>  
> >> @@ -144,6 +145,7 @@ QTestState *qtest_init(const char *extra_args)
> >>  
> >>      s->rx = g_string_new("");
> >>      s->pid_file = pid_file;
> >> +    s->child_pid = pid;
> >>      for (i = 0; i < MAX_IRQ; i++) {
> >>          s->irq_level[i] = false;
> >>      }
> >> @@ -165,8 +167,9 @@ void qtest_quit(QTestState *s)
> >>  
> >>      pid_t pid = qtest_qemu_pid(s);
> >>      if (pid != -1) {
> >> +        /* kill QEMU, but wait for the child created by us to run system() */
> >>          kill(pid, SIGTERM);
> >> -        waitpid(pid, &status, 0);
> >> +        waitpid(s->child_pid, &status, 0);
> >>      }
> >>  
> >>      unlink(s->pid_file);
> >> 
> >
> >
> > -- 
> > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> > GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Andreas Färber Feb. 1, 2013, 12:25 a.m. UTC | #4
Am 28.01.2013 19:15, schrieb Eduardo Habkost:
> When running "make check" with gcov enabled, we get the following
> message:
> 
>    hw/tmp105.gcda:cannot open data file, assuming not executed
> 
> The problem happens because:
> 
>  * tmp105-test exits before QEMU exits, because waitpid() at
>    qtest_quit() fails;
>  * waitpid() fails because there's another process already
>    waiting for the QEMU process;
>  * The process that is already waiting for QEMU is the child created by
>    qtest_init() to run system();
>  * qtest_quit() is incorrectly waiting for the QEMU PID directly instead
>    of the child created by qtest_init().
> 
> This fixes the problem by sending SIGTERM to QEMU, but waiting for the
> child process created by qtest_init() (that exits immediately after QEMU
> exits).
> 
> Reported-by: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Tested-by: Andreas Färber <afaerber@suse.de>

Thanks a lot for the quick analysis and fix!

Andreas
Andreas Färber Feb. 1, 2013, 12:30 a.m. UTC | #5
Am 30.01.2013 03:14, schrieb Anthony Liguori:
> Andreas Färber <afaerber@suse.de> writes:
> 
>> Am 28.01.2013 19:15, schrieb Eduardo Habkost:
>>> When running "make check" with gcov enabled, we get the following
>>> message:
>>>
>>>    hw/tmp105.gcda:cannot open data file, assuming not executed
>>>
>>> The problem happens because:
>>>
>>>  * tmp105-test exits before QEMU exits, because waitpid() at
>>>    qtest_quit() fails;
>>>  * waitpid() fails because there's another process already
>>>    waiting for the QEMU process;
>>>  * The process that is already waiting for QEMU is the child created by
>>>    qtest_init() to run system();
>>>  * qtest_quit() is incorrectly waiting for the QEMU PID directly instead
>>>    of the child created by qtest_init().
>>>
>>> This fixes the problem by sending SIGTERM to QEMU, but waiting for the
>>> child process created by qtest_init() (that exits immediately after QEMU
>>> exits).
>>>
>>> Reported-by: Andreas Färber <afaerber@suse.de>
>>> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
>>
>> Thanks. Still need to test this...
>>
>> Anthony, might this by any chance fix your sparc 100% CPU issue?
> 
> I don't see how.  How would this impact the QEMU process?

I had previously reported failing tests leading to the QEMU process not
being killed and eating CPU and memory until the system becomes very
unresponsive. Possibly that's a second issue because as I understand it
an assertion in the test program makes it exit immediately without any
cleanup.

Regards,

Andreas
Anthony Liguori Feb. 4, 2013, 10:54 p.m. UTC | #6
Applied.  Thanks.

Regards,

Anthony Liguori
diff mbox

Patch

diff --git a/tests/libqtest.c b/tests/libqtest.c
index 913fa05..762dec4 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -39,7 +39,8 @@  struct QTestState
     int qmp_fd;
     bool irq_level[MAX_IRQ];
     GString *rx;
-    gchar *pid_file;
+    gchar *pid_file; /* QEMU PID file */
+    int child_pid;   /* Child process created to execute QEMU */
     char *socket_path, *qmp_socket_path;
 };
 
@@ -144,6 +145,7 @@  QTestState *qtest_init(const char *extra_args)
 
     s->rx = g_string_new("");
     s->pid_file = pid_file;
+    s->child_pid = pid;
     for (i = 0; i < MAX_IRQ; i++) {
         s->irq_level[i] = false;
     }
@@ -165,8 +167,9 @@  void qtest_quit(QTestState *s)
 
     pid_t pid = qtest_qemu_pid(s);
     if (pid != -1) {
+        /* kill QEMU, but wait for the child created by us to run system() */
         kill(pid, SIGTERM);
-        waitpid(pid, &status, 0);
+        waitpid(s->child_pid, &status, 0);
     }
 
     unlink(s->pid_file);