diff mbox series

[v3,2/6] monitor: flush qmp responses when CLOSED

Message ID 20180615014249.22730-3-peterx@redhat.com
State New
Headers show
Series monitor: enable OOB by default | expand

Commit Message

Peter Xu June 15, 2018, 1:42 a.m. UTC
Previously we clean up the queues when we got CLOSED event.  It was used
to make sure we won't send leftover replies/events of a old client to a
new client.  Now this patch does an extra operation to flush the
response queue before cleaning up.

In most cases, a QMP session will be based on a bidirectional channel (a
TCP port, for example, we read/write to the same socket handle), so in
port and out port of the backend chardev are fundamentally the same
port. In these cases, it does not really matter much on whether we'll
flush the response queue since flushing should possibly fail
anyways (just imagine to write to a socket that has already closed).
However there can be cases where in & out ports of the QMP monitor's
backend chardev are separated.  One true example:

  cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands

In this case, the backend is fd-typed, and it is connected to stdio
where in port is stdin and out port is stdout.  Now if we drop all the
events on the response queue then filter_command process might miss some
events that it might expect.  The thing is that, when stdin closes,
stdout might still be there alive!

In practice, I encountered a very strange SHUTDOWN event missing when
running test with iotest 087 with Out-Of-Band enabled. One condition
could be this (after "quit" command is executed and QEMU quits the main
loop):

1. [main thread] QEMU queues one SHUTDOWN event into response queue

2. "cat" terminates (to distinguish it from the animal, I quote it).
   Logically it can terminate even earlier, but let's just assume it's
   here.

3. [monitor iothread] QEMU reads EOF from stdin, which connects to the
   "cat" process

4. [monitor iothread] QEMU calls the CLOSED event hook for the monitor,
   which will clean up the response queue of the monitor, then the
   SHUTDOWN event is dropped

5. [main thread] clean up the monitors in monitor_cleanup(), when
   trying to flush pending responses, it sees nothing.  SHUTDOWN is
   lost forever

Note that before the monitor iothread was introduced, step [4]/[5] could
never happen since the main loop was the only place to detect the EOF
event of stdin and run the CLOSED event hooks.  Now things can happen in
parallel in the iothread.

Without this patch, iotest 087 will have ~10% chance to miss the
SHUTDOWN event and fail when with Out-Of-Band enabled:

087 8s ... - output mismatch (see 087.out.bad)

Comments

Markus Armbruster June 15, 2018, 8:11 a.m. UTC | #1
Peter Xu <peterx@redhat.com> writes:

> Previously we clean up the queues when we got CLOSED event.  It was used
> to make sure we won't send leftover replies/events of a old client to a
> new client.  Now this patch does an extra operation to flush the
> response queue before cleaning up.
>
> In most cases, a QMP session will be based on a bidirectional channel (a
> TCP port, for example, we read/write to the same socket handle), so in
> port and out port of the backend chardev are fundamentally the same
> port. In these cases, it does not really matter much on whether we'll
> flush the response queue since flushing should possibly fail

s/should possibly fail anyways/will fail anyway/

> anyways (just imagine to write to a socket that has already closed).
> However there can be cases where in & out ports of the QMP monitor's
> backend chardev are separated.  One true example:

Suggest to drop "true", or say "Here's an example".

>   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
>
> In this case, the backend is fd-typed, and it is connected to stdio
> where in port is stdin and out port is stdout.  Now if we drop all the
> events on the response queue then filter_command process might miss some
> events that it might expect.  The thing is that, when stdin closes,
> stdout might still be there alive!
>
> In practice, I encountered a very strange SHUTDOWN event missing when

The event isn't strange, its missing is (sort of).  Suggest to drop
"very strange".

> running test with iotest 087 with Out-Of-Band enabled. One condition
> could be this (after "quit" command is executed and QEMU quits the main
> loop):

"could be" suggests this may or may not demonstrate the loss of the
event.  Misleading.  "Here's one of the ways this can happen" would be
clearer.

>
> 1. [main thread] QEMU queues one SHUTDOWN event into response queue

s/one/a/

>
> 2. "cat" terminates (to distinguish it from the animal, I quote it).
>    Logically it can terminate even earlier, but let's just assume it's
>    here.

Suggest to drop the second sentence.  You're describing one way for the
bug to bite here, not all possible ways.

>
> 3. [monitor iothread] QEMU reads EOF from stdin, which connects to the
>    "cat" process

Suggest "QEMU's monitor iothread reads".

Suggest to drop ", which connects...", because that should be obvious.

> 4. [monitor iothread] QEMU calls the CLOSED event hook for the monitor,

"QEMU's monitor iothread" again.

>    which will clean up the response queue of the monitor, then the
>    SHUTDOWN event is dropped

Suggest to say "destroy" instead of "clean up", to make perfectly clear
that we're throwing away the queue.
>
> 5. [main thread] clean up the monitors in monitor_cleanup(), when

"QEMU's main thread cleans up the monitors in monitor_cleanup().  When"

>    trying to flush pending responses, it sees nothing.  SHUTDOWN is
>    lost forever

Missing a period.

> Note that before the monitor iothread was introduced, step [4]/[5] could
> never happen since the main loop was the only place to detect the EOF
> event of stdin and run the CLOSED event hooks.  Now things can happen in
> parallel in the iothread.
>
> Without this patch, iotest 087 will have ~10% chance to miss the
> SHUTDOWN event and fail when with Out-Of-Band enabled:
>
> 087 8s ... - output mismatch (see 087.out.bad)
> --- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 18:44:22.378982462 +0800
> +++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 18:53:44.267840928 +0800
> @@ -8,7 +8,6 @@
>  {"return": {}}
>  {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
>
>  === Duplicate ID ===
> @@ -53,7 +52,6 @@
>  {"return": {}}
>  {"return": {}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}

Suggest to indent the quoted diff a bit, so make it more obviously not
part of the patch.  In fact, git-am chokes on it for me.

> This patch fixes the problem.
>
> Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
> Suggested-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Don't let my nitpicking deceive you; this is a really nice commit
message.

One more suggestion.  Your first paragraph describes the solution.  I
like to start with the problem instead.  More so when the problem
description is long, as it is here, because by the time I'm done reading
the problem, I'm prone to have forgotten the solution, and go "okay, and
how's the patch addressing the problem?"  But please use your judgement.

> ---
>  monitor.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/monitor.c b/monitor.c
> index 6d0cec552e..e59d4f09ac 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -512,20 +512,39 @@ struct QMPResponse {
>  };
>  typedef struct QMPResponse QMPResponse;
>  
> +static QObject *monitor_qmp_response_pop_one(Monitor *mon)
> +{
> +    QObject *data;
> +
> +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> +    data = g_queue_pop_head(mon->qmp.qmp_responses);
> +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> +
> +    return data;
> +}

Note for later: returns null only when the queue is empty (we never push
null onto this queue).

> +
> +static void monitor_qmp_response_flush(Monitor *mon)
> +{
> +    QObject *data;
> +
> +    while ((data = monitor_qmp_response_pop_one(mon))) {
> +        monitor_json_emitter_raw(mon, data);
> +        qobject_unref(data);
> +    }
> +}

This flushes the queue to @mon.  Good.

> +
>  /*
>   * Return one QMPResponse.  The response is only valid if
>   * response.data is not NULL.
>   */
> -static QMPResponse monitor_qmp_response_pop_one(void)
> +static QMPResponse monitor_qmp_response_pop(void)
>  {
>      Monitor *mon;
>      QObject *data = NULL;
>  
>      qemu_mutex_lock(&monitor_lock);
>      QTAILQ_FOREACH(mon, &mon_list, entry) {
> -        qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> -        data = g_queue_pop_head(mon->qmp.qmp_responses);
> -        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> +        data = monitor_qmp_response_pop_one(mon);
>          if (data) {
>              break;
>          }
       }
       qemu_mutex_unlock(&monitor_lock);
       return (QMPResponse) { .mon = mon, .data = data };
   }

Not this patch's fault, but this function gave me pause until I got the
loop's purpose: the function returns a response from *any* monitor.

Moreover, your naming creates an inconsistency with
monitor_qmp_requests_pop_one().

I append my attempt to make it easier to understand, and restore
consistency.  I guess I'd split it into a preparatory patch and a fixup
for this one.

> @@ -539,7 +558,7 @@ static void monitor_qmp_bh_responder(void *opaque)
>      QMPResponse response;
>  
>      while (true) {
> -        response = monitor_qmp_response_pop_one();
> +        response = monitor_qmp_response_pop();
>          if (!response.data) {
>              break;
>          }
> @@ -4366,6 +4385,7 @@ static void monitor_qmp_event(void *opaque, int event)
>          mon_refcount++;
>          break;
>      case CHR_EVENT_CLOSED:
> +        monitor_qmp_response_flush(mon);

This wastes work in the common case when @mon's output fd has been
closed already.  I think that's just fine.

>          monitor_qmp_cleanup_queues(mon);
>          json_message_parser_destroy(&mon->qmp.parser);
>          json_message_parser_init(&mon->qmp.parser, handle_qmp_command);


diff --git a/monitor.c b/monitor.c
index e59d4f09ac..af28977549 100644
--- a/monitor.c
+++ b/monitor.c
@@ -512,7 +512,7 @@ struct QMPResponse {
 };
 typedef struct QMPResponse QMPResponse;
 
-static QObject *monitor_qmp_response_pop_one(Monitor *mon)
+static QObject *monitor_qmp_requests_pop(Monitor *mon)
 {
     QObject *data;
 
@@ -527,41 +527,39 @@ static void monitor_qmp_response_flush(Monitor *mon)
 {
     QObject *data;
 
-    while ((data = monitor_qmp_response_pop_one(mon))) {
+    while ((data = monitor_qmp_requests_pop(mon))) {
         monitor_json_emitter_raw(mon, data);
         qobject_unref(data);
     }
 }
 
 /*
- * Return one QMPResponse.  The response is only valid if
- * response.data is not NULL.
+ * Pop a QMPResponse from any monitor's response queue into @response.
+ * Return NULL when all queues are empty, else @response.
  */
-static QMPResponse monitor_qmp_response_pop(void)
+static QMPResponse *monitor_qmp_response_pop_any(QMPResponse *response)
 {
     Monitor *mon;
     QObject *data = NULL;
 
     qemu_mutex_lock(&monitor_lock);
     QTAILQ_FOREACH(mon, &mon_list, entry) {
-        data = monitor_qmp_response_pop_one(mon);
+        data = monitor_qmp_requests_pop(mon);
         if (data) {
+            response->mon = mon;
+            response->data = data;
             break;
         }
     }
     qemu_mutex_unlock(&monitor_lock);
-    return (QMPResponse) { .mon = mon, .data = data };
+    return data ? response : NULL;
 }
 
 static void monitor_qmp_bh_responder(void *opaque)
 {
     QMPResponse response;
 
-    while (true) {
-        response = monitor_qmp_response_pop();
-        if (!response.data) {
-            break;
-        }
+    while (monitor_qmp_response_pop_any(&response)) {
         monitor_json_emitter_raw(response.mon, response.data);
         qobject_unref(response.data);
     }
@@ -4126,7 +4124,7 @@ static void monitor_qmp_dispatch_one(QMPRequest *req_obj)
  * when we process one request on a specific monitor, we put that
  * monitor to the end of mon_list queue.
  */
-static QMPRequest *monitor_qmp_requests_pop_one(void)
+static QMPRequest *monitor_qmp_requests_pop_any(void)
 {
     QMPRequest *req_obj = NULL;
     Monitor *mon;
@@ -4158,7 +4156,7 @@ static QMPRequest *monitor_qmp_requests_pop_one(void)
 
 static void monitor_qmp_bh_dispatcher(void *data)
 {
-    QMPRequest *req_obj = monitor_qmp_requests_pop_one();
+    QMPRequest *req_obj = monitor_qmp_requests_pop_any();
 
     if (req_obj) {
         trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
Peter Xu June 19, 2018, 4:35 a.m. UTC | #2
On Fri, Jun 15, 2018 at 10:11:34AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Previously we clean up the queues when we got CLOSED event.  It was used
> > to make sure we won't send leftover replies/events of a old client to a
> > new client.  Now this patch does an extra operation to flush the
> > response queue before cleaning up.
> >
> > In most cases, a QMP session will be based on a bidirectional channel (a
> > TCP port, for example, we read/write to the same socket handle), so in
> > port and out port of the backend chardev are fundamentally the same
> > port. In these cases, it does not really matter much on whether we'll
> > flush the response queue since flushing should possibly fail
> 
> s/should possibly fail anyways/will fail anyway/
> 
> > anyways (just imagine to write to a socket that has already closed).
> > However there can be cases where in & out ports of the QMP monitor's
> > backend chardev are separated.  One true example:
> 
> Suggest to drop "true", or say "Here's an example".
> 
> >   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
> >
> > In this case, the backend is fd-typed, and it is connected to stdio
> > where in port is stdin and out port is stdout.  Now if we drop all the
> > events on the response queue then filter_command process might miss some
> > events that it might expect.  The thing is that, when stdin closes,
> > stdout might still be there alive!
> >
> > In practice, I encountered a very strange SHUTDOWN event missing when
> 
> The event isn't strange, its missing is (sort of).  Suggest to drop
> "very strange".
> 
> > running test with iotest 087 with Out-Of-Band enabled. One condition
> > could be this (after "quit" command is executed and QEMU quits the main
> > loop):
> 
> "could be" suggests this may or may not demonstrate the loss of the
> event.  Misleading.  "Here's one of the ways this can happen" would be
> clearer.
> 
> >
> > 1. [main thread] QEMU queues one SHUTDOWN event into response queue
> 
> s/one/a/
> 
> >
> > 2. "cat" terminates (to distinguish it from the animal, I quote it).
> >    Logically it can terminate even earlier, but let's just assume it's
> >    here.
> 
> Suggest to drop the second sentence.  You're describing one way for the
> bug to bite here, not all possible ways.
> 
> >
> > 3. [monitor iothread] QEMU reads EOF from stdin, which connects to the
> >    "cat" process
> 
> Suggest "QEMU's monitor iothread reads".
> 
> Suggest to drop ", which connects...", because that should be obvious.
> 
> > 4. [monitor iothread] QEMU calls the CLOSED event hook for the monitor,
> 
> "QEMU's monitor iothread" again.
> 
> >    which will clean up the response queue of the monitor, then the
> >    SHUTDOWN event is dropped
> 
> Suggest to say "destroy" instead of "clean up", to make perfectly clear
> that we're throwing away the queue.
> >
> > 5. [main thread] clean up the monitors in monitor_cleanup(), when
> 
> "QEMU's main thread cleans up the monitors in monitor_cleanup().  When"
> 
> >    trying to flush pending responses, it sees nothing.  SHUTDOWN is
> >    lost forever
> 
> Missing a period.
> 
> > Note that before the monitor iothread was introduced, step [4]/[5] could
> > never happen since the main loop was the only place to detect the EOF
> > event of stdin and run the CLOSED event hooks.  Now things can happen in
> > parallel in the iothread.
> >
> > Without this patch, iotest 087 will have ~10% chance to miss the
> > SHUTDOWN event and fail when with Out-Of-Band enabled:
> >
> > 087 8s ... - output mismatch (see 087.out.bad)
> > --- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 18:44:22.378982462 +0800
> > +++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 18:53:44.267840928 +0800
> > @@ -8,7 +8,6 @@
> >  {"return": {}}
> >  {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}}
> >  {"return": {}}
> > -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
> >
> >  === Duplicate ID ===
> > @@ -53,7 +52,6 @@
> >  {"return": {}}
> >  {"return": {}}
> >  {"return": {}}
> > -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
> 
> Suggest to indent the quoted diff a bit, so make it more obviously not
> part of the patch.  In fact, git-am chokes on it for me.
> 
> > This patch fixes the problem.
> >
> > Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
> > Suggested-by: Markus Armbruster <armbru@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Don't let my nitpicking deceive you; this is a really nice commit
> message.

Thanks.  I'll apply all the suggestions until here in my next post.

> 
> One more suggestion.  Your first paragraph describes the solution.  I
> like to start with the problem instead.  More so when the problem
> description is long, as it is here, because by the time I'm done reading
> the problem, I'm prone to have forgotten the solution, and go "okay, and
> how's the patch addressing the problem?"  But please use your judgement.

Yes, talking about the problem first makes sense.  But I'll try to
make it even easier for readers: I'll summarize the problem and
solution first, then I'll keep the rest of the commit message as
detailed description of the problem.  So I'll rewrite my first
paragraph as:

  Previously we clean up the queues when we got CLOSED event.  It was
  used to make sure we won't send leftover replies/events of a old
  client to a new client which makes perfect sense. However this will
  also drop the replies/events even if the output port of the previous
  chardev backend is still open, which can lead to missing of the last
  replies/events.  Now this patch does an extra operation to flush the
  response queue before cleaning up.

Hope this works.

> 
> > ---
> >  monitor.c | 30 +++++++++++++++++++++++++-----
> >  1 file changed, 25 insertions(+), 5 deletions(-)
> >
> > diff --git a/monitor.c b/monitor.c
> > index 6d0cec552e..e59d4f09ac 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -512,20 +512,39 @@ struct QMPResponse {
> >  };
> >  typedef struct QMPResponse QMPResponse;
> >  
> > +static QObject *monitor_qmp_response_pop_one(Monitor *mon)
> > +{
> > +    QObject *data;
> > +
> > +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> > +    data = g_queue_pop_head(mon->qmp.qmp_responses);
> > +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> > +
> > +    return data;
> > +}
> 
> Note for later: returns null only when the queue is empty (we never push
> null onto this queue).
> 
> > +
> > +static void monitor_qmp_response_flush(Monitor *mon)
> > +{
> > +    QObject *data;
> > +
> > +    while ((data = monitor_qmp_response_pop_one(mon))) {
> > +        monitor_json_emitter_raw(mon, data);
> > +        qobject_unref(data);
> > +    }
> > +}
> 
> This flushes the queue to @mon.  Good.
> 
> > +
> >  /*
> >   * Return one QMPResponse.  The response is only valid if
> >   * response.data is not NULL.
> >   */
> > -static QMPResponse monitor_qmp_response_pop_one(void)
> > +static QMPResponse monitor_qmp_response_pop(void)
> >  {
> >      Monitor *mon;
> >      QObject *data = NULL;
> >  
> >      qemu_mutex_lock(&monitor_lock);
> >      QTAILQ_FOREACH(mon, &mon_list, entry) {
> > -        qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> > -        data = g_queue_pop_head(mon->qmp.qmp_responses);
> > -        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> > +        data = monitor_qmp_response_pop_one(mon);
> >          if (data) {
> >              break;
> >          }
>        }
>        qemu_mutex_unlock(&monitor_lock);
>        return (QMPResponse) { .mon = mon, .data = data };
>    }
> 
> Not this patch's fault, but this function gave me pause until I got the
> loop's purpose: the function returns a response from *any* monitor.
> 
> Moreover, your naming creates an inconsistency with
> monitor_qmp_requests_pop_one().
> 
> I append my attempt to make it easier to understand, and restore
> consistency.  I guess I'd split it into a preparatory patch and a fixup
> for this one.

Will do.

> 
> > @@ -539,7 +558,7 @@ static void monitor_qmp_bh_responder(void *opaque)
> >      QMPResponse response;
> >  
> >      while (true) {
> > -        response = monitor_qmp_response_pop_one();
> > +        response = monitor_qmp_response_pop();
> >          if (!response.data) {
> >              break;
> >          }
> > @@ -4366,6 +4385,7 @@ static void monitor_qmp_event(void *opaque, int event)
> >          mon_refcount++;
> >          break;
> >      case CHR_EVENT_CLOSED:
> > +        monitor_qmp_response_flush(mon);
> 
> This wastes work in the common case when @mon's output fd has been
> closed already.  I think that's just fine.

Yeah.  Since I'll repost, I consider to add some comment for that.

> 
> >          monitor_qmp_cleanup_queues(mon);
> >          json_message_parser_destroy(&mon->qmp.parser);
> >          json_message_parser_init(&mon->qmp.parser, handle_qmp_command);
> 
> 
> diff --git a/monitor.c b/monitor.c
> index e59d4f09ac..af28977549 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -512,7 +512,7 @@ struct QMPResponse {
>  };
>  typedef struct QMPResponse QMPResponse;
>  
> -static QObject *monitor_qmp_response_pop_one(Monitor *mon)
> +static QObject *monitor_qmp_requests_pop(Monitor *mon)

I guess it's a typo; I'll keep the "response" word.

>  {
>      QObject *data;
>  
> @@ -527,41 +527,39 @@ static void monitor_qmp_response_flush(Monitor *mon)
>  {
>      QObject *data;
>  
> -    while ((data = monitor_qmp_response_pop_one(mon))) {
> +    while ((data = monitor_qmp_requests_pop(mon))) {

Same here.

>          monitor_json_emitter_raw(mon, data);
>          qobject_unref(data);
>      }
>  }
>  
>  /*
> - * Return one QMPResponse.  The response is only valid if
> - * response.data is not NULL.
> + * Pop a QMPResponse from any monitor's response queue into @response.
> + * Return NULL when all queues are empty, else @response.

I'll change the return value to a boolean directly if it's okay, since
after all we'll pass the QMPResponse in now.

I'll put the rest of the content into a standalone patch.

Thanks,

>   */
> -static QMPResponse monitor_qmp_response_pop(void)
> +static QMPResponse *monitor_qmp_response_pop_any(QMPResponse *response)
>  {
>      Monitor *mon;
>      QObject *data = NULL;
>  
>      qemu_mutex_lock(&monitor_lock);
>      QTAILQ_FOREACH(mon, &mon_list, entry) {
> -        data = monitor_qmp_response_pop_one(mon);
> +        data = monitor_qmp_requests_pop(mon);
>          if (data) {
> +            response->mon = mon;
> +            response->data = data;
>              break;
>          }
>      }
>      qemu_mutex_unlock(&monitor_lock);
> -    return (QMPResponse) { .mon = mon, .data = data };
> +    return data ? response : NULL;
>  }
>  
>  static void monitor_qmp_bh_responder(void *opaque)
>  {
>      QMPResponse response;
>  
> -    while (true) {
> -        response = monitor_qmp_response_pop();
> -        if (!response.data) {
> -            break;
> -        }
> +    while (monitor_qmp_response_pop_any(&response)) {
>          monitor_json_emitter_raw(response.mon, response.data);
>          qobject_unref(response.data);
>      }
> @@ -4126,7 +4124,7 @@ static void monitor_qmp_dispatch_one(QMPRequest *req_obj)
>   * when we process one request on a specific monitor, we put that
>   * monitor to the end of mon_list queue.
>   */
> -static QMPRequest *monitor_qmp_requests_pop_one(void)
> +static QMPRequest *monitor_qmp_requests_pop_any(void)
>  {
>      QMPRequest *req_obj = NULL;
>      Monitor *mon;
> @@ -4158,7 +4156,7 @@ static QMPRequest *monitor_qmp_requests_pop_one(void)
>  
>  static void monitor_qmp_bh_dispatcher(void *data)
>  {
> -    QMPRequest *req_obj = monitor_qmp_requests_pop_one();
> +    QMPRequest *req_obj = monitor_qmp_requests_pop_any();
>  
>      if (req_obj) {
>          trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
Markus Armbruster June 19, 2018, 1:40 p.m. UTC | #3
Peter Xu <peterx@redhat.com> writes:

> On Fri, Jun 15, 2018 at 10:11:34AM +0200, Markus Armbruster wrote:
[...]
>> diff --git a/monitor.c b/monitor.c
>> index e59d4f09ac..af28977549 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -512,7 +512,7 @@ struct QMPResponse {
>>  };
>>  typedef struct QMPResponse QMPResponse;
>>  
>> -static QObject *monitor_qmp_response_pop_one(Monitor *mon)
>> +static QObject *monitor_qmp_requests_pop(Monitor *mon)
>
> I guess it's a typo; I'll keep the "response" word.

You're right.

>>  {
>>      QObject *data;
>>  
>> @@ -527,41 +527,39 @@ static void monitor_qmp_response_flush(Monitor *mon)
>>  {
>>      QObject *data;
>>  
>> -    while ((data = monitor_qmp_response_pop_one(mon))) {
>> +    while ((data = monitor_qmp_requests_pop(mon))) {
>
> Same here.
>
>>          monitor_json_emitter_raw(mon, data);
>>          qobject_unref(data);
>>      }
>>  }
>>  
>>  /*
>> - * Return one QMPResponse.  The response is only valid if
>> - * response.data is not NULL.
>> + * Pop a QMPResponse from any monitor's response queue into @response.
>> + * Return NULL when all queues are empty, else @response.
>
> I'll change the return value to a boolean directly if it's okay, since
> after all we'll pass the QMPResponse in now.

I routinely return the buffer argument in cases like this.  It's quite
traditional, e.g. fgets(), memcpy(), strcpy() do it, too.  Occasionally
permits more concise code, but not here.

[...]
diff mbox series

Patch

--- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 18:44:22.378982462 +0800
+++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 18:53:44.267840928 +0800
@@ -8,7 +8,6 @@ 
 {"return": {}}
 {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}}
 {"return": {}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}

 === Duplicate ID ===
@@ -53,7 +52,6 @@ 
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}

This patch fixes the problem.

Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 monitor.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/monitor.c b/monitor.c
index 6d0cec552e..e59d4f09ac 100644
--- a/monitor.c
+++ b/monitor.c
@@ -512,20 +512,39 @@  struct QMPResponse {
 };
 typedef struct QMPResponse QMPResponse;
 
+static QObject *monitor_qmp_response_pop_one(Monitor *mon)
+{
+    QObject *data;
+
+    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
+    data = g_queue_pop_head(mon->qmp.qmp_responses);
+    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
+
+    return data;
+}
+
+static void monitor_qmp_response_flush(Monitor *mon)
+{
+    QObject *data;
+
+    while ((data = monitor_qmp_response_pop_one(mon))) {
+        monitor_json_emitter_raw(mon, data);
+        qobject_unref(data);
+    }
+}
+
 /*
  * Return one QMPResponse.  The response is only valid if
  * response.data is not NULL.
  */
-static QMPResponse monitor_qmp_response_pop_one(void)
+static QMPResponse monitor_qmp_response_pop(void)
 {
     Monitor *mon;
     QObject *data = NULL;
 
     qemu_mutex_lock(&monitor_lock);
     QTAILQ_FOREACH(mon, &mon_list, entry) {
-        qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
-        data = g_queue_pop_head(mon->qmp.qmp_responses);
-        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
+        data = monitor_qmp_response_pop_one(mon);
         if (data) {
             break;
         }
@@ -539,7 +558,7 @@  static void monitor_qmp_bh_responder(void *opaque)
     QMPResponse response;
 
     while (true) {
-        response = monitor_qmp_response_pop_one();
+        response = monitor_qmp_response_pop();
         if (!response.data) {
             break;
         }
@@ -4366,6 +4385,7 @@  static void monitor_qmp_event(void *opaque, int event)
         mon_refcount++;
         break;
     case CHR_EVENT_CLOSED:
+        monitor_qmp_response_flush(mon);
         monitor_qmp_cleanup_queues(mon);
         json_message_parser_destroy(&mon->qmp.parser);
         json_message_parser_init(&mon->qmp.parser, handle_qmp_command);