diff mbox series

[v7,3/7] monitor: suspend monitor instead of send CMD_DROP

Message ID 20180903043149.4076-4-peterx@redhat.com
State New
Headers show
Series [v7,1/7] qapi: Fix build_params() for empty parameter list | expand

Commit Message

Peter Xu Sept. 3, 2018, 4:31 a.m. UTC
When we received too many qmp commands, previously we'll send
COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
instead of dropping the command we stop reading from input when we
notice the queue is getting full.  Note that here since we removed the
need_resume flag we need to be _very_ careful on the suspend/resume
paring on the conditions since unmatched condition checks will hang
death the monitor.  Meanwhile, now we will need to read the queue length
to decide whether we'll need to resume the monitor so we need to take
the queue lock again even after popping from it.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 monitor.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

Comments

Markus Armbruster Sept. 3, 2018, 7:38 a.m. UTC | #1
Peter Xu <peterx@redhat.com> writes:

> When we received too many qmp commands, previously we'll send
> COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
> instead of dropping the command we stop reading from input when we
> notice the queue is getting full.  Note that here since we removed the
> need_resume flag we need to be _very_ careful on the suspend/resume
> paring on the conditions since unmatched condition checks will hang
> death the monitor.  Meanwhile, now we will need to read the queue length
> to decide whether we'll need to resume the monitor so we need to take
> the queue lock again even after popping from it.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

The subject's "send CMD_DROP" fails to highlight the important part:
we're no longer dropping commands.  Moreover, the commit message could
do a better job explaining the tradeoffs.  Here's my try:

    monitor: Suspend monitor instead dropping commands

    When a QMP client sends in-band commands more quickly that we can
    process them, we can either queue them without limit (QUEUE), drop
    commands when the queue is full (DROP), or suspend receiving commands
    when the queue is full (SUSPEND).  None of them is ideal:

    * QUEUE lets a misbehaving client make QEMU eat memory without bounds.
      Not such a hot idea.

    * With DROP, the client has to cope with dropped in-band commands.  To
      inform the client, we send a COMMAND_DROPPED event then.  The event is
      flawed by design in two ways: it's ambiguous (see commit d621cfe0a17),
      and it brings back the "eat memory without bounds" problem.

    * With SUSPEND, the client has to manage the flow of in-band commands to
      keep the monitor available for out-of-band commands.

    We currently DROP.  Switch to SUSPEND.

    Managing the flow of in-band commands to keep the monitor available for
    out-of-band commands isn't really hard: just count the number of
    "outstanding" in-band commands (commands sent minus replies received),
    and if it exceeds the limit, hold back additional ones until it drops
    below the limit again.

    Note that we need to be careful pairing the suspend with a resume, or
    else the monitor will hang, possibly forever.


> ---
>  monitor.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
>
> diff --git a/monitor.c b/monitor.c
> index 3b90c9eb5f..9e6cad2af6 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -89,6 +89,8 @@
>  #include "hw/s390x/storage-attributes.h"
>  #endif
>  
> +#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> +

Let's drop the parenthesis.

>  /*
>   * Supported types:
>   *
> @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void)
>  static void monitor_qmp_bh_dispatcher(void *data)
>  {
>      QMPRequest *req_obj = monitor_qmp_requests_pop_any();
> +    Monitor *mon;
>      QDict *rsp;
>      bool need_resume;
>  
>      if (!req_obj) {
>          return;
>      }
> -

Let's keep the blank line.

> +    mon = req_obj->mon;
>      /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> -    need_resume = !qmp_oob_enabled(req_obj->mon);
> +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> +    need_resume = !qmp_oob_enabled(req_obj->mon) ||
> +        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;

Note for later: this is the condition guarding monitor_resume().

Is this race-free?  We have two criticial sections, one in
monitor_qmp_requests_pop_any(), and one here.  What if two threads
interleave like this

    thread 1                            thread 2
    ------------------------------------------------------------------
    monitor_qmp_requests_pop_any()
        lock
        dequeue
        unlock
                                        monitor_qmp_requests_pop_any()
                                            lock
                                            dequeue
                                            unlock
                                        lock
                                        check queue length
                                        unlock
                                        if queue length demands it
                                             monitor_resume() 
    lock
    check queue length
    unlock
    if queue length demands it
        monitor_resume()

Looks racy to me: if we start with the queue full (and the monitor
suspended), both threads' check queue length see length
QMP_REQ_QUEUE_LEN_MAX - 2, and neither thread resumes the monitor.
Oops.

Simplest fix is probably moving the resume into
monitor_qmp_requests_pop_any()'s critical section.

> +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
>      if (req_obj->req) {
>          trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
> -        monitor_qmp_dispatch(req_obj->mon, req_obj->req, req_obj->id);
> +        monitor_qmp_dispatch(mon, req_obj->req, req_obj->id);
>      } else {
>          assert(req_obj->err);
>          rsp = qmp_error_response(req_obj->err);
>          req_obj->err = NULL;
> -        monitor_qmp_respond(req_obj->mon, rsp, NULL);
> +        monitor_qmp_respond(mon, rsp, NULL);
>          qobject_unref(rsp);
>      }
>  
>      if (need_resume) {
>          /* Pairs with the monitor_suspend() in handle_qmp_command() */
> -        monitor_resume(req_obj->mon);
> +        monitor_resume(mon);
>      }
>      qmp_request_free(req_obj);

The replacement of req_obj->mon by mon makes this change less clear than
it could be.  Let's figure out the possible race, and then we'll see
whether we'll still want this replacement.

>  
> @@ -4154,8 +4160,6 @@ static void monitor_qmp_bh_dispatcher(void *data)
>      qemu_bh_schedule(qmp_dispatcher_bh);
>  }
>  
> -#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> -
>  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>  {
>      Monitor *mon = opaque;
> @@ -4205,19 +4209,12 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
       /*
        * If OOB is not enabled on the current monitor, we'll emulate the
        * old behavior that we won't process the current monitor any more
        * until it has responded.  This helps make sure that as long as
        * OOB is not enabled, the server will never drop any command.
        */

This comment is now stale, we don't drop commands anymore.

>      if (!qmp_oob_enabled(mon)) {
>          monitor_suspend(mon);
>      } else {
> -        /* Drop the request if queue is full. */
> -        if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
> -            qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> +        if (mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
>              /*
> -             * FIXME @id's scope is just @mon, and broadcasting it is
> -             * wrong.  If another monitor's client has a command with
> -             * the same ID in flight, the event will incorrectly claim
> -             * that command was dropped.
> +             * If this is _exactly_ the last request that we can
> +             * queue, we suspend the monitor right now.
>               */
> -            qapi_event_send_command_dropped(id,
> -                                            COMMAND_DROP_REASON_QUEUE_FULL);
> -            qmp_request_free(req_obj);
> -            return;
> +            monitor_suspend(mon);
>          }
>      }

The conditional and its comments look unbalanced.  Moreover, the
condition is visually dissimilar to the one guarding the matching
monitor_resume().  What about:

       /*
        * Suspend the monitor when we can't queue more requests after
        * this one.  Dequeuing in monitor_qmp_bh_dispatcher() will resume
        * it.
        * Note that when OOB is disabled, we queue at most one command,
        * for backward compatibility.
        */
       if (!qmp_oob_enabled(mon) ||
           mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
           monitor_suspend(mon);
       }

You'll have to update the comment if you move the resume to
monitor_qmp_requests_pop_any().

Next:

       /*
        * Put the request to the end of queue so that requests will be
        * handled in time order.  Ownership for req_obj, req, id,
        * etc. will be delivered to the handler side.
        */

Suggest asserting mon->qmp.qmp_requests->length < QMP_REQ_QUEUE_LEN_MAX
here.

       g_queue_push_tail(mon->qmp.qmp_requests, req_obj);
       qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
Markus Armbruster Sept. 3, 2018, 7:56 a.m. UTC | #2
One more thing: we need test coverage for "suspend on full queue, resume
when the logjam clears".  qmp-test.c got the building blocks.  Something
like this:

    send_cmd_that_blocks() eight times
    send_oob_cmd_that_fails()
    unblock_blocked_cmd()
    recv_cmd_id() for the 1st in-band command
    recv_cmd_id() for the oob command
    unblock_blocked_cmd()
    recv_cmd_id() for the 2nd in-band command
    ... repeat for the remaining six in-band commands ...
Peter Xu Sept. 3, 2018, 9:06 a.m. UTC | #3
On Mon, Sep 03, 2018 at 09:38:00AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > When we received too many qmp commands, previously we'll send
> > COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
> > instead of dropping the command we stop reading from input when we
> > notice the queue is getting full.  Note that here since we removed the
> > need_resume flag we need to be _very_ careful on the suspend/resume
> > paring on the conditions since unmatched condition checks will hang
> > death the monitor.  Meanwhile, now we will need to read the queue length
> > to decide whether we'll need to resume the monitor so we need to take
> > the queue lock again even after popping from it.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> The subject's "send CMD_DROP" fails to highlight the important part:
> we're no longer dropping commands.  Moreover, the commit message could
> do a better job explaining the tradeoffs.  Here's my try:
> 
>     monitor: Suspend monitor instead dropping commands
> 
>     When a QMP client sends in-band commands more quickly that we can
>     process them, we can either queue them without limit (QUEUE), drop
>     commands when the queue is full (DROP), or suspend receiving commands
>     when the queue is full (SUSPEND).  None of them is ideal:
> 
>     * QUEUE lets a misbehaving client make QEMU eat memory without bounds.
>       Not such a hot idea.
> 
>     * With DROP, the client has to cope with dropped in-band commands.  To
>       inform the client, we send a COMMAND_DROPPED event then.  The event is
>       flawed by design in two ways: it's ambiguous (see commit d621cfe0a17),
>       and it brings back the "eat memory without bounds" problem.
> 
>     * With SUSPEND, the client has to manage the flow of in-band commands to
>       keep the monitor available for out-of-band commands.
> 
>     We currently DROP.  Switch to SUSPEND.
> 
>     Managing the flow of in-band commands to keep the monitor available for
>     out-of-band commands isn't really hard: just count the number of
>     "outstanding" in-band commands (commands sent minus replies received),
>     and if it exceeds the limit, hold back additional ones until it drops
>     below the limit again.

(Will the simplest QMP client just block at the write() system call
 without notice?  Anyway, the SUSPEND is ideal enough to me... :)

> 
>     Note that we need to be careful pairing the suspend with a resume, or
>     else the monitor will hang, possibly forever.

Will take your version, thanks.

> 
> 
> > ---
> >  monitor.c | 33 +++++++++++++++------------------
> >  1 file changed, 15 insertions(+), 18 deletions(-)
> >
> > diff --git a/monitor.c b/monitor.c
> > index 3b90c9eb5f..9e6cad2af6 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -89,6 +89,8 @@
> >  #include "hw/s390x/storage-attributes.h"
> >  #endif
> >  
> > +#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> > +
> 
> Let's drop the parenthesis.

Ok.

> 
> >  /*
> >   * Supported types:
> >   *
> > @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void)
> >  static void monitor_qmp_bh_dispatcher(void *data)
> >  {
> >      QMPRequest *req_obj = monitor_qmp_requests_pop_any();
> > +    Monitor *mon;
> >      QDict *rsp;
> >      bool need_resume;
> >  
> >      if (!req_obj) {
> >          return;
> >      }
> > -
> 
> Let's keep the blank line.

Ok.

> 
> > +    mon = req_obj->mon;
> >      /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> > -    need_resume = !qmp_oob_enabled(req_obj->mon);
> > +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> > +    need_resume = !qmp_oob_enabled(req_obj->mon) ||
> > +        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> 
> Note for later: this is the condition guarding monitor_resume().
> 
> Is this race-free?  We have two criticial sections, one in
> monitor_qmp_requests_pop_any(), and one here.  What if two threads
> interleave like this
> 
>     thread 1                            thread 2
>     ------------------------------------------------------------------
>     monitor_qmp_requests_pop_any()
>         lock
>         dequeue
>         unlock
>                                         monitor_qmp_requests_pop_any()
>                                             lock
>                                             dequeue
>                                             unlock
>                                         lock
>                                         check queue length
>                                         unlock
>                                         if queue length demands it
>                                              monitor_resume() 
>     lock
>     check queue length
>     unlock
>     if queue length demands it
>         monitor_resume()
> 
> Looks racy to me: if we start with the queue full (and the monitor
> suspended), both threads' check queue length see length
> QMP_REQ_QUEUE_LEN_MAX - 2, and neither thread resumes the monitor.
> Oops.
> 
> Simplest fix is probably moving the resume into
> monitor_qmp_requests_pop_any()'s critical section.

But we only have one QMP dispatcher, right?  So IMHO we can't have two
threads doing monitor_qmp_requests_pop_any() at the same time.

But I fully agree that it'll be nicer to keep them together (e.g. in
case we allow to dispatch two commands some day), but then if you see
it'll be something like the old req_obj->need_resume flag, but we set
it in monitor_qmp_requests_pop_any() instead.  If so, I really prefer
we just keep that need_resume flag for per-monitor by dropping
176160ce78 and keep my patch to move that flag into monitor struct
(which I dropped after the rebase of this version).

> 
> > +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> >      if (req_obj->req) {
> >          trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
> > -        monitor_qmp_dispatch(req_obj->mon, req_obj->req, req_obj->id);
> > +        monitor_qmp_dispatch(mon, req_obj->req, req_obj->id);
> >      } else {
> >          assert(req_obj->err);
> >          rsp = qmp_error_response(req_obj->err);
> >          req_obj->err = NULL;
> > -        monitor_qmp_respond(req_obj->mon, rsp, NULL);
> > +        monitor_qmp_respond(mon, rsp, NULL);
> >          qobject_unref(rsp);
> >      }
> >  
> >      if (need_resume) {
> >          /* Pairs with the monitor_suspend() in handle_qmp_command() */
> > -        monitor_resume(req_obj->mon);
> > +        monitor_resume(mon);
> >      }
> >      qmp_request_free(req_obj);
> 
> The replacement of req_obj->mon by mon makes this change less clear than
> it could be.  Let's figure out the possible race, and then we'll see
> whether we'll still want this replacement.

Sure.

> 
> >  
> > @@ -4154,8 +4160,6 @@ static void monitor_qmp_bh_dispatcher(void *data)
> >      qemu_bh_schedule(qmp_dispatcher_bh);
> >  }
> >  
> > -#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> > -
> >  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
> >  {
> >      Monitor *mon = opaque;
> > @@ -4205,19 +4209,12 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>        /*
>         * If OOB is not enabled on the current monitor, we'll emulate the
>         * old behavior that we won't process the current monitor any more
>         * until it has responded.  This helps make sure that as long as
>         * OOB is not enabled, the server will never drop any command.
>         */
> 
> This comment is now stale, we don't drop commands anymore.

AFAIU it's describing [1] below but nothing related to COMMAND_DROP?

> 
> >      if (!qmp_oob_enabled(mon)) {
> >          monitor_suspend(mon);

[1]

> >      } else {
> > -        /* Drop the request if queue is full. */
> > -        if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
> > -            qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> > +        if (mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
> >              /*
> > -             * FIXME @id's scope is just @mon, and broadcasting it is
> > -             * wrong.  If another monitor's client has a command with
> > -             * the same ID in flight, the event will incorrectly claim
> > -             * that command was dropped.
> > +             * If this is _exactly_ the last request that we can
> > +             * queue, we suspend the monitor right now.
> >               */
> > -            qapi_event_send_command_dropped(id,
> > -                                            COMMAND_DROP_REASON_QUEUE_FULL);
> > -            qmp_request_free(req_obj);
> > -            return;
> > +            monitor_suspend(mon);
> >          }
> >      }
> 
> The conditional and its comments look unbalanced.  Moreover, the
> condition is visually dissimilar to the one guarding the matching
> monitor_resume().  What about:
> 
>        /*
>         * Suspend the monitor when we can't queue more requests after
>         * this one.  Dequeuing in monitor_qmp_bh_dispatcher() will resume
>         * it.
>         * Note that when OOB is disabled, we queue at most one command,
>         * for backward compatibility.
>         */
>        if (!qmp_oob_enabled(mon) ||
>            mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
>            monitor_suspend(mon);
>        }
> 
> You'll have to update the comment if you move the resume to
> monitor_qmp_requests_pop_any().
> 
> Next:
> 
>        /*
>         * Put the request to the end of queue so that requests will be
>         * handled in time order.  Ownership for req_obj, req, id,
>         * etc. will be delivered to the handler side.
>         */
> 
> Suggest asserting mon->qmp.qmp_requests->length < QMP_REQ_QUEUE_LEN_MAX
> here.

Sure I can do this.

> 
>        g_queue_push_tail(mon->qmp.qmp_requests, req_obj);
>        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);

Thanks,
Markus Armbruster Sept. 3, 2018, 1:16 p.m. UTC | #4
Peter Xu <peterx@redhat.com> writes:

> On Mon, Sep 03, 2018 at 09:38:00AM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > When we received too many qmp commands, previously we'll send
>> > COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
>> > instead of dropping the command we stop reading from input when we
>> > notice the queue is getting full.  Note that here since we removed the
>> > need_resume flag we need to be _very_ careful on the suspend/resume
>> > paring on the conditions since unmatched condition checks will hang
>> > death the monitor.  Meanwhile, now we will need to read the queue length
>> > to decide whether we'll need to resume the monitor so we need to take
>> > the queue lock again even after popping from it.
>> >
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> 
>> The subject's "send CMD_DROP" fails to highlight the important part:
>> we're no longer dropping commands.  Moreover, the commit message could
>> do a better job explaining the tradeoffs.  Here's my try:
>> 
>>     monitor: Suspend monitor instead dropping commands
>> 
>>     When a QMP client sends in-band commands more quickly that we can
>>     process them, we can either queue them without limit (QUEUE), drop
>>     commands when the queue is full (DROP), or suspend receiving commands
>>     when the queue is full (SUSPEND).  None of them is ideal:
>> 
>>     * QUEUE lets a misbehaving client make QEMU eat memory without bounds.
>>       Not such a hot idea.
>> 
>>     * With DROP, the client has to cope with dropped in-band commands.  To
>>       inform the client, we send a COMMAND_DROPPED event then.  The event is
>>       flawed by design in two ways: it's ambiguous (see commit d621cfe0a17),
>>       and it brings back the "eat memory without bounds" problem.
>> 
>>     * With SUSPEND, the client has to manage the flow of in-band commands to
>>       keep the monitor available for out-of-band commands.
>> 
>>     We currently DROP.  Switch to SUSPEND.
>> 
>>     Managing the flow of in-band commands to keep the monitor available for
>>     out-of-band commands isn't really hard: just count the number of
>>     "outstanding" in-band commands (commands sent minus replies received),
>>     and if it exceeds the limit, hold back additional ones until it drops
>>     below the limit again.
>
> (Will the simplest QMP client just block at the write() system call
>  without notice?

Yes.

When you stop reading from a socket or pipe, the peer eventually can't
write more.  "Eventually", because the TCP connection or pipe buffers
some.

A naive client using a blocking write() will block then.

A slightly more sophisticated client using a non-blocking write() has
its write() fail with EAGAIN or EWOULDBLOCK.

In both cases, OOB commands may be stuck in the TCP connection's /
pipe's buffer.


>                   Anyway, the SUSPEND is ideal enough to me... :)
>
>> 
>>     Note that we need to be careful pairing the suspend with a resume, or
>>     else the monitor will hang, possibly forever.
>
> Will take your version, thanks.
>
>> 
>> 
>> > ---
>> >  monitor.c | 33 +++++++++++++++------------------
>> >  1 file changed, 15 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/monitor.c b/monitor.c
>> > index 3b90c9eb5f..9e6cad2af6 100644
>> > --- a/monitor.c
>> > +++ b/monitor.c
>> > @@ -89,6 +89,8 @@
>> >  #include "hw/s390x/storage-attributes.h"
>> >  #endif
>> >  
>> > +#define  QMP_REQ_QUEUE_LEN_MAX  (8)
>> > +
>> 
>> Let's drop the parenthesis.
>
> Ok.
>
>> 
>> >  /*
>> >   * Supported types:
>> >   *
>> > @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void)
>> >  static void monitor_qmp_bh_dispatcher(void *data)
>> >  {
>> >      QMPRequest *req_obj = monitor_qmp_requests_pop_any();
>> > +    Monitor *mon;
>> >      QDict *rsp;
>> >      bool need_resume;
>> >  
>> >      if (!req_obj) {
>> >          return;
>> >      }
>> > -
>> 
>> Let's keep the blank line.
>
> Ok.
>
>> 
>> > +    mon = req_obj->mon;
>> >      /*  qmp_oob_enabled() might change after "qmp_capabilities" */
>> > -    need_resume = !qmp_oob_enabled(req_obj->mon);
>> > +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
>> > +    need_resume = !qmp_oob_enabled(req_obj->mon) ||
>> > +        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
>> 
>> Note for later: this is the condition guarding monitor_resume().
>> 
>> Is this race-free?  We have two criticial sections, one in
>> monitor_qmp_requests_pop_any(), and one here.  What if two threads
>> interleave like this
>> 
>>     thread 1                            thread 2
>>     ------------------------------------------------------------------
>>     monitor_qmp_requests_pop_any()
>>         lock
>>         dequeue
>>         unlock
>>                                         monitor_qmp_requests_pop_any()
>>                                             lock
>>                                             dequeue
>>                                             unlock
>>                                         lock
>>                                         check queue length
>>                                         unlock
>>                                         if queue length demands it
>>                                              monitor_resume() 
>>     lock
>>     check queue length
>>     unlock
>>     if queue length demands it
>>         monitor_resume()
>> 
>> Looks racy to me: if we start with the queue full (and the monitor
>> suspended), both threads' check queue length see length
>> QMP_REQ_QUEUE_LEN_MAX - 2, and neither thread resumes the monitor.
>> Oops.
>> 
>> Simplest fix is probably moving the resume into
>> monitor_qmp_requests_pop_any()'s critical section.
>
> But we only have one QMP dispatcher, right?  So IMHO we can't have two
> threads doing monitor_qmp_requests_pop_any() at the same time.

You're right, we currently run monitor_qmp_bh_dispatcher() only in the
main thread, and a thread can't race with itself.  But the main thread
can still race with handle_qmp_command() running in mon_iothread.

    main thread                         mon_iothread
    ------------------------------------------------------------------
    monitor_qmp_requests_pop_any()
        lock
        dequeue
        unlock
                                        lock
                                        if queue length demands it
                                            monitor_suspend()
                                        push onto queue
                                        unlock
    lock
    check queue length
    unlock
    if queue length demands it
        monitor_resume()

If we start with the queue full (and the monitor suspended), the main
thread first determines it'll need to resume.  mon_iothread then fills
the queue again, and suspends the suspended monitor some more.  If I
read the code correctly, this increments mon->suspend_cnt from 1 to 2.
Finally, the main thread checks the queue length:

       need_resume = !qmp_oob_enabled(req_obj->mon) ||
           mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;

The yields false, because ->length is now QMP_REQ_QUEUE_LEN_MAX.  The
main thread does *not* resume the monitor.

State after this: queue full, mon->suspend_cnt == 2.  Bad, but we'll
recover on the dequeue after next: the next dequeue decrements
mon->suspend_cnt to 1 without resuming the monitor, the one after that
will decrement it to 0 and resume the monitor.

However, if handle_qmp_command() runs in this spot often enough to push
mon->suspend_cnt above QMP_REQ_QUEUE_LEN_MAX, the monitor will remain
suspended forever.

I'm teaching you multiprogramming 101 here.  The thing that should make
the moderately experienced nose twitch is the anti-pattern

    begin critical section
    do something
    end critical section
    begin critical section
    if we just did X, state must be Y, so we must now do Z
    end critical section

The part "if we just did X, state must be Y" is *wrong*.  You have no
idea what the state is, since code running between the two critical
sections may have changed it.

Here,

    X = dequeued from a full queue"
    Y = "mon->suspend_cnt == 1"
    Z = monitor_resume() to resume the monitor

I showed that Y does not hold.

On a slightly more abstract level, this anti-pattern applies:

    begin critical section
    start messing with invariant
    end critical section
    // invariant does not hold here, oopsie
    begin critical section
    finish messing with invariant
    end critical section

The invariant here is "monitor suspended exactly when the queue is
full".  Your first critical section can make the queue non-full.  The
second one resumes.  The invariant doesn't hold in between, and all bets
are off.

To maintain the invariant, you *have* to enqueue and suspend in a single
critical section (which you do), *and* dequeue and resume in a single
critical section (which you don't).

> But I fully agree that it'll be nicer to keep them together (e.g. in
> case we allow to dispatch two commands some day), but then if you see
> it'll be something like the old req_obj->need_resume flag, but we set
> it in monitor_qmp_requests_pop_any() instead.  If so, I really prefer
> we just keep that need_resume flag for per-monitor by dropping
> 176160ce78 and keep my patch to move that flag into monitor struct
> (which I dropped after the rebase of this version).

Please have another look.

>> > +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
>> >      if (req_obj->req) {
>> >          trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
>> > -        monitor_qmp_dispatch(req_obj->mon, req_obj->req, req_obj->id);
>> > +        monitor_qmp_dispatch(mon, req_obj->req, req_obj->id);
>> >      } else {
>> >          assert(req_obj->err);
>> >          rsp = qmp_error_response(req_obj->err);
>> >          req_obj->err = NULL;
>> > -        monitor_qmp_respond(req_obj->mon, rsp, NULL);
>> > +        monitor_qmp_respond(mon, rsp, NULL);
>> >          qobject_unref(rsp);
>> >      }
>> >  
>> >      if (need_resume) {
>> >          /* Pairs with the monitor_suspend() in handle_qmp_command() */
>> > -        monitor_resume(req_obj->mon);
>> > +        monitor_resume(mon);
>> >      }
>> >      qmp_request_free(req_obj);
>> 
>> The replacement of req_obj->mon by mon makes this change less clear than
>> it could be.  Let's figure out the possible race, and then we'll see
>> whether we'll still want this replacement.
>
> Sure.
>
>> 
>> >  
>> > @@ -4154,8 +4160,6 @@ static void monitor_qmp_bh_dispatcher(void *data)
>> >      qemu_bh_schedule(qmp_dispatcher_bh);
>> >  }
>> >  
>> > -#define  QMP_REQ_QUEUE_LEN_MAX  (8)
>> > -
>> >  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>> >  {
>> >      Monitor *mon = opaque;
>> > @@ -4205,19 +4209,12 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>>        /*
>>         * If OOB is not enabled on the current monitor, we'll emulate the
>>         * old behavior that we won't process the current monitor any more
>>         * until it has responded.  This helps make sure that as long as
>>         * OOB is not enabled, the server will never drop any command.
>>         */
>> 
>> This comment is now stale, we don't drop commands anymore.
>
> AFAIU it's describing [1] below but nothing related to COMMAND_DROP?

The comment suggests the server can drop commands when OOB is enabled.
This is no longer correct after this patch.

>> 
>> >      if (!qmp_oob_enabled(mon)) {
>> >          monitor_suspend(mon);
>
> [1]
>
>> >      } else {
>> > -        /* Drop the request if queue is full. */
>> > -        if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
>> > -            qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
>> > +        if (mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
>> >              /*
>> > -             * FIXME @id's scope is just @mon, and broadcasting it is
>> > -             * wrong.  If another monitor's client has a command with
>> > -             * the same ID in flight, the event will incorrectly claim
>> > -             * that command was dropped.
>> > +             * If this is _exactly_ the last request that we can
>> > +             * queue, we suspend the monitor right now.
>> >               */
>> > -            qapi_event_send_command_dropped(id,
>> > -                                            COMMAND_DROP_REASON_QUEUE_FULL);
>> > -            qmp_request_free(req_obj);
>> > -            return;
>> > +            monitor_suspend(mon);
>> >          }
>> >      }
>> 
>> The conditional and its comments look unbalanced.  Moreover, the
>> condition is visually dissimilar to the one guarding the matching
>> monitor_resume().  What about:
>> 
>>        /*
>>         * Suspend the monitor when we can't queue more requests after
>>         * this one.  Dequeuing in monitor_qmp_bh_dispatcher() will resume
>>         * it.
>>         * Note that when OOB is disabled, we queue at most one command,
>>         * for backward compatibility.
>>         */
>>        if (!qmp_oob_enabled(mon) ||
>>            mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
>>            monitor_suspend(mon);
>>        }
>> 
>> You'll have to update the comment if you move the resume to
>> monitor_qmp_requests_pop_any().
>> 
>> Next:
>> 
>>        /*
>>         * Put the request to the end of queue so that requests will be
>>         * handled in time order.  Ownership for req_obj, req, id,
>>         * etc. will be delivered to the handler side.
>>         */
>> 
>> Suggest asserting mon->qmp.qmp_requests->length < QMP_REQ_QUEUE_LEN_MAX
>> here.
>
> Sure I can do this.
>
>> 
>>        g_queue_push_tail(mon->qmp.qmp_requests, req_obj);
>>        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
>
> Thanks,
Peter Xu Sept. 4, 2018, 3:33 a.m. UTC | #5
On Mon, Sep 03, 2018 at 03:16:55PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Sep 03, 2018 at 09:38:00AM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > When we received too many qmp commands, previously we'll send
> >> > COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
> >> > instead of dropping the command we stop reading from input when we
> >> > notice the queue is getting full.  Note that here since we removed the
> >> > need_resume flag we need to be _very_ careful on the suspend/resume
> >> > paring on the conditions since unmatched condition checks will hang
> >> > death the monitor.  Meanwhile, now we will need to read the queue length
> >> > to decide whether we'll need to resume the monitor so we need to take
> >> > the queue lock again even after popping from it.
> >> >
> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >> 
> >> The subject's "send CMD_DROP" fails to highlight the important part:
> >> we're no longer dropping commands.  Moreover, the commit message could
> >> do a better job explaining the tradeoffs.  Here's my try:
> >> 
> >>     monitor: Suspend monitor instead dropping commands
> >> 
> >>     When a QMP client sends in-band commands more quickly that we can
> >>     process them, we can either queue them without limit (QUEUE), drop
> >>     commands when the queue is full (DROP), or suspend receiving commands
> >>     when the queue is full (SUSPEND).  None of them is ideal:
> >> 
> >>     * QUEUE lets a misbehaving client make QEMU eat memory without bounds.
> >>       Not such a hot idea.
> >> 
> >>     * With DROP, the client has to cope with dropped in-band commands.  To
> >>       inform the client, we send a COMMAND_DROPPED event then.  The event is
> >>       flawed by design in two ways: it's ambiguous (see commit d621cfe0a17),
> >>       and it brings back the "eat memory without bounds" problem.
> >> 
> >>     * With SUSPEND, the client has to manage the flow of in-band commands to
> >>       keep the monitor available for out-of-band commands.
> >> 
> >>     We currently DROP.  Switch to SUSPEND.
> >> 
> >>     Managing the flow of in-band commands to keep the monitor available for
> >>     out-of-band commands isn't really hard: just count the number of
> >>     "outstanding" in-band commands (commands sent minus replies received),
> >>     and if it exceeds the limit, hold back additional ones until it drops
> >>     below the limit again.
> >
> > (Will the simplest QMP client just block at the write() system call
> >  without notice?
> 
> Yes.
> 
> When you stop reading from a socket or pipe, the peer eventually can't
> write more.  "Eventually", because the TCP connection or pipe buffers
> some.
> 
> A naive client using a blocking write() will block then.
> 
> A slightly more sophisticated client using a non-blocking write() has
> its write() fail with EAGAIN or EWOULDBLOCK.
> 
> In both cases, OOB commands may be stuck in the TCP connection's /
> pipe's buffer.
> 
> 
> >                   Anyway, the SUSPEND is ideal enough to me... :)
> >
> >> 
> >>     Note that we need to be careful pairing the suspend with a resume, or
> >>     else the monitor will hang, possibly forever.
> >
> > Will take your version, thanks.
> >
> >> 
> >> 
> >> > ---
> >> >  monitor.c | 33 +++++++++++++++------------------
> >> >  1 file changed, 15 insertions(+), 18 deletions(-)
> >> >
> >> > diff --git a/monitor.c b/monitor.c
> >> > index 3b90c9eb5f..9e6cad2af6 100644
> >> > --- a/monitor.c
> >> > +++ b/monitor.c
> >> > @@ -89,6 +89,8 @@
> >> >  #include "hw/s390x/storage-attributes.h"
> >> >  #endif
> >> >  
> >> > +#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> >> > +
> >> 
> >> Let's drop the parenthesis.
> >
> > Ok.
> >
> >> 
> >> >  /*
> >> >   * Supported types:
> >> >   *
> >> > @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void)
> >> >  static void monitor_qmp_bh_dispatcher(void *data)
> >> >  {
> >> >      QMPRequest *req_obj = monitor_qmp_requests_pop_any();
> >> > +    Monitor *mon;
> >> >      QDict *rsp;
> >> >      bool need_resume;
> >> >  
> >> >      if (!req_obj) {
> >> >          return;
> >> >      }
> >> > -
> >> 
> >> Let's keep the blank line.
> >
> > Ok.
> >
> >> 
> >> > +    mon = req_obj->mon;
> >> >      /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> >> > -    need_resume = !qmp_oob_enabled(req_obj->mon);
> >> > +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> >> > +    need_resume = !qmp_oob_enabled(req_obj->mon) ||
> >> > +        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> >> 
> >> Note for later: this is the condition guarding monitor_resume().
> >> 
> >> Is this race-free?  We have two criticial sections, one in
> >> monitor_qmp_requests_pop_any(), and one here.  What if two threads
> >> interleave like this
> >> 
> >>     thread 1                            thread 2
> >>     ------------------------------------------------------------------
> >>     monitor_qmp_requests_pop_any()
> >>         lock
> >>         dequeue
> >>         unlock
> >>                                         monitor_qmp_requests_pop_any()
> >>                                             lock
> >>                                             dequeue
> >>                                             unlock
> >>                                         lock
> >>                                         check queue length
> >>                                         unlock
> >>                                         if queue length demands it
> >>                                              monitor_resume() 
> >>     lock
> >>     check queue length
> >>     unlock
> >>     if queue length demands it
> >>         monitor_resume()
> >> 
> >> Looks racy to me: if we start with the queue full (and the monitor
> >> suspended), both threads' check queue length see length
> >> QMP_REQ_QUEUE_LEN_MAX - 2, and neither thread resumes the monitor.
> >> Oops.
> >> 
> >> Simplest fix is probably moving the resume into
> >> monitor_qmp_requests_pop_any()'s critical section.
> >
> > But we only have one QMP dispatcher, right?  So IMHO we can't have two
> > threads doing monitor_qmp_requests_pop_any() at the same time.
> 
> You're right, we currently run monitor_qmp_bh_dispatcher() only in the
> main thread, and a thread can't race with itself.  But the main thread
> can still race with handle_qmp_command() running in mon_iothread.
> 
>     main thread                         mon_iothread
>     ------------------------------------------------------------------
>     monitor_qmp_requests_pop_any()
>         lock
>         dequeue
>         unlock
>                                         lock
>                                         if queue length demands it
>                                             monitor_suspend()
>                                         push onto queue
>                                         unlock
>     lock
>     check queue length
>     unlock
>     if queue length demands it
>         monitor_resume()           <---------------------- [1]
> 
> If we start with the queue full (and the monitor suspended), the main
> thread first determines it'll need to resume.  mon_iothread then fills
> the queue again, and suspends the suspended monitor some more.  If I

(Side note: here it's tricky that when we "determines to resume" we
 didn't really resume, so we are still suspended, hence the
 mon_iothread cannot fill the queue again until the resume at [1]
 above.  Hence IMHO we're safe here.  But I agree that it's still racy
 in other cases.)

> read the code correctly, this increments mon->suspend_cnt from 1 to 2.
> Finally, the main thread checks the queue length:
> 
>        need_resume = !qmp_oob_enabled(req_obj->mon) ||
>            mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> 
> The yields false, because ->length is now QMP_REQ_QUEUE_LEN_MAX.  The
> main thread does *not* resume the monitor.
> 
> State after this: queue full, mon->suspend_cnt == 2.  Bad, but we'll
> recover on the dequeue after next: the next dequeue decrements
> mon->suspend_cnt to 1 without resuming the monitor, the one after that
> will decrement it to 0 and resume the monitor.
> 
> However, if handle_qmp_command() runs in this spot often enough to push
> mon->suspend_cnt above QMP_REQ_QUEUE_LEN_MAX, the monitor will remain
> suspended forever.
> 
> I'm teaching you multiprogramming 101 here.  The thing that should make
> the moderately experienced nose twitch is the anti-pattern
> 
>     begin critical section
>     do something
>     end critical section
>     begin critical section
>     if we just did X, state must be Y, so we must now do Z
>     end critical section
> 
> The part "if we just did X, state must be Y" is *wrong*.  You have no
> idea what the state is, since code running between the two critical
> sections may have changed it.
> 
> Here,
> 
>     X = dequeued from a full queue"
>     Y = "mon->suspend_cnt == 1"
>     Z = monitor_resume() to resume the monitor
> 
> I showed that Y does not hold.
> 
> On a slightly more abstract level, this anti-pattern applies:
> 
>     begin critical section
>     start messing with invariant
>     end critical section
>     // invariant does not hold here, oopsie
>     begin critical section
>     finish messing with invariant
>     end critical section
> 
> The invariant here is "monitor suspended exactly when the queue is
> full".  Your first critical section can make the queue non-full.  The
> second one resumes.  The invariant doesn't hold in between, and all bets
> are off.
> 
> To maintain the invariant, you *have* to enqueue and suspend in a single
> critical section (which you do), *and* dequeue and resume in a single
> critical section (which you don't).

Thank you for teaching the lesson for me.

> 
> > But I fully agree that it'll be nicer to keep them together (e.g. in
> > case we allow to dispatch two commands some day), but then if you see
> > it'll be something like the old req_obj->need_resume flag, but we set
> > it in monitor_qmp_requests_pop_any() instead.  If so, I really prefer
> > we just keep that need_resume flag for per-monitor by dropping
> > 176160ce78 and keep my patch to move that flag into monitor struct
> > (which I dropped after the rebase of this version).
> 
> Please have another look.

Again, if we want to put pop+check into the same critical section,
then we possibly need to return something from the
monitor_qmp_requests_pop_any() to show the queue length information,
then I would prefer to keep the per-monitor need_resume flag.

What do you think?

> 
> >> > +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> >> >      if (req_obj->req) {
> >> >          trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
> >> > -        monitor_qmp_dispatch(req_obj->mon, req_obj->req, req_obj->id);
> >> > +        monitor_qmp_dispatch(mon, req_obj->req, req_obj->id);
> >> >      } else {
> >> >          assert(req_obj->err);
> >> >          rsp = qmp_error_response(req_obj->err);
> >> >          req_obj->err = NULL;
> >> > -        monitor_qmp_respond(req_obj->mon, rsp, NULL);
> >> > +        monitor_qmp_respond(mon, rsp, NULL);
> >> >          qobject_unref(rsp);
> >> >      }
> >> >  
> >> >      if (need_resume) {
> >> >          /* Pairs with the monitor_suspend() in handle_qmp_command() */
> >> > -        monitor_resume(req_obj->mon);
> >> > +        monitor_resume(mon);
> >> >      }
> >> >      qmp_request_free(req_obj);
> >> 
> >> The replacement of req_obj->mon by mon makes this change less clear than
> >> it could be.  Let's figure out the possible race, and then we'll see
> >> whether we'll still want this replacement.
> >
> > Sure.
> >
> >> 
> >> >  
> >> > @@ -4154,8 +4160,6 @@ static void monitor_qmp_bh_dispatcher(void *data)
> >> >      qemu_bh_schedule(qmp_dispatcher_bh);
> >> >  }
> >> >  
> >> > -#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> >> > -
> >> >  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
> >> >  {
> >> >      Monitor *mon = opaque;
> >> > @@ -4205,19 +4209,12 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
> >>        /*
> >>         * If OOB is not enabled on the current monitor, we'll emulate the
> >>         * old behavior that we won't process the current monitor any more
> >>         * until it has responded.  This helps make sure that as long as
> >>         * OOB is not enabled, the server will never drop any command.
> >>         */
> >> 
> >> This comment is now stale, we don't drop commands anymore.
> >
> > AFAIU it's describing [1] below but nothing related to COMMAND_DROP?
> 
> The comment suggests the server can drop commands when OOB is enabled.
> This is no longer correct after this patch.

Ah yes the last sentense, sorry.

No matter what, I'm taking your suggestion below so the paragraph will
be dropped.

Thanks,

> 
> >> 
> >> >      if (!qmp_oob_enabled(mon)) {
> >> >          monitor_suspend(mon);
> >
> > [1]
> >
> >> >      } else {
> >> > -        /* Drop the request if queue is full. */
> >> > -        if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
> >> > -            qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> >> > +        if (mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
> >> >              /*
> >> > -             * FIXME @id's scope is just @mon, and broadcasting it is
> >> > -             * wrong.  If another monitor's client has a command with
> >> > -             * the same ID in flight, the event will incorrectly claim
> >> > -             * that command was dropped.
> >> > +             * If this is _exactly_ the last request that we can
> >> > +             * queue, we suspend the monitor right now.
> >> >               */
> >> > -            qapi_event_send_command_dropped(id,
> >> > -                                            COMMAND_DROP_REASON_QUEUE_FULL);
> >> > -            qmp_request_free(req_obj);
> >> > -            return;
> >> > +            monitor_suspend(mon);
> >> >          }
> >> >      }
> >> 
> >> The conditional and its comments look unbalanced.  Moreover, the
> >> condition is visually dissimilar to the one guarding the matching
> >> monitor_resume().  What about:
> >> 
> >>        /*
> >>         * Suspend the monitor when we can't queue more requests after
> >>         * this one.  Dequeuing in monitor_qmp_bh_dispatcher() will resume
> >>         * it.
> >>         * Note that when OOB is disabled, we queue at most one command,
> >>         * for backward compatibility.
> >>         */
> >>        if (!qmp_oob_enabled(mon) ||
> >>            mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
> >>            monitor_suspend(mon);
> >>        }
> >> 
> >> You'll have to update the comment if you move the resume to
> >> monitor_qmp_requests_pop_any().
> >> 
> >> Next:
> >> 
> >>        /*
> >>         * Put the request to the end of queue so that requests will be
> >>         * handled in time order.  Ownership for req_obj, req, id,
> >>         * etc. will be delivered to the handler side.
> >>         */
> >> 
> >> Suggest asserting mon->qmp.qmp_requests->length < QMP_REQ_QUEUE_LEN_MAX
> >> here.
> >
> > Sure I can do this.
> >
> >> 
> >>        g_queue_push_tail(mon->qmp.qmp_requests, req_obj);
> >>        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> >
> > Thanks,

Regards,
Markus Armbruster Sept. 4, 2018, 6:17 a.m. UTC | #6
Peter Xu <peterx@redhat.com> writes:

> On Mon, Sep 03, 2018 at 03:16:55PM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Mon, Sep 03, 2018 at 09:38:00AM +0200, Markus Armbruster wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > When we received too many qmp commands, previously we'll send
>> >> > COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
>> >> > instead of dropping the command we stop reading from input when we
>> >> > notice the queue is getting full.  Note that here since we removed the
>> >> > need_resume flag we need to be _very_ careful on the suspend/resume
>> >> > paring on the conditions since unmatched condition checks will hang
>> >> > death the monitor.  Meanwhile, now we will need to read the queue length
>> >> > to decide whether we'll need to resume the monitor so we need to take
>> >> > the queue lock again even after popping from it.
>> >> >
>> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> >> 
>> >> The subject's "send CMD_DROP" fails to highlight the important part:
>> >> we're no longer dropping commands.  Moreover, the commit message could
>> >> do a better job explaining the tradeoffs.  Here's my try:
>> >> 
>> >>     monitor: Suspend monitor instead dropping commands
>> >> 
>> >>     When a QMP client sends in-band commands more quickly that we can
>> >>     process them, we can either queue them without limit (QUEUE), drop
>> >>     commands when the queue is full (DROP), or suspend receiving commands
>> >>     when the queue is full (SUSPEND).  None of them is ideal:
>> >> 
>> >>     * QUEUE lets a misbehaving client make QEMU eat memory without bounds.
>> >>       Not such a hot idea.
>> >> 
>> >>     * With DROP, the client has to cope with dropped in-band commands.  To
>> >>       inform the client, we send a COMMAND_DROPPED event then.  The event is
>> >>       flawed by design in two ways: it's ambiguous (see commit d621cfe0a17),
>> >>       and it brings back the "eat memory without bounds" problem.
>> >> 
>> >>     * With SUSPEND, the client has to manage the flow of in-band commands to
>> >>       keep the monitor available for out-of-band commands.
>> >> 
>> >>     We currently DROP.  Switch to SUSPEND.
>> >> 
>> >>     Managing the flow of in-band commands to keep the monitor available for
>> >>     out-of-band commands isn't really hard: just count the number of
>> >>     "outstanding" in-band commands (commands sent minus replies received),
>> >>     and if it exceeds the limit, hold back additional ones until it drops
>> >>     below the limit again.
>> >
>> > (Will the simplest QMP client just block at the write() system call
>> >  without notice?
>> 
>> Yes.
>> 
>> When you stop reading from a socket or pipe, the peer eventually can't
>> write more.  "Eventually", because the TCP connection or pipe buffers
>> some.
>> 
>> A naive client using a blocking write() will block then.
>> 
>> A slightly more sophisticated client using a non-blocking write() has
>> its write() fail with EAGAIN or EWOULDBLOCK.
>> 
>> In both cases, OOB commands may be stuck in the TCP connection's /
>> pipe's buffer.
>> 
>> 
>> >                   Anyway, the SUSPEND is ideal enough to me... :)
>> >
>> >> 
>> >>     Note that we need to be careful pairing the suspend with a resume, or
>> >>     else the monitor will hang, possibly forever.
>> >
>> > Will take your version, thanks.
>> >
>> >> 
>> >> 
>> >> > ---
>> >> >  monitor.c | 33 +++++++++++++++------------------
>> >> >  1 file changed, 15 insertions(+), 18 deletions(-)
>> >> >
>> >> > diff --git a/monitor.c b/monitor.c
>> >> > index 3b90c9eb5f..9e6cad2af6 100644
>> >> > --- a/monitor.c
>> >> > +++ b/monitor.c
>> >> > @@ -89,6 +89,8 @@
>> >> >  #include "hw/s390x/storage-attributes.h"
>> >> >  #endif
>> >> >  
>> >> > +#define  QMP_REQ_QUEUE_LEN_MAX  (8)
>> >> > +
>> >> 
>> >> Let's drop the parenthesis.
>> >
>> > Ok.
>> >
>> >> 
>> >> >  /*
>> >> >   * Supported types:
>> >> >   *
>> >> > @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void)
>> >> >  static void monitor_qmp_bh_dispatcher(void *data)
>> >> >  {
>> >> >      QMPRequest *req_obj = monitor_qmp_requests_pop_any();
>> >> > +    Monitor *mon;
>> >> >      QDict *rsp;
>> >> >      bool need_resume;
>> >> >  
>> >> >      if (!req_obj) {
>> >> >          return;
>> >> >      }
>> >> > -
>> >> 
>> >> Let's keep the blank line.
>> >
>> > Ok.
>> >
>> >> 
>> >> > +    mon = req_obj->mon;
>> >> >      /*  qmp_oob_enabled() might change after "qmp_capabilities" */
>> >> > -    need_resume = !qmp_oob_enabled(req_obj->mon);
>> >> > +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
>> >> > +    need_resume = !qmp_oob_enabled(req_obj->mon) ||
>> >> > +        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
>> >> 
>> >> Note for later: this is the condition guarding monitor_resume().
>> >> 
>> >> Is this race-free?  We have two criticial sections, one in
>> >> monitor_qmp_requests_pop_any(), and one here.  What if two threads
>> >> interleave like this
>> >> 
>> >>     thread 1                            thread 2
>> >>     ------------------------------------------------------------------
>> >>     monitor_qmp_requests_pop_any()
>> >>         lock
>> >>         dequeue
>> >>         unlock
>> >>                                         monitor_qmp_requests_pop_any()
>> >>                                             lock
>> >>                                             dequeue
>> >>                                             unlock
>> >>                                         lock
>> >>                                         check queue length
>> >>                                         unlock
>> >>                                         if queue length demands it
>> >>                                              monitor_resume() 
>> >>     lock
>> >>     check queue length
>> >>     unlock
>> >>     if queue length demands it
>> >>         monitor_resume()
>> >> 
>> >> Looks racy to me: if we start with the queue full (and the monitor
>> >> suspended), both threads' check queue length see length
>> >> QMP_REQ_QUEUE_LEN_MAX - 2, and neither thread resumes the monitor.
>> >> Oops.
>> >> 
>> >> Simplest fix is probably moving the resume into
>> >> monitor_qmp_requests_pop_any()'s critical section.
>> >
>> > But we only have one QMP dispatcher, right?  So IMHO we can't have two
>> > threads doing monitor_qmp_requests_pop_any() at the same time.
>> 
>> You're right, we currently run monitor_qmp_bh_dispatcher() only in the
>> main thread, and a thread can't race with itself.  But the main thread
>> can still race with handle_qmp_command() running in mon_iothread.
>> 
>>     main thread                         mon_iothread
>>     ------------------------------------------------------------------
>>     monitor_qmp_requests_pop_any()
>>         lock
>>         dequeue
>>         unlock
>>                                         lock
>>                                         if queue length demands it
>>                                             monitor_suspend()
>>                                         push onto queue
>>                                         unlock
>>     lock
>>     check queue length
>>     unlock
>>     if queue length demands it
>>         monitor_resume()           <---------------------- [1]
>> 
>> If we start with the queue full (and the monitor suspended), the main
>> thread first determines it'll need to resume.  mon_iothread then fills
>> the queue again, and suspends the suspended monitor some more.  If I
>
> (Side note: here it's tricky that when we "determines to resume" we
>  didn't really resume, so we are still suspended, hence the
>  mon_iothread cannot fill the queue again until the resume at [1]

You're right.

>  above.  Hence IMHO we're safe here.  But I agree that it's still racy
>  in other cases.)

Maybe.

Getting threads to cooperate safely is hard.  Key to success is making
things as simple and obvious as we can.  Suitable invariants help.

They help even more when they're documented and checked with assertions.
Perhaps you can think of ways to do that.

>> read the code correctly, this increments mon->suspend_cnt from 1 to 2.
>> Finally, the main thread checks the queue length:
>> 
>>        need_resume = !qmp_oob_enabled(req_obj->mon) ||
>>            mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
>> 
>> The yields false, because ->length is now QMP_REQ_QUEUE_LEN_MAX.  The
>> main thread does *not* resume the monitor.
>> 
>> State after this: queue full, mon->suspend_cnt == 2.  Bad, but we'll
>> recover on the dequeue after next: the next dequeue decrements
>> mon->suspend_cnt to 1 without resuming the monitor, the one after that
>> will decrement it to 0 and resume the monitor.
>> 
>> However, if handle_qmp_command() runs in this spot often enough to push
>> mon->suspend_cnt above QMP_REQ_QUEUE_LEN_MAX, the monitor will remain
>> suspended forever.
>> 
>> I'm teaching you multiprogramming 101 here.  The thing that should make
>> the moderately experienced nose twitch is the anti-pattern
>> 
>>     begin critical section
>>     do something
>>     end critical section
>>     begin critical section
>>     if we just did X, state must be Y, so we must now do Z
>>     end critical section
>> 
>> The part "if we just did X, state must be Y" is *wrong*.  You have no
>> idea what the state is, since code running between the two critical
>> sections may have changed it.
>> 
>> Here,
>> 
>>     X = dequeued from a full queue"
>>     Y = "mon->suspend_cnt == 1"
>>     Z = monitor_resume() to resume the monitor
>> 
>> I showed that Y does not hold.
>> 
>> On a slightly more abstract level, this anti-pattern applies:
>> 
>>     begin critical section
>>     start messing with invariant
>>     end critical section
>>     // invariant does not hold here, oopsie
>>     begin critical section
>>     finish messing with invariant
>>     end critical section
>> 
>> The invariant here is "monitor suspended exactly when the queue is
>> full".  Your first critical section can make the queue non-full.  The
>> second one resumes.  The invariant doesn't hold in between, and all bets
>> are off.
>> 
>> To maintain the invariant, you *have* to enqueue and suspend in a single
>> critical section (which you do), *and* dequeue and resume in a single
>> critical section (which you don't).
>
> Thank you for teaching the lesson for me.
>
>> 
>> > But I fully agree that it'll be nicer to keep them together (e.g. in
>> > case we allow to dispatch two commands some day), but then if you see
>> > it'll be something like the old req_obj->need_resume flag, but we set
>> > it in monitor_qmp_requests_pop_any() instead.  If so, I really prefer
>> > we just keep that need_resume flag for per-monitor by dropping
>> > 176160ce78 and keep my patch to move that flag into monitor struct
>> > (which I dropped after the rebase of this version).
>> 
>> Please have another look.
>
> Again, if we want to put pop+check into the same critical section,
> then we possibly need to return something from the
> monitor_qmp_requests_pop_any() to show the queue length information,
> then I would prefer to keep the per-monitor need_resume flag.
>
> What do you think?

Options:

* Save rather than recompute @need_resume

  This gets rid of the second critical section.

* Have monitor_qmp_requests_pop_any() return @need_resume

  This merges the second critical section into the first.

* Have a single critical section in monitor_qmp_bh_dispatcher()

  This replaces both critical sections by a single larger one.
  Callers of monitor_qmp_bh_dispatcher() must hold monitor_lock.

* Other ideas?

The question to ask regardless of option: what invariant do the critical
sections maintain?

I proposed one: monitor suspended exactly when the queue is full.

handle_qmp_command()'s critical section maintains it: it suspends when
its enqueue fills up the queue.

monitor_qmp_bh_dispatcher()'s critical sections do not.  Even if we
reduce them from two to one, the resulting single critical section can't
as long as we resume only after the dequeued command completed, because
that would require holding monitor_lock while the command executes.  So,
to maintain this invariant, we need to rethink
monitor_qmp_bh_dispatcher().  Why can't we resume right after dequeuing?

You're of course welcome to propose a different invariant.

[...]
Peter Xu Sept. 4, 2018, 7:01 a.m. UTC | #7
On Tue, Sep 04, 2018 at 08:17:54AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Sep 03, 2018 at 03:16:55PM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Mon, Sep 03, 2018 at 09:38:00AM +0200, Markus Armbruster wrote:
> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >> 
> >> >> > When we received too many qmp commands, previously we'll send
> >> >> > COMMAND_DROPPED events to monitors, then we'll drop the requests.  Now
> >> >> > instead of dropping the command we stop reading from input when we
> >> >> > notice the queue is getting full.  Note that here since we removed the
> >> >> > need_resume flag we need to be _very_ careful on the suspend/resume
> >> >> > paring on the conditions since unmatched condition checks will hang
> >> >> > death the monitor.  Meanwhile, now we will need to read the queue length
> >> >> > to decide whether we'll need to resume the monitor so we need to take
> >> >> > the queue lock again even after popping from it.
> >> >> >
> >> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >> >> 
> >> >> The subject's "send CMD_DROP" fails to highlight the important part:
> >> >> we're no longer dropping commands.  Moreover, the commit message could
> >> >> do a better job explaining the tradeoffs.  Here's my try:
> >> >> 
> >> >>     monitor: Suspend monitor instead dropping commands
> >> >> 
> >> >>     When a QMP client sends in-band commands more quickly that we can
> >> >>     process them, we can either queue them without limit (QUEUE), drop
> >> >>     commands when the queue is full (DROP), or suspend receiving commands
> >> >>     when the queue is full (SUSPEND).  None of them is ideal:
> >> >> 
> >> >>     * QUEUE lets a misbehaving client make QEMU eat memory without bounds.
> >> >>       Not such a hot idea.
> >> >> 
> >> >>     * With DROP, the client has to cope with dropped in-band commands.  To
> >> >>       inform the client, we send a COMMAND_DROPPED event then.  The event is
> >> >>       flawed by design in two ways: it's ambiguous (see commit d621cfe0a17),
> >> >>       and it brings back the "eat memory without bounds" problem.
> >> >> 
> >> >>     * With SUSPEND, the client has to manage the flow of in-band commands to
> >> >>       keep the monitor available for out-of-band commands.
> >> >> 
> >> >>     We currently DROP.  Switch to SUSPEND.
> >> >> 
> >> >>     Managing the flow of in-band commands to keep the monitor available for
> >> >>     out-of-band commands isn't really hard: just count the number of
> >> >>     "outstanding" in-band commands (commands sent minus replies received),
> >> >>     and if it exceeds the limit, hold back additional ones until it drops
> >> >>     below the limit again.
> >> >
> >> > (Will the simplest QMP client just block at the write() system call
> >> >  without notice?
> >> 
> >> Yes.
> >> 
> >> When you stop reading from a socket or pipe, the peer eventually can't
> >> write more.  "Eventually", because the TCP connection or pipe buffers
> >> some.
> >> 
> >> A naive client using a blocking write() will block then.
> >> 
> >> A slightly more sophisticated client using a non-blocking write() has
> >> its write() fail with EAGAIN or EWOULDBLOCK.
> >> 
> >> In both cases, OOB commands may be stuck in the TCP connection's /
> >> pipe's buffer.
> >> 
> >> 
> >> >                   Anyway, the SUSPEND is ideal enough to me... :)
> >> >
> >> >> 
> >> >>     Note that we need to be careful pairing the suspend with a resume, or
> >> >>     else the monitor will hang, possibly forever.
> >> >
> >> > Will take your version, thanks.
> >> >
> >> >> 
> >> >> 
> >> >> > ---
> >> >> >  monitor.c | 33 +++++++++++++++------------------
> >> >> >  1 file changed, 15 insertions(+), 18 deletions(-)
> >> >> >
> >> >> > diff --git a/monitor.c b/monitor.c
> >> >> > index 3b90c9eb5f..9e6cad2af6 100644
> >> >> > --- a/monitor.c
> >> >> > +++ b/monitor.c
> >> >> > @@ -89,6 +89,8 @@
> >> >> >  #include "hw/s390x/storage-attributes.h"
> >> >> >  #endif
> >> >> >  
> >> >> > +#define  QMP_REQ_QUEUE_LEN_MAX  (8)
> >> >> > +
> >> >> 
> >> >> Let's drop the parenthesis.
> >> >
> >> > Ok.
> >> >
> >> >> 
> >> >> >  /*
> >> >> >   * Supported types:
> >> >> >   *
> >> >> > @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void)
> >> >> >  static void monitor_qmp_bh_dispatcher(void *data)
> >> >> >  {
> >> >> >      QMPRequest *req_obj = monitor_qmp_requests_pop_any();
> >> >> > +    Monitor *mon;
> >> >> >      QDict *rsp;
> >> >> >      bool need_resume;
> >> >> >  
> >> >> >      if (!req_obj) {
> >> >> >          return;
> >> >> >      }
> >> >> > -
> >> >> 
> >> >> Let's keep the blank line.
> >> >
> >> > Ok.
> >> >
> >> >> 
> >> >> > +    mon = req_obj->mon;
> >> >> >      /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> >> >> > -    need_resume = !qmp_oob_enabled(req_obj->mon);
> >> >> > +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> >> >> > +    need_resume = !qmp_oob_enabled(req_obj->mon) ||
> >> >> > +        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> >> >> 
> >> >> Note for later: this is the condition guarding monitor_resume().
> >> >> 
> >> >> Is this race-free?  We have two criticial sections, one in
> >> >> monitor_qmp_requests_pop_any(), and one here.  What if two threads
> >> >> interleave like this
> >> >> 
> >> >>     thread 1                            thread 2
> >> >>     ------------------------------------------------------------------
> >> >>     monitor_qmp_requests_pop_any()
> >> >>         lock
> >> >>         dequeue
> >> >>         unlock
> >> >>                                         monitor_qmp_requests_pop_any()
> >> >>                                             lock
> >> >>                                             dequeue
> >> >>                                             unlock
> >> >>                                         lock
> >> >>                                         check queue length
> >> >>                                         unlock
> >> >>                                         if queue length demands it
> >> >>                                              monitor_resume() 
> >> >>     lock
> >> >>     check queue length
> >> >>     unlock
> >> >>     if queue length demands it
> >> >>         monitor_resume()
> >> >> 
> >> >> Looks racy to me: if we start with the queue full (and the monitor
> >> >> suspended), both threads' check queue length see length
> >> >> QMP_REQ_QUEUE_LEN_MAX - 2, and neither thread resumes the monitor.
> >> >> Oops.
> >> >> 
> >> >> Simplest fix is probably moving the resume into
> >> >> monitor_qmp_requests_pop_any()'s critical section.
> >> >
> >> > But we only have one QMP dispatcher, right?  So IMHO we can't have two
> >> > threads doing monitor_qmp_requests_pop_any() at the same time.
> >> 
> >> You're right, we currently run monitor_qmp_bh_dispatcher() only in the
> >> main thread, and a thread can't race with itself.  But the main thread
> >> can still race with handle_qmp_command() running in mon_iothread.
> >> 
> >>     main thread                         mon_iothread
> >>     ------------------------------------------------------------------
> >>     monitor_qmp_requests_pop_any()
> >>         lock
> >>         dequeue
> >>         unlock
> >>                                         lock
> >>                                         if queue length demands it
> >>                                             monitor_suspend()
> >>                                         push onto queue
> >>                                         unlock
> >>     lock
> >>     check queue length
> >>     unlock
> >>     if queue length demands it
> >>         monitor_resume()           <---------------------- [1]
> >> 
> >> If we start with the queue full (and the monitor suspended), the main
> >> thread first determines it'll need to resume.  mon_iothread then fills
> >> the queue again, and suspends the suspended monitor some more.  If I
> >
> > (Side note: here it's tricky that when we "determines to resume" we
> >  didn't really resume, so we are still suspended, hence the
> >  mon_iothread cannot fill the queue again until the resume at [1]
> 
> You're right.
> 
> >  above.  Hence IMHO we're safe here.  But I agree that it's still racy
> >  in other cases.)
> 
> Maybe.
> 
> Getting threads to cooperate safely is hard.  Key to success is making
> things as simple and obvious as we can.  Suitable invariants help.
> 
> They help even more when they're documented and checked with assertions.
> Perhaps you can think of ways to do that.
> 
> >> read the code correctly, this increments mon->suspend_cnt from 1 to 2.
> >> Finally, the main thread checks the queue length:
> >> 
> >>        need_resume = !qmp_oob_enabled(req_obj->mon) ||
> >>            mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> >> 
> >> The yields false, because ->length is now QMP_REQ_QUEUE_LEN_MAX.  The
> >> main thread does *not* resume the monitor.
> >> 
> >> State after this: queue full, mon->suspend_cnt == 2.  Bad, but we'll
> >> recover on the dequeue after next: the next dequeue decrements
> >> mon->suspend_cnt to 1 without resuming the monitor, the one after that
> >> will decrement it to 0 and resume the monitor.
> >> 
> >> However, if handle_qmp_command() runs in this spot often enough to push
> >> mon->suspend_cnt above QMP_REQ_QUEUE_LEN_MAX, the monitor will remain
> >> suspended forever.
> >> 
> >> I'm teaching you multiprogramming 101 here.  The thing that should make
> >> the moderately experienced nose twitch is the anti-pattern
> >> 
> >>     begin critical section
> >>     do something
> >>     end critical section
> >>     begin critical section
> >>     if we just did X, state must be Y, so we must now do Z
> >>     end critical section
> >> 
> >> The part "if we just did X, state must be Y" is *wrong*.  You have no
> >> idea what the state is, since code running between the two critical
> >> sections may have changed it.
> >> 
> >> Here,
> >> 
> >>     X = dequeued from a full queue"
> >>     Y = "mon->suspend_cnt == 1"
> >>     Z = monitor_resume() to resume the monitor
> >> 
> >> I showed that Y does not hold.
> >> 
> >> On a slightly more abstract level, this anti-pattern applies:
> >> 
> >>     begin critical section
> >>     start messing with invariant
> >>     end critical section
> >>     // invariant does not hold here, oopsie
> >>     begin critical section
> >>     finish messing with invariant
> >>     end critical section
> >> 
> >> The invariant here is "monitor suspended exactly when the queue is
> >> full".  Your first critical section can make the queue non-full.  The
> >> second one resumes.  The invariant doesn't hold in between, and all bets
> >> are off.
> >> 
> >> To maintain the invariant, you *have* to enqueue and suspend in a single
> >> critical section (which you do), *and* dequeue and resume in a single
> >> critical section (which you don't).
> >
> > Thank you for teaching the lesson for me.
> >
> >> 
> >> > But I fully agree that it'll be nicer to keep them together (e.g. in
> >> > case we allow to dispatch two commands some day), but then if you see
> >> > it'll be something like the old req_obj->need_resume flag, but we set
> >> > it in monitor_qmp_requests_pop_any() instead.  If so, I really prefer
> >> > we just keep that need_resume flag for per-monitor by dropping
> >> > 176160ce78 and keep my patch to move that flag into monitor struct
> >> > (which I dropped after the rebase of this version).
> >> 
> >> Please have another look.
> >
> > Again, if we want to put pop+check into the same critical section,
> > then we possibly need to return something from the
> > monitor_qmp_requests_pop_any() to show the queue length information,
> > then I would prefer to keep the per-monitor need_resume flag.
> >
> > What do you think?
> 
> Options:
> 
> * Save rather than recompute @need_resume
> 
>   This gets rid of the second critical section.

This one I always liked.  Actually it will avoid potential risk when
the conditions become more complicated (now it only contains two
checks).  Meanwhile...

> 
> * Have monitor_qmp_requests_pop_any() return @need_resume
> 
>   This merges the second critical section into the first.

(I don't like this one much...)

> 
> * Have a single critical section in monitor_qmp_bh_dispatcher()
> 
>   This replaces both critical sections by a single larger one.
>   Callers of monitor_qmp_bh_dispatcher() must hold monitor_lock.

... I like this one too since it's clean enough at least for now and
it won't bother you to drop one queued patch in monitor-next.

I'll use this one if no one disagrees.

> 
> * Other ideas?
> 
> The question to ask regardless of option: what invariant do the critical
> sections maintain?
> 
> I proposed one: monitor suspended exactly when the queue is full.
> 
> handle_qmp_command()'s critical section maintains it: it suspends when
> its enqueue fills up the queue.
> 
> monitor_qmp_bh_dispatcher()'s critical sections do not.  Even if we
> reduce them from two to one, the resulting single critical section can't
> as long as we resume only after the dequeued command completed, because
> that would require holding monitor_lock while the command executes.  So,
> to maintain this invariant, we need to rethink
> monitor_qmp_bh_dispatcher().  Why can't we resume right after dequeuing?

Makes sense.  I can do that in my next post if you like.

> 
> You're of course welcome to propose a different invariant.
> 
> [...]

Thanks,
diff mbox series

Patch

diff --git a/monitor.c b/monitor.c
index 3b90c9eb5f..9e6cad2af6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -89,6 +89,8 @@ 
 #include "hw/s390x/storage-attributes.h"
 #endif
 
+#define  QMP_REQ_QUEUE_LEN_MAX  (8)
+
 /*
  * Supported types:
  *
@@ -4124,29 +4126,33 @@  static QMPRequest *monitor_qmp_requests_pop_any(void)
 static void monitor_qmp_bh_dispatcher(void *data)
 {
     QMPRequest *req_obj = monitor_qmp_requests_pop_any();
+    Monitor *mon;
     QDict *rsp;
     bool need_resume;
 
     if (!req_obj) {
         return;
     }
-
+    mon = req_obj->mon;
     /*  qmp_oob_enabled() might change after "qmp_capabilities" */
-    need_resume = !qmp_oob_enabled(req_obj->mon);
+    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
+    need_resume = !qmp_oob_enabled(req_obj->mon) ||
+        mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
+    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
     if (req_obj->req) {
         trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");
-        monitor_qmp_dispatch(req_obj->mon, req_obj->req, req_obj->id);
+        monitor_qmp_dispatch(mon, req_obj->req, req_obj->id);
     } else {
         assert(req_obj->err);
         rsp = qmp_error_response(req_obj->err);
         req_obj->err = NULL;
-        monitor_qmp_respond(req_obj->mon, rsp, NULL);
+        monitor_qmp_respond(mon, rsp, NULL);
         qobject_unref(rsp);
     }
 
     if (need_resume) {
         /* Pairs with the monitor_suspend() in handle_qmp_command() */
-        monitor_resume(req_obj->mon);
+        monitor_resume(mon);
     }
     qmp_request_free(req_obj);
 
@@ -4154,8 +4160,6 @@  static void monitor_qmp_bh_dispatcher(void *data)
     qemu_bh_schedule(qmp_dispatcher_bh);
 }
 
-#define  QMP_REQ_QUEUE_LEN_MAX  (8)
-
 static void handle_qmp_command(void *opaque, QObject *req, Error *err)
 {
     Monitor *mon = opaque;
@@ -4205,19 +4209,12 @@  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
     if (!qmp_oob_enabled(mon)) {
         monitor_suspend(mon);
     } else {
-        /* Drop the request if queue is full. */
-        if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
-            qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
+        if (mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) {
             /*
-             * FIXME @id's scope is just @mon, and broadcasting it is
-             * wrong.  If another monitor's client has a command with
-             * the same ID in flight, the event will incorrectly claim
-             * that command was dropped.
+             * If this is _exactly_ the last request that we can
+             * queue, we suspend the monitor right now.
              */
-            qapi_event_send_command_dropped(id,
-                                            COMMAND_DROP_REASON_QUEUE_FULL);
-            qmp_request_free(req_obj);
-            return;
+            monitor_suspend(mon);
         }
     }