diff mbox series

[v8,06/20] job.h: define functions called without job lock held

Message ID 20220629141538.3400679-7-eesposit@redhat.com
State New
Headers show
Series job: replace AioContext lock with job_mutex | expand

Commit Message

Emanuele Giuseppe Esposito June 29, 2022, 2:15 p.m. UTC
These functions don't need a _locked() counterpart, since
they are all called outside job.c and take the lock only
internally.

Update also the comments in blockjob.c (and move them in job.c).

Note: at this stage, job_{lock/unlock} and job lock guard macros
are *nop*.

No functional change intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 blockjob.c         | 20 --------------------
 include/qemu/job.h | 37 ++++++++++++++++++++++++++++++++++---
 job.c              | 15 +++++++++++++++
 3 files changed, 49 insertions(+), 23 deletions(-)

Comments

Stefan Hajnoczi July 5, 2022, 7:43 a.m. UTC | #1
On Wed, Jun 29, 2022 at 10:15:24AM -0400, Emanuele Giuseppe Esposito wrote:
> These functions don't need a _locked() counterpart, since
> they are all called outside job.c and take the lock only
> internally.
> 
> Update also the comments in blockjob.c (and move them in job.c).
> 
> Note: at this stage, job_{lock/unlock} and job lock guard macros
> are *nop*.
> 
> No functional change intended.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>  blockjob.c         | 20 --------------------
>  include/qemu/job.h | 37 ++++++++++++++++++++++++++++++++++---
>  job.c              | 15 +++++++++++++++
>  3 files changed, 49 insertions(+), 23 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Vladimir Sementsov-Ogievskiy July 5, 2022, 10:53 a.m. UTC | #2
On 6/29/22 17:15, Emanuele Giuseppe Esposito wrote:
> These functions don't need a _locked() counterpart, since
> they are all called outside job.c and take the lock only
> internally.
> 
> Update also the comments in blockjob.c (and move them in job.c).

Still, that would be better as a separate patch.

> 
> Note: at this stage, job_{lock/unlock} and job lock guard macros
> are *nop*.
> 
> No functional change intended.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   blockjob.c         | 20 --------------------
>   include/qemu/job.h | 37 ++++++++++++++++++++++++++++++++++---
>   job.c              | 15 +++++++++++++++
>   3 files changed, 49 insertions(+), 23 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 4868453d74..7da59a1f1c 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -36,21 +36,6 @@
>   #include "qemu/main-loop.h"
>   #include "qemu/timer.h"
>   
> -/*
> - * The block job API is composed of two categories of functions.
> - *
> - * The first includes functions used by the monitor.  The monitor is
> - * peculiar in that it accesses the block job list with block_job_get, and
> - * therefore needs consistency across block_job_get and the actual operation
> - * (e.g. block_job_set_speed).  The consistency is achieved with
> - * aio_context_acquire/release.  These functions are declared in blockjob.h.
> - *
> - * The second includes functions used by the block job drivers and sometimes
> - * by the core block layer.  These do not care about locking, because the
> - * whole coroutine runs under the AioContext lock, and are declared in
> - * blockjob_int.h.
> - */
> -
>   static bool is_block_job(Job *job)
>   {
>       return job_type(job) == JOB_TYPE_BACKUP ||
> @@ -433,11 +418,6 @@ static void block_job_event_ready(Notifier *n, void *opaque)
>   }
>   
>   
> -/*
> - * API for block job drivers and the block layer.  These functions are
> - * declared in blockjob_int.h.
> - */
> -
>   void *block_job_create(const char *job_id, const BlockJobDriver *driver,
>                          JobTxn *txn, BlockDriverState *bs, uint64_t perm,
>                          uint64_t shared_perm, int64_t speed, int flags,
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 99960cc9a3..b714236c1a 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -363,6 +363,7 @@ void job_txn_unref_locked(JobTxn *txn);
>   
>   /**
>    * Create a new long-running job and return it.
> + * Called with job_mutex *not* held.
>    *
>    * @job_id: The id of the newly-created job, or %NULL for internal jobs
>    * @driver: The class object for the newly-created job.
> @@ -400,6 +401,8 @@ void job_unref_locked(Job *job);
>    * @done: How much progress the job made since the last call
>    *
>    * Updates the progress counter of the job.
> + *
> + * Progress API is thread safe.

This tell nothing for function user. Finally the whole job_ API will be thread safe, isn't it?

I think here we need simply "called with mutex not held". (Or even "may be called with mutex held or not held" if we need it, or just nothing)

and note about progress API should be somewhere in job.c, as that's implementation details.

>    */
>   void job_progress_update(Job *job, uint64_t done);
>   
> @@ -410,6 +413,8 @@ void job_progress_update(Job *job, uint64_t done);
>    *
>    * Sets the expected end value of the progress counter of a job so that a
>    * completion percentage can be calculated when the progress is updated.
> + *
> + * Progress API is thread safe.
>    */
>   void job_progress_set_remaining(Job *job, uint64_t remaining);
>   
> @@ -425,6 +430,8 @@ void job_progress_set_remaining(Job *job, uint64_t remaining);
>    * length before, and job_progress_update() afterwards.
>    * (So the operation acts as a parenthesis in regards to the main job
>    * operation running in background.)
> + *
> + * Progress API is thread safe.
>    */
>   void job_progress_increase_remaining(Job *job, uint64_t delta);
>   
> @@ -443,6 +450,8 @@ void job_enter_cond_locked(Job *job, bool(*fn)(Job *job));
>    *
>    * Begins execution of a job.
>    * Takes ownership of one reference to the job object.
> + *
> + * Called with job_mutex *not* held.
>    */
>   void job_start(Job *job);
>   
> @@ -450,6 +459,7 @@ void job_start(Job *job);
>    * @job: The job to enter.
>    *
>    * Continue the specified job by entering the coroutine.
> + * Called with job_mutex *not* held.
>    */
>   void job_enter(Job *job);
>   
> @@ -458,6 +468,9 @@ void job_enter(Job *job);
>    *
>    * Pause now if job_pause() has been called. Jobs that perform lots of I/O
>    * must call this between requests so that the job can be paused.
> + *
> + * Called with job_mutex *not* held (we don't want the coroutine
> + * to yield with the lock held!).

The comment in () looks strange, as we know that job_pause_point take the mutex inside.

>    */
>   void coroutine_fn job_pause_point(Job *job);
>   
> @@ -465,6 +478,8 @@ void coroutine_fn job_pause_point(Job *job);
>    * @job: The job that calls the function.
>    *
>    * Yield the job coroutine.
> + * Called with job_mutex *not* held (we don't want the coroutine
> + * to yield with the lock held!).

same here.

>    */
>   void job_yield(Job *job);
>   
> @@ -475,6 +490,9 @@ void job_yield(Job *job);
>    * Put the job to sleep (assuming that it wasn't canceled) for @ns
>    * %QEMU_CLOCK_REALTIME nanoseconds.  Canceling the job will immediately
>    * interrupt the wait.
> + *
> + * Called with job_mutex *not* held (we don't want the coroutine
> + * to yield with the lock held!).
>    */
>   void coroutine_fn job_sleep_ns(Job *job, int64_t ns);
>   
> @@ -496,6 +514,7 @@ bool job_is_cancelled_locked(Job *job);
>   /**
>    * Returns whether the job is scheduled for cancellation (at an
>    * indefinite point).
> + * Called with job_mutex *not* held.
>    */
>   bool job_cancel_requested(Job *job);
>   
> @@ -582,10 +601,16 @@ int job_apply_verb(Job *job, JobVerb verb, Error **errp);
>   /* Same as job_apply_verb, but called with job lock held. */
>   int job_apply_verb_locked(Job *job, JobVerb verb, Error **errp);
>   
> -/** The @job could not be started, free it. */
> +/**
> + * The @job could not be started, free it.
> + * Called with job_mutex *not* held.
> + */
>   void job_early_fail(Job *job);
>   
> -/** Moves the @job from RUNNING to READY */
> +/**
> + * Moves the @job from RUNNING to READY.
> + * Called with job_mutex *not* held.
> + */
>   void job_transition_to_ready(Job *job);
>   
>   /** Asynchronously complete the specified @job. */
> @@ -628,7 +653,13 @@ int job_cancel_sync(Job *job, bool force);
>   /* Same as job_cancel_sync, but called with job lock held. */
>   int job_cancel_sync_locked(Job *job, bool force);
>   
> -/** Synchronously force-cancels all jobs using job_cancel_sync(). */
> +/**
> + * Synchronously force-cancels all jobs using job_cancel_sync_locked().
> + *
> + * Called with job_lock *not* held, unlike most other APIs consumed
> + * by the monitor! This is primarly to avoid adding unnecessary lock-unlock
> + * patterns in the caller.
> + */

I'd prefer just "Called with job_lock *not* held". The function is not exclusion in any manner:
it's normal for functions without _locked suffix to require that mutex is not held.


I'd merge all new comments in job.h to the previous commit, as they are related to the questions risen by it.


>   void job_cancel_sync_all(void);
>   
>   /**
> diff --git a/job.c b/job.c
> index dd44fac8dd..7a3cc93f66 100644
> --- a/job.c
> +++ b/job.c
> @@ -32,12 +32,27 @@
>   #include "trace/trace-root.h"
>   #include "qapi/qapi-events-job.h"
>   
> +/*
> + * The job API is composed of two categories of functions.
> + *
> + * The first includes functions used by the monitor.  The monitor is
> + * peculiar in that it accesses the block job list with job_get, and
> + * therefore needs consistency across job_get and the actual operation
> + * (e.g. job_user_cancel). To achieve this consistency, the caller
> + * calls job_lock/job_unlock itself around the whole operation.
> + *
> + *
> + * The second includes functions used by the block job drivers and sometimes
> + * by the core block layer. These delegate the locking to the callee instead.
> + */
> +
>   /*
>    * job_mutex protects the jobs list, but also makes the
>    * struct job fields thread-safe.
>    */
>   QemuMutex job_mutex;
>   
> +/* Protected by job_mutex */
>   static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
>   
>   /* Job State Transition Table */


So the logic is: the function that doesn't have public _locked counterpart has explicit comment that mutex should be not held. OK.
Vladimir Sementsov-Ogievskiy July 5, 2022, 10:54 a.m. UTC | #3
To subject: hmm, the commit don't define any function..
Emanuele Giuseppe Esposito July 6, 2022, 8:22 a.m. UTC | #4
Am 05/07/2022 um 12:53 schrieb Vladimir Sementsov-Ogievskiy:
> On 6/29/22 17:15, Emanuele Giuseppe Esposito wrote:
>> These functions don't need a _locked() counterpart, since
>> they are all called outside job.c and take the lock only
>> internally.
>>
>> Update also the comments in blockjob.c (and move them in job.c).
> 
> Still, that would be better as a separate patch.
> 
>>
>> Note: at this stage, job_{lock/unlock} and job lock guard macros
>> are *nop*.
>>
>> No functional change intended.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   blockjob.c         | 20 --------------------
>>   include/qemu/job.h | 37 ++++++++++++++++++++++++++++++++++---
>>   job.c              | 15 +++++++++++++++
>>   3 files changed, 49 insertions(+), 23 deletions(-)
>>
>> diff --git a/blockjob.c b/blockjob.c
>> index 4868453d74..7da59a1f1c 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -36,21 +36,6 @@
>>   #include "qemu/main-loop.h"
>>   #include "qemu/timer.h"
>>   -/*
>> - * The block job API is composed of two categories of functions.
>> - *
>> - * The first includes functions used by the monitor.  The monitor is
>> - * peculiar in that it accesses the block job list with
>> block_job_get, and
>> - * therefore needs consistency across block_job_get and the actual
>> operation
>> - * (e.g. block_job_set_speed).  The consistency is achieved with
>> - * aio_context_acquire/release.  These functions are declared in
>> blockjob.h.
>> - *
>> - * The second includes functions used by the block job drivers and
>> sometimes
>> - * by the core block layer.  These do not care about locking, because
>> the
>> - * whole coroutine runs under the AioContext lock, and are declared in
>> - * blockjob_int.h.
>> - */
>> -
>>   static bool is_block_job(Job *job)
>>   {
>>       return job_type(job) == JOB_TYPE_BACKUP ||
>> @@ -433,11 +418,6 @@ static void block_job_event_ready(Notifier *n,
>> void *opaque)
>>   }
>>     -/*
>> - * API for block job drivers and the block layer.  These functions are
>> - * declared in blockjob_int.h.
>> - */
>> -
>>   void *block_job_create(const char *job_id, const BlockJobDriver
>> *driver,
>>                          JobTxn *txn, BlockDriverState *bs, uint64_t
>> perm,
>>                          uint64_t shared_perm, int64_t speed, int flags,
>> diff --git a/include/qemu/job.h b/include/qemu/job.h
>> index 99960cc9a3..b714236c1a 100644
>> --- a/include/qemu/job.h
>> +++ b/include/qemu/job.h
>> @@ -363,6 +363,7 @@ void job_txn_unref_locked(JobTxn *txn);
>>     /**
>>    * Create a new long-running job and return it.
>> + * Called with job_mutex *not* held.
>>    *
>>    * @job_id: The id of the newly-created job, or %NULL for internal jobs
>>    * @driver: The class object for the newly-created job.
>> @@ -400,6 +401,8 @@ void job_unref_locked(Job *job);
>>    * @done: How much progress the job made since the last call
>>    *
>>    * Updates the progress counter of the job.
>> + *
>> + * Progress API is thread safe.
> 
> This tell nothing for function user. Finally the whole job_ API will be
> thread safe, isn't it?
> 
> I think here we need simply "called with mutex not held". (Or even "may
> be called with mutex held or not held" if we need it, or just nothing)
> 
> and note about progress API should be somewhere in job.c, as that's
> implementation details.

What about "Progress API is thread safe. Can be called with job mutex
held or not"?

> 
[...]
> 
> I'd merge all new comments in job.h to the previous commit, as they are
> related to the questions risen by it.

I disagree, I think it will be a mess of functions again if we mix these
one that don't need the lock held and the ones that need it.

You understand it because you got the logic of this serie, but others
may not.

> 
> 
>>   void job_cancel_sync_all(void);
>>     /**
>> diff --git a/job.c b/job.c
>> index dd44fac8dd..7a3cc93f66 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -32,12 +32,27 @@
>>   #include "trace/trace-root.h"
>>   #include "qapi/qapi-events-job.h"
>>   +/*
>> + * The job API is composed of two categories of functions.
>> + *
>> + * The first includes functions used by the monitor.  The monitor is
>> + * peculiar in that it accesses the block job list with job_get, and
>> + * therefore needs consistency across job_get and the actual operation
>> + * (e.g. job_user_cancel). To achieve this consistency, the caller
>> + * calls job_lock/job_unlock itself around the whole operation.
>> + *
>> + *
>> + * The second includes functions used by the block job drivers and
>> sometimes
>> + * by the core block layer. These delegate the locking to the callee
>> instead.
>> + */
>> +
>>   /*
>>    * job_mutex protects the jobs list, but also makes the
>>    * struct job fields thread-safe.
>>    */
>>   QemuMutex job_mutex;
>>   +/* Protected by job_mutex */
>>   static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
>>     /* Job State Transition Table */
> 
> 
> So the logic is: the function that doesn't have public _locked
> counterpart has explicit comment that mutex should be not held. OK.
>
Emanuele Giuseppe Esposito July 6, 2022, 8:23 a.m. UTC | #5
Am 05/07/2022 um 12:54 schrieb Vladimir Sementsov-Ogievskiy:
> To subject: hmm, the commit don't define any function..
> 
mark functions called without job lock held?
Vladimir Sementsov-Ogievskiy July 6, 2022, 9:48 a.m. UTC | #6
On 7/6/22 11:22, Emanuele Giuseppe Esposito wrote:
> 
> 
> Am 05/07/2022 um 12:53 schrieb Vladimir Sementsov-Ogievskiy:
>> On 6/29/22 17:15, Emanuele Giuseppe Esposito wrote:
>>> These functions don't need a _locked() counterpart, since
>>> they are all called outside job.c and take the lock only
>>> internally.
>>>
>>> Update also the comments in blockjob.c (and move them in job.c).
>>
>> Still, that would be better as a separate patch.
>>
>>>
>>> Note: at this stage, job_{lock/unlock} and job lock guard macros
>>> are *nop*.
>>>
>>> No functional change intended.
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>    blockjob.c         | 20 --------------------
>>>    include/qemu/job.h | 37 ++++++++++++++++++++++++++++++++++---
>>>    job.c              | 15 +++++++++++++++
>>>    3 files changed, 49 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/blockjob.c b/blockjob.c
>>> index 4868453d74..7da59a1f1c 100644
>>> --- a/blockjob.c
>>> +++ b/blockjob.c
>>> @@ -36,21 +36,6 @@
>>>    #include "qemu/main-loop.h"
>>>    #include "qemu/timer.h"
>>>    -/*
>>> - * The block job API is composed of two categories of functions.
>>> - *
>>> - * The first includes functions used by the monitor.  The monitor is
>>> - * peculiar in that it accesses the block job list with
>>> block_job_get, and
>>> - * therefore needs consistency across block_job_get and the actual
>>> operation
>>> - * (e.g. block_job_set_speed).  The consistency is achieved with
>>> - * aio_context_acquire/release.  These functions are declared in
>>> blockjob.h.
>>> - *
>>> - * The second includes functions used by the block job drivers and
>>> sometimes
>>> - * by the core block layer.  These do not care about locking, because
>>> the
>>> - * whole coroutine runs under the AioContext lock, and are declared in
>>> - * blockjob_int.h.
>>> - */
>>> -
>>>    static bool is_block_job(Job *job)
>>>    {
>>>        return job_type(job) == JOB_TYPE_BACKUP ||
>>> @@ -433,11 +418,6 @@ static void block_job_event_ready(Notifier *n,
>>> void *opaque)
>>>    }
>>>      -/*
>>> - * API for block job drivers and the block layer.  These functions are
>>> - * declared in blockjob_int.h.
>>> - */
>>> -
>>>    void *block_job_create(const char *job_id, const BlockJobDriver
>>> *driver,
>>>                           JobTxn *txn, BlockDriverState *bs, uint64_t
>>> perm,
>>>                           uint64_t shared_perm, int64_t speed, int flags,
>>> diff --git a/include/qemu/job.h b/include/qemu/job.h
>>> index 99960cc9a3..b714236c1a 100644
>>> --- a/include/qemu/job.h
>>> +++ b/include/qemu/job.h
>>> @@ -363,6 +363,7 @@ void job_txn_unref_locked(JobTxn *txn);
>>>      /**
>>>     * Create a new long-running job and return it.
>>> + * Called with job_mutex *not* held.
>>>     *
>>>     * @job_id: The id of the newly-created job, or %NULL for internal jobs
>>>     * @driver: The class object for the newly-created job.
>>> @@ -400,6 +401,8 @@ void job_unref_locked(Job *job);
>>>     * @done: How much progress the job made since the last call
>>>     *
>>>     * Updates the progress counter of the job.
>>> + *
>>> + * Progress API is thread safe.
>>
>> This tell nothing for function user. Finally the whole job_ API will be
>> thread safe, isn't it?
>>
>> I think here we need simply "called with mutex not held". (Or even "may
>> be called with mutex held or not held" if we need it, or just nothing)
>>
>> and note about progress API should be somewhere in job.c, as that's
>> implementation details.
> 
> What about "Progress API is thread safe. Can be called with job mutex
> held or not"?

OK, if you like, that's not critical. Still, I think that after this series the whole job API should be thread safe, which make a comment about progress API misleading: user will think "hmm.. OK, progress related functions are thread safe. Others are not?"

> 
>>
> [...]
>>
>> I'd merge all new comments in job.h to the previous commit, as they are
>> related to the questions risen by it.
> 
> I disagree, I think it will be a mess of functions again if we mix these
> one that don't need the lock held and the ones that need it.
> 
> You understand it because you got the logic of this serie, but others
> may not.
> 

That's not critical.. Why it seems better in one patch for me:

For a patch like 05 I anyway have to review the whole job.c/job.h checking that everything is correct. When I see that something was not updated, it looks like a mistake to me. Than I find missed part in the next commit..


>>
>>
>>>    void job_cancel_sync_all(void);
>>>      /**
>>> diff --git a/job.c b/job.c
>>> index dd44fac8dd..7a3cc93f66 100644
>>> --- a/job.c
>>> +++ b/job.c
>>> @@ -32,12 +32,27 @@
>>>    #include "trace/trace-root.h"
>>>    #include "qapi/qapi-events-job.h"
>>>    +/*
>>> + * The job API is composed of two categories of functions.
>>> + *
>>> + * The first includes functions used by the monitor.  The monitor is
>>> + * peculiar in that it accesses the block job list with job_get, and
>>> + * therefore needs consistency across job_get and the actual operation
>>> + * (e.g. job_user_cancel). To achieve this consistency, the caller
>>> + * calls job_lock/job_unlock itself around the whole operation.
>>> + *
>>> + *
>>> + * The second includes functions used by the block job drivers and
>>> sometimes
>>> + * by the core block layer. These delegate the locking to the callee
>>> instead.
>>> + */
>>> +
>>>    /*
>>>     * job_mutex protects the jobs list, but also makes the
>>>     * struct job fields thread-safe.
>>>     */
>>>    QemuMutex job_mutex;
>>>    +/* Protected by job_mutex */
>>>    static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
>>>      /* Job State Transition Table */
>>
>>
>> So the logic is: the function that doesn't have public _locked
>> counterpart has explicit comment that mutex should be not held. OK.
>>
>
Vladimir Sementsov-Ogievskiy July 6, 2022, 9:51 a.m. UTC | #7
On 7/6/22 11:23, Emanuele Giuseppe Esposito wrote:
> 
> 
> Am 05/07/2022 um 12:54 schrieb Vladimir Sementsov-Ogievskiy:
>> To subject: hmm, the commit don't define any function..
>>
> mark functions called without job lock held?
> 

Yes, that's better)
diff mbox series

Patch

diff --git a/blockjob.c b/blockjob.c
index 4868453d74..7da59a1f1c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -36,21 +36,6 @@ 
 #include "qemu/main-loop.h"
 #include "qemu/timer.h"
 
-/*
- * The block job API is composed of two categories of functions.
- *
- * The first includes functions used by the monitor.  The monitor is
- * peculiar in that it accesses the block job list with block_job_get, and
- * therefore needs consistency across block_job_get and the actual operation
- * (e.g. block_job_set_speed).  The consistency is achieved with
- * aio_context_acquire/release.  These functions are declared in blockjob.h.
- *
- * The second includes functions used by the block job drivers and sometimes
- * by the core block layer.  These do not care about locking, because the
- * whole coroutine runs under the AioContext lock, and are declared in
- * blockjob_int.h.
- */
-
 static bool is_block_job(Job *job)
 {
     return job_type(job) == JOB_TYPE_BACKUP ||
@@ -433,11 +418,6 @@  static void block_job_event_ready(Notifier *n, void *opaque)
 }
 
 
-/*
- * API for block job drivers and the block layer.  These functions are
- * declared in blockjob_int.h.
- */
-
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
                        JobTxn *txn, BlockDriverState *bs, uint64_t perm,
                        uint64_t shared_perm, int64_t speed, int flags,
diff --git a/include/qemu/job.h b/include/qemu/job.h
index 99960cc9a3..b714236c1a 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -363,6 +363,7 @@  void job_txn_unref_locked(JobTxn *txn);
 
 /**
  * Create a new long-running job and return it.
+ * Called with job_mutex *not* held.
  *
  * @job_id: The id of the newly-created job, or %NULL for internal jobs
  * @driver: The class object for the newly-created job.
@@ -400,6 +401,8 @@  void job_unref_locked(Job *job);
  * @done: How much progress the job made since the last call
  *
  * Updates the progress counter of the job.
+ *
+ * Progress API is thread safe.
  */
 void job_progress_update(Job *job, uint64_t done);
 
@@ -410,6 +413,8 @@  void job_progress_update(Job *job, uint64_t done);
  *
  * Sets the expected end value of the progress counter of a job so that a
  * completion percentage can be calculated when the progress is updated.
+ *
+ * Progress API is thread safe.
  */
 void job_progress_set_remaining(Job *job, uint64_t remaining);
 
@@ -425,6 +430,8 @@  void job_progress_set_remaining(Job *job, uint64_t remaining);
  * length before, and job_progress_update() afterwards.
  * (So the operation acts as a parenthesis in regards to the main job
  * operation running in background.)
+ *
+ * Progress API is thread safe.
  */
 void job_progress_increase_remaining(Job *job, uint64_t delta);
 
@@ -443,6 +450,8 @@  void job_enter_cond_locked(Job *job, bool(*fn)(Job *job));
  *
  * Begins execution of a job.
  * Takes ownership of one reference to the job object.
+ *
+ * Called with job_mutex *not* held.
  */
 void job_start(Job *job);
 
@@ -450,6 +459,7 @@  void job_start(Job *job);
  * @job: The job to enter.
  *
  * Continue the specified job by entering the coroutine.
+ * Called with job_mutex *not* held.
  */
 void job_enter(Job *job);
 
@@ -458,6 +468,9 @@  void job_enter(Job *job);
  *
  * Pause now if job_pause() has been called. Jobs that perform lots of I/O
  * must call this between requests so that the job can be paused.
+ *
+ * Called with job_mutex *not* held (we don't want the coroutine
+ * to yield with the lock held!).
  */
 void coroutine_fn job_pause_point(Job *job);
 
@@ -465,6 +478,8 @@  void coroutine_fn job_pause_point(Job *job);
  * @job: The job that calls the function.
  *
  * Yield the job coroutine.
+ * Called with job_mutex *not* held (we don't want the coroutine
+ * to yield with the lock held!).
  */
 void job_yield(Job *job);
 
@@ -475,6 +490,9 @@  void job_yield(Job *job);
  * Put the job to sleep (assuming that it wasn't canceled) for @ns
  * %QEMU_CLOCK_REALTIME nanoseconds.  Canceling the job will immediately
  * interrupt the wait.
+ *
+ * Called with job_mutex *not* held (we don't want the coroutine
+ * to yield with the lock held!).
  */
 void coroutine_fn job_sleep_ns(Job *job, int64_t ns);
 
@@ -496,6 +514,7 @@  bool job_is_cancelled_locked(Job *job);
 /**
  * Returns whether the job is scheduled for cancellation (at an
  * indefinite point).
+ * Called with job_mutex *not* held.
  */
 bool job_cancel_requested(Job *job);
 
@@ -582,10 +601,16 @@  int job_apply_verb(Job *job, JobVerb verb, Error **errp);
 /* Same as job_apply_verb, but called with job lock held. */
 int job_apply_verb_locked(Job *job, JobVerb verb, Error **errp);
 
-/** The @job could not be started, free it. */
+/**
+ * The @job could not be started, free it.
+ * Called with job_mutex *not* held.
+ */
 void job_early_fail(Job *job);
 
-/** Moves the @job from RUNNING to READY */
+/**
+ * Moves the @job from RUNNING to READY.
+ * Called with job_mutex *not* held.
+ */
 void job_transition_to_ready(Job *job);
 
 /** Asynchronously complete the specified @job. */
@@ -628,7 +653,13 @@  int job_cancel_sync(Job *job, bool force);
 /* Same as job_cancel_sync, but called with job lock held. */
 int job_cancel_sync_locked(Job *job, bool force);
 
-/** Synchronously force-cancels all jobs using job_cancel_sync(). */
+/**
+ * Synchronously force-cancels all jobs using job_cancel_sync_locked().
+ *
+ * Called with job_lock *not* held, unlike most other APIs consumed
+ * by the monitor! This is primarly to avoid adding unnecessary lock-unlock
+ * patterns in the caller.
+ */
 void job_cancel_sync_all(void);
 
 /**
diff --git a/job.c b/job.c
index dd44fac8dd..7a3cc93f66 100644
--- a/job.c
+++ b/job.c
@@ -32,12 +32,27 @@ 
 #include "trace/trace-root.h"
 #include "qapi/qapi-events-job.h"
 
+/*
+ * The job API is composed of two categories of functions.
+ *
+ * The first includes functions used by the monitor.  The monitor is
+ * peculiar in that it accesses the block job list with job_get, and
+ * therefore needs consistency across job_get and the actual operation
+ * (e.g. job_user_cancel). To achieve this consistency, the caller
+ * calls job_lock/job_unlock itself around the whole operation.
+ *
+ *
+ * The second includes functions used by the block job drivers and sometimes
+ * by the core block layer. These delegate the locking to the callee instead.
+ */
+
 /*
  * job_mutex protects the jobs list, but also makes the
  * struct job fields thread-safe.
  */
 QemuMutex job_mutex;
 
+/* Protected by job_mutex */
 static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
 
 /* Job State Transition Table */