[v3] blockjob: drain all job nodes in block_job_drain

Message ID	20190724094025.12442-1-vsementsov@virtuozzo.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> To: qemu-devel@nongnu.org, qemu-block@nongnu.org Date: Wed, 24 Jul 2019 12:40:25 +0300 Message-Id: <20190724094025.12442-1-vsementsov@virtuozzo.com> Subject: [Qemu-devel] [PATCH v3] blockjob: drain all job nodes in block_job_drain Precedence: list Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, jsnow@redhat.com, mreitz@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	[v3] blockjob: drain all job nodes in block_job_drain \| expand [v3] blockjob: drain all job nodes in block_job_drain

Vladimir Sementsov-Ogievskiy July 24, 2019, 9:40 a.m. UTC

Instead of draining additional nodes in each job code, let's do it in
common block_job_drain, draining just all job's children.
BlockJobDriver.drain becomes unused, so, drop it at all.

It's also a first step to finally get rid of blockjob->blk.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

v3: just resend, as I've some auto returned mails and not sure that
    v2 reached recipients.

v2: apply Max's suggestions:
 - drop BlockJobDriver.drain
 - do firtly loop of bdrv_drained_begin and then separate loop
   of bdrv_drained_end.

   Hmm, a question here: should I call bdrv_drained_end in reverse
   order? Or it's OK as is?

 include/block/blockjob_int.h | 11 -----------
 block/backup.c               | 18 +-----------------
 block/mirror.c               | 26 +++-----------------------
 blockjob.c                   | 13 ++++++++-----
 4 files changed, 12 insertions(+), 56 deletions(-)

John Snow July 30, 2019, 7:11 p.m. UTC | #1

On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Instead of draining additional nodes in each job code, let's do it in
> common block_job_drain, draining just all job's children.
> BlockJobDriver.drain becomes unused, so, drop it at all.
> 
> It's also a first step to finally get rid of blockjob->blk.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 
> v3: just resend, as I've some auto returned mails and not sure that
>     v2 reached recipients.
> 
> v2: apply Max's suggestions:
>  - drop BlockJobDriver.drain
>  - do firtly loop of bdrv_drained_begin and then separate loop
>    of bdrv_drained_end.
> 
>    Hmm, a question here: should I call bdrv_drained_end in reverse
>    order? Or it's OK as is?
> 

I think it should be OK. These nodes don't necessarily have a well
defined relationship between each other, do they?

>  include/block/blockjob_int.h | 11 -----------
>  block/backup.c               | 18 +-----------------
>  block/mirror.c               | 26 +++-----------------------
>  blockjob.c                   | 13 ++++++++-----
>  4 files changed, 12 insertions(+), 56 deletions(-)
> 

Nice diffstat :)

> diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
> index e4a318dd15..e1abf4ee85 100644
> --- a/include/block/blockjob_int.h
> +++ b/include/block/blockjob_int.h
> @@ -52,17 +52,6 @@ struct BlockJobDriver {
>       * besides job->blk to the new AioContext.
>       */
>      void (*attached_aio_context)(BlockJob *job, AioContext *new_context);
> -
> -    /*
> -     * If the callback is not NULL, it will be invoked when the job has to be
> -     * synchronously cancelled or completed; it should drain BlockDriverStates
> -     * as required to ensure progress.
> -     *
> -     * Block jobs must use the default implementation for job_driver.drain,
> -     * which will in turn call this callback after doing generic block job
> -     * stuff.
> -     */
> -    void (*drain)(BlockJob *job);

I was about to say "huh?" ... but then realized you're deleting this
confusing glob. Good.

>  };
>  
>  /**
> diff --git a/block/backup.c b/block/backup.c
> index 715e1d3be8..7930004bbd 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -320,21 +320,6 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
>      hbitmap_set(backup_job->copy_bitmap, 0, backup_job->len);
>  }
>  
> -static void backup_drain(BlockJob *job)
> -{
> -    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> -
> -    /* Need to keep a reference in case blk_drain triggers execution
> -     * of backup_complete...
> -     */
> -    if (s->target) {
> -        BlockBackend *target = s->target;
> -        blk_ref(target);
> -        blk_drain(target);
> -        blk_unref(target);
> -    }
> -}
> -

Adios ...

>  static BlockErrorAction backup_error_action(BackupBlockJob *job,
>                                              bool read, int error)
>  {
> @@ -493,8 +478,7 @@ static const BlockJobDriver backup_job_driver = {
>          .commit                 = backup_commit,
>          .abort                  = backup_abort,
>          .clean                  = backup_clean,
> -    },
> -    .drain                  = backup_drain,
> +    }
>  };
>  

This pleases the eyes.

>  static int64_t backup_calculate_cluster_size(BlockDriverState *target,
> diff --git a/block/mirror.c b/block/mirror.c
> index 8cb75fb409..8456ccd89d 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -644,14 +644,11 @@ static int mirror_exit_common(Job *job)
>      bdrv_ref(mirror_top_bs);
>      bdrv_ref(target_bs);
>  
> -    /* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
> +    /*
> +     * Remove target parent that still uses BLK_PERM_WRITE/RESIZE before

(Thanks, patchew...)

>       * inserting target_bs at s->to_replace, where we might not be able to get
>       * these permissions.
> -     *
> -     * Note that blk_unref() alone doesn't necessarily drop permissions because
> -     * we might be running nested inside mirror_drain(), which takes an extra
> -     * reference, so use an explicit blk_set_perm() first. */
> -    blk_set_perm(s->target, 0, BLK_PERM_ALL, &error_abort);
> +     */
>      blk_unref(s->target);
>      s->target = NULL;
>  
> @@ -1143,21 +1140,6 @@ static bool mirror_drained_poll(BlockJob *job)
>      return !!s->in_flight;
>  }
>  
> -static void mirror_drain(BlockJob *job)
> -{
> -    MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
> -
> -    /* Need to keep a reference in case blk_drain triggers execution
> -     * of mirror_complete...
> -     */
> -    if (s->target) {
> -        BlockBackend *target = s->target;
> -        blk_ref(target);
> -        blk_drain(target);
> -        blk_unref(target);
> -    }
> -}
> -
>  static const BlockJobDriver mirror_job_driver = {
>      .job_driver = {
>          .instance_size          = sizeof(MirrorBlockJob),
> @@ -1172,7 +1154,6 @@ static const BlockJobDriver mirror_job_driver = {
>          .complete               = mirror_complete,
>      },
>      .drained_poll           = mirror_drained_poll,
> -    .drain                  = mirror_drain,
>  };
>  
>  static const BlockJobDriver commit_active_job_driver = {
> @@ -1189,7 +1170,6 @@ static const BlockJobDriver commit_active_job_driver = {
>          .complete               = mirror_complete,
>      },
>      .drained_poll           = mirror_drained_poll,
> -    .drain                  = mirror_drain,
>  };
>  
>  static void coroutine_fn
> diff --git a/blockjob.c b/blockjob.c
> index 20b7f557da..78cf71d6c8 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -92,12 +92,15 @@ void block_job_free(Job *job)
>  void block_job_drain(Job *job)
>  {
>      BlockJob *bjob = container_of(job, BlockJob, job);
> -    const JobDriver *drv = job->driver;
> -    BlockJobDriver *bjdrv = container_of(drv, BlockJobDriver, job_driver);
> +    GSList *l;
>  
> -    blk_drain(bjob->blk);
> -    if (bjdrv->drain) {
> -        bjdrv->drain(bjob);
> +    for (l = bjob->nodes; l; l = l->next) {
> +        BdrvChild *c = l->data;
> +        bdrv_drained_begin(c->bs);
> +    }
> +    for (l = bjob->nodes; l; l = l->next) {
> +        BdrvChild *c = l->data;
> +        bdrv_drained_end(c->bs);
>      }
>  }
>  
> 

Seems much nicer to me. What becomes of the ref/unref pairs?

I guess not needed anymore?, since job cleanup necessarily happens in
the main loop context now and we don't have a backup_complete function
anymore ...?

In the cases where auto_finalize=true, do we have any guarantee that the
completion callbacks cannot be scheduled while we are here?

Vladimir Sementsov-Ogievskiy July 31, 2019, 10:28 a.m. UTC | #2

30.07.2019 22:11, John Snow wrote:
> 
> 
> On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Instead of draining additional nodes in each job code, let's do it in
>> common block_job_drain, draining just all job's children.
>> BlockJobDriver.drain becomes unused, so, drop it at all.
>>
>> It's also a first step to finally get rid of blockjob->blk.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> v3: just resend, as I've some auto returned mails and not sure that
>>      v2 reached recipients.
>>
>> v2: apply Max's suggestions:
>>   - drop BlockJobDriver.drain
>>   - do firtly loop of bdrv_drained_begin and then separate loop
>>     of bdrv_drained_end.
>>
>>     Hmm, a question here: should I call bdrv_drained_end in reverse
>>     order? Or it's OK as is?
>>
> 
> I think it should be OK. These nodes don't necessarily have a well
> defined relationship between each other, do they?
> 
>>   include/block/blockjob_int.h | 11 -----------
>>   block/backup.c               | 18 +-----------------
>>   block/mirror.c               | 26 +++-----------------------
>>   blockjob.c                   | 13 ++++++++-----
>>   4 files changed, 12 insertions(+), 56 deletions(-)
>>
> 
> Nice diffstat :)
> 
>> diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
>> index e4a318dd15..e1abf4ee85 100644
>> --- a/include/block/blockjob_int.h
>> +++ b/include/block/blockjob_int.h
>> @@ -52,17 +52,6 @@ struct BlockJobDriver {
>>        * besides job->blk to the new AioContext.
>>        */
>>       void (*attached_aio_context)(BlockJob *job, AioContext *new_context);
>> -
>> -    /*
>> -     * If the callback is not NULL, it will be invoked when the job has to be
>> -     * synchronously cancelled or completed; it should drain BlockDriverStates
>> -     * as required to ensure progress.
>> -     *
>> -     * Block jobs must use the default implementation for job_driver.drain,
>> -     * which will in turn call this callback after doing generic block job
>> -     * stuff.
>> -     */
>> -    void (*drain)(BlockJob *job);
> 
> I was about to say "huh?" ... but then realized you're deleting this
> confusing glob. Good.
> 
>>   };
>>   
>>   /**
>> diff --git a/block/backup.c b/block/backup.c
>> index 715e1d3be8..7930004bbd 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -320,21 +320,6 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
>>       hbitmap_set(backup_job->copy_bitmap, 0, backup_job->len);
>>   }
>>   
>> -static void backup_drain(BlockJob *job)
>> -{
>> -    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
>> -
>> -    /* Need to keep a reference in case blk_drain triggers execution
>> -     * of backup_complete...
>> -     */
>> -    if (s->target) {
>> -        BlockBackend *target = s->target;
>> -        blk_ref(target);
>> -        blk_drain(target);
>> -        blk_unref(target);
>> -    }
>> -}
>> -
> 
> Adios ...
> 
>>   static BlockErrorAction backup_error_action(BackupBlockJob *job,
>>                                               bool read, int error)
>>   {
>> @@ -493,8 +478,7 @@ static const BlockJobDriver backup_job_driver = {
>>           .commit                 = backup_commit,
>>           .abort                  = backup_abort,
>>           .clean                  = backup_clean,
>> -    },
>> -    .drain                  = backup_drain,
>> +    }
>>   };
>>   
> 
> This pleases the eyes.
> 
>>   static int64_t backup_calculate_cluster_size(BlockDriverState *target,
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 8cb75fb409..8456ccd89d 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -644,14 +644,11 @@ static int mirror_exit_common(Job *job)
>>       bdrv_ref(mirror_top_bs);
>>       bdrv_ref(target_bs);
>>   
>> -    /* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
>> +    /*
>> +     * Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
> 
> (Thanks, patchew...)
> 
>>        * inserting target_bs at s->to_replace, where we might not be able to get
>>        * these permissions.
>> -     *
>> -     * Note that blk_unref() alone doesn't necessarily drop permissions because
>> -     * we might be running nested inside mirror_drain(), which takes an extra
>> -     * reference, so use an explicit blk_set_perm() first. */
>> -    blk_set_perm(s->target, 0, BLK_PERM_ALL, &error_abort);
>> +     */
>>       blk_unref(s->target);
>>       s->target = NULL;
>>   
>> @@ -1143,21 +1140,6 @@ static bool mirror_drained_poll(BlockJob *job)
>>       return !!s->in_flight;
>>   }
>>   
>> -static void mirror_drain(BlockJob *job)
>> -{
>> -    MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
>> -
>> -    /* Need to keep a reference in case blk_drain triggers execution
>> -     * of mirror_complete...
>> -     */
>> -    if (s->target) {
>> -        BlockBackend *target = s->target;
>> -        blk_ref(target);
>> -        blk_drain(target);
>> -        blk_unref(target);
>> -    }
>> -}
>> -
>>   static const BlockJobDriver mirror_job_driver = {
>>       .job_driver = {
>>           .instance_size          = sizeof(MirrorBlockJob),
>> @@ -1172,7 +1154,6 @@ static const BlockJobDriver mirror_job_driver = {
>>           .complete               = mirror_complete,
>>       },
>>       .drained_poll           = mirror_drained_poll,
>> -    .drain                  = mirror_drain,
>>   };
>>   
>>   static const BlockJobDriver commit_active_job_driver = {
>> @@ -1189,7 +1170,6 @@ static const BlockJobDriver commit_active_job_driver = {
>>           .complete               = mirror_complete,
>>       },
>>       .drained_poll           = mirror_drained_poll,
>> -    .drain                  = mirror_drain,
>>   };
>>   
>>   static void coroutine_fn
>> diff --git a/blockjob.c b/blockjob.c
>> index 20b7f557da..78cf71d6c8 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -92,12 +92,15 @@ void block_job_free(Job *job)
>>   void block_job_drain(Job *job)
>>   {
>>       BlockJob *bjob = container_of(job, BlockJob, job);
>> -    const JobDriver *drv = job->driver;
>> -    BlockJobDriver *bjdrv = container_of(drv, BlockJobDriver, job_driver);
>> +    GSList *l;
>>   
>> -    blk_drain(bjob->blk);
>> -    if (bjdrv->drain) {
>> -        bjdrv->drain(bjob);
>> +    for (l = bjob->nodes; l; l = l->next) {
>> +        BdrvChild *c = l->data;
>> +        bdrv_drained_begin(c->bs);
>> +    }
>> +    for (l = bjob->nodes; l; l = l->next) {
>> +        BdrvChild *c = l->data;
>> +        bdrv_drained_end(c->bs);
>>       }
>>   }
>>   
>>
> 
> Seems much nicer to me. What becomes of the ref/unref pairs?
> 
> I guess not needed anymore?, since job cleanup necessarily happens in
> the main loop context now and we don't have a backup_complete function
> anymore ...?

What pairs do you mean?

> 
> In the cases where auto_finalize=true, do we have any guarantee that the
> completion callbacks cannot be scheduled while we are here?
> 

Hmm, not simple for me to assume.. Is it a problem? And is it about this patch?

John Snow July 31, 2019, 1:39 p.m. UTC | #3

On 7/31/19 6:28 AM, Vladimir Sementsov-Ogievskiy wrote:
> 30.07.2019 22:11, John Snow wrote:
>>
>>
>> On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Instead of draining additional nodes in each job code, let's do it in
>>> common block_job_drain, draining just all job's children.
>>> BlockJobDriver.drain becomes unused, so, drop it at all.
>>>
>>> It's also a first step to finally get rid of blockjob->blk.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>> v3: just resend, as I've some auto returned mails and not sure that
>>>      v2 reached recipients.
>>>
>>> v2: apply Max's suggestions:
>>>   - drop BlockJobDriver.drain
>>>   - do firtly loop of bdrv_drained_begin and then separate loop
>>>     of bdrv_drained_end.
>>>
>>>     Hmm, a question here: should I call bdrv_drained_end in reverse
>>>     order? Or it's OK as is?
>>>
>>
>> I think it should be OK. These nodes don't necessarily have a well
>> defined relationship between each other, do they?
>>
>>>   include/block/blockjob_int.h | 11 -----------
>>>   block/backup.c               | 18 +-----------------
>>>   block/mirror.c               | 26 +++-----------------------
>>>   blockjob.c                   | 13 ++++++++-----
>>>   4 files changed, 12 insertions(+), 56 deletions(-)
>>>
>>
>> Nice diffstat :)
>>
>>> diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
>>> index e4a318dd15..e1abf4ee85 100644
>>> --- a/include/block/blockjob_int.h
>>> +++ b/include/block/blockjob_int.h
>>> @@ -52,17 +52,6 @@ struct BlockJobDriver {
>>>        * besides job->blk to the new AioContext.
>>>        */
>>>       void (*attached_aio_context)(BlockJob *job, AioContext *new_context);
>>> -
>>> -    /*
>>> -     * If the callback is not NULL, it will be invoked when the job has to be
>>> -     * synchronously cancelled or completed; it should drain BlockDriverStates
>>> -     * as required to ensure progress.
>>> -     *
>>> -     * Block jobs must use the default implementation for job_driver.drain,
>>> -     * which will in turn call this callback after doing generic block job
>>> -     * stuff.
>>> -     */
>>> -    void (*drain)(BlockJob *job);
>>
>> I was about to say "huh?" ... but then realized you're deleting this
>> confusing glob. Good.
>>
>>>   };
>>>   
>>>   /**
>>> diff --git a/block/backup.c b/block/backup.c
>>> index 715e1d3be8..7930004bbd 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -320,21 +320,6 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
>>>       hbitmap_set(backup_job->copy_bitmap, 0, backup_job->len);
>>>   }
>>>   
>>> -static void backup_drain(BlockJob *job)
>>> -{
>>> -    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
>>> -
>>> -    /* Need to keep a reference in case blk_drain triggers execution
>>> -     * of backup_complete...
>>> -     */
>>> -    if (s->target) {
>>> -        BlockBackend *target = s->target;
>>> -        blk_ref(target);
>>> -        blk_drain(target);
>>> -        blk_unref(target);
>>> -    }
>>> -}
>>> -
>>
>> Adios ...
>>
>>>   static BlockErrorAction backup_error_action(BackupBlockJob *job,
>>>                                               bool read, int error)
>>>   {
>>> @@ -493,8 +478,7 @@ static const BlockJobDriver backup_job_driver = {
>>>           .commit                 = backup_commit,
>>>           .abort                  = backup_abort,
>>>           .clean                  = backup_clean,
>>> -    },
>>> -    .drain                  = backup_drain,
>>> +    }
>>>   };
>>>   
>>
>> This pleases the eyes.
>>
>>>   static int64_t backup_calculate_cluster_size(BlockDriverState *target,
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 8cb75fb409..8456ccd89d 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -644,14 +644,11 @@ static int mirror_exit_common(Job *job)
>>>       bdrv_ref(mirror_top_bs);
>>>       bdrv_ref(target_bs);
>>>   
>>> -    /* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
>>> +    /*
>>> +     * Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
>>
>> (Thanks, patchew...)
>>
>>>        * inserting target_bs at s->to_replace, where we might not be able to get
>>>        * these permissions.
>>> -     *
>>> -     * Note that blk_unref() alone doesn't necessarily drop permissions because
>>> -     * we might be running nested inside mirror_drain(), which takes an extra
>>> -     * reference, so use an explicit blk_set_perm() first. */
>>> -    blk_set_perm(s->target, 0, BLK_PERM_ALL, &error_abort);
>>> +     */
>>>       blk_unref(s->target);
>>>       s->target = NULL;
>>>   
>>> @@ -1143,21 +1140,6 @@ static bool mirror_drained_poll(BlockJob *job)
>>>       return !!s->in_flight;
>>>   }
>>>   
>>> -static void mirror_drain(BlockJob *job)
>>> -{
>>> -    MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
>>> -
>>> -    /* Need to keep a reference in case blk_drain triggers execution
>>> -     * of mirror_complete...
>>> -     */
>>> -    if (s->target) {
>>> -        BlockBackend *target = s->target;
>>> -        blk_ref(target);
>>> -        blk_drain(target);
>>> -        blk_unref(target);
>>> -    }
>>> -}
>>> -
>>>   static const BlockJobDriver mirror_job_driver = {
>>>       .job_driver = {
>>>           .instance_size          = sizeof(MirrorBlockJob),
>>> @@ -1172,7 +1154,6 @@ static const BlockJobDriver mirror_job_driver = {
>>>           .complete               = mirror_complete,
>>>       },
>>>       .drained_poll           = mirror_drained_poll,
>>> -    .drain                  = mirror_drain,
>>>   };
>>>   
>>>   static const BlockJobDriver commit_active_job_driver = {
>>> @@ -1189,7 +1170,6 @@ static const BlockJobDriver commit_active_job_driver = {
>>>           .complete               = mirror_complete,
>>>       },
>>>       .drained_poll           = mirror_drained_poll,
>>> -    .drain                  = mirror_drain,
>>>   };
>>>   
>>>   static void coroutine_fn
>>> diff --git a/blockjob.c b/blockjob.c
>>> index 20b7f557da..78cf71d6c8 100644
>>> --- a/blockjob.c
>>> +++ b/blockjob.c
>>> @@ -92,12 +92,15 @@ void block_job_free(Job *job)
>>>   void block_job_drain(Job *job)
>>>   {
>>>       BlockJob *bjob = container_of(job, BlockJob, job);
>>> -    const JobDriver *drv = job->driver;
>>> -    BlockJobDriver *bjdrv = container_of(drv, BlockJobDriver, job_driver);
>>> +    GSList *l;
>>>   
>>> -    blk_drain(bjob->blk);
>>> -    if (bjdrv->drain) {
>>> -        bjdrv->drain(bjob);
>>> +    for (l = bjob->nodes; l; l = l->next) {
>>> +        BdrvChild *c = l->data;
>>> +        bdrv_drained_begin(c->bs);
>>> +    }
>>> +    for (l = bjob->nodes; l; l = l->next) {
>>> +        BdrvChild *c = l->data;
>>> +        bdrv_drained_end(c->bs);
>>>       }
>>>   }
>>>   
>>>
>>
>> Seems much nicer to me. What becomes of the ref/unref pairs?
>>
>> I guess not needed anymore?, since job cleanup necessarily happens in
>> the main loop context now and we don't have a backup_complete function
>> anymore ...?
> 
> What pairs do you mean?

blk_ref / blk_unref in the backup and mirror specific drain paths.

> 
>>
>> In the cases where auto_finalize=true, do we have any guarantee that the
>> completion callbacks cannot be scheduled while we are here?
>>
> 
> Hmm, not simple for me to assume.. Is it a problem? And is it about this patch?
>

Max Reitz Aug. 1, 2019, 7:44 p.m. UTC | #4

On 30.07.19 21:11, John Snow wrote:
> 
> 
> On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Instead of draining additional nodes in each job code, let's do it in
>> common block_job_drain, draining just all job's children.
>> BlockJobDriver.drain becomes unused, so, drop it at all.
>>
>> It's also a first step to finally get rid of blockjob->blk.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> v3: just resend, as I've some auto returned mails and not sure that
>>     v2 reached recipients.
>>
>> v2: apply Max's suggestions:
>>  - drop BlockJobDriver.drain
>>  - do firtly loop of bdrv_drained_begin and then separate loop
>>    of bdrv_drained_end.
>>
>>    Hmm, a question here: should I call bdrv_drained_end in reverse
>>    order? Or it's OK as is?
>>
> 
> I think it should be OK. These nodes don't necessarily have a well
> defined relationship between each other, do they?

It’s OK.  If they do have a relationship, the drain code sorts it out by
itself anyway.

[...]

> Seems much nicer to me. What becomes of the ref/unref pairs?
> 
> I guess not needed anymore?, since job cleanup necessarily happens in
> the main loop context now and we don't have a backup_complete function
> anymore ...?
> 
> In the cases where auto_finalize=true, do we have any guarantee that the
> completion callbacks cannot be scheduled while we are here?

Let me try to figure this out...

The only caller of block_job_drain() is job_drain(), whose only caller
is job_finish_sync() in an AIO_WAIT_WHILE() loop.  That loop can only
work in the main loop, so I suppose it does run in the main loop (and
consequentially, block_job_drain() runs in the main loop, too).

That AIO_WAIT_WHILE() loop drains the job (so all nodes) and waits until
the job has completed.  To me that looks like it is designed to have the
completion callback run at some point...?

I suppose anything like that is scheduled by job_enter() in job_drain(),
namely the aio_co_enter() in there.

(A) If the job runs in the main AioContext, aio_co_enter() will enter
the coroutine if we do not run in a coroutine right now (safe), or it
will schedule it for a later point if we do run in a coroutine.  That
latter part may be unsafe, because I guess some coroutine work in
bdrv_drained_begin() or bdrv_drained_end() may wake it up, which can
then mess everything up.

(B) If the job runs in another context, aio_co_enter() will schedule the
job immediately, which I guess means that it can run at any point?  So
it could complete at any point, including block_job_drain().  Ah, no,
actually.  AIO_WAIT_WHILE() will have the job’s context acquired while
evaluating the condition (that is, when calling block_job_drain()).  So
this is safe.

So, well.  Unless Vladimir can give you a guarantee why e.g.
block_job_remove_all_bdrv() won’t run during e.g. bdrv_drained_begin(),
I suppose you’re right and the node list needs to be copied at the
beginning of this function and all nodes should be ref’d.

Max

Vladimir Sementsov-Ogievskiy Aug. 2, 2019, 8:45 a.m. UTC | #5

01.08.2019 22:44, Max Reitz wrote:
> On 30.07.19 21:11, John Snow wrote:
>>
>>
>> On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Instead of draining additional nodes in each job code, let's do it in
>>> common block_job_drain, draining just all job's children.
>>> BlockJobDriver.drain becomes unused, so, drop it at all.
>>>
>>> It's also a first step to finally get rid of blockjob->blk.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>> v3: just resend, as I've some auto returned mails and not sure that
>>>      v2 reached recipients.
>>>
>>> v2: apply Max's suggestions:
>>>   - drop BlockJobDriver.drain
>>>   - do firtly loop of bdrv_drained_begin and then separate loop
>>>     of bdrv_drained_end.
>>>
>>>     Hmm, a question here: should I call bdrv_drained_end in reverse
>>>     order? Or it's OK as is?
>>>
>>
>> I think it should be OK. These nodes don't necessarily have a well
>> defined relationship between each other, do they?
> 
> It’s OK.  If they do have a relationship, the drain code sorts it out by
> itself anyway.
> 
> [...]
> 
>> Seems much nicer to me. What becomes of the ref/unref pairs?
>>
>> I guess not needed anymore?, since job cleanup necessarily happens in
>> the main loop context now and we don't have a backup_complete function
>> anymore ...?
>>
>> In the cases where auto_finalize=true, do we have any guarantee that the
>> completion callbacks cannot be scheduled while we are here?
> 
> Let me try to figure this out...
> 
> The only caller of block_job_drain() is job_drain(), whose only caller
> is job_finish_sync() in an AIO_WAIT_WHILE() loop.  That loop can only
> work in the main loop, so I suppose it does run in the main loop (and
> consequentially, block_job_drain() runs in the main loop, too).
> 
> That AIO_WAIT_WHILE() loop drains the job (so all nodes) and waits until
> the job has completed.  To me that looks like it is designed to have the
> completion callback run at some point...?
> 
> I suppose anything like that is scheduled by job_enter() in job_drain(),
> namely the aio_co_enter() in there.
> 
> (A) If the job runs in the main AioContext, aio_co_enter() will enter
> the coroutine if we do not run in a coroutine right now (safe), or it
> will schedule it for a later point if we do run in a coroutine.  That
> latter part may be unsafe, because I guess some coroutine work in
> bdrv_drained_begin() or bdrv_drained_end() may wake it up, which can
> then mess everything up.
> 
> (B) If the job runs in another context, aio_co_enter() will schedule the
> job immediately, which I guess means that it can run at any point?  So
> it could complete at any point, including block_job_drain().  Ah, no,
> actually.  AIO_WAIT_WHILE() will have the job’s context acquired while
> evaluating the condition (that is, when calling block_job_drain()).  So
> this is safe.
> 
> 
> So, well.  Unless Vladimir can give you a guarantee why e.g.
> block_job_remove_all_bdrv() won’t run during e.g. bdrv_drained_begin(),
> I suppose you’re right and the node list needs to be copied at the
> beginning of this function and all nodes should be ref’d.
> 

Thanks a lot! OK, I'll resend

Kevin Wolf Aug. 2, 2019, 1:12 p.m. UTC | #6

Am 01.08.2019 um 21:44 hat Max Reitz geschrieben:
> On 30.07.19 21:11, John Snow wrote:
> > 
> > 
> > On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> >> Instead of draining additional nodes in each job code, let's do it in
> >> common block_job_drain, draining just all job's children.
> >> BlockJobDriver.drain becomes unused, so, drop it at all.
> >>
> >> It's also a first step to finally get rid of blockjob->blk.
> >>
> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> >> ---
> >>
> >> v3: just resend, as I've some auto returned mails and not sure that
> >>     v2 reached recipients.
> >>
> >> v2: apply Max's suggestions:
> >>  - drop BlockJobDriver.drain
> >>  - do firtly loop of bdrv_drained_begin and then separate loop
> >>    of bdrv_drained_end.
> >>
> >>    Hmm, a question here: should I call bdrv_drained_end in reverse
> >>    order? Or it's OK as is?
> >>
> > 
> > I think it should be OK. These nodes don't necessarily have a well
> > defined relationship between each other, do they?
> 
> It’s OK.  If they do have a relationship, the drain code sorts it out by
> itself anyway.
> 
> [...]
> 
> > Seems much nicer to me. What becomes of the ref/unref pairs?
> > 
> > I guess not needed anymore?, since job cleanup necessarily happens in
> > the main loop context now and we don't have a backup_complete function
> > anymore ...?
> > 
> > In the cases where auto_finalize=true, do we have any guarantee that the
> > completion callbacks cannot be scheduled while we are here?
> 
> Let me try to figure this out...
> 
> The only caller of block_job_drain() is job_drain(), whose only caller
> is job_finish_sync() in an AIO_WAIT_WHILE() loop.  That loop can only
> work in the main loop, so I suppose it does run in the main loop (and
> consequentially, block_job_drain() runs in the main loop, too).
> 
> That AIO_WAIT_WHILE() loop drains the job (so all nodes) and waits until
> the job has completed.  To me that looks like it is designed to have the
> completion callback run at some point...?

Yes, definitely.

However, I wonder why we do this blk_drain(). We are not really
interested in stopping all requests to the nodes involves in the job, we
just want to get the job to make progress so that it will eventually
complete. Draining looks like the entirely wrong tool to me.

This was introduced in commit bae8196d9f9, and it looks to me as if it
was a hack because drain could work cross-AioContext and everything else
couldn't.

Today we use AIO_WAIT_WHILE() in job_finish_sync(), so I wonder if maybe
the whole drain part is unnecessary now and just doing the job_enter()
part would be enough.

> I suppose anything like that is scheduled by job_enter() in job_drain(),
> namely the aio_co_enter() in there.
> 
> (A) If the job runs in the main AioContext, aio_co_enter() will enter
> the coroutine if we do not run in a coroutine right now (safe), or it
> will schedule it for a later point if we do run in a coroutine.  That
> latter part may be unsafe, because I guess some coroutine work in
> bdrv_drained_begin() or bdrv_drained_end() may wake it up, which can
> then mess everything up.

It should be fine, actually. The drain functions drop out of coroutine
context to avoid such problems. So if it gets woken up, it's before the
actual drain has started.

> (B) If the job runs in another context, aio_co_enter() will schedule the
> job immediately, which I guess means that it can run at any point?  So
> it could complete at any point, including block_job_drain().  Ah, no,
> actually.  AIO_WAIT_WHILE() will have the job’s context acquired while
> evaluating the condition (that is, when calling block_job_drain()).  So
> this is safe.
> 
> 
> So, well.  Unless Vladimir can give you a guarantee why e.g.
> block_job_remove_all_bdrv() won’t run during e.g. bdrv_drained_begin(),
> I suppose you’re right and the node list needs to be copied at the
> beginning of this function and all nodes should be ref’d.

At some point, it will probably run, in the polling phase of
bdrv_drained_begin().

Kevin

[v3] blockjob: drain all job nodes in block_job_drain

Commit Message

Comments

Patch