Message ID | 20190605123229.92848-3-vsementsov@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | introduce pinned blk | expand |
Am 05.06.2019 um 14:32 hat Vladimir Sementsov-Ogievskiy geschrieben: > child_role job already has .stay_at_node=true, so on bdrv_replace_node > operation these child are unchanged. Make block job blk behave in same > manner, to avoid inconsistent intermediate graph states and workarounds > like in mirror. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> This feels dangerous. It does what you want it to do if the only graph change below the BlockBackend is the one in mirror_exit_common. But the user could also take a snapshot, or in the future hopefully insert a filter node, and you would then want the BlockBackend to move. To be honest, even BdrvChildRole.stay_at_node is a bit of a hack. But at least it's only used for permissions and not for the actual data flow. Kevin
05.06.2019 20:11, Kevin Wolf wrote: > Am 05.06.2019 um 14:32 hat Vladimir Sementsov-Ogievskiy geschrieben: >> child_role job already has .stay_at_node=true, so on bdrv_replace_node >> operation these child are unchanged. Make block job blk behave in same >> manner, to avoid inconsistent intermediate graph states and workarounds >> like in mirror. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > This feels dangerous. It does what you want it to do if the only graph > change below the BlockBackend is the one in mirror_exit_common. But the > user could also take a snapshot, or in the future hopefully insert a > filter node, and you would then want the BlockBackend to move. > > To be honest, even BdrvChildRole.stay_at_node is a bit of a hack. But at > least it's only used for permissions and not for the actual data flow. > Hmm. Than, may be just add a parameter to bdrv_replace_node, which parents to ignore? Would it work?
Am 05.06.2019 um 19:16 hat Vladimir Sementsov-Ogievskiy geschrieben: > 05.06.2019 20:11, Kevin Wolf wrote: > > Am 05.06.2019 um 14:32 hat Vladimir Sementsov-Ogievskiy geschrieben: > >> child_role job already has .stay_at_node=true, so on bdrv_replace_node > >> operation these child are unchanged. Make block job blk behave in same > >> manner, to avoid inconsistent intermediate graph states and workarounds > >> like in mirror. > >> > >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > > > This feels dangerous. It does what you want it to do if the only graph > > change below the BlockBackend is the one in mirror_exit_common. But the > > user could also take a snapshot, or in the future hopefully insert a > > filter node, and you would then want the BlockBackend to move. > > > > To be honest, even BdrvChildRole.stay_at_node is a bit of a hack. But at > > least it's only used for permissions and not for the actual data flow. > > Hmm. Than, may be just add a parameter to bdrv_replace_node, which parents > to ignore? Would it work? I would have to think a bit more about it, but it does sound safer. Or we take a step back and ask why it's even a problem for the mirror block job if the BlockBackend is moved to a different node. The main reason I see is because of bs->job that is set for the root node of the BlockBackend and needs to be unset for the same node. Maybe we can just finally get rid of bs->job? It doesn't have many users any more. Kevin
06.06.2019 13:05, Kevin Wolf wrote: > Am 05.06.2019 um 19:16 hat Vladimir Sementsov-Ogievskiy geschrieben: >> 05.06.2019 20:11, Kevin Wolf wrote: >>> Am 05.06.2019 um 14:32 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>> child_role job already has .stay_at_node=true, so on bdrv_replace_node >>>> operation these child are unchanged. Make block job blk behave in same >>>> manner, to avoid inconsistent intermediate graph states and workarounds >>>> like in mirror. >>>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>> >>> This feels dangerous. It does what you want it to do if the only graph >>> change below the BlockBackend is the one in mirror_exit_common. But the >>> user could also take a snapshot, or in the future hopefully insert a >>> filter node, and you would then want the BlockBackend to move. >>> >>> To be honest, even BdrvChildRole.stay_at_node is a bit of a hack. But at >>> least it's only used for permissions and not for the actual data flow. >> >> Hmm. Than, may be just add a parameter to bdrv_replace_node, which parents >> to ignore? Would it work? > > I would have to think a bit more about it, but it does sound safer. > > Or we take a step back and ask why it's even a problem for the mirror > block job if the BlockBackend is moved to a different node. The main > reason I see is because of bs->job that is set for the root node of the > BlockBackend and needs to be unset for the same node. > > Maybe we can just finally get rid of bs->job? It doesn't have many users > any more. > Hmm, looked at it. Not sure what should be refactored around job to get rid of "main node" concept.. Which seems to be in a bad relation with starting job on implicit filters as a main node.. But about just removing bs->job pointer, I don't know at least what to do with blk_iostatus_reset and blockdev_mark_auto_del..
Am 06.06.2019 um 14:29 hat Vladimir Sementsov-Ogievskiy geschrieben: > 06.06.2019 13:05, Kevin Wolf wrote: > > Am 05.06.2019 um 19:16 hat Vladimir Sementsov-Ogievskiy geschrieben: > >> 05.06.2019 20:11, Kevin Wolf wrote: > >>> Am 05.06.2019 um 14:32 hat Vladimir Sementsov-Ogievskiy geschrieben: > >>>> child_role job already has .stay_at_node=true, so on bdrv_replace_node > >>>> operation these child are unchanged. Make block job blk behave in same > >>>> manner, to avoid inconsistent intermediate graph states and workarounds > >>>> like in mirror. > >>>> > >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > >>> > >>> This feels dangerous. It does what you want it to do if the only graph > >>> change below the BlockBackend is the one in mirror_exit_common. But the > >>> user could also take a snapshot, or in the future hopefully insert a > >>> filter node, and you would then want the BlockBackend to move. > >>> > >>> To be honest, even BdrvChildRole.stay_at_node is a bit of a hack. But at > >>> least it's only used for permissions and not for the actual data flow. > >> > >> Hmm. Than, may be just add a parameter to bdrv_replace_node, which parents > >> to ignore? Would it work? > > > > I would have to think a bit more about it, but it does sound safer. > > > > Or we take a step back and ask why it's even a problem for the mirror > > block job if the BlockBackend is moved to a different node. The main > > reason I see is because of bs->job that is set for the root node of the > > BlockBackend and needs to be unset for the same node. > > > > Maybe we can just finally get rid of bs->job? It doesn't have many users > > any more. > > > > Hmm, looked at it. Not sure what should be refactored around job to get rid > of "main node" concept.. Which seems to be in a bad relation with starting > job on implicit filters as a main node.. > > But about just removing bs->job pointer, I don't know at least what to do with > blk_iostatus_reset and blockdev_mark_auto_del.. blk_iostatus_reset() looks easy. It has only two callers: 1. blk_attach_dev(). This doesn't have anything to do with jobs and attaching a new guest device won't solve any problem the job encountered, so no reason to reset the iostatus for the job. 2. qmp_cont(). This resets the iostatus for everything. We can just call block_job_iostatus_reset() for all block jobs instead of going through BlockBackend. blockdev_mark_auto_del() might be a bit trickier. The whole idea of the function is: When a guest device gets unplugged, automatically remove its root block node, too. Commit 12bde0eed6b made it cancel a block job because that should happen immediately when the device is actually released by the guest and not only after the job finishes and gives up its reference. I would like to just change the behaviour, but I'm afraid we can't do this because of compatibility. However, just checking bs->job is really only one special case of another user of the node to be deleted. Maybe we can extend it a little so that any block jobs that contain the node in job->nodes are cancelled. Kevin
06.06.2019 16:06, Kevin Wolf wrote: > Am 06.06.2019 um 14:29 hat Vladimir Sementsov-Ogievskiy geschrieben: >> 06.06.2019 13:05, Kevin Wolf wrote: >>> Am 05.06.2019 um 19:16 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>> 05.06.2019 20:11, Kevin Wolf wrote: >>>>> Am 05.06.2019 um 14:32 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>>>> child_role job already has .stay_at_node=true, so on bdrv_replace_node >>>>>> operation these child are unchanged. Make block job blk behave in same >>>>>> manner, to avoid inconsistent intermediate graph states and workarounds >>>>>> like in mirror. >>>>>> >>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>>>> >>>>> This feels dangerous. It does what you want it to do if the only graph >>>>> change below the BlockBackend is the one in mirror_exit_common. But the >>>>> user could also take a snapshot, or in the future hopefully insert a >>>>> filter node, and you would then want the BlockBackend to move. >>>>> >>>>> To be honest, even BdrvChildRole.stay_at_node is a bit of a hack. But at >>>>> least it's only used for permissions and not for the actual data flow. >>>> >>>> Hmm. Than, may be just add a parameter to bdrv_replace_node, which parents >>>> to ignore? Would it work? >>> >>> I would have to think a bit more about it, but it does sound safer. >>> >>> Or we take a step back and ask why it's even a problem for the mirror >>> block job if the BlockBackend is moved to a different node. The main >>> reason I see is because of bs->job that is set for the root node of the >>> BlockBackend and needs to be unset for the same node. >>> >>> Maybe we can just finally get rid of bs->job? It doesn't have many users >>> any more. >>> >> >> Hmm, looked at it. Not sure what should be refactored around job to get rid >> of "main node" concept.. Which seems to be in a bad relation with starting >> job on implicit filters as a main node.. >> >> But about just removing bs->job pointer, I don't know at least what to do with >> blk_iostatus_reset and blockdev_mark_auto_del.. > > blk_iostatus_reset() looks easy. It has only two callers: > > 1. blk_attach_dev(). This doesn't have anything to do with jobs and > attaching a new guest device won't solve any problem the job > encountered, so no reason to reset the iostatus for the job. > > 2. qmp_cont(). This resets the iostatus for everything. We can just > call block_job_iostatus_reset() for all block jobs instead of going > through BlockBackend. > > blockdev_mark_auto_del() might be a bit trickier. The whole idea of the > function is: When a guest device gets unplugged, automatically remove > its root block node, too. Commit 12bde0eed6b made it cancel a block job > because that should happen immediately when the device is actually > released by the guest and not only after the job finishes and gives up > its reference. I would like to just change the behaviour, but I'm afraid > we can't do this because of compatibility. > > However, just checking bs->job is really only one special case of > another user of the node to be deleted. Maybe we can extend it a little > so that any block jobs that contain the node in job->nodes are > cancelled. > OK, thanks. I'll try this way
diff --git a/block/mirror.c b/block/mirror.c index f8bdb5b21b..23443116e4 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -713,12 +713,8 @@ static int mirror_exit_common(Job *job) &error_abort); bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort); - /* We just changed the BDS the job BB refers to (with either or both of the - * bdrv_replace_node() calls), so switch the BB back so the cleanup does - * the right thing. We don't need any permissions any more now. */ - blk_remove_bs(bjob->blk); + /* We don't need any permissions any more now. */ blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort); - blk_insert_bs(bjob->blk, mirror_top_bs, &error_abort); bs_opaque->job = NULL; diff --git a/blockjob.c b/blockjob.c index 931d675c0c..f5c8d31491 100644 --- a/blockjob.c +++ b/blockjob.c @@ -398,7 +398,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, job_id = bdrv_get_device_name(bs); } - blk = blk_new(bdrv_get_aio_context(bs), perm, shared_perm); + blk = blk_new_pinned(bdrv_get_aio_context(bs), perm, shared_perm); ret = blk_insert_bs(blk, bs, errp); if (ret < 0) { blk_unref(blk);
child_role job already has .stay_at_node=true, so on bdrv_replace_node operation these child are unchanged. Make block job blk behave in same manner, to avoid inconsistent intermediate graph states and workarounds like in mirror. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> --- block/mirror.c | 6 +----- blockjob.c | 2 +- 2 files changed, 2 insertions(+), 6 deletions(-)