Message ID | 20171212160450.17510-3-vsementsov@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | fix bitmaps migration through shared storage | expand |
Am 12.12.2017 um 17:04 hat Vladimir Sementsov-Ogievskiy geschrieben: > Consider migration with shared storage. Persistent bitmaps are stored > on bdrv_inactivate. Then, on destination > process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which > leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps > are already loaded on destination start. In this case we should call > qcow2_reopen_bitmaps_rw instead. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > Reviewed-by: John Snow <jsnow@redhat.com> qcow2_invalidate_cache() calls qcow2_close() first, so why are there still any bitmaps loaded? Isn't this a bug? Do we leak bitmaps when a qcow2 image is closed? Kevin
22.12.2017 16:39, Kevin Wolf wrote: > Am 12.12.2017 um 17:04 hat Vladimir Sementsov-Ogievskiy geschrieben: >> Consider migration with shared storage. Persistent bitmaps are stored >> on bdrv_inactivate. Then, on destination >> process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which >> leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps >> are already loaded on destination start. In this case we should call >> qcow2_reopen_bitmaps_rw instead. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> Reviewed-by: John Snow <jsnow@redhat.com> > qcow2_invalidate_cache() calls qcow2_close() first, so why are there > still any bitmaps loaded? Isn't this a bug? Do we leak bitmaps when a > qcow2 image is closed? > > Kevin Interesting point. Now persistent dirty bitmaps are released at the end of bdrv_inactivate_recurse, which is not called here. It was a separate patch commit 615b5dcf2decbc5f0abb512d13d7e5db2385fa23 Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Date: Wed Jun 28 15:05:30 2017 +0300 block: release persistent bitmaps on inactivate May be it is more correct to release them immediately after storing, in qcow2_inactivete. But it will not fix the issue, because qcow2_close will call qcow2_inactivate only if (!(s->flags & BDRV_O_INACTIVE)), but it is not the case. or we can do like this, it fixes the new test: static void qcow2_close(BlockDriverState *bs) { BDRVQcow2State *s = bs->opaque; qemu_vfree(s->l1_table); /* else pre-write overlap checks in cache_destroy may crash */ s->l1_table = NULL; if (!(s->flags & BDRV_O_INACTIVE)) { qcow2_inactivate(bs); } + bdrv_release_persistent_dirty_bitmaps(bs); What do you think?
Am 22.12.2017 um 15:25 hat Vladimir Sementsov-Ogievskiy geschrieben: > 22.12.2017 16:39, Kevin Wolf wrote: > > Am 12.12.2017 um 17:04 hat Vladimir Sementsov-Ogievskiy geschrieben: > > > Consider migration with shared storage. Persistent bitmaps are stored > > > on bdrv_inactivate. Then, on destination > > > process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which > > > leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps > > > are already loaded on destination start. In this case we should call > > > qcow2_reopen_bitmaps_rw instead. > > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > > Reviewed-by: John Snow <jsnow@redhat.com> > > qcow2_invalidate_cache() calls qcow2_close() first, so why are there > > still any bitmaps loaded? Isn't this a bug? Do we leak bitmaps when a > > qcow2 image is closed? > > > > Kevin > > Interesting point. > > Now persistent dirty bitmaps are released at the end of > bdrv_inactivate_recurse, > which is not called here. It was a separate patch > > commit 615b5dcf2decbc5f0abb512d13d7e5db2385fa23 > Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > Date: Wed Jun 28 15:05:30 2017 +0300 > > block: release persistent bitmaps on inactivate > > May be it is more correct to release them immediately after storing, in > qcow2_inactivete. I chose the question form because I'm really not deep enough into the bitmap code to have a solid opinion, but I have a feeling that releasing the bitmaps from the block driver that provided them would be cleaner indeed. I suppose the same is true for .bdrv_close. > But it will not fix the issue, because qcow2_close will > call qcow2_inactivate only if (!(s->flags & BDRV_O_INACTIVE)), but it is > not the case. Yes, good point. Is there a reason why bitmaps are already loaded in qcow2_do_open() even if the image is inactive? Can bitmaps be meaningfully used on inactive images? Otherwise, we could just make qcow2_load_autoloading_dirty_bitmaps() conditional on cleared BDRV_O_INACTIVE. > or we can do like this, it fixes the new test: > static void qcow2_close(BlockDriverState *bs) > { > BDRVQcow2State *s = bs->opaque; > qemu_vfree(s->l1_table); > /* else pre-write overlap checks in cache_destroy may crash */ > s->l1_table = NULL; > > if (!(s->flags & BDRV_O_INACTIVE)) { > qcow2_inactivate(bs); > } > + bdrv_release_persistent_dirty_bitmaps(bs); > > What do you think? Hm, I think I don't like this much. We just need to decide what the status of inactive images is supposed to be. If they should have bitmaps, then your patch is probably right. But if inactive images shouldn't have any, we need to change qcow2_do_open() and qcow2_inactivate(). Kevin
22.12.2017 18:43, Kevin Wolf wrote: > Am 22.12.2017 um 15:25 hat Vladimir Sementsov-Ogievskiy geschrieben: >> 22.12.2017 16:39, Kevin Wolf wrote: >>> Am 12.12.2017 um 17:04 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>> Consider migration with shared storage. Persistent bitmaps are stored >>>> on bdrv_inactivate. Then, on destination >>>> process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which >>>> leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps >>>> are already loaded on destination start. In this case we should call >>>> qcow2_reopen_bitmaps_rw instead. >>>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>>> Reviewed-by: John Snow <jsnow@redhat.com> >>> qcow2_invalidate_cache() calls qcow2_close() first, so why are there >>> still any bitmaps loaded? Isn't this a bug? Do we leak bitmaps when a >>> qcow2 image is closed? >>> >>> Kevin >> Interesting point. >> >> Now persistent dirty bitmaps are released at the end of >> bdrv_inactivate_recurse, >> which is not called here. It was a separate patch >> >> commit 615b5dcf2decbc5f0abb512d13d7e5db2385fa23 >> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> Date: Wed Jun 28 15:05:30 2017 +0300 >> >> block: release persistent bitmaps on inactivate >> >> May be it is more correct to release them immediately after storing, in >> qcow2_inactivete. > I chose the question form because I'm really not deep enough into the > bitmap code to have a solid opinion, but I have a feeling that releasing > the bitmaps from the block driver that provided them would be cleaner > indeed. I suppose the same is true for .bdrv_close. > >> But it will not fix the issue, because qcow2_close will >> call qcow2_inactivate only if (!(s->flags & BDRV_O_INACTIVE)), but it is >> not the case. > Yes, good point. > > Is there a reason why bitmaps are already loaded in qcow2_do_open() even > if the image is inactive? Can bitmaps be meaningfully used on inactive > images? > > Otherwise, we could just make qcow2_load_autoloading_dirty_bitmaps() > conditional on cleared BDRV_O_INACTIVE. > >> or we can do like this, it fixes the new test: >> static void qcow2_close(BlockDriverState *bs) >> { >> BDRVQcow2State *s = bs->opaque; >> qemu_vfree(s->l1_table); >> /* else pre-write overlap checks in cache_destroy may crash */ >> s->l1_table = NULL; >> >> if (!(s->flags & BDRV_O_INACTIVE)) { >> qcow2_inactivate(bs); >> } >> + bdrv_release_persistent_dirty_bitmaps(bs); >> >> What do you think? > Hm, I think I don't like this much. > > We just need to decide what the status of inactive images is supposed to > be. If they should have bitmaps, then your patch is probably right. But > if inactive images shouldn't have any, we need to change qcow2_do_open() > and qcow2_inactivate(). > > Kevin Does Qemu start in inactive mode when and only when it is incoming migration? In this case I don't see any reason of early-load the bitmaps. Backup in inactive mode should not be allowed too, yes? So, it looks like it's ok to just do not autoload bitmaps if we are in inactive mode. The difference would be that user will not see these bitmaps during migration. And he even may create bitmaps with same names, which will lead to fault. But it don't look like real problem. So, if there will not be other thoughts, I'll make another patch.
Am 22.12.2017 um 17:12 hat Vladimir Sementsov-Ogievskiy geschrieben: > 22.12.2017 18:43, Kevin Wolf wrote: > > Am 22.12.2017 um 15:25 hat Vladimir Sementsov-Ogievskiy geschrieben: > > > 22.12.2017 16:39, Kevin Wolf wrote: > > > > Am 12.12.2017 um 17:04 hat Vladimir Sementsov-Ogievskiy geschrieben: > > > > > Consider migration with shared storage. Persistent bitmaps are stored > > > > > on bdrv_inactivate. Then, on destination > > > > > process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which > > > > > leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps > > > > > are already loaded on destination start. In this case we should call > > > > > qcow2_reopen_bitmaps_rw instead. > > > > > > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > > > > Reviewed-by: John Snow <jsnow@redhat.com> > > > > qcow2_invalidate_cache() calls qcow2_close() first, so why are there > > > > still any bitmaps loaded? Isn't this a bug? Do we leak bitmaps when a > > > > qcow2 image is closed? > > > > > > > > Kevin > > > Interesting point. > > > > > > Now persistent dirty bitmaps are released at the end of > > > bdrv_inactivate_recurse, > > > which is not called here. It was a separate patch > > > > > > commit 615b5dcf2decbc5f0abb512d13d7e5db2385fa23 > > > Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > > Date: Wed Jun 28 15:05:30 2017 +0300 > > > > > > block: release persistent bitmaps on inactivate > > > > > > May be it is more correct to release them immediately after storing, in > > > qcow2_inactivete. > > I chose the question form because I'm really not deep enough into the > > bitmap code to have a solid opinion, but I have a feeling that releasing > > the bitmaps from the block driver that provided them would be cleaner > > indeed. I suppose the same is true for .bdrv_close. > > > > > But it will not fix the issue, because qcow2_close will > > > call qcow2_inactivate only if (!(s->flags & BDRV_O_INACTIVE)), but it is > > > not the case. > > Yes, good point. > > > > Is there a reason why bitmaps are already loaded in qcow2_do_open() even > > if the image is inactive? Can bitmaps be meaningfully used on inactive > > images? > > > > Otherwise, we could just make qcow2_load_autoloading_dirty_bitmaps() > > conditional on cleared BDRV_O_INACTIVE. > > > > > or we can do like this, it fixes the new test: > > > static void qcow2_close(BlockDriverState *bs) > > > { > > > BDRVQcow2State *s = bs->opaque; > > > qemu_vfree(s->l1_table); > > > /* else pre-write overlap checks in cache_destroy may crash */ > > > s->l1_table = NULL; > > > > > > if (!(s->flags & BDRV_O_INACTIVE)) { > > > qcow2_inactivate(bs); > > > } > > > + bdrv_release_persistent_dirty_bitmaps(bs); > > > > > > What do you think? > > Hm, I think I don't like this much. > > > > We just need to decide what the status of inactive images is supposed to > > be. If they should have bitmaps, then your patch is probably right. But > > if inactive images shouldn't have any, we need to change qcow2_do_open() > > and qcow2_inactivate(). > > > > Kevin > > Does Qemu start in inactive mode when and only when it is incoming > migration? In this case I don't see any reason of early-load the > bitmaps. Backup in inactive mode should not be allowed too, yes? Yes, inactive images are only for incoming migration at the moment. I would like to change that into opening all images as inactive and then immediately activate them (just to unify two code paths, one of which isn't tested as much as it should be), but that wouldn't be visible externally, so it should be fine, too. > So, it looks like it's ok to just do not autoload bitmaps if we are in > inactive mode. The difference would be that user will not see these > bitmaps during migration. And he even may create bitmaps with same > names, which will lead to fault. But it don't look like real problem. You can't create persistent bitmaps on an inactive image anyway, because that would involve writes. Conflicts with temporary bitmaps might be possible, though. Kevin
22.12.2017 19:28, Kevin Wolf wrote: > Am 22.12.2017 um 17:12 hat Vladimir Sementsov-Ogievskiy geschrieben: >> 22.12.2017 18:43, Kevin Wolf wrote: >>> Am 22.12.2017 um 15:25 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>> 22.12.2017 16:39, Kevin Wolf wrote: >>>>> Am 12.12.2017 um 17:04 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>>>> Consider migration with shared storage. Persistent bitmaps are stored >>>>>> on bdrv_inactivate. Then, on destination >>>>>> process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which >>>>>> leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps >>>>>> are already loaded on destination start. In this case we should call >>>>>> qcow2_reopen_bitmaps_rw instead. >>>>>> >>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>>>>> Reviewed-by: John Snow <jsnow@redhat.com> >>>>> qcow2_invalidate_cache() calls qcow2_close() first, so why are there >>>>> still any bitmaps loaded? Isn't this a bug? Do we leak bitmaps when a >>>>> qcow2 image is closed? >>>>> >>>>> Kevin >>>> Interesting point. >>>> >>>> Now persistent dirty bitmaps are released at the end of >>>> bdrv_inactivate_recurse, >>>> which is not called here. It was a separate patch >>>> >>>> commit 615b5dcf2decbc5f0abb512d13d7e5db2385fa23 >>>> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>>> Date: Wed Jun 28 15:05:30 2017 +0300 >>>> >>>> block: release persistent bitmaps on inactivate >>>> >>>> May be it is more correct to release them immediately after storing, in >>>> qcow2_inactivete. >>> I chose the question form because I'm really not deep enough into the >>> bitmap code to have a solid opinion, but I have a feeling that releasing >>> the bitmaps from the block driver that provided them would be cleaner >>> indeed. I suppose the same is true for .bdrv_close. >>> >>>> But it will not fix the issue, because qcow2_close will >>>> call qcow2_inactivate only if (!(s->flags & BDRV_O_INACTIVE)), but it is >>>> not the case. >>> Yes, good point. >>> >>> Is there a reason why bitmaps are already loaded in qcow2_do_open() even >>> if the image is inactive? Can bitmaps be meaningfully used on inactive >>> images? >>> >>> Otherwise, we could just make qcow2_load_autoloading_dirty_bitmaps() >>> conditional on cleared BDRV_O_INACTIVE. >>> >>>> or we can do like this, it fixes the new test: >>>> static void qcow2_close(BlockDriverState *bs) >>>> { >>>> BDRVQcow2State *s = bs->opaque; >>>> qemu_vfree(s->l1_table); >>>> /* else pre-write overlap checks in cache_destroy may crash */ >>>> s->l1_table = NULL; >>>> >>>> if (!(s->flags & BDRV_O_INACTIVE)) { >>>> qcow2_inactivate(bs); >>>> } >>>> + bdrv_release_persistent_dirty_bitmaps(bs); >>>> >>>> What do you think? >>> Hm, I think I don't like this much. >>> >>> We just need to decide what the status of inactive images is supposed to >>> be. If they should have bitmaps, then your patch is probably right. But >>> if inactive images shouldn't have any, we need to change qcow2_do_open() >>> and qcow2_inactivate(). >>> >>> Kevin >> Does Qemu start in inactive mode when and only when it is incoming >> migration? In this case I don't see any reason of early-load the >> bitmaps. Backup in inactive mode should not be allowed too, yes? > Yes, inactive images are only for incoming migration at the moment. > > I would like to change that into opening all images as inactive and then > immediately activate them (just to unify two code paths, one of which > isn't tested as much as it should be), but that wouldn't be visible > externally, so it should be fine, too. similar approach led to a problem for us: To make it possible to modify drives through nbd before start (a kind of external reatore), I've added simple @@ -4727,6 +4727,8 @@ int main(int argc, char **argv, char **envp) } } else if (autostart) { vm_start(); + } else { + bdrv_inactivate_all(); } and now, we have no dirty bitmaps between paused start and cont. And it is not comfortable, we can't query, can't do something with these bitmaps before vm start. > >> So, it looks like it's ok to just do not autoload bitmaps if we are in >> inactive mode. The difference would be that user will not see these >> bitmaps during migration. And he even may create bitmaps with same >> names, which will lead to fault. But it don't look like real problem. > You can't create persistent bitmaps on an inactive image anyway, because > that would involve writes. > > Conflicts with temporary bitmaps might be possible, though. > > Kevin
diff --git a/block/qcow2.c b/block/qcow2.c index 1914a940e5..f5f205e1b7 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1450,7 +1450,13 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags, s->autoclear_features &= QCOW2_AUTOCLEAR_MASK; } - if (qcow2_load_autoloading_dirty_bitmaps(bs, &local_err)) { + if (bdrv_has_readonly_bitmaps(bs)) { + if (!bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE)) { + bool header_updated = false; + qcow2_reopen_bitmaps_rw_hint(bs, &header_updated, &local_err); + update_header = update_header && !header_updated; + } + } else if (qcow2_load_autoloading_dirty_bitmaps(bs, &local_err)) { update_header = false; } if (local_err != NULL) {