Message ID | 4E316EFD.6080304@redhat.com |
---|---|
State | New |
Headers | show |
Am 28.07.2011 16:15, schrieb Kevin Wolf: > Am 28.07.2011 15:50, schrieb Frediano Ziglio: >> Well, I think this is the first real improve patch. >> Is more a RFC than a patch. Yes, some lines are terrible! >> It collapses refcount decrement during cow. >> From a first check time executing 015 test passed from about 600 seconds >> to 70. >> This at least prove that refcount updates counts! >> Some doubt: >> 1- place the code in qcow2-refcount.c as it update only refcount and not >> cluster? >> 2- allow some sort of "begin transaction" / "commit" / "rollback" like >> databases instead? >> 3- allow changing tables from different coroutines? >> >> 1) If you have a sequence like (1, 2, 4) probably these clusters are all in >> the same l2 table but with this code you get two write instead of one. >> I'm thinking about a function in qcow2-refcount.c that accept an array of cluster >> instead of a start + len. >> >> Signed-off-by: Frediano Ziglio <freddy77@gmail.com> > > I think what you're seeing is actually just one special case of a more > general problem. The problem is that we're interpreting writethrough > stricter than required. > > The semantics that we really need is that on completion of a request, > all of its data and metadata must be flushed to disk. There is no > requirement that we flush all intermediate states. > > My recent update to qcow2_update_snapshot_refcount() is just another > case of the same problem. I think the solution should be similar to what > I did there, i.e. switch the cache to writeback mode while we're > operating on it and switch back when we're done. We should probably have > functions that make both of this a one-liner (I think here we have some > similarity to your begin/commit idea). > > With the right functions, this could become as easy as this (might need > better function names, but you get the idea): > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 882f50a..45b67b1 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -612,6 +612,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState > *bs, QCowL2Meta *m) > if (m->nb_clusters == 0) > return 0; > > + qcow2_cache_disable_writethrough(bs); > + > old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t)); > > /* copy content of unmodified sectors */ > @@ -683,6 +685,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState > *bs, QCowL2Meta *m) > > ret = 0; > err: > + qcow2_cache_restore_writethrough(bs); > qemu_free(old_cluster); > return ret; > } Maybe that's a bit too easy for a solution. With coroutines this requires running under a CoMutex in order to avoid influencing other requests and possibly missing a cache flush. This is contrary to our goal of running requests in parallel, so maybe some more changes to how the cache handles cache=writethrough are required. Kevin
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 882f50a..45b67b1 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -612,6 +612,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) if (m->nb_clusters == 0) return 0; + qcow2_cache_disable_writethrough(bs); + old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t));