From patchwork Thu Jul 28 14:15:25 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 107262 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 12E77B6F00 for ; Fri, 29 Jul 2011 00:12:41 +1000 (EST) Received: from localhost ([::1]:42390 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QmRKH-0000CW-PY for incoming@patchwork.ozlabs.org; Thu, 28 Jul 2011 10:12:37 -0400 Received: from eggs.gnu.org ([140.186.70.92]:47125) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QmRKC-0000CE-NB for qemu-devel@nongnu.org; Thu, 28 Jul 2011 10:12:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QmRKB-0006kL-OF for qemu-devel@nongnu.org; Thu, 28 Jul 2011 10:12:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53450) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QmRKB-0006kG-Fz for qemu-devel@nongnu.org; Thu, 28 Jul 2011 10:12:31 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p6SECUs4026047 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 28 Jul 2011 10:12:30 -0400 Received: from dhcp-5-188.str.redhat.com (dhcp-5-175.str.redhat.com [10.32.5.175]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p6SECSiZ005135 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 28 Jul 2011 10:12:29 -0400 Message-ID: <4E316EFD.6080304@redhat.com> Date: Thu, 28 Jul 2011 16:15:25 +0200 From: Kevin Wolf User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20110707 Thunderbird/5.0 MIME-Version: 1.0 To: Frediano Ziglio References: <1311861017-13425-1-git-send-email-freddy77@gmail.com> In-Reply-To: <1311861017-13425-1-git-send-email-freddy77@gmail.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.132.183.28 Cc: qemu-devel@nongnu.org Subject: Re: [Qemu-devel] [PATCH] [RFC] qcow2: group refcount updates during cow X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Am 28.07.2011 15:50, schrieb Frediano Ziglio: > Well, I think this is the first real improve patch. > Is more a RFC than a patch. Yes, some lines are terrible! > It collapses refcount decrement during cow. > From a first check time executing 015 test passed from about 600 seconds > to 70. > This at least prove that refcount updates counts! > Some doubt: > 1- place the code in qcow2-refcount.c as it update only refcount and not > cluster? > 2- allow some sort of "begin transaction" / "commit" / "rollback" like > databases instead? > 3- allow changing tables from different coroutines? > > 1) If you have a sequence like (1, 2, 4) probably these clusters are all in > the same l2 table but with this code you get two write instead of one. > I'm thinking about a function in qcow2-refcount.c that accept an array of cluster > instead of a start + len. > > Signed-off-by: Frediano Ziglio I think what you're seeing is actually just one special case of a more general problem. The problem is that we're interpreting writethrough stricter than required. The semantics that we really need is that on completion of a request, all of its data and metadata must be flushed to disk. There is no requirement that we flush all intermediate states. My recent update to qcow2_update_snapshot_refcount() is just another case of the same problem. I think the solution should be similar to what I did there, i.e. switch the cache to writeback mode while we're operating on it and switch back when we're done. We should probably have functions that make both of this a one-liner (I think here we have some similarity to your begin/commit idea). With the right functions, this could become as easy as this (might need better function names, but you get the idea): /* copy content of unmodified sectors */ @@ -683,6 +685,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) ret = 0; err: + qcow2_cache_restore_writethrough(bs); qemu_free(old_cluster); return ret; } Kevin diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 882f50a..45b67b1 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -612,6 +612,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) if (m->nb_clusters == 0) return 0; + qcow2_cache_disable_writethrough(bs); + old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t));