From patchwork Wed Feb 6 12:31:34 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Beno=C3=AEt_Canet?= X-Patchwork-Id: 218618 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 0B86C2C02B7 for ; Thu, 7 Feb 2013 00:47:08 +1100 (EST) Received: from localhost ([::1]:34080 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U34Az-0006Ip-Oz for incoming@patchwork.ozlabs.org; Wed, 06 Feb 2013 07:32:33 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54968) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U34A9-0003nJ-ED for qemu-devel@nongnu.org; Wed, 06 Feb 2013 07:31:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U34A6-0004ab-LO for qemu-devel@nongnu.org; Wed, 06 Feb 2013 07:31:41 -0500 Received: from nodalink.pck.nerim.net ([62.212.105.220]:43526 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U34A6-0004aN-4b for qemu-devel@nongnu.org; Wed, 06 Feb 2013 07:31:38 -0500 Received: by paradis.irqsave.net (Postfix, from userid 1002) id 75D53874354; Wed, 6 Feb 2013 13:31:37 +0100 (CET) Received: from localhost.localdomain (unknown [192.168.77.1]) by paradis.irqsave.net (Postfix) with ESMTP id 2765187430E; Wed, 6 Feb 2013 13:31:19 +0100 (CET) From: =?UTF-8?q?Beno=C3=AEt=20Canet?= To: qemu-devel@nongnu.org Date: Wed, 6 Feb 2013 13:31:34 +0100 Message-Id: <1360153926-9492-2-git-send-email-benoit@irqsave.net> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1360153926-9492-1-git-send-email-benoit@irqsave.net> References: <1360153926-9492-1-git-send-email-benoit@irqsave.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 62.212.105.220 Cc: kwolf@redhat.com, =?UTF-8?q?Beno=C3=AEt=20Canet?= , stefanha@redhat.com Subject: [Qemu-devel] [RFC V6 01/33] qcow2: Add deduplication to the qcow2 specification. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Benoit Canet --- docs/specs/qcow2.txt | 105 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 103 insertions(+), 2 deletions(-) diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt index 36a559d..8e52de1 100644 --- a/docs/specs/qcow2.txt +++ b/docs/specs/qcow2.txt @@ -80,7 +80,12 @@ in the description of a field. tables to repair refcounts before accessing the image. - Bits 1-63: Reserved (set to 0) + Bit 1: Deduplication bit. If this bit is set then + deduplication is used on this image. + L2 tables size 64KB is different from + cluster size 4KB. + + Bits 2-63: Reserved (set to 0) 80 - 87: compatible_features Bitmask of compatible features. An implementation can @@ -116,6 +121,7 @@ be stored. Each extension has a structure like the following: 0x00000000 - End of the header extension area 0xE2792ACA - Backing file format name 0x6803f857 - Feature name table + 0xCD8E819B - Deduplication other - Unknown header extension, can be safely ignored @@ -159,6 +165,101 @@ the header extension data. Each entry look like this: terminated if it has full length) +== Deduplication == + +The deduplication extension contains information concerning deduplication. + + Byte 0 - 7: Offset of the RAM deduplication table (RAM lookup) + + 8 - 11: Size of the RAM deduplication table = number of L1 64-bit + pointers + + 12: Hash algo enum field + 0: SHA-256 + 1: SHA3 + 2: SKEIN-256 + + 13: Dedup strategies bitmap + 0: RAM based hash lookup (always set to 1 for now) + 1: Disk based hash lookup + 2: Deduplication running if set to 1 + + 14 - 69: Set to zero and reserved for future use + +Disk based lookup structure will be described in a future QCOW2 specification. + +== Deduplication table (RAM method) == + +The deduplication table maps a physical offset to a data hash and +logical offset. It is used to permanently store the information to +do the deduplication. It is loaded at startup into a RAM based representation +used to do the lookups. + +The deduplication table contains 64-bit offsets to the level 2 deduplication +table blocks. +Each entry of these blocks contains a 32-byte SHA256 hash followed by the +64-bit logical offset of the first encountered cluster having this hash. + +== Deduplication table schematic (RAM method) == + +0 l1_dedup_index Size + | +|--------------------------------------------------------------------| +| | | +| | L1 Deduplication table | +| | | +|--------------------------------------------------------------------| + | + | + | +0 | l2_dedup_block_entries + | +|---------------------------------| +| | +| L2 deduplication block | +| | +| l2_dedup_index | +|---------------------------------| + | + 0 | 40 + | + |-------------------------------| + | | + | Deduplication table entry | + | | + |-------------------------------| + + +== Deduplication table entry description (RAM method) == + +Each L2 deduplication table entry has the following structure: + + Byte 0 - 31: hash of data cluster + + 32 - 39: Logical offset of first encountered block having + this hash + +== Deduplication table arithmetics (RAM method) == + +cluster_size = 4096 +dedup_block_size = 65536 * 5 +l2_size = 65536 * 16 (16 factor is from the smaller cluster_size) +refcount_order must be >= 4 + +Entries in the deduplication table are ordered by physical cluster index. + +The number of entries in an l2 deduplication table block is : +l2_dedup_block_entries = FLOOR(dedup_block_size / (32 + 8)) + +The index in the level 1 deduplication table is : +l1_dedup_index = physical_cluster_index / l2_block_cluster_entries + +The index in the level 2 deduplication table is: +l2_dedup_index = physical_cluster_index % l2_block_cluster_entries + +The 16 remaining bytes in each l2 deduplication blocks are set to zero and +reserved for a future usage. + == Host cluster management == qcow2 manages the allocation of host clusters by maintaining a reference count @@ -211,7 +312,7 @@ guest clusters to host clusters. They are called L1 and L2 table. The L1 table has a variable size (stored in the header) and may use multiple clusters, however it must be contiguous in the image file. L2 tables are -exactly one cluster in size. +exactly one cluster in size excepted for the deduplication case. Given a offset into the virtual disk, the offset into the image file can be obtained as follows: