From patchwork Thu Jun 20 14:26:18 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Beno=C3=AEt_Canet?= X-Patchwork-Id: 252956 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id C82232C00A0 for ; Fri, 21 Jun 2013 00:35:35 +1000 (EST) Received: from localhost ([::1]:40270 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UpfxV-0004lC-U6 for incoming@patchwork.ozlabs.org; Thu, 20 Jun 2013 10:35:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60234) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Upfqb-0002wb-KN for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:28:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UpfqS-0001FK-C5 for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:28:25 -0400 Received: from nodalink.pck.nerim.net ([62.212.105.220]:37596 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UpfqR-0001Eu-Kw for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:28:16 -0400 Received: by paradis.irqsave.net (Postfix, from userid 1002) id 09B50874673; Thu, 20 Jun 2013 16:28:14 +0200 (CEST) Received: from localhost.localdomain (unknown [192.168.77.1]) by paradis.irqsave.net (Postfix) with ESMTP id 00781874675; Thu, 20 Jun 2013 16:25:01 +0200 (CEST) From: =?UTF-8?q?Beno=C3=AEt=20Canet?= To: qemu-devel@nongnu.org Date: Thu, 20 Jun 2013 16:26:18 +0200 Message-Id: <1371738392-9594-11-git-send-email-benoit@irqsave.net> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1371738392-9594-1-git-send-email-benoit@irqsave.net> References: <1371738392-9594-1-git-send-email-benoit@irqsave.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 62.212.105.220 Cc: kwolf@redhat.com, =?UTF-8?q?Beno=C3=AEt=20Canet?= , stefanha@redhat.com Subject: [Qemu-devel] [RFC V8 10/24] qcow2: Add qcow2_dedup and related functions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Benoit Canet --- block/qcow2-dedup.c | 415 +++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 5 + 2 files changed, 420 insertions(+) diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c index bc6e2c2..0daf77e 100644 --- a/block/qcow2-dedup.c +++ b/block/qcow2-dedup.c @@ -119,3 +119,418 @@ fail: *data = NULL; return ret; } + +/* + * Set a QCowHashInfo into the new state + * + * @hash_info: the hash_info to set to new + */ +static void qcow2_set_hash_info_new(QCowHashInfo *hash_info) +{ + hash_info->physical_sect = QCOW_DEDUP_FLAG_EMPTY; + hash_info->first_logical_sect = QCOW_DEDUP_FLAG_EMPTY; +} + +/* + * Compute the hash of a given cluster + * + * @data: a buffer containing the cluster data + * @hash: a QCowHash where to store the computed hash + * @ret: 0 on success, negative on error + */ +static int qcow2_compute_cluster_hash(BlockDriverState *bs, + QCowHash *hash, + uint8_t *data) +{ + return 0; +} + +/* + * Get a QCowHashInfo corresponding to a cluster data + * + * @phash: if phash can be used no hash is computed + * @data: a buffer containing the cluster + * @ret: 1 if if hash found, 0 if not found, -errno on error + */ +static int qcow2_get_persistent_hash_for_cluster(BlockDriverState *bs, + QcowPersistentHash *phash, + uint8_t *data) +{ + BDRVQcowState *s = bs->opaque; + int ret = 0; + + /* no hash has been provided compute it and store it for later usage */ + if (!phash->reuse) { + ret = qcow2_compute_cluster_hash(bs, + &phash->hash_info.hash, + data); + } + + /* do not reuse the hash anymore if it was precomputed */ + phash->reuse = false; + + /* error */ + if (ret < 0) { + return ret; + } + + /* ask the store if it knows this QCowHashInfo */ + return qcow2_store_get(bs, &s->key_value_store, &phash->hash_info); +} + +/* + * Insert a QCowHashInfo into the store as new + * + * @hash_info: the QCowHashInfo to set to new and insert + * @ret: 0 on succes, -errno on error + */ +static int qcow2_add_new_hash_info_to_store(BlockDriverState *bs, + QCowHashInfo *hash_info) +{ + BDRVQcowState *s = bs->opaque; + + /* set the QCowHashInfo to the new state so we will remember to fill these + * field later with real values + */ + qcow2_set_hash_info_new(hash_info); + return qcow2_store_insert(bs, &s->key_value_store, hash_info); +} + +/* + * Helper used to build a QCowHashElement + * + * @hash: the QCowHash to use + * @ret: a newly allocated QCowHashElement containing the given hash + */ +static QCowHashElement *qcow2_dedup_hash_new(QCowHash *hash) +{ + QCowHashElement *dedup_hash; + dedup_hash = g_new0(QCowHashElement, 1); + memcpy(dedup_hash->hash_info.hash.data, hash->data, HASH_LENGTH); + return dedup_hash; +} + +/* + * Helper used to link a deduplicated cluster in the l2 + * + * @logical_sect: the cluster sector seen by the guest + * @physical_sect: the cluster sector in the QCOW2 file + * @overwrite: true if we must overwrite the L2 table entry + * @ret: + */ +static int qcow2_dedup_link_l2(BlockDriverState *bs, + uint64_t logical_sect, + uint64_t physical_sect, + bool overwrite) +{ + QCowL2Meta m = { + .alloc_offset = physical_sect << 9, + .offset = logical_sect << 9, + .nb_clusters = 1, + .nb_available = 0, + .cow_start = { + .offset = 0, + .nb_sectors = 0, + }, + .cow_end = { + .offset = 0, + .nb_sectors = 0, + }, + .l2_entry_flags = 0, + .overwrite = overwrite, + }; + return qcow2_alloc_cluster_link_l2(bs, &m); +} + +/* Clear the QCOW_OFLAG_COPIED from the first L2 entry written for a physical + * cluster. + * + * @hash_info: the duplicated hash info + * @ret: 0 on success, negative on error + */ +static int qcow2_clear_l2_copied_flag_if_needed(BlockDriverState *bs, + QCowHashInfo *hash_info) +{ + int ret = 0; + uint64_t first_logical_sect = hash_info->first_logical_sect; + + /* QCOW_OFLAG_COPIED already cleared -> do nothing */ + if (!(first_logical_sect & QCOW_OFLAG_COPIED)) { + return 0; + } + + first_logical_sect &= ~QCOW_OFLAG_COPIED; + + /* overwrite first L2 entry to clear QCOW_FLAG_COPIED */ + ret = qcow2_dedup_link_l2(bs, first_logical_sect, + hash_info->physical_sect, + true); + + if (ret < 0) { + return ret; + } + + /* remember that we don't need to clear QCOW_OFLAG_COPIED again */ + hash_info->first_logical_sect = first_logical_sect; + + return 0; +} + +/* This function deduplicate a cluster + * + * @logical_sect: The logical sector of the write + * @hash_info: The duplicated cluster hash info + * @ret: 0 on success, negative on error + */ +static int qcow2_deduplicate_cluster(BlockDriverState *bs, + uint64_t logical_sect, + QCowHashInfo *hash_info) +{ + BDRVQcowState *s = bs->opaque; + uint64_t cluster_index = hash_info->physical_sect / s->cluster_sectors; + int ret = 0; + + /* Increment the refcount of the cluster */ + ret = qcow2_update_cluster_refcount(bs, + cluster_index, + 1); + + if (ret < 0) { + return ret; + } + + /* create new L2 entry */ + return qcow2_dedup_link_l2(bs, logical_sect, + hash_info->physical_sect, + false); +} + +/* This function tries to deduplicate a given cluster. + * + * @sector_num: the logical sector number we are trying to deduplicate + * @phash: Used instead of computing the hash if provided + * @data: the buffer in which to look for a duplicated cluster + * @ret: ret < 0 on error, 1 on deduplication else 0 + */ +static int qcow2_try_dedup_cluster(BlockDriverState *bs, + QcowPersistentHash *phash, + uint64_t sector_num, + uint8_t *data) +{ + BDRVQcowState *s = bs->opaque; + int ret = 0; + uint64_t logical_sect; + uint64_t existing_physical_offset; + int pnum = s->cluster_sectors; + + /* search the tree for duplicated cluster */ + ret = qcow2_get_persistent_hash_for_cluster(bs, + phash, + data); + + /* we won't reuse the hash on error */ + if (ret < 0) { + return ret; + } + + /* if cluster is not duplicated store hash for later usage */ + if (!ret) { + return qcow2_add_new_hash_info_to_store(bs, &phash->hash_info); + } + + logical_sect = sector_num & ~(s->cluster_sectors - 1); + ret = qcow2_get_cluster_offset(bs, logical_sect << 9, + &pnum, &existing_physical_offset); + + if (ret < 0) { + return ret; + } + + /* if we are rewriting the same cluster at the same place do nothing */ + if (existing_physical_offset == phash->hash_info.physical_sect << 9) { + return 1; + } + + /* take care of not having refcount > 1 and QCOW_OFLAG_COPIED at once */ + ret = qcow2_clear_l2_copied_flag_if_needed(bs, &phash->hash_info); + + if (ret < 0) { + return ret; + } + + /* do the deduplication */ + ret = qcow2_deduplicate_cluster(bs, logical_sect, &phash->hash_info); + + if (ret < 0) { + return ret; + } + + return 1; +} + + +static void add_hash_to_undedupable_list(BlockDriverState *bs, + QCowDedupState *ds) +{ + /* memorize hash for later storage in key value store and and disk */ + QCowHashElement *dedup_hash = + qcow2_dedup_hash_new(&ds->phash.hash_info.hash); + QTAILQ_INSERT_TAIL(&ds->undedupables, dedup_hash, next); +} + +static int qcow2_dedup_starting_from_begining(BlockDriverState *bs, + QCowDedupState *ds, + uint64_t sector_num, + uint8_t *data, + int left_to_process) +{ + BDRVQcowState *s = bs->opaque; + int i; + int ret = 0; + + for (i = 0; i < left_to_process; i++) { + ret = qcow2_try_dedup_cluster(bs, + &ds->phash, + sector_num + i * s->cluster_sectors, + data + i * s->cluster_size); + + if (ret < 0) { + return ret; + } + + /* stop if a cluster has not been deduplicated */ + if (ret != 1) { + break; + } + } + + return i; +} + +static int qcow2_count_next_non_dedupable_clusters(BlockDriverState *bs, + QCowDedupState *ds, + uint8_t *data, + int left_to_process) +{ + BDRVQcowState *s = bs->opaque; + int i; + int ret = 0; + + for (i = 0; i < left_to_process; i++) { + ret = qcow2_get_persistent_hash_for_cluster(bs, + &ds->phash, + data + i * s->cluster_size); + + if (ret < 0) { + return ret; + } + + /* found a duplicated cluster : stop here */ + if (ret) { + break; + } + + ret = qcow2_add_new_hash_info_to_store(bs, &ds->phash.hash_info); + + if (ret < 0) { + return ret; + } + add_hash_to_undedupable_list(bs, ds); + } + + return i; +} + + +/* Deduplicate all the cluster that can be deduplicated. + * + * Next it computes the number of non deduplicable sectors to come while storing + * the hashes of these sectors in a linked list for later usage. + * Then it computes the first duplicated cluster hash that comes after non + * deduplicable cluster, this hash will be used at next call of the function + * + * @ds: a structure containing the state of the deduplication + * for this write request + * @sector_num: The logical sector + * @data: the buffer containing the data to deduplicate + * @data_nr: the size of the buffer in sectors + * + */ +int qcow2_dedup(BlockDriverState *bs, + QCowDedupState *ds, + uint64_t sector_num, + uint8_t *data, + int data_nr) +{ + BDRVQcowState *s = bs->opaque; + int ret = 0; + int deduped_clusters_nr = 0; + int left_to_process; + int start_index; + + start_index = sector_num & (s->cluster_sectors - 1); + + left_to_process = (data_nr / s->cluster_sectors) - + ds->nb_clusters_processed; + + data += ds->nb_clusters_processed * s->cluster_size; + + /* start deduplicating all that can be cluster after cluster */ + ret = qcow2_dedup_starting_from_begining(bs, + ds, + sector_num, + data, + left_to_process); + + if (ret < 0) { + return ret; + } + + deduped_clusters_nr = ret; + + left_to_process -= ret; + ds->nb_clusters_processed += ret; + data += ret * s->cluster_size; + + /* We deduped everything till the end */ + if (!left_to_process) { + ds->nb_undedupable_sectors = 0; + goto exit; + } + + /* skip and account the first undedupable cluster found */ + left_to_process--; + ds->nb_clusters_processed++; + data += s->cluster_size; + ds->nb_undedupable_sectors += s->cluster_sectors; + + add_hash_to_undedupable_list(bs, ds); + + /* Count how many non duplicated sector can be written and memorize hashes + * to write them after data has reached disk. + */ + ret = qcow2_count_next_non_dedupable_clusters(bs, + ds, + data, + left_to_process); + + if (ret < 0) { + return ret; + } + + left_to_process -= ret; + ds->nb_clusters_processed += ret; + ds->nb_undedupable_sectors += ret * s->cluster_sectors; + + /* remember to reuse the last hash computed at new qcow2_dedup call */ + if (left_to_process) { + ds->phash.reuse = true; + } + +exit: + if (!deduped_clusters_nr) { + return 0; + } + + return deduped_clusters_nr * s->cluster_sectors - start_index; +} diff --git a/block/qcow2.h b/block/qcow2.h index 3238083..3346842 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -737,6 +737,11 @@ int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs, int sectors_nr, uint8_t **dedup_cluster_data, int *dedup_cluster_data_nr); +int qcow2_dedup(BlockDriverState *bs, + QCowDedupState *ds, + uint64_t sector_num, + uint8_t *data, + int data_nr); #endif