From patchwork Sun Oct 26 15:20:47 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: lijun X-Patchwork-Id: 403221 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id A530C14007D for ; Mon, 27 Oct 2014 02:22:48 +1100 (AEDT) Received: from localhost ([::1]:56992 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XiPeX-0001kb-KU for incoming@patchwork.ozlabs.org; Sun, 26 Oct 2014 11:22:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39617) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XiPdH-0007yL-1H for qemu-devel@nongnu.org; Sun, 26 Oct 2014 11:21:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XiPd4-0005wq-AJ for qemu-devel@nongnu.org; Sun, 26 Oct 2014 11:21:26 -0400 Received: from mail-pa0-x22b.google.com ([2607:f8b0:400e:c03::22b]:52187) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XiPd2-0005wG-TK for qemu-devel@nongnu.org; Sun, 26 Oct 2014 11:21:14 -0400 Received: by mail-pa0-f43.google.com with SMTP id eu11so3897696pac.16 for ; Sun, 26 Oct 2014 08:21:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=asxW7EOyP9p3ewDQJ+MAB89N4smgErDZWud9N//mU14=; b=Wp5uH/CmdV+SzmbEucs5KgfWvoH1Ep1prMUKecRQ384iyUzpIfB4KbyrdKasvPpIxS aViwC7sSmBRiE6ZBIlmQr8mssry0k6g0dSu2CqrM5NRLYeABAHCP+0l6vVtBm7Tr9jKN 71c80P7Tklcgek5fSaJ7suCLgnqmbhhBYvCwNeDRTg7sUT6lT9ospqVKcb4MYpmkMW4j fYgY/vHInV6LCjR9yqUrLqC9EepZdXwhUaVlAe6grVYhpk8wleOQH4Jgmd9uSe1ztfc+ 7zG0wwqjzjC3ZhC5XIep1qZlmiBZX/MWfSAuBIA0GxF9uQsln+/540qDIv1nKqSGi9z2 sVdA== X-Received: by 10.66.218.202 with SMTP id pi10mr18573714pac.28.1414336868017; Sun, 26 Oct 2014 08:21:08 -0700 (PDT) Received: from localhost ([125.39.9.140]) by mx.google.com with ESMTPSA id pw10sm8539789pbc.93.2014.10.26.08.21.06 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Oct 2014 08:21:07 -0700 (PDT) From: Jun Li To: qemu-devel@nongnu.org Date: Sun, 26 Oct 2014 23:20:47 +0800 Message-Id: <1414336849-21179-2-git-send-email-junmuzi@gmail.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1414336849-21179-1-git-send-email-junmuzi@gmail.com> References: <1414336849-21179-1-git-send-email-junmuzi@gmail.com> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400e:c03::22b Cc: kwolf@redhat.com, juli@redhat.com, famz@redhat.com, Jun Li , stefanha@redhat.com Subject: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This patch is the realization of new function qcow2_shrink_l1_and_l2_table. This function will shrink/discard l1 and l2 table when do qcow2 shrinking. Signed-off-by: Jun Li --- v5: Do some modifications based on MAX's suggestion. Thanks for MAX. In v5, do l2 shrinking firstly, then do l1 shrinking in function qcow2_shrink_l1_and_l2_table. As do l1 shrinking need to allocate some clusters for new l1 table, so in v5 it can re-use the freed clusters come from l2 shrinking. v4: Add deal with COW clusters in l2 table. When using COW, some of (l2_entry >> s->cluster_bits) will larger than s->refcount_table_size, so need to discard this l2_entry. v3: Fixed host cluster leak. --- block/qcow2-cluster.c | 182 ++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.c | 37 +++++++++- block/qcow2.h | 2 + 3 files changed, 218 insertions(+), 3 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 4d888c7..28d2d62 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -29,6 +29,9 @@ #include "block/qcow2.h" #include "trace.h" +static int l2_load(BlockDriverState *bs, uint64_t l2_offset, + uint64_t **l2_table); + int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, bool exact_size) { @@ -135,6 +138,185 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, return ret; } +int qcow2_shrink_l1_and_l2_table(BlockDriverState *bs, uint64_t new_l1_size, + int new_l2_index, int64_t boundary_size) +{ + BDRVQcowState *s = bs->opaque; + int new_l1_size2, ret, i; + uint64_t *new_l1_table; + int64_t new_l1_table_offset; + int64_t old_l1_table_offset, old_l1_size; + uint8_t data[12]; + uint64_t l2_offset; + uint64_t *l2_table, l2_entry; + int64_t l2_free_entry; /* The entry of l2 table need to free from */ + uint64_t *old_l1_table = s->l1_table; + int num = s->l1_size - new_l1_size; + + assert(new_l1_size <= s->l1_size); + while ((num >= -1) && (s->l1_size + num - 1 >= 0)) { + l2_free_entry = 0; + l2_offset = old_l1_table[s->l1_size + num - 1] & L1E_OFFSET_MASK; + + if (l2_offset == 0) { + goto retry; + } + + if (num == 0) { + if (new_l2_index == 0) { + goto retry; + } + l2_free_entry = new_l2_index; + } + + /* load l2_table into cache */ + ret = l2_load(bs, l2_offset, &l2_table); + + if (ret < 0) { + return ret; + } + + for (i = s->l2_size - 1; i >= 0; i--) { + l2_entry = be64_to_cpu(l2_table[i]); + + /* Due to COW, the clusters in l2 table will + * not in sequential order, so there will be + * some l2_entry >= boundary_size when perform shrinking. + */ + if (num == -1) { + if (l2_entry >= boundary_size) { + goto free_cluster; + } else { + continue; + } + } + + /* Deal with COW clusters in l2 table when num == 0 */ + if (i <= l2_free_entry - 1) { + if (l2_entry >= boundary_size) { + goto free_cluster; + } + continue; + } + + switch (qcow2_get_cluster_type(l2_entry)) { + case QCOW2_CLUSTER_UNALLOCATED: + if (!bs->backing_hd) { + continue; + } + break; + + case QCOW2_CLUSTER_ZERO: + continue; + + case QCOW2_CLUSTER_NORMAL: + case QCOW2_CLUSTER_COMPRESSED: + break; + + default: + abort(); + } + + free_cluster: + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + + if (s->qcow_version >= 3) { + l2_table[i] = cpu_to_be64(QCOW_OFLAG_ZERO); + } else { + l2_table[i] = cpu_to_be64(0); + } + + /* Then decrease the refcount */ + qcow2_free_any_clusters(bs, l2_entry, 1, QCOW2_DISCARD_MAX); + } + + ret = qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table); + if (ret < 0) { + return ret; + } + if (l2_free_entry == 0 && num != -1) { + qemu_vfree(l2_table); + qcow2_free_clusters(bs, l2_offset, s->cluster_size - 1, + QCOW2_DISCARD_OTHER); + } + retry: + num--; + } + + new_l1_size2 = sizeof(uint64_t) * new_l1_size; + new_l1_table = qemu_try_blockalign(bs->file, + align_offset(new_l1_size2, 512)); + if (new_l1_table == NULL) { + return -ENOMEM; + } + memset(new_l1_table, 0, align_offset(new_l1_size2, 512)); + + /* shrinking l1 table */ + memcpy(new_l1_table, s->l1_table, new_l1_size2); + + /* write new table (align to cluster) */ + new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2); + + if ((new_l1_table_offset) >= boundary_size) { + goto fail; + } + + ret = qcow2_cache_flush(bs, s->refcount_block_cache); + if (ret < 0) { + goto fail; + } + + /* the L1 position has not yet been updated, so these clusters must + * indeed be completely free */ + ret = qcow2_pre_write_overlap_check(bs, 0, new_l1_table_offset, + new_l1_size2); + + if (ret < 0) { + goto fail; + } + + BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE); + + for (i = 0; i < new_l1_size; i++) { + new_l1_table[i] = cpu_to_be64(new_l1_table[i]); + } + + ret = bdrv_pwrite_sync(bs->file, new_l1_table_offset, + new_l1_table, new_l1_size2); + if (ret < 0) { + goto fail; + } + + for (i = 0; i < new_l1_size; i++) { + new_l1_table[i] = be64_to_cpu(new_l1_table[i]); + } + + /* set new table */ + BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE); + cpu_to_be32w((uint32_t *)data, new_l1_size); + stq_be_p(data + 4, new_l1_table_offset); + ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), + data, sizeof(data)); + if (ret < 0) { + goto fail; + } + + qemu_vfree(s->l1_table); + old_l1_table_offset = s->l1_table_offset; + s->l1_table_offset = new_l1_table_offset; + s->l1_table = new_l1_table; + old_l1_size = s->l1_size; + s->l1_size = new_l1_size; + qcow2_free_clusters(bs, old_l1_table_offset, old_l1_size * sizeof(uint64_t), + QCOW2_DISCARD_OTHER); + return 0; + fail: + qemu_vfree(new_l1_table); + qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2, + QCOW2_DISCARD_OTHER); + return ret; +} + /* * l2_load * diff --git a/block/qcow2.c b/block/qcow2.c index d031515..d2b0dfe 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2111,10 +2111,41 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) return -ENOTSUP; } - /* shrinking is currently not supported */ + /* shrinking image */ if (offset < bs->total_sectors * 512) { - error_report("qcow2 doesn't support shrinking images yet"); - return -ENOTSUP; + /* As l1 table, l2 table, refcount table, refcount block table + * and file header of the qcow2 image need to use some clusters, + * so should subtract these metadata from offset. + */ + int64_t nb_l1 = DIV_ROUND_UP((uint64_t)s->l1_size * sizeof(uint64_t), + s->cluster_size); + int64_t nb_l2 = DIV_ROUND_UP(offset, (uint64_t)s->l2_size << + s->cluster_bits); + int64_t nb_refcount_block_table = DIV_ROUND_UP(offset, (uint64_t) + s->cluster_size << + s->refcount_block_bits); + int64_t nb_refcount_table = DIV_ROUND_UP(nb_refcount_block_table << 3, + s->cluster_size); + int64_t total_nb = 2 * nb_l2 + nb_l1 + nb_refcount_block_table + + nb_refcount_table + 1; + int64_t offset_for_shrink = offset - (total_nb << s->cluster_bits); + int new_l2_index = offset_to_l2_index(s, offset_for_shrink); + + new_l1_size = size_to_l1(s, offset_for_shrink); + ret = qcow2_shrink_l1_and_l2_table(bs, new_l1_size, new_l2_index, + offset); + if (ret < 0) { + return ret; + } + + int64_t actual_size = bdrv_get_allocated_file_size(bs); + + if (offset < actual_size) { + ret = bdrv_truncate(bs->file, offset); + if (ret < 0) { + return ret; + } + } } new_l1_size = size_to_l1(s, offset); diff --git a/block/qcow2.h b/block/qcow2.h index 577ccd1..be1237d 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -516,6 +516,8 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset, /* qcow2-cluster.c functions */ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, bool exact_size); +int qcow2_shrink_l1_and_l2_table(BlockDriverState *bs, uint64_t new_l1_size, + int new_l2_index, int64_t boundary_size); int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index); void qcow2_l2_cache_reset(BlockDriverState *bs); int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);