From patchwork Tue Dec 30 09:20:43 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 424566 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 921481400A0 for ; Tue, 30 Dec 2014 20:21:34 +1100 (AEDT) Received: from localhost ([::1]:36136 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y5szc-0003mF-OA for incoming@patchwork.ozlabs.org; Tue, 30 Dec 2014 04:21:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45269) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y5sz9-0002ur-BB for qemu-devel@nongnu.org; Tue, 30 Dec 2014 04:21:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y5sz7-00042N-VS for qemu-devel@nongnu.org; Tue, 30 Dec 2014 04:21:03 -0500 Received: from mailhub.sw.ru ([195.214.232.25]:34274 helo=relay.sw.ru) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y5sz7-0003xU-IC for qemu-devel@nongnu.org; Tue, 30 Dec 2014 04:21:01 -0500 Received: from hades.sw.ru ([10.30.8.132]) by relay.sw.ru (8.13.4/8.13.4) with ESMTP id sBU9KIar019941; Tue, 30 Dec 2014 12:20:22 +0300 (MSK) From: "Denis V. Lunev" To: Date: Tue, 30 Dec 2014 12:20:43 +0300 Message-Id: <1419931250-19259-2-git-send-email-den@openvz.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1419931250-19259-1-git-send-email-den@openvz.org> References: <1419931250-19259-1-git-send-email-den@openvz.org> X-detected-operating-system: by eggs.gnu.org: OpenBSD 3.x X-Received-From: 195.214.232.25 Cc: Kevin Wolf , "Denis V. Lunev" , Peter Lieven , qemu-devel@nongnu.org, Stefan Hajnoczi Subject: [Qemu-devel] [PATCH 1/8] block: prepare bdrv_co_do_write_zeroes to deal with large bl.max_write_zeroes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org bdrv_co_do_write_zeroes split writes using bl.max_write_zeroes or 16 MiB as a chunk size. This is implemented in this way to tolerate buggy block backends which do not accept too big requests. Though if the bdrv_co_write_zeroes callback is not good enough, we fallback to write data explicitely using bdrv_co_writev and we create buffer to accomodate zeroes inside. The size of this buffer is the size of the chunk. Thus if the underlying layer will have bl.max_write_zeroes high enough, f.e. 4 GiB, the allocation can fail. Actually, there is no need to allocate such a big amount of memory. We could simply allocate 1 MiB buffer and create iovec, which will point to the same memory. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Stefan Hajnoczi CC: Peter Lieven --- block.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/block.c b/block.c index 4165d42..d69c121 100644 --- a/block.c +++ b/block.c @@ -3173,14 +3173,18 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs, * of 32768 512-byte sectors (16 MiB) per request. */ #define MAX_WRITE_ZEROES_DEFAULT 32768 +/* allocate iovec with zeroes using 1 MiB chunks to avoid to big allocations */ +#define MAX_ZEROES_CHUNK (1024 * 1024) static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, int64_t sector_num, int nb_sectors, BdrvRequestFlags flags) { BlockDriver *drv = bs->drv; QEMUIOVector qiov; - struct iovec iov = {0}; int ret = 0; + void *chunk = NULL; + + qemu_iovec_init(&qiov, 0); int max_write_zeroes = bs->bl.max_write_zeroes ? bs->bl.max_write_zeroes : MAX_WRITE_ZEROES_DEFAULT; @@ -3217,27 +3221,35 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, } if (ret == -ENOTSUP) { + int64_t num_bytes = (int64_t)num << BDRV_SECTOR_BITS; + int chunk_size = MIN(MAX_ZEROES_CHUNK, num_bytes); + /* Fall back to bounce buffer if write zeroes is unsupported */ - iov.iov_len = num * BDRV_SECTOR_SIZE; - if (iov.iov_base == NULL) { - iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE); - if (iov.iov_base == NULL) { + if (chunk == NULL) { + chunk = qemu_try_blockalign(bs, chunk_size); + if (chunk == NULL) { ret = -ENOMEM; goto fail; } - memset(iov.iov_base, 0, num * BDRV_SECTOR_SIZE); + memset(chunk, 0, chunk_size); + } + + while (num_bytes > 0) { + int to_add = MIN(chunk_size, num_bytes); + qemu_iovec_add(&qiov, chunk, to_add); + num_bytes -= to_add; } - qemu_iovec_init_external(&qiov, &iov, 1); ret = drv->bdrv_co_writev(bs, sector_num, num, &qiov); /* Keep bounce buffer around if it is big enough for all * all future requests. */ - if (num < max_write_zeroes) { - qemu_vfree(iov.iov_base); - iov.iov_base = NULL; + if (chunk_size != MAX_ZEROES_CHUNK) { + qemu_vfree(chunk); + chunk = NULL; } + qemu_iovec_reset(&qiov); } sector_num += num; @@ -3245,7 +3257,8 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, } fail: - qemu_vfree(iov.iov_base); + qemu_iovec_destroy(&qiov); + qemu_vfree(chunk); return ret; }