From patchwork Mon Aug 29 14:53:11 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 112098 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id CF023B6F71 for ; Tue, 30 Aug 2011 01:57:14 +1000 (EST) Received: from localhost ([::1]:56636 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qy3BZ-00012E-BE for incoming@patchwork.ozlabs.org; Mon, 29 Aug 2011 10:51:37 -0400 Received: from eggs.gnu.org ([140.186.70.92]:50205) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qy3Am-0007Zk-QE for qemu-devel@nongnu.org; Mon, 29 Aug 2011 10:50:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qy3Al-0000Vx-Oy for qemu-devel@nongnu.org; Mon, 29 Aug 2011 10:50:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64569) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qy3Al-0000Vj-H6 for qemu-devel@nongnu.org; Mon, 29 Aug 2011 10:50:47 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p7TEokS3002302 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 29 Aug 2011 10:50:46 -0400 Received: from dhcp-5-188.str.redhat.com (dhcp-5-175.str.redhat.com [10.32.5.175]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p7TEoecf021975; Mon, 29 Aug 2011 10:50:45 -0400 From: Kevin Wolf To: anthony@codemonkey.ws Date: Mon, 29 Aug 2011 16:53:11 +0200 Message-Id: <1314629618-8308-4-git-send-email-kwolf@redhat.com> In-Reply-To: <1314629618-8308-1-git-send-email-kwolf@redhat.com> References: <1314629618-8308-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.132.183.28 Cc: kwolf@redhat.com, qemu-devel@nongnu.org Subject: [Qemu-devel] [PATCH 03/30] block: add cache=directsync parameter to -drive X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Stefan Hajnoczi This patch adds -drive cache=directsync for O_DIRECT | O_SYNC host file I/O with no disk write cache presented to the guest. This mode is useful when guests may not be sending flushes when appropriate and therefore leave data at risk in case of power failure. When cache=directsync is used, write operations are only completed to the guest when data is safely on disk. This new mode is like cache=writethrough but it bypasses the host page cache. Signed-off-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- block.c | 6 ++++-- qemu-config.c | 3 ++- qemu-img.c | 3 ++- qemu-options.hx | 8 ++++++-- 4 files changed, 14 insertions(+), 6 deletions(-) diff --git a/block.c b/block.c index dbef3ae..4186a2f 100644 --- a/block.c +++ b/block.c @@ -448,6 +448,8 @@ int bdrv_parse_cache_flags(const char *mode, int *flags) if (!strcmp(mode, "off") || !strcmp(mode, "none")) { *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB; + } else if (!strcmp(mode, "directsync")) { + *flags |= BDRV_O_NOCACHE; } else if (!strcmp(mode, "writeback")) { *flags |= BDRV_O_CACHE_WB; } else if (!strcmp(mode, "unsafe")) { @@ -1188,8 +1190,8 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset, return ret; } - /* No flush needed for cache=writethrough, it uses O_DSYNC */ - if ((bs->open_flags & BDRV_O_CACHE_MASK) != 0) { + /* No flush needed for cache modes that use O_DSYNC */ + if ((bs->open_flags & BDRV_O_CACHE_WB) != 0) { bdrv_flush(bs); } diff --git a/qemu-config.c b/qemu-config.c index 1eb6b9a..139e077 100644 --- a/qemu-config.c +++ b/qemu-config.c @@ -55,7 +55,8 @@ static QemuOptsList qemu_drive_opts = { },{ .name = "cache", .type = QEMU_OPT_STRING, - .help = "host cache usage (none, writeback, writethrough, unsafe)", + .help = "host cache usage (none, writeback, writethrough, " + "directsync, unsafe)", },{ .name = "aio", .type = QEMU_OPT_STRING, diff --git a/qemu-img.c b/qemu-img.c index 5e203c2..10a3a8b 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -66,7 +66,8 @@ static void help(void) " 'filename' is a disk image filename\n" " 'fmt' is the disk image format. It is guessed automatically in most cases\n" " 'cache' is the cache mode used to write the output disk image, the valid\n" - " options are: 'none', 'writeback' (default), 'writethrough' and 'unsafe'\n" + " options are: 'none', 'writeback' (default), 'writethrough', 'directsync'\n" + " and 'unsafe'\n" " 'size' is the disk image size in bytes. Optional suffixes\n" " 'k' or 'K' (kilobyte, 1024), 'M' (megabyte, 1024k), 'G' (gigabyte, 1024M)\n" " and T (terabyte, 1024G) are supported. 'b' is ignored.\n" diff --git a/qemu-options.hx b/qemu-options.hx index d86815d..35d95d1 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -133,7 +133,7 @@ ETEXI DEF("drive", HAS_ARG, QEMU_OPTION_drive, "-drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" " [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n" - " [,cache=writethrough|writeback|none|unsafe][,format=f]\n" + " [,cache=writethrough|writeback|none|directsync|unsafe][,format=f]\n" " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" " [,readonly=on|off]\n" " use 'file' as a drive image\n", QEMU_ARCH_ALL) @@ -164,7 +164,7 @@ These options have the same definition as they have in @option{-hdachs}. @item snapshot=@var{snapshot} @var{snapshot} is "on" or "off" and allows to enable snapshot for given drive (see @option{-snapshot}). @item cache=@var{cache} -@var{cache} is "none", "writeback", "unsafe", or "writethrough" and controls how the host cache is used to access block data. +@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough" and controls how the host cache is used to access block data. @item aio=@var{aio} @var{aio} is "threads", or "native" and selects between pthread based disk I/O and native Linux AIO. @item format=@var{format} @@ -199,6 +199,10 @@ The host page cache can be avoided entirely with @option{cache=none}. This will attempt to do disk IO directly to the guests memory. QEMU may still perform an internal copy of the data. +The host page cache can be avoided while only sending write notifications to +the guest when the data has been reported as written by the storage subsystem +using @option{cache=directsync}. + Some block drivers perform badly with @option{cache=writethrough}, most notably, qcow2. If performance is more important than correctness, @option{cache=writeback} should be used with qcow2.