From patchwork Wed Aug 31 14:35:59 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Berger X-Patchwork-Id: 112578 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 76F8AB6EE8 for ; Thu, 1 Sep 2011 01:31:32 +1000 (EST) Received: from localhost ([::1]:41011 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qylv6-0005eU-F5 for incoming@patchwork.ozlabs.org; Wed, 31 Aug 2011 10:37:36 -0400 Received: from eggs.gnu.org ([140.186.70.92]:43177) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QyluB-0003GH-LR for qemu-devel@nongnu.org; Wed, 31 Aug 2011 10:36:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qylu4-0006SO-CP for qemu-devel@nongnu.org; Wed, 31 Aug 2011 10:36:39 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:55726) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qylu4-0006SD-2E for qemu-devel@nongnu.org; Wed, 31 Aug 2011 10:36:32 -0400 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by e34.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p7VEaR9s028178 for ; Wed, 31 Aug 2011 08:36:27 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p7VEaOmm184844 for ; Wed, 31 Aug 2011 08:36:25 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p7V8ZufU032301 for ; Wed, 31 Aug 2011 02:35:56 -0600 Received: from localhost.localdomain (d941e-10.watson.ibm.com [9.59.241.154]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p7V8ZtoY032252 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 31 Aug 2011 02:35:55 -0600 Received: from localhost.localdomain (d941e-10 [127.0.0.1]) by localhost.localdomain (8.14.4/8.14.3) with ESMTP id p7VEaM4h032566; Wed, 31 Aug 2011 10:36:22 -0400 Received: (from root@localhost) by localhost.localdomain (8.14.4/8.14.4/Submit) id p7VEaLlt032565; Wed, 31 Aug 2011 10:36:21 -0400 Message-Id: <20110831143621.799480525@linux.vnet.ibm.com> User-Agent: quilt/0.48-1 Date: Wed, 31 Aug 2011 10:35:59 -0400 From: Stefan Berger To: stefanb@linux.vnet.ibm.com, qemu-devel@nongnu.org References: <20110831143551.127339744@linux.vnet.ibm.com> Content-Disposition: inline; filename=qemu_bdrv_lock.diff X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Received-From: 32.97.110.152 Cc: chrisw@redhat.com, anbang.ruan@cs.ox.ac.uk, rrelyea@redhat.com, alevy@redhat.com, andreas.niederl@iaik.tugraz.at, serge@hallyn.com Subject: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This patch introduces file locking via fcntl() for the block layer so that concurrent access to files shared by 2 Qemu instances, for example via NFS, can be serialized. This feature is useful primarily during initial phases of VM migration where the target machine's TIS driver validates the block storage (and in a later patch checks for missing AES keys) and terminates Qemu if the storage is found to be faulty. This then allows migration to be gracefully terminated and Qemu continues running on the source machine. Support for win32 is based on win32 API and has been lightly tested with a standalone test program locking shared storage from two different machines. To enable locking a file multiple times, a counter is used. Actual locking happens the very first time and unlocking happens when the counter is zero. v7: - fixed compilation error in win32 part Signed-off-by: Stefan Berger --- --- block.c | 41 +++++++++++++++++++++++++++++++++++ block.h | 8 ++++++ block/raw-posix.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ block/raw-win32.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ block_int.h | 4 +++ 5 files changed, 168 insertions(+) Index: qemu-git/block.c =================================================================== --- qemu-git.orig/block.c +++ qemu-git/block.c @@ -521,6 +521,8 @@ static int bdrv_open_common(BlockDriverS goto free_and_fail; } + drv->num_locks = 0; + bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR); ret = refresh_total_sectors(bs, bs->total_sectors); @@ -1316,6 +1318,45 @@ void bdrv_get_geometry(BlockDriverState *nb_sectors_ptr = length; } +/* file locking */ +static int bdrv_lock_common(BlockDriverState *bs, BDRVLockType lock_type) +{ + BlockDriver *drv = bs->drv; + + if (!drv) { + return -ENOMEDIUM; + } + + if (bs->file) { + drv = bs->file->drv; + if (drv->bdrv_lock) { + return drv->bdrv_lock(bs->file, lock_type); + } + } + + if (drv->bdrv_lock) { + return drv->bdrv_lock(bs, lock_type); + } + + return -ENOTSUP; +} + + +int bdrv_lock(BlockDriverState *bs) +{ + if (bdrv_is_read_only(bs)) { + return bdrv_lock_common(bs, BDRV_F_RDLCK); + } + + return bdrv_lock_common(bs, BDRV_F_WRLCK); +} + +void bdrv_unlock(BlockDriverState *bs) +{ + bdrv_lock_common(bs, BDRV_F_UNLCK); +} + + struct partition { uint8_t boot_ind; /* 0x80 - active */ uint8_t head; /* starting head */ Index: qemu-git/block.h =================================================================== --- qemu-git.orig/block.h +++ qemu-git/block.h @@ -43,6 +43,12 @@ typedef struct QEMUSnapshotInfo { #define BDRV_SECTOR_MASK ~(BDRV_SECTOR_SIZE - 1) typedef enum { + BDRV_F_UNLCK, + BDRV_F_RDLCK, + BDRV_F_WRLCK, +} BDRVLockType; + +typedef enum { BLOCK_ERR_REPORT, BLOCK_ERR_IGNORE, BLOCK_ERR_STOP_ENOSPC, BLOCK_ERR_STOP_ANY } BlockErrorAction; @@ -100,6 +106,8 @@ int bdrv_commit(BlockDriverState *bs); void bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); +int bdrv_lock(BlockDriverState *bs); +void bdrv_unlock(BlockDriverState *bs); void bdrv_register(BlockDriver *bdrv); Index: qemu-git/block/raw-posix.c =================================================================== --- qemu-git.orig/block/raw-posix.c +++ qemu-git/block/raw-posix.c @@ -803,6 +803,67 @@ static int64_t raw_get_allocated_file_si return (int64_t)st.st_blocks * 512; } +static int raw_lock(BlockDriverState *bs, BDRVLockType lock_type) +{ + BlockDriver *drv = bs->drv; + BDRVRawState *s = bs->opaque; + struct flock flock = { + .l_whence = SEEK_SET, + .l_start = 0, + .l_len = 0, + }; + int n; + + switch (lock_type) { + case BDRV_F_RDLCK: + case BDRV_F_WRLCK: + if (drv->num_locks) { + drv->num_locks++; + return 0; + } + flock.l_type = (lock_type == BDRV_F_RDLCK) ? F_RDLCK : F_WRLCK; + break; + + case BDRV_F_UNLCK: + if (--drv->num_locks > 0) { + return 0; + } + + assert(drv->num_locks == 0); + + flock.l_type = F_UNLCK; + break; + + default: + return -EINVAL; + } + + while (1) { + n = fcntl(s->fd, F_SETLKW, &flock); + if (n < 0) { + if (errno == EINTR) { + continue; + } + if (errno == EAGAIN) { + usleep(10000); + continue; + } + } + break; + } + + if (n == 0 && + ((lock_type == BDRV_F_RDLCK) || (lock_type == BDRV_F_WRLCK))) { + drv->num_locks = 1; + } + + if (n) { + return -errno; + } + + return 0; +} + static int raw_create(const char *filename, QEMUOptionParameter *options) { int fd; @@ -901,6 +962,8 @@ static BlockDriver bdrv_file = { .bdrv_get_allocated_file_size = raw_get_allocated_file_size, + .bdrv_lock = raw_lock, + .create_options = raw_create_options, }; Index: qemu-git/block_int.h =================================================================== --- qemu-git.orig/block_int.h +++ qemu-git/block_int.h @@ -146,6 +146,10 @@ struct BlockDriver { */ int (*bdrv_has_zero_init)(BlockDriverState *bs); + /* File locking */ + int num_locks; + int (*bdrv_lock)(BlockDriverState *bs, BDRVLockType lock_type); + QLIST_ENTRY(BlockDriver) list; }; Index: qemu-git/block/raw-win32.c =================================================================== --- qemu-git.orig/block/raw-win32.c +++ qemu-git/block/raw-win32.c @@ -242,6 +242,57 @@ static int64_t raw_get_allocated_file_si return st.st_size; } +static int raw_lock(BlockDriverState *bs, int lock_type) +{ + BlockDriver *drv = bs->drv; + BDRVRawState *s = bs->opaque; + OVERLAPPED ov; + BOOL res; + DWORD num_bytes; + + switch (lock_type) { + case BDRV_F_RDLCK: + case BDRV_F_WRLCK: + if (drv->num_locks) { + drv->num_locks++; + return 0; + } + + memset(&ov, 0, sizeof(ov)); + + res = LockFileEx(s->hfile, LOCKFILE_EXCLUSIVE_LOCK, 0, ~0, ~0, &ov); + + if (res == FALSE) { + res = GetOverlappedResult(s->hfile, &ov, &num_bytes, TRUE); + } + + if (res == TRUE) { + drv->num_locks = 1; + } + + break; + + case BDRV_F_UNLCK: + if (--drv->num_locks > 0) { + return 0; + } + + assert(drv->num_locks >= 0); + + res = UnlockFile(s->hfile, 0, 0, ~0, ~0); + break; + + default: + return -EINVAL; + } + + if (res == FALSE) { + return -EIO; + } + + return 0; +} + static int raw_create(const char *filename, QEMUOptionParameter *options) { int fd; @@ -289,6 +340,7 @@ static BlockDriver bdrv_file = { .bdrv_get_allocated_file_size = raw_get_allocated_file_size, + .bdrv_lock = raw_lock, .create_options = raw_create_options, };