From patchwork Fri Dec 3 11:09:52 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: ronnie sahlberg X-Patchwork-Id: 74126 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id CA14B1007D7 for ; Fri, 3 Dec 2010 22:58:47 +1100 (EST) Received: from localhost ([127.0.0.1]:40884 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1POTl8-000650-Ht for incoming@patchwork.ozlabs.org; Fri, 03 Dec 2010 06:25:02 -0500 Received: from [140.186.70.92] (port=59607 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1POTYk-00022H-Eq for qemu-devel@nongnu.org; Fri, 03 Dec 2010 06:12:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1POTYi-0002yt-HX for qemu-devel@nongnu.org; Fri, 03 Dec 2010 06:12:14 -0500 Received: from mail-yx0-f173.google.com ([209.85.213.173]:45069) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1POTYi-0002ym-9Q for qemu-devel@nongnu.org; Fri, 03 Dec 2010 06:12:12 -0500 Received: by yxl31 with SMTP id 31so5075622yxl.4 for ; Fri, 03 Dec 2010 03:12:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:to:cc:subject :date:message-id:x-mailer:in-reply-to:references; bh=TB6fVSI0nYoruZr1mhKIxcZ9OUS2jiCn9x3qsOvjkpE=; b=R2VhaqadPOsthudmQ8h/6kbw6hvkxX0O3EsSxnZYsRgwirr3HLF7Ng2yusxvRC9KcI LvEnfS8j81OMBMRN6iM9Ul6+ETE7n0wbsxlsgNCPXvYKqB5kvxeJ+f+cawbkKi5v3x3R Wr6JiFGJCF/fzub9+aJU+smsKVbIgLoe+gbjQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references; b=MleO4WIjA97n3W3PhxLTC0I0lR8Sh1HhhEkbzGZcialVkqVLJu3Gb6pdtjyPFNFgeB l9fUgn1aTteRUDDIHA7On+36bIeWXHMV3dfIwODLjhWaESLl+uKXLPrwMaLhBbXBKx5O dJk75oNUM/sNev2ieVN58Rxo96giKl1Q+bi/M= Received: by 10.42.172.199 with SMTP id o7mr404722icz.355.1291374731650; Fri, 03 Dec 2010 03:12:11 -0800 (PST) Received: from ronniesahlberg@gmail.com (CPE-121-216-183-74.lnse2.ken.bigpond.net.au [121.216.183.74]) by mx.google.com with ESMTPS id 34sm1487205ibi.2.2010.12.03.03.12.07 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 03 Dec 2010 03:12:10 -0800 (PST) Received: by ronniesahlberg@gmail.com (sSMTP sendmail emulation); Fri, 03 Dec 2010 22:11:43 +1100 From: ronniesahlberg@gmail.com To: qemu-devel@nongnu.org Date: Fri, 3 Dec 2010 22:09:52 +1100 Message-Id: <1291374593-17448-14-git-send-email-ronniesahlberg@gmail.com> X-Mailer: git-send-email 1.7.3.1 In-Reply-To: <1291374593-17448-1-git-send-email-ronniesahlberg@gmail.com> References: <1291374593-17448-1-git-send-email-ronniesahlberg@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: Ronnie Sahlberg Subject: [Qemu-devel] [PATCH 13/14] ./block/iscsi.c X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Ronnie Sahlberg This file provides a new protocol ISCSI to qemu and allows qemu connect directly to iscsi resources, without having to go through the host scsi layer. This allows qemu/kvm to use iscsi devices without exposing these to the host system and without polluting the page cache of the host. This file provides the bindings between QEMU./KVM and the iscsi library allowing QEMY to interface to network devices directly. This library is fully async for read/write and nonblocking and should provide good performance. Optimizations are still possible for future enhancements to for example reduce the number of data copes that are performed in the read/write paths. Syntax for using iscsi devices is iscsi://[:]// Example : ... -drive file=iscsi://127.0.0.1:3260/iqn.ronnie.test/1 ... -cdrom iscsi://127.0.0.1:3260/iqn.ronnie.test/2 This has been tested extensively with TGTS and seems to work well with TGTD exported devices. For more general purpose use, support for target initiated NOP exchanges and CHAP authentication is desired. Signed-off-by: Ronnie Sahlberg --- block/iscsi.c | 500 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 500 insertions(+), 0 deletions(-) create mode 100644 block/iscsi.c diff --git a/block/iscsi.c b/block/iscsi.c new file mode 100644 index 0000000..20fc288 --- /dev/null +++ b/block/iscsi.c @@ -0,0 +1,500 @@ +/* + * QEMU Block driver for iSCSI images + * + * Copyright (c) 2010 Ronnie Sahlberg + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include +#include "sysemu.h" +#include "qemu-common.h" +#include "qemu-error.h" +#include "block_int.h" +#include "iscsi/iscsi.h" +#include "iscsi/scsi-lowlevel.h" + + +typedef struct ISCSILUN { + struct iscsi_context *iscsi; + int lun; + int block_size; + unsigned long num_blocks; +} ISCSILUN; + +typedef struct ISCSIAIOCB { + BlockDriverAIOCB common; + QEMUIOVector *qiov; + QEMUBH *bh; + ISCSILUN *iscsilun; + int canceled; + int status; + size_t read_size; +} ISCSIAIOCB; + +struct iscsi_task { + ISCSILUN *iscsilun; + int status; + int complete; +}; + +static int +iscsi_is_inserted(BlockDriverState *bs) +{ + ISCSILUN *iscsilun = bs->opaque; + struct iscsi_context *iscsi = iscsilun->iscsi; + + return iscsi_is_logged_in(iscsi); +} + + +static void +iscsi_aio_cancel(BlockDriverAIOCB *blockacb) +{ + ISCSIAIOCB *acb = (ISCSIAIOCB *)blockacb; + + acb->status = -EIO; + acb->common.cb(acb->common.opaque, acb->status); + acb->canceled = 1; +} + +static AIOPool iscsi_aio_pool = { + .aiocb_size = sizeof(ISCSIAIOCB), + .cancel = iscsi_aio_cancel, +}; + + +static void iscsi_process_read(void *arg); +static void iscsi_process_write(void *arg); + +static void +iscsi_set_events(ISCSILUN *iscsilun) +{ + struct iscsi_context *iscsi = iscsilun->iscsi; + + qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), iscsi_process_read, + (iscsi_which_events(iscsi)&POLLOUT) + ?iscsi_process_write:NULL, + NULL, NULL, iscsilun); +} + +static void +iscsi_process_read(void *arg) +{ + ISCSILUN *iscsilun = arg; + struct iscsi_context *iscsi = iscsilun->iscsi; + + iscsi_service(iscsi, POLLIN); + iscsi_set_events(iscsilun); +} + +static void +iscsi_process_write(void *arg) +{ + ISCSILUN *iscsilun = arg; + struct iscsi_context *iscsi = iscsilun->iscsi; + + iscsi_service(iscsi, POLLOUT); + iscsi_set_events(iscsilun); +} + + +static int +iscsi_schedule_bh(QEMUBHFunc *cb, ISCSIAIOCB *acb) +{ + acb->bh = qemu_bh_new(cb, acb); + if (!acb->bh) { + error_report("oom: could not create iscsi bh"); + return -EIO; + } + + qemu_bh_schedule(acb->bh); + return 0; +} + +static void +iscsi_readv_writev_bh_cb(void *p) +{ + ISCSIAIOCB *acb = p; + + acb->common.cb(acb->common.opaque, acb->status); + qemu_aio_release(acb); +} + +static void +iscsi_aio_write10_cb(struct iscsi_context *iscsi, int status, + void *command_data, void *private_data) +{ + ISCSIAIOCB *acb = private_data; + + if (acb->canceled != 0) { + qemu_aio_release(acb); + return; + } + + acb->status = 0; + if (status < 0) { + error_report("Failed to write10 data to iSCSI lun. %s", + iscsi_get_error(iscsi)); + acb->status = -EIO; + } + + iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb); +} + +static BlockDriverAIOCB * +iscsi_aio_writev(BlockDriverState *bs, int64_t sector_num, + QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, + void *opaque) +{ + ISCSILUN *iscsilun = bs->opaque; + struct iscsi_context *iscsi = iscsilun->iscsi; + ISCSIAIOCB *acb; + size_t size; + unsigned char *buf; + + acb = qemu_aio_get(&iscsi_aio_pool, bs, cb, opaque); + if (!acb) { + return NULL; + } + + acb->iscsilun = iscsilun; + acb->qiov = qiov; + + acb->canceled = 0; + + /* XXX we should pass the iovec to write10 to avoid the extra copy */ + /* this will allow us to get rid of 'buf' completely */ + size = nb_sectors * BDRV_SECTOR_SIZE; + buf = qemu_malloc(size); + qemu_iovec_to_buffer(acb->qiov, buf); + if (iscsi_write10_async(iscsi, iscsilun->lun, buf, size, + sector_num * BDRV_SECTOR_SIZE + / iscsilun->block_size, + 0, 0, iscsilun->block_size, + iscsi_aio_write10_cb, acb) != 0) { + error_report("iSCSI: Failed to send write10 command. %s", + iscsi_get_error(iscsi)); + qemu_free(buf); + qemu_aio_release(acb); + return NULL; + } + qemu_free(buf); + iscsi_set_events(iscsilun); + + return &acb->common; +} + +static void +iscsi_aio_read10_cb(struct iscsi_context *iscsi, int status, + void *command_data, void *private_data) +{ + ISCSIAIOCB *acb = private_data; + struct scsi_task *scsi = command_data; + + if (acb->canceled != 0) { + qemu_aio_release(acb); + return; + } + + acb->status = 0; + if (status < 0) { + error_report("Failed to read10 data from iSCSI lun. %s", + iscsi_get_error(iscsi)); + acb->status = -EIO; + } else { + qemu_iovec_from_buffer(acb->qiov, scsi->datain.data, acb->read_size); + } + + iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb); +} + +static BlockDriverAIOCB * +iscsi_aio_readv(BlockDriverState *bs, int64_t sector_num, + QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, + void *opaque) +{ + ISCSILUN *iscsilun = bs->opaque; + struct iscsi_context *iscsi = iscsilun->iscsi; + ISCSIAIOCB *acb; + size_t qemu_read_size, lun_read_size; + + qemu_read_size = BDRV_SECTOR_SIZE * (size_t)nb_sectors; + lun_read_size = (qemu_read_size + iscsilun->block_size - 1) + / iscsilun->block_size * iscsilun->block_size; + + acb = qemu_aio_get(&iscsi_aio_pool, bs, cb, opaque); + if (!acb) { + return NULL; + } + + acb->iscsilun = iscsilun; + acb->qiov = qiov; + + acb->canceled = 0; + acb->read_size = qemu_read_size; + + if (iscsi_read10_async(iscsi, iscsilun->lun, + sector_num * BDRV_SECTOR_SIZE + / iscsilun->block_size, + lun_read_size, iscsilun->block_size, + iscsi_aio_read10_cb, acb) != 0) { + error_report("iSCSI: Failed to send read10 command. %s", + iscsi_get_error(iscsi)); + qemu_aio_release(acb); + return NULL; + } + iscsi_set_events(iscsilun); + + return &acb->common; +} + + +static int +iscsi_flush(BlockDriverState *bs) +{ + ISCSILUN *iscsilun = bs->opaque; + struct iscsi_context *iscsi = iscsilun->iscsi; + struct scsi_task *task; + + task = iscsi_synchronizecache10_sync(iscsi, iscsilun->lun, 0, 0, 0, 0); + if (task == NULL) { + error_report("iSCSI: Failed to flush() device."); + return -1; + } + if (task->status != 0) { + error_report("iSCSI: Failed to flush to LUN : %s", + iscsi_get_error(iscsi)); + scsi_free_scsi_task(task); + return -1; + } + scsi_free_scsi_task(task); + return 0; +} + +static int64_t +iscsi_getlength(BlockDriverState *bs) +{ + ISCSILUN *iscsilun = bs->opaque; + int64_t len; + + len = iscsilun->num_blocks; + len *= iscsilun->block_size; + + return len; +} + +static void +iscsi_readcapacity10_cb(struct iscsi_context *iscsi, int status, + void *command_data, void *private_data) +{ + struct iscsi_task *task = private_data; + struct scsi_readcapacity10 *rc10; + struct scsi_task *scsi = command_data; + + if (status != 0) { + error_report("iSCSI: Failed to read capacity of iSCSI lun. %s", + iscsi_get_error(iscsi)); + task->status = 1; + task->complete = 1; + return; + } + + rc10 = scsi_datain_unmarshall(scsi); + if (rc10 == NULL) { + error_report("iSCSI: Failed to unmarshall readcapacity10 data."); + task->status = 1; + task->complete = 1; + return; + } + + task->iscsilun->block_size = rc10->block_size; + task->iscsilun->num_blocks = rc10->lba; + + task->status = 0; + task->complete = 1; +} + + +static void +iscsi_connect_cb(struct iscsi_context *iscsi, int status, void *command_data, + void *private_data) +{ + struct iscsi_task *task = private_data; + + if (status != 0) { + task->status = 1; + task->complete = 1; + return; + } + + if (iscsi_readcapacity10_async(iscsi, task->iscsilun->lun, 0, 0, + iscsi_readcapacity10_cb, private_data) + != 0) { + error_report("iSCSI: failed to send readcapacity command."); + task->status = 1; + task->complete = 1; + } +} + +/* + * We support iscsi url's on the form + * iscsi://[:]// + */ +static int iscsi_open(BlockDriverState *bs, const char *filename, int flags) +{ + ISCSILUN *iscsilun = bs->opaque; + struct iscsi_context *iscsi = NULL; + struct iscsi_task task; + char *ptr, *host, *target, *url; + char *tmp; + int ret, lun; + + bzero(iscsilun, sizeof(ISCSILUN)); + + url = qemu_strdup(filename); + + if (strncmp(url, "iscsi://", 8)) { + error_report("iSCSI: url does not start with 'iscsi://'"); + ret = -EINVAL; + goto failed; + } + + host = url + 8; + + ptr = index(host, '/'); + if (ptr == NULL) { + error_report("iSCSI: host is not '/' terminated."); + ret = -EINVAL; + goto failed; + } + + *ptr++ = 0; + target = ptr; + + ptr = index(target, '/'); + if (ptr == NULL) { + error_report("iSCSI: host/target is not '/' terminated."); + ret = -EINVAL; + goto failed; + } + + *ptr++ = 0; + + lun = strtol(ptr, &tmp, 10); + if (*tmp) { + error_report("iSCSI: Invalid LUN specified."); + ret = -EINVAL; + goto failed; + } + + /* Should really append the KVM name after the ':' here */ + iscsi = iscsi_create_context("iqn.2008-11.org.linux-kvm:"); + if (iscsi == NULL) { + error_report("iSCSI: Failed to create iSCSI context."); + ret = -ENOMEM; + goto failed; + } + + if (iscsi_set_targetname(iscsi, target)) { + error_report("iSCSI: Failed to set target name."); + ret = -ENOMEM; + goto failed; + } + + if (iscsi_set_session_type(iscsi, ISCSI_SESSION_NORMAL) != 0) { + error_report("iSCSI: Failed to set session type to normal."); + ret = -ENOMEM; + goto failed; + } + + iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_NONE_CRC32C); + + task.iscsilun = iscsilun; + task.status = 0; + task.complete = 0; + + iscsilun->iscsi = iscsi; + iscsilun->lun = lun; + + if (iscsi_full_connect_async(iscsi, host, lun, iscsi_connect_cb, &task) + != 0) { + error_report("iSCSI: Failed to start async connect."); + ret = -ENOMEM; + goto failed; + } + async_context_push(); + while (!task.complete) { + iscsi_set_events(iscsilun); + qemu_aio_wait(); + } + async_context_pop(); + if (task.status != 0) { + error_report("iSCSI: Failed to connect to LUN : %s", + iscsi_get_error(iscsi)); + ret = -EINVAL; + goto failed; + } + + return 0; + +failed: + qemu_free(url); + if (iscsi != NULL) { + iscsi_destroy_context(iscsi); + } + bzero(iscsilun, sizeof(ISCSILUN)); + return ret; +} + +static void iscsi_close(BlockDriverState *bs) +{ + ISCSILUN *iscsilun = bs->opaque; + struct iscsi_context *iscsi = iscsilun->iscsi; + + qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL, NULL, NULL); + iscsi_destroy_context(iscsi); + bzero(iscsilun, sizeof(ISCSILUN)); +} + +static BlockDriver bdrv_iscsi = { + .format_name = "iscsi", + .protocol_name = "iscsi", + + .instance_size = sizeof(ISCSILUN), + .bdrv_file_open = iscsi_open, + .bdrv_close = iscsi_close, + .bdrv_flush = iscsi_flush, + + .bdrv_getlength = iscsi_getlength, + + .bdrv_aio_readv = iscsi_aio_readv, + .bdrv_aio_writev = iscsi_aio_writev, + + .bdrv_is_inserted = iscsi_is_inserted, +}; + +static void iscsi_block_init(void) +{ + bdrv_register(&bdrv_iscsi); +} + +block_init(iscsi_block_init);