From patchwork Wed Dec 12 12:21:52 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ezequiel Garcia <elezegarcia@gmail.com>
X-Patchwork-Id: 205497
Return-Path: 
 <linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from merlin.infradead.org (merlin.infradead.org
	[IPv6:2001:4978:20e::2])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by ozlabs.org (Postfix) with ESMTPS id AD8692C00A4
	for <incoming@patchwork.ozlabs.org>;
	Wed, 12 Dec 2012 23:23:23 +1100 (EST)
Received: from localhost ([::1] helo=merlin.infradead.org)
	by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux))
	id 1TilKB-0002g3-Qp; Wed, 12 Dec 2012 12:22:07 +0000
Received: from mail-gg0-f177.google.com ([209.85.161.177])
	by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux))
	id 1TilK7-0002fR-L7
	for linux-mtd@lists.infradead.org; Wed, 12 Dec 2012 12:22:05 +0000
Received: by mail-gg0-f177.google.com with SMTP id y3so100176ggc.36
	for <linux-mtd@lists.infradead.org>;
	Wed, 12 Dec 2012 04:22:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=from:to:cc:subject:date:message-id:x-mailer;
	bh=h5JnYihZSwwNa7xKIX5bw+bjjcSIpdLq/QrySXbxHrw=;
	b=ZmNc9/P+dsfIrDoeiSYRId/Bt9FmV9AiZtVoC35OGqymXboR4wG8PaPGDR8Xprnpgv
	+j8BP98rpftje793/Cz9GaoLkC/SWQzFqPDqrvjNr5+ZyloBSm6VQ+2w1iEKvGY/Yvxp
	1EOYhrNjAqwMk9w+ge7uL2Fb/1qKta9h8knbDgnmkLYvrmyVrOu+6uZ2S240dkuG6nWI
	VMCKyeHTSeeaMz6zkg/y60NNq0wvbmNrNS1IXOrEaN5JsYXUCQoc2l3NKeOR15t4fjFZ
	FJr63t0X1m8u3w8ViFfmO1HXAZZwMsRzpXmXuiPQZ320yeRml3SpSmbjjye+R6nCuW3H
	1/tQ==
Received: by 10.236.124.7 with SMTP id w7mr363298yhh.45.1355314921606;
	Wed, 12 Dec 2012 04:22:01 -0800 (PST)
Received: from localhost.localdomain (122.222.3.200.ros.express.com.ar.
	[200.3.222.122]) by mx.google.com with ESMTPS id
	s28sm1647790yhk.11.2012.12.12.04.21.58
	(version=TLSv1/SSLv3 cipher=OTHER);
	Wed, 12 Dec 2012 04:22:00 -0800 (PST)
From: Ezequiel Garcia <elezegarcia@gmail.com>
To: <linux-mtd@lists.infradead.org>
Subject: [RFC/PATCH v2] ubi: Add ubiblock read-write driver
Date: Wed, 12 Dec 2012 09:21:52 -0300
Message-Id: <1355314912-9321-1-git-send-email-elezegarcia@gmail.com>
X-Mailer: git-send-email 1.7.8.6
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20121212_072203_914720_FEF9D04C 
X-CRM114-Status: GOOD (  33.46  )
X-Spam-Score: -2.7 (--)
X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary:
	Content analysis details:   (-2.7 points)
	pts rule name              description
	---- ----------------------
	--------------------------------------------------
	-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/,
	low trust [209.85.161.177 listed in list.dnswl.org]
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail
	provider (elezegarcia[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	-1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
	[score: 0.0000]
	-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from
	author's domain
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
	Ezequiel Garcia <elezegarcia@gmail.com>,
	Artem Bityutskiy <dedekind1@gmail.com>, richard.weinberger@gmail.com,
	Michael Opdenacker <michael.opdenacker@free-electrons.com>,
	Ezequiel Garcia <ezequiel@free-electrons.com>,
	Tim Bird <tim.bird@am.sony.com>
X-BeenThere: linux-mtd@lists.infradead.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Linux MTD discussion mailing list <linux-mtd.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-mtd>,
	<mailto:linux-mtd-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-mtd/>
List-Post: <mailto:linux-mtd@lists.infradead.org>
List-Help: <mailto:linux-mtd-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-mtd>,
	<mailto:linux-mtd-request@lists.infradead.org?subject=subscribe>
MIME-Version: 1.0
Sender: linux-mtd-bounces@lists.infradead.org
Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

From: Ezequiel Garcia <ezequiel@free-electrons.com>

Block device emulation on top of ubi volumes with read/write support.
Block devices are created upon user request through the
'vol' module parameter.

For instance,

  $ modprobe ubiblock vol=/dev/ubi0_0
  $ modprobe ubiblock vol=0,rootfs

Read/write access is expected to work fairly well because the
request queue at block elevator orders block transfers to achieve
space-locality.
In other words, it's expected that reads and writes gets ordered
to address the same LEB.

To help this and reduce access to the UBI volume,
two 1-LEB size caches have been implemented:
one for reading and one for writing.

Every read and every write goes through one of these caches.
Read requests can be satisfied through the read cache or the write cache.
Write requests always fill the write cache (possibly flushing it).
This fill is done through the read cache, if this is possible.
If the requested leb is not cached, it's loaded to the associated cache.

Flash device is written when write cache is flushed on:
* ubiblock device release
* a different than cached leb is accessed
* io-barrier is received through a REQ_FLUSH request

By creating two caches we decouple read access from write access,
in an effort to improve wear leveling.

Both caches are 1-LEB bytes, vmalloced at open() and freed at release();
in addition, each ubiblock has a workqueue associated (not a thread).
An unused ubiblock shouldn't waste much resources.

Signed-off-by: Ezequiel Garcia <ezequiel@free-electrons.com>
---
Changes from v1:
 * Switch to 2-caches: one for reading, one for writing
   as suggested by Richard Weinberger.
 * IO barriers supported using REQ_FLUSH as requested by Artem Bityutskiy.
 * Block devices are attached to ubi volumes using a parameter.

 drivers/mtd/ubi/Kconfig    |   16 +
 drivers/mtd/ubi/Makefile   |    1 +
 drivers/mtd/ubi/ubiblock.c |  830 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 847 insertions(+), 0 deletions(-)
 create mode 100644 drivers/mtd/ubi/ubiblock.c

diff --git a/drivers/mtd/ubi/Kconfig b/drivers/mtd/ubi/Kconfig
index 36663af..12ffa2d 100644
--- a/drivers/mtd/ubi/Kconfig
+++ b/drivers/mtd/ubi/Kconfig
@@ -87,4 +87,20 @@ config MTD_UBI_GLUEBI
 	   work on top of UBI. Do not enable this unless you use legacy
 	   software.
 
+config MTD_UBI_BLOCK
+	tristate "Caching block device access to UBI volumes"
+	help
+	   Since UBI already takes care of eraseblock wear leveling
+	   and bad block handling, it's possible to implement a block
+	   device on top of it and therefore mount regular filesystems
+	   (i.e. not flash-oriented, as ext4).
+
+	   In other words, this is a software flash translation layer.
+
+	   This is a *very* experimental feature. In particular, it is
+	   yet not known how heavily a regular block-oriented filesystem
+	   might impact on a raw flash wear.
+
+	   If in doubt, say "N".
+
 endif # MTD_UBI
diff --git a/drivers/mtd/ubi/Makefile b/drivers/mtd/ubi/Makefile
index b46b0c97..1578733 100644
--- a/drivers/mtd/ubi/Makefile
+++ b/drivers/mtd/ubi/Makefile
@@ -5,3 +5,4 @@ ubi-y += misc.o debug.o
 ubi-$(CONFIG_MTD_UBI_FASTMAP) += fastmap.o
 
 obj-$(CONFIG_MTD_UBI_GLUEBI) += gluebi.o
+obj-$(CONFIG_MTD_UBI_BLOCK) += ubiblock.o
diff --git a/drivers/mtd/ubi/ubiblock.c b/drivers/mtd/ubi/ubiblock.c
new file mode 100644
index 0000000..16a545e
--- /dev/null
+++ b/drivers/mtd/ubi/ubiblock.c
@@ -0,0 +1,830 @@
+/*
+ * Copyright (c) 2012 Ezequiel Garcia
+ * Copyright (c) 2011 Free Electrons
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * TODO: Add parameter for autoloading
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/err.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/mtd/ubi.h>
+#include <linux/workqueue.h>
+#include <linux/blkdev.h>
+#include <linux/hdreg.h>
+
+#include "ubi-media.h"
+
+#if 0
+static bool auto_vol;
+module_param(auto_vol, int, 0644);
+MODULE_PARM_DESC(auto_vol, "Automatically attach a block layer to each volume")
+#endif
+
+/* Maximum number of supported devices */
+#define UBIBLOCK_MAX_DEVICES 32
+
+/* Maximum length of the 'vol=' parameter */
+#define UBIBLOCK_VOL_PARAM_LEN 64
+
+/* Maximum number of comma-separated items in the 'vol=' parameter */
+#define UBIBLOCK_VOL_PARAM_COUNT 2
+
+struct ubiblock_vol_param {
+	int ubi_num;
+	int vol_id;
+	char name[UBIBLOCK_VOL_PARAM_LEN];
+};
+
+/* Numbers of elements set in the @ubiblock_vol_param array */
+static int ubiblock_devs;
+
+/* MTD devices specification parameters */
+static struct ubiblock_vol_param ubiblock_vol_param[UBIBLOCK_MAX_DEVICES];
+
+struct ubiblock_cache {
+	char *buffer;
+	enum { STATE_EMPTY, STATE_CLEAN, STATE_DIRTY } state;
+	int leb_num;
+};
+
+struct ubiblock {
+	struct ubi_volume_desc *desc;
+	struct ubi_volume_info *vi;
+	int ubi_num;
+	int vol_id;
+	int refcnt;
+
+	struct gendisk *gd;
+	struct request_queue *rq;
+
+	struct workqueue_struct *wq;
+	struct work_struct work;
+
+	struct mutex vol_mutex;
+	spinlock_t queue_lock;
+	struct list_head list;
+
+	int leb_size;
+	struct ubiblock_cache read_cache;
+	struct ubiblock_cache write_cache;
+};
+
+/* Linked list of all ubiblock instances */
+static LIST_HEAD(ubiblock_devices);
+static DEFINE_MUTEX(devices_mutex);
+static int ubiblock_major;
+
+/*
+ * Ugh, this parameter parsing code is simply awful :(
+ */
+static int ubiblock_set_vol_param(const char *val,
+				  const struct kernel_param *kp)
+{
+	int len, i, ret;
+	struct ubiblock_vol_param *param;
+	char buf[UBIBLOCK_VOL_PARAM_LEN];
+	char *pbuf = &buf[0];
+	char *tokens[UBIBLOCK_VOL_PARAM_COUNT];
+
+	if (!val)
+		return -EINVAL;
+
+	len = strnlen(val, UBIBLOCK_VOL_PARAM_LEN);
+	if (len == 0) {
+		pr_warn("empty 'vol=' parameter - ignored\n");
+		return 0;
+	}
+
+	if (len == UBIBLOCK_VOL_PARAM_LEN) {
+		pr_err("parameter \"%s\" is too long, max. is %d\n",
+			val, UBIBLOCK_VOL_PARAM_LEN);
+		return -EINVAL;
+	}
+
+	strcpy(buf, val);
+
+	/* Get rid of the final newline */
+	if (buf[len - 1] == '\n')
+		buf[len - 1] = '\0';
+
+	for (i = 0; i < UBIBLOCK_VOL_PARAM_COUNT; i++)
+		tokens[i] = strsep(&pbuf, ",");
+
+	param = &ubiblock_vol_param[ubiblock_devs];
+	if (tokens[1]) {
+		/* Two parameters: can be 'ubi, vol_id' or 'ubi, vol_name' */
+		ret = kstrtoint(tokens[0], 10, &param->ubi_num);
+		if (ret < 0)
+			return -EINVAL;
+
+		/* Second param can be a number or a name */
+		ret = kstrtoint(tokens[1], 10, &param->vol_id);
+		if (ret < 0) {
+			param->vol_id = -1;
+			strcpy(param->name, tokens[1]);
+		}
+
+	} else {
+		/* One parameter: must be device path */
+		strcpy(param->name, tokens[0]);
+		param->ubi_num = -1;
+		param->vol_id = -1;
+	}
+
+	ubiblock_devs++;
+
+	return 0;
+}
+
+static const struct kernel_param_ops ubiblock_param_ops = {
+        .set    = ubiblock_set_vol_param,
+};
+module_param_cb(vol, &ubiblock_param_ops, NULL, 0644);
+
+static struct ubiblock *find_dev_nolock(int ubi_num, int vol_id)
+{
+	struct ubiblock *dev;
+
+	list_for_each_entry(dev, &ubiblock_devices, list)
+		if (dev->ubi_num == ubi_num && dev->vol_id == vol_id)
+			return dev;
+	return NULL;
+}
+
+static bool leb_on_cache(struct ubiblock_cache *cache, int leb_num)
+{
+	return cache->leb_num == leb_num;
+}
+
+static int ubiblock_fill_cache(struct ubiblock *dev, int leb_num,
+			       struct ubiblock_cache *cache,
+			       struct ubiblock_cache *aux_cache)
+{
+	int ret;
+
+	/* Warn if we fill cache while being dirty */
+	WARN_ON(cache->state == STATE_DIRTY);
+
+	cache->leb_num = leb_num;
+	cache->state = STATE_CLEAN;
+
+	/*
+	 * If leb is on auxiliary cache, we use it to fill
+	 * the cache. This auxiliary cache needs to be invalidated.
+	 */
+	if (aux_cache && leb_on_cache(aux_cache, leb_num)) {
+
+		aux_cache->leb_num = -1;
+		aux_cache->state = STATE_EMPTY;
+		memcpy(cache->buffer, aux_cache->buffer, dev->leb_size);
+	} else {
+
+		ret = ubi_read(dev->desc, leb_num, cache->buffer, 0,
+			       dev->leb_size);
+		if (ret) {
+			dev_err(disk_to_dev(dev->gd), "ubi_read error %d\n",
+				ret);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int ubiblock_flush(struct ubiblock *dev, bool sync)
+{
+	struct ubiblock_cache* cache = &dev->write_cache;
+	int ret = 0;
+
+	if (cache->state != STATE_DIRTY)
+		return 0;
+
+	/*
+	 * TODO: mtdblock sets STATE_EMPTY, arguing that it prevents the
+	 * underlying media to get changed without notice.
+	 * I'm not fully convinced, so I just put STATE_CLEAN.
+	 */
+	cache->state = STATE_CLEAN;
+
+	/* Atomically change leb with buffer contents */
+	ret = ubi_leb_change(dev->desc, cache->leb_num,
+			     cache->buffer, dev->leb_size);
+	if (ret) {
+		dev_err(disk_to_dev(dev->gd), "ubi_leb_change error %d\n", ret);
+		return ret;
+	}
+
+	/* Sync ubi device when device is released and on block flush ioctl */
+	if (sync)
+		ret = ubi_sync(dev->ubi_num);
+
+	return ret;
+}
+
+static int ubiblock_read(struct ubiblock *dev, char *buffer,
+			 int pos, int len)
+{
+	int leb, offset, ret;
+	int bytes_left = len;
+	int to_read = len;
+	char *cache_buffer;
+
+	/* Get leb:offset address to read from */
+	leb = pos / dev->leb_size;
+	offset = pos % dev->leb_size;
+
+	while (bytes_left) {
+
+		/*
+		 * We can only read one leb at a time.
+		 * Therefore if the read length is larger than
+		 * one leb size, we split the operation.
+		 */
+		if (offset + to_read > dev->leb_size)
+			to_read = dev->leb_size - offset;
+
+		/*
+		 * 1) First try on read cache, if not there...
+		 * 2) then try on write cache, if not there...
+		 * 3) finally load this leb on read_cache
+		 *
+		 * Note that reading never flushes to disk!
+		 */
+		if (leb_on_cache(&dev->read_cache, leb)) {
+			cache_buffer = dev->read_cache.buffer;
+
+		} else if (leb_on_cache(&dev->write_cache, leb)) {
+			cache_buffer = dev->write_cache.buffer;
+
+		} else {
+			/* Leb is not in any cache: fill read_cache */
+			ret = ubiblock_fill_cache(dev, leb,
+				&dev->read_cache, NULL);
+			if (ret)
+				return ret;
+			cache_buffer = dev->read_cache.buffer;
+		}
+
+		memcpy(buffer, cache_buffer + offset, to_read);
+		buffer += to_read;
+		bytes_left -= to_read;
+		to_read = bytes_left;
+		leb++;
+		offset = 0;
+	}
+	return 0;
+}
+
+static int ubiblock_write(struct ubiblock *dev, const char *buffer,
+			 int pos, int len)
+{
+	int leb, offset, ret;
+	int bytes_left = len;
+	int to_write = len;
+	struct ubiblock_cache* cache = &dev->write_cache;
+
+	/* Get (leb:offset) address to write */
+	leb = pos / dev->leb_size;
+	offset = pos % dev->leb_size;
+
+	while (bytes_left) {
+		/*
+		 * We can only write one leb at a time.
+		 * Therefore if the write length is larger than
+		 * one leb size, we split the operation.
+		 */
+		if (offset + to_write > dev->leb_size)
+			to_write = dev->leb_size - offset;
+
+		/*
+		 * If leb is not on write cache, we flush current cached
+		 * leb to disk. Cache contents will be filled either
+		 * by using read cache or by reading device.
+		 */
+		if (!leb_on_cache(cache, leb)) {
+
+			ret = ubiblock_flush(dev, false);
+			if (ret)
+				return ret;
+
+			ret = ubiblock_fill_cache(dev, leb,
+				cache, &dev->read_cache);
+			if (ret)
+				return ret;
+		}
+
+		memcpy(cache->buffer + offset, buffer, to_write);
+
+		/* This is the only place where we dirt the write cache */
+		cache->state = STATE_DIRTY;
+
+		buffer += to_write;
+		bytes_left -= to_write;
+		to_write = bytes_left;
+		offset = 0;
+		leb++;
+	}
+	return 0;
+}
+
+static int do_ubiblock_request(struct ubiblock *dev, struct request *req)
+{
+	int pos, len;
+
+	if (req->cmd_flags & REQ_FLUSH)
+		return ubiblock_flush(dev, true);
+
+	if (req->cmd_type != REQ_TYPE_FS)
+		return -EIO;
+
+	if (blk_rq_pos(req) + blk_rq_cur_sectors(req) >
+	    get_capacity(req->rq_disk))
+		return -EIO;
+
+	pos = blk_rq_pos(req) << 9;
+	len = blk_rq_cur_bytes(req);
+
+	switch (rq_data_dir(req)) {
+	case READ:
+		return ubiblock_read(dev, req->buffer, pos, len);
+	case WRITE:
+		return ubiblock_write(dev, req->buffer, pos, len);
+	default:
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static void ubiblock_do_work(struct work_struct *work)
+{
+	struct ubiblock *dev =
+		container_of(work, struct ubiblock, work);
+	struct request_queue *rq = dev->rq;
+	struct request *req;
+	int res;
+
+	spin_lock_irq(rq->queue_lock);
+
+	req = blk_fetch_request(rq);
+	while (req) {
+
+		spin_unlock_irq(rq->queue_lock);
+
+		mutex_lock(&dev->vol_mutex);
+		res = do_ubiblock_request(dev, req);
+		mutex_unlock(&dev->vol_mutex);
+
+		spin_lock_irq(rq->queue_lock);
+
+		/*
+		 * If we're done with this request,
+		 * we need to fetch a new one
+		 */
+		if (!__blk_end_request_cur(req, res))
+			req = blk_fetch_request(rq);
+	}
+
+	spin_unlock_irq(rq->queue_lock);
+}
+
+static void ubiblock_request(struct request_queue *rq)
+{
+	struct ubiblock *dev;
+	struct request *req;
+
+	dev = rq->queuedata;
+
+	if (!dev)
+		while ((req = blk_fetch_request(rq)) != NULL)
+			__blk_end_request_all(req, -ENODEV);
+	else
+		queue_work(dev->wq, &dev->work);
+}
+
+static int ubiblock_alloc_cache(struct ubiblock_cache *cache, int size)
+{
+	cache->state = STATE_EMPTY;
+	cache->leb_num = -1;
+	cache->buffer = vmalloc(size);
+	if (!cache->buffer)
+		return -ENOMEM;
+	return 0;
+}
+
+static void ubiblock_free_cache(struct ubiblock_cache *cache)
+{
+	cache->leb_num = -1;
+	cache->state = STATE_EMPTY;
+	vfree(cache->buffer);
+}
+
+static int ubiblock_open(struct block_device *bdev, fmode_t mode)
+{
+	struct ubiblock *dev = bdev->bd_disk->private_data;
+	int ubi_mode = UBI_READONLY;
+	int ret;
+
+	mutex_lock(&dev->vol_mutex);
+	if (dev->refcnt > 0) {
+		/*
+		 * The volume is already opened,
+		 * just increase the reference counter
+		 */
+		dev->refcnt++;
+		mutex_unlock(&dev->vol_mutex);
+		return 0;
+	}
+
+	if (mode & FMODE_WRITE)
+		ubi_mode = UBI_READWRITE;
+
+	dev->desc = ubi_open_volume(dev->ubi_num, dev->vol_id, ubi_mode);
+	if (IS_ERR(dev->desc)) {
+		dev_err(disk_to_dev(dev->gd),
+			"failed to open ubi volume %d_%d\n",
+			dev->ubi_num, dev->vol_id);
+
+		ret = PTR_ERR(dev->desc);
+		dev->desc = NULL;
+		goto out_unlock;
+	}
+
+	dev->vi = kzalloc(sizeof(struct ubi_volume_info), GFP_KERNEL);
+	if (!dev->vi) {
+		ret = -ENOMEM;
+		goto out_close;
+	}
+	ubi_get_volume_info(dev->desc, dev->vi);
+	dev->leb_size = dev->vi->usable_leb_size;
+
+	ret = ubiblock_alloc_cache(&dev->read_cache, dev->leb_size);
+	if (ret)
+		goto out_free;
+
+	ret = ubiblock_alloc_cache(&dev->write_cache, dev->leb_size);
+	if (ret)
+		goto out_free_cache;
+
+	dev->refcnt++;
+	mutex_unlock(&dev->vol_mutex);
+	return 0;
+
+out_free_cache:
+	ubiblock_free_cache(&dev->read_cache);
+out_free:
+	kfree(dev->vi);
+out_close:
+	ubi_close_volume(dev->desc);
+	dev->desc = NULL;
+out_unlock:
+	mutex_unlock(&dev->vol_mutex);
+	return ret;
+}
+
+static int ubiblock_release(struct gendisk *gd, fmode_t mode)
+{
+	struct ubiblock *dev = gd->private_data;
+
+	mutex_lock(&dev->vol_mutex);
+
+	dev->refcnt--;
+	if (dev->refcnt == 0) {
+		ubiblock_flush(dev, true);
+
+		ubiblock_free_cache(&dev->read_cache);
+		ubiblock_free_cache(&dev->write_cache);
+
+		kfree(dev->vi);
+		ubi_close_volume(dev->desc);
+
+		dev->vi = NULL;
+		dev->desc = NULL;
+	}
+
+	mutex_unlock(&dev->vol_mutex);
+	return 0;
+}
+
+static int ubiblock_ioctl(struct block_device *bdev, fmode_t mode,
+			      unsigned int cmd, unsigned long arg)
+{
+	struct ubiblock *dev = bdev->bd_disk->private_data;
+	int ret = -ENXIO;
+
+	if (!dev)
+		return ret;
+
+	mutex_lock(&dev->vol_mutex);
+
+	/* I can't get this to get called. What's going on? */
+	switch (cmd) {
+	case BLKFLSBUF:
+		ret = ubiblock_flush(dev, true);
+		break;
+	default:
+		ret = -ENOTTY;
+	}
+
+	mutex_unlock(&dev->vol_mutex);
+	return ret;
+}
+
+static int ubiblock_getgeo(struct block_device *bdev, struct hd_geometry *geo)
+{
+	/* Some tools might require this information */
+	geo->heads = 1;
+	geo->cylinders = 1;
+	geo->sectors = get_capacity(bdev->bd_disk);
+	geo->start = 0;
+	return 0;
+}
+
+static const struct block_device_operations ubiblock_ops = {
+	.owner = THIS_MODULE,
+	.open = ubiblock_open,
+	.release = ubiblock_release,
+	.ioctl = ubiblock_ioctl,
+	.getgeo	= ubiblock_getgeo,
+};
+
+static int ubiblock_add(struct ubi_volume_info *vi)
+{
+	struct ubiblock *dev;
+	struct gendisk *gd;
+	int disk_capacity;
+	int ret;
+
+	/* Check that the volume isn't already handled */
+	mutex_lock(&devices_mutex);
+	if (find_dev_nolock(vi->ubi_num, vi->vol_id)) {
+		mutex_unlock(&devices_mutex);
+		return -EEXIST;
+	}
+	mutex_unlock(&devices_mutex);
+
+	dev = kzalloc(sizeof(struct ubiblock), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	mutex_init(&dev->vol_mutex);
+
+	dev->ubi_num = vi->ubi_num;
+	dev->vol_id = vi->vol_id;
+
+	/* Initialize the gendisk of this ubiblock device */
+	gd = alloc_disk(1);
+	if (!gd) {
+		pr_err("alloc_disk failed\n");
+		ret = -ENODEV;
+		goto out_free_dev;
+	}
+
+	gd->fops = &ubiblock_ops;
+	gd->major = ubiblock_major;
+	gd->first_minor = dev->ubi_num * UBI_MAX_VOLUMES + dev->vol_id;
+	gd->private_data = dev;
+	sprintf(gd->disk_name, "ubiblock%d_%d", dev->ubi_num, dev->vol_id);
+	disk_capacity = (vi->size * vi->usable_leb_size) >> 9;
+	set_capacity(gd, disk_capacity);
+	dev->gd = gd;
+
+	spin_lock_init(&dev->queue_lock);
+	dev->rq = blk_init_queue(ubiblock_request, &dev->queue_lock);
+	if (!dev->rq) {
+		pr_err("blk_init_queue failed\n");
+		ret = -ENODEV;
+		goto out_put_disk;
+	}
+
+	dev->rq->queuedata = dev;
+	dev->gd->queue = dev->rq;
+
+	blk_queue_flush(dev->rq, REQ_FLUSH);
+
+	/* TODO: Is performance better or worse with this flag? */
+	/* queue_flag_set_unlocked(QUEUE_FLAG_NONROT, dev->rq);*/
+
+	/*
+	 * Create one workqueue per volume (per registered block device).
+	 * Rembember workqueues are cheap, they're not threads.
+	 */
+	dev->wq = alloc_workqueue(gd->disk_name, 0, 0);
+	if (!dev->wq)
+		goto out_free_queue;
+	INIT_WORK(&dev->work, ubiblock_do_work);
+
+	mutex_lock(&devices_mutex);
+	list_add_tail(&dev->list, &ubiblock_devices);
+	mutex_unlock(&devices_mutex);
+
+	/* Must be the last step: anyone can call file ops from now on */
+	add_disk(dev->gd);
+
+	dev_info(disk_to_dev(dev->gd), "created from ubi%d:%d(%s)\n",
+		 dev->ubi_num, dev->vol_id, vi->name);
+
+	return 0;
+
+out_free_queue:
+	blk_cleanup_queue(dev->rq);
+out_put_disk:
+	put_disk(dev->gd);
+out_free_dev:
+	kfree(dev);
+
+	return ret;
+}
+
+static void ubiblock_cleanup(struct ubiblock *dev)
+{
+	del_gendisk(dev->gd);
+	blk_cleanup_queue(dev->rq);
+	put_disk(dev->gd);
+}
+
+static void ubiblock_del(struct ubi_volume_info *vi)
+{
+	struct ubiblock *dev;
+
+	mutex_lock(&devices_mutex);
+	dev = find_dev_nolock(vi->ubi_num, vi->vol_id);
+	if (!dev) {
+		mutex_unlock(&devices_mutex);
+		return;
+	}
+	/* Remove from device list */
+	list_del(&dev->list);
+	mutex_unlock(&devices_mutex);
+
+	/* Flush pending work and stop this workqueue */
+	destroy_workqueue(dev->wq);
+
+	mutex_lock(&dev->vol_mutex);
+
+	/*
+	 * This means that ubiblock device is opened and in usage.
+	 * However, this shouldn't happen, since we have
+	 * called ubi_open_volume() at open() time, thus preventing
+	 * volume removal.
+	 */
+	WARN_ON(dev->desc);
+	ubiblock_cleanup(dev);
+
+	mutex_unlock(&dev->vol_mutex);
+
+	kfree(dev);
+}
+
+static void ubiblock_resize(struct ubi_volume_info *vi)
+{
+	struct ubiblock *dev;
+	int disk_capacity;
+
+	/*
+	 * We don't touch the list, but we better lock it: it could be that the
+	 * device gets removed between the time the device has been found and
+	 * the time we access dev->gd
+	 */
+	mutex_lock(&devices_mutex);
+	dev = find_dev_nolock(vi->ubi_num, vi->vol_id);
+	if (!dev) {
+		mutex_unlock(&devices_mutex);
+		return;
+	}
+	mutex_unlock(&devices_mutex);
+
+	mutex_lock(&dev->vol_mutex);
+	disk_capacity = (vi->size * vi->usable_leb_size) >> 9;
+	set_capacity(dev->gd, disk_capacity);
+	dev_dbg(disk_to_dev(dev->gd), "resized to %d LEBs\n", vi->size);
+	mutex_unlock(&dev->vol_mutex);
+}
+
+static int ubiblock_notify(struct notifier_block *nb,
+			 unsigned long notification_type, void *ns_ptr)
+{
+	struct ubi_notification *nt = ns_ptr;
+
+	switch (notification_type) {
+	case UBI_VOLUME_ADDED:
+		if (0)
+			ubiblock_add(&nt->vi);
+		break;
+	case UBI_VOLUME_REMOVED:
+		ubiblock_del(&nt->vi);
+		break;
+	case UBI_VOLUME_RESIZED:
+		ubiblock_resize(&nt->vi);
+		break;
+	default:
+		break;
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block ubiblock_notifier = {
+	.notifier_call = ubiblock_notify,
+};
+
+static struct ubi_volume_desc *
+open_volume_desc(const char *name, int ubi_num, int vol_id)
+{
+	if (ubi_num == -1)
+		/* No ubi num, name must be a vol device path */
+		return ubi_open_volume_path(name, UBI_READONLY);
+	else if (vol_id == -1)
+		/* No vol_id, must be vol_name */
+		return ubi_open_volume_nm(ubi_num, name, UBI_READONLY);
+	else
+		return ubi_open_volume(ubi_num, vol_id, UBI_READONLY);
+}
+
+static void ubiblock_register_vol_param(void)
+{
+	int i, ret;
+	struct ubiblock_vol_param *p;
+	struct ubi_volume_desc *desc;
+	struct ubi_volume_info vi;
+
+	for (i = 0; i < ubiblock_devs; i++) {
+		p = &ubiblock_vol_param[i];
+
+		desc = open_volume_desc(p->name, p->ubi_num, p->vol_id);
+		if (IS_ERR(desc))
+			continue;
+
+		ubi_get_volume_info(desc, &vi);
+		ret = ubiblock_add(&vi);
+		if (ret)
+			pr_warn("can't add '%s' volume, err=%d\n",
+				vi.name, ret);
+
+		ubi_close_volume(desc);
+	}
+}
+
+static int __init ubiblock_init(void)
+{
+	ubiblock_major = register_blkdev(0, "ubiblock");
+	if (ubiblock_major < 0)
+		return ubiblock_major;
+
+	ubiblock_register_vol_param();
+
+	/*
+	 * Blocks will get registered dynamically.
+	 * Each ubi volume will get a corresponding block device.
+	 */
+	return ubi_register_volume_notifier(&ubiblock_notifier, 1);
+}
+
+static void __exit ubiblock_exit(void)
+{
+	struct ubiblock *next;
+	struct ubiblock *dev;
+
+	ubi_unregister_volume_notifier(&ubiblock_notifier);
+
+	list_for_each_entry_safe(dev, next, &ubiblock_devices, list) {
+
+		/* Flush pending work and stop workqueue */
+		destroy_workqueue(dev->wq);
+
+		/* The module is being forcefully removed */
+		WARN_ON(dev->desc);
+
+		/* Remove from device list */
+		list_del(&dev->list);
+
+		ubiblock_cleanup(dev);
+
+		kfree(dev);
+	}
+
+	unregister_blkdev(ubiblock_major, "ubiblock");
+}
+
+module_init(ubiblock_init);
+module_exit(ubiblock_exit);
+
+MODULE_DESCRIPTION("Block device emulation access to UBI volumes");
+MODULE_AUTHOR("David Wagner");
+MODULE_AUTHOR("Ezequiel Garcia <ezequiel@free-electrons.com>");
+MODULE_LICENSE("GPL");