Patchwork [v4] mtd: Do not corrupt backing device of device node inode

login
register
mail settings
Submitter Kirill A. Shutemov
Date May 5, 2010, 3:40 p.m.
Message ID <1273074054-11849-1-git-send-email-kirill@shutemov.name>
Download mbox | patch
Permalink /patch/51704/
State New
Headers show

Comments

Kirill A. Shutemov - May 5, 2010, 3:40 p.m.
We cannot modify file->f_mapping->backing_dev_info, because it will corrupt
backing device of device node inode, since file->f_mapping is equal to
inode->i_mapping (see __dentry_open() in fs/open.c).

Let's introduce separate inode for MTD device with appropriate backing
device.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
---
Changelog:
 v3 -> v4:
  - Use igrab() instead of __iget inside the inode_lock;
  - Add stable@ to CC list.
 v2 -> v3:
  - Rebase to mtd-2.6.git.
 v1 -> v2:
  - Fix error handling based on comments by Jan Kara.

---
 drivers/mtd/mtdchar.c   |   74 +++++++++++++++++++++++++++++++++++++++++-----
 drivers/mtd/mtdcore.c   |    3 ++
 include/linux/mtd/mtd.h |    3 ++
 3 files changed, 72 insertions(+), 8 deletions(-)
Artem Bityutskiy - May 6, 2010, 6:39 a.m.
On Wed, 2010-05-05 at 18:40 +0300, Kirill A. Shutemov wrote:
> We cannot modify file->f_mapping->backing_dev_info, because it will corrupt
> backing device of device node inode, since file->f_mapping is equal to
> inode->i_mapping (see __dentry_open() in fs/open.c).
> 
> Let's introduce separate inode for MTD device with appropriate backing
> device.
> 
> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
> ---
> Changelog:
>  v3 -> v4:
>   - Use igrab() instead of __iget inside the inode_lock;
>   - Add stable@ to CC list.
>  v2 -> v3:
>   - Rebase to mtd-2.6.git.
>  v1 -> v2:
>   - Fix error handling based on comments by Jan Kara.

Pushed to l2-mtd-2.6.git / dunno

Also added Jan's ack which disappeared in v4
David Woodhouse - May 14, 2010, 1:04 a.m.
On Wed, 2010-05-05 at 18:40 +0300, Kirill A. Shutemov wrote:
> We cannot modify file->f_mapping->backing_dev_info, because it will corrupt
> backing device of device node inode, since file->f_mapping is equal to
> inode->i_mapping (see __dentry_open() in fs/open.c).
> 
> Let's introduce separate inode for MTD device with appropriate backing
> device.

I hate the fact that we have to do this -- is it really the only option?

Is it _just_ for the backing_device_info? Can't that be done
differently?

> @@ -85,11 +88,27 @@ static int mtd_open(struct inode *inode, struct file *file)
>  		goto out;
>  	}
>  
> -	if (mtd->backing_dev_info)
> -		file->f_mapping->backing_dev_info = mtd->backing_dev_info;
> +	if (!mtd->inode) {
> +		mtd->inode = new_inode(mtd_inode_mnt->mnt_sb);

I believe that would be a race condition, if it wasn't for the BKL.

And what happens when you close the chardevice and call iput() on the
inode so it's destroyed, and then you re-open the device? You never set
mtd->inode = NULL, so won't it now try to igrab a stale pointer?

You won't have seen this in your testing unless you made it prune the
icache between the close and open calls.
Kirill A. Shutemov - May 17, 2010, 1:19 p.m.
On Fri, May 14, 2010 at 4:04 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Wed, 2010-05-05 at 18:40 +0300, Kirill A. Shutemov wrote:
>> We cannot modify file->f_mapping->backing_dev_info, because it will corrupt
>> backing device of device node inode, since file->f_mapping is equal to
>> inode->i_mapping (see __dentry_open() in fs/open.c).
>>
>> Let's introduce separate inode for MTD device with appropriate backing
>> device.
>
> I hate the fact that we have to do this -- is it really the only option?
>
> Is it _just_ for the backing_device_info? Can't that be done
> differently?

Yes, it's ugly, but I don't see options.

>> @@ -85,11 +88,27 @@ static int mtd_open(struct inode *inode, struct file *file)
>>               goto out;
>>       }
>>
>> -     if (mtd->backing_dev_info)
>> -             file->f_mapping->backing_dev_info = mtd->backing_dev_info;
>> +     if (!mtd->inode) {
>> +             mtd->inode = new_inode(mtd_inode_mnt->mnt_sb);
>
> I believe that would be a race condition, if it wasn't for the BKL.

Ok, I'll fix it.

> And what happens when you close the chardevice and call iput() on the
> inode so it's destroyed, and then you re-open the device? You never set
> mtd->inode = NULL, so won't it now try to igrab a stale pointer?

inode destroys only on del_mtd_device() so it's safe to re-open chardevice.

> You won't have seen this in your testing unless you made it prune the
> icache between the close and open calls.
>
> --
> David Woodhouse                            Open Source Technology Centre
> David.Woodhouse@intel.com                              Intel Corporation
>
>
Jan Kara - May 17, 2010, 9:56 p.m.
On Fri 14-05-10 02:04:34, David Woodhouse wrote:
> On Wed, 2010-05-05 at 18:40 +0300, Kirill A. Shutemov wrote:
> > We cannot modify file->f_mapping->backing_dev_info, because it will corrupt
> > backing device of device node inode, since file->f_mapping is equal to
> > inode->i_mapping (see __dentry_open() in fs/open.c).
> > 
> > Let's introduce separate inode for MTD device with appropriate backing
> > device.
> 
> I hate the fact that we have to do this -- is it really the only option?
> 
> Is it _just_ for the backing_device_info? Can't that be done
> differently?
  Well, if I understand the problem MTD tries to solve, what you really need
is that file->f_mapping->backing_dev_info points to your structure so that
you can specify the capability of backing device to support mmap and
whatever else. What I'm not sure about is, why you cannot have this
backing_dev_info directly in the original device inode but since this is
the problem you are originally trying to solve, I guess you have some good
reason for that.
  So with this requirement, you have to at least setup complete struct
address_space to which f_mapping can point. This address_space has to be
linked (via mapping->host) to some inode. So you could point i_mapping
to your address_space structure if that would work for you. But this only
has a reasonable chance to work if you would somehow tie the lifetime
of your address_space with the lifetime of your device inode (code in
block_dev.c does something like this because all inodes which represent
the same block block device share one address_space). Moreover you would
have to do all the address_space initialization inode_init_always does (or
probably split out the mapping initialization from inode_init_always and
call it from MTD code). So I'm not sure it's really better.
  When you decide you don't want to take care about proper setup of
address_space and refcounting and whatever, you have to create a full
inode. But this inode has to live in some filesystem -> what Kirill did is
unavoidable in this case...

									Honza

Patch

diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index c355491..7f4634e 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -15,12 +15,15 @@ 
 #include <linux/smp_lock.h>
 #include <linux/backing-dev.h>
 #include <linux/compat.h>
+#include <linux/mount.h>
 
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/compatmac.h>
 
 #include <asm/uaccess.h>
 
+#define MTD_INODE_FS_MAGIC 0x11307854
+static struct vfsmount *mtd_inode_mnt __read_mostly;
 
 /*
  * Data structure to hold the pointer to the mtd device as well
@@ -85,11 +88,27 @@  static int mtd_open(struct inode *inode, struct file *file)
 		goto out;
 	}
 
-	if (mtd->backing_dev_info)
-		file->f_mapping->backing_dev_info = mtd->backing_dev_info;
+	if (!mtd->inode) {
+		mtd->inode = new_inode(mtd_inode_mnt->mnt_sb);
+		if (!mtd->inode) {
+			put_mtd_device(mtd);
+			ret = -ENOMEM;
+			goto out;
+		}
+		mtd->inode->i_mode = S_IFCHR;
+		mtd->inode->i_rdev = inode->i_rdev;
+		if (mtd->backing_dev_info) {
+			mtd->inode->i_data.backing_dev_info =
+				mtd->backing_dev_info;
+		}
+	}
+
+	igrab(mtd->inode);
+	file->f_mapping = mtd->inode->i_mapping;
 
 	/* You can't open it RW if it's not a writeable device */
 	if ((file->f_mode & FMODE_WRITE) && !(mtd->flags & MTD_WRITEABLE)) {
+		iput(mtd->inode);
 		put_mtd_device(mtd);
 		ret = -EACCES;
 		goto out;
@@ -97,6 +116,7 @@  static int mtd_open(struct inode *inode, struct file *file)
 
 	mfi = kzalloc(sizeof(*mfi), GFP_KERNEL);
 	if (!mfi) {
+		iput(mtd->inode);
 		put_mtd_device(mtd);
 		ret = -ENOMEM;
 		goto out;
@@ -122,6 +142,8 @@  static int mtd_close(struct inode *inode, struct file *file)
 	if ((file->f_mode & FMODE_WRITE) && mtd->sync)
 		mtd->sync(mtd);
 
+	iput(mtd->inode);
+
 	put_mtd_device(mtd);
 	file->private_data = NULL;
 	kfree(mfi);
@@ -951,22 +973,58 @@  static const struct file_operations mtd_fops = {
 #endif
 };
 
+static int mtd_inodefs_get_sb(struct file_system_type *fs_type, int flags,
+                               const char *dev_name, void *data,
+                               struct vfsmount *mnt)
+{
+        return get_sb_pseudo(fs_type, "mtd_inode:", NULL, MTD_INODE_FS_MAGIC,
+                             mnt);
+}
+
+static struct file_system_type mtd_inodefs_type = {
+       .name = "mtd_inodefs",
+       .get_sb = mtd_inodefs_get_sb,
+       .kill_sb = kill_anon_super,
+};
+
 static int __init init_mtdchar(void)
 {
-	int status;
+	int ret;
 
-	status = __register_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS,
+	ret = __register_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS,
 				   "mtd", &mtd_fops);
-	if (status < 0) {
-		printk(KERN_NOTICE "Can't allocate major number %d for Memory Technology Devices.\n",
-		       MTD_CHAR_MAJOR);
+	if (ret < 0) {
+		pr_notice("Can't allocate major number %d for "
+				"Memory Technology Devices.\n", MTD_CHAR_MAJOR);
+		return ret;
 	}
 
-	return status;
+	ret = register_filesystem(&mtd_inodefs_type);
+	if (ret) {
+		pr_notice("Can't register mtd_inodefs filesystem: %d\n", ret);
+		goto err_unregister_chdev;
+	}
+
+	mtd_inode_mnt = kern_mount(&mtd_inodefs_type);
+	if (IS_ERR(mtd_inode_mnt)) {
+		ret = PTR_ERR(mtd_inode_mnt);
+		pr_notice("Error mounting mtd_inodefs filesystem: %d\n", ret);
+		goto err_unregister_filesystem;
+	}
+
+	return ret;
+
+err_unregister_filesystem:
+	unregister_filesystem(&mtd_inodefs_type);
+err_unregister_chdev:
+	unregister_chrdev(MTD_CHAR_MAJOR, "mtd");
+	return ret;
 }
 
 static void __exit cleanup_mtdchar(void)
 {
+	mntput(mtd_inode_mnt);
+	unregister_filesystem(&mtd_inodefs_type);
 	__unregister_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS, "mtd");
 }
 
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 70a7858..fa424ad 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -369,6 +369,9 @@  int del_mtd_device (struct mtd_info *mtd)
 		       mtd->index, mtd->name, mtd->usecount);
 		ret = -EBUSY;
 	} else {
+		if (mtd->inode)
+			iput(mtd->inode);
+
 		device_unregister(&mtd->dev);
 
 		idr_remove(&mtd_idr, mtd->index);
diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index 5326435..d66409a 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -12,6 +12,7 @@ 
 #include <linux/uio.h>
 #include <linux/notifier.h>
 #include <linux/device.h>
+#include <linux/fs.h>
 
 #include <linux/mtd/compatmac.h>
 #include <mtd/mtd-abi.h>
@@ -174,6 +175,8 @@  struct mtd_info {
 	 */
 	struct backing_dev_info *backing_dev_info;
 
+	/* inode for mtd device */
+	struct inode *inode;
 
 	int (*read) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);
 	int (*write) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf);