Patchwork [Q] ext4: the max file size of each case is correct?

login
register
mail settings
Submitter Toshiyuki Okajima
Date Aug. 10, 2010, 2:23 p.m.
Message ID <20100810232321.0eb72247.toshi.okajima@jp.fujitsu.com>
Download mbox | patch
Permalink /patch/61335/
State New
Headers show

Comments

Andreas Dilger - Aug. 10, 2010, 6:35 a.m.
On 2010-08-10, at 10:23, Toshiyuki Okajima wrote:
> Table. the max file size which we can write or seek
>       at each filesystem feature tuning and file flag setting
> +===============================+===============================+
> |                               |                               |
> |     !EXT4_EXTENTS_FL          |        EXT4_EXTETNS_FL        |
> |                               |                               |
> +-------------------------------+-------------------------------+
> |   write:      2194719883264   | write:       --------------   |
> |   seek:       2199023251456   | seek:        --------------   |
> +-------------------------------+-------------------------------+
> |   write:      4402345721856   | write:       17592186044415   |
> |   seek:      17592186044415   | seek:        17592186044415   |
> +-------------------------------+-------------------------------+

Interesting.  The 2TB vs. 4TB difference for block-mapped files is due to the removal of the 2^32 sectors limit imposed by i_blocks, and is not strictly related to extents.

> +loff_t ext4_llseek(struct file *file, loff_t offset, int origin)
> +{
> +	struct inode *inode = file->f_mapping->host;
> +	loff_t maxbytes;
> +
> +	mutex_lock(&inode->i_mutex);
> +	if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
> +		maxbytes = EXT4_SB(inode->i_sb)->s_bitmap_maxbytes;
> +	else
> +		maxbytes = inode->i_sb->s_maxbytes;

This part doesn't really have to be under i_mutex.  Otherwise, the patch looks reasonable.

> +	switch (origin) {
> +	case SEEK_END:
> +		offset += inode->i_size;
> +		break;
> +	case SEEK_CUR:
> +		if (offset == 0) {
> +			mutex_unlock(&inode->i_mutex);
> +			return file->f_pos;
> +		}
> +		offset += file->f_pos;
> +		break;
> +       	}
> +
> +	if (offset < 0 || offset > maxbytes) {
> +		mutex_unlock(&inode->i_mutex);
> +		return -EINVAL;
> +	}
> +
> +	if (offset != file->f_pos) {
> +		file->f_pos = offset;
> +		file->f_version = 0;
> +	}
> +	mutex_unlock(&inode->i_mutex);
> + 
> +	return offset;
> +}

It's too bad that we have to duplicate the whole generic_file_llseek() code here, but I don't think there is a better solution.  However, it is worthwhile to add a comment to this function like:

/* copied from generic_file_llseek() to handle both block-mapped and
 * extent-mapped maxbytes values.  Should otherwise be identical. */

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Toshiyuki Okajima - Aug. 10, 2010, 2:23 p.m.
Hi.

I have tested the following cases in order to confirm the maximum file size.
For the tests, I selected two parameters:
(These parameters relate to the max file size.)
1) Filesystem Feature
  "extent"
2) File(Inode) Flag
  "EXT4_EXTENTS_FL"
 (This parameter corresponds to the case where people shifts from ext3 into ext4.
  (Files which are created with ext3 have no "EXT4_EXTENTS_FL" flag.))

Table. the max file size which we can write or seek
       at each filesystem feature tuning and file flag setting
+============+===============================+===============================+
| \ File flag|                               |                               |
|      \     |     !EXT4_EXTENTS_FL          |        EXT4_EXTETNS_FL        |
|Fs Features\|                               |                               |
+------------+-------------------------------+-------------------------------+
| !extent    |   write:      2194719883264   | write:       --------------   |
|            |   seek:       2199023251456   | seek:        --------------   |
+------------+-------------------------------+-------------------------------+
|  extent    |   write:      4402345721856   | write:       17592186044415   |
|            |   seek:      17592186044415   | seek:        17592186044415   |
+------------+-------------------------------+-------------------------------+
(
The symbols (!extent, extent) mean: 
 !extent: The filesystem feature "extent" is not set.
	ex.  mkfs.ext3 <dev>; mount -t ext4 <dev>
  extent: The filesystem feature "extent" is set.
	ex.  mkfs.ext3 <dev>; tune2fs -Oextent,huge_file <dev>; mount -t ext4 <dev>
The symbols ("!EXT4_EXTENTS_FL","EXT4_EXTENTS_FL") mean:
 !EXT4_EXTENS_FL: The file flag, "EXT4_EXTENTS_FL" is not set.
  EXT4_EXTENS_FL: The file flag, "EXT4_EXTENTS_FL" is set. 
)

According to the table, if EXT4_EXTETNS_FL flag is not set, the max file
size of write() is different from the one of seek().

These differences don't include in the ext4-specification, do they?
I think the differences are not the ext4-specification.

So, I made a fix patch of the differences.
Toshiyuki Okajima - Aug. 18, 2010, 1:06 a.m.
Hi.

(2010/08/10 15:35), Andreas Dilger wrote:
>
> On 2010-08-10, at 10:23, Toshiyuki Okajima wrote:
<snip>
>> +loff_t ext4_llseek(struct file *file, loff_t offset, int origin)
>> +{
>> +	struct inode *inode = file->f_mapping->host;
>> +	loff_t maxbytes;
>> +
>> +	mutex_lock(&inode->i_mutex);
>> +	if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
>> +		maxbytes = EXT4_SB(inode->i_sb)->s_bitmap_maxbytes;
>> +	else
>> +		maxbytes = inode->i_sb->s_maxbytes;
>
> This part doesn't really have to be under i_mutex.  Otherwise, the patch looks reasonable.
OK.

>
>> +	switch (origin) {
>> +	case SEEK_END:
>> +		offset += inode->i_size;
>> +		break;
>> +	case SEEK_CUR:
>> +		if (offset == 0) {
>> +			mutex_unlock(&inode->i_mutex);
>> +			return file->f_pos;
>> +		}
>> +		offset += file->f_pos;
>> +		break;
>> +       	}
>> +
>> +	if (offset<  0 || offset>  maxbytes) {
>> +		mutex_unlock(&inode->i_mutex);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (offset != file->f_pos) {
>> +		file->f_pos = offset;
>> +		file->f_version = 0;
>> +	}
>> +	mutex_unlock(&inode->i_mutex);
>> +
>> +	return offset;
>> +}
>
> It's too bad that we have to duplicate the whole generic_file_llseek() code here, but I don't think there is a better solution.  However, it is worthwhile to add a comment to this function like:
>
> /* copied from generic_file_llseek() to handle both block-mapped and
>   * extent-mapped maxbytes values.  Should otherwise be identical. */
I understand it.

I apply your comments and then rebuild my patch.
Immediately I'll send it.

Thanks,
Toshiyuki Okajima

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

================================================================================
Subject: [PATCH] ext4: create own llseek function to handle the max file size correctly.
From: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>

If the file has no EXT4_EXTENTS_FL flag, the maximum size which can be written
(write systemcall) is different from the maximum size which can be seeked 
(lseek systemcall).

For example, the following 2 cases show us the differences:
#1: mkfs.ext3 <dev>; mount -t ext4 <dev>
#2: mkfs.ext3 <dev>; tune2fs -Oextent,huge_file <dev>; mount -t ext4 <dev>

Table. the max file size which we can write or seek
       at each filesystem feature tuning and file flag setting
+============+===============================+===============================+
| \ File flag|                               |                               |
|      \     |     !EXT4_EXTENTS_FL          |        EXT4_EXTETNS_FL        |
|case       \|                               |                               |
+------------+-------------------------------+-------------------------------+
| #1         |   write:      2194719883264   | write:       --------------   |
|            |   seek:       2199023251456   | seek:        --------------   |
+------------+-------------------------------+-------------------------------+
| #2         |   write:      4402345721856   | write:       17592186044415   |
|            |   seek:      17592186044415   | seek:        17592186044415   |
+------------+-------------------------------+-------------------------------+

The differences exist because ext4 has 2 max-file-size (sb->s_maxbytes, 
EXT4_SB(sb)->s_bitmap_maxbytes) although generic_file_llseek uses only 
sb->s_maxbytes.  (llseek of ext4_file_operations is generic_file_llseek.)
Therefore we create own llseek function which uses 2 max-file-size.

The new own function originates from generic_file_llseek_nolocked().
If the file flag, "EXT4_EXTENTS_FL" is not set, the function alters 
inode->i_sb->s_maxbytes into EXT4_SB(inode->i_sb)->s_bitmap_maxbytes.

Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
---
 fs/ext4/dir.c  |    2 +-
 fs/ext4/ext4.h |    1 +
 fs/ext4/file.c |   39 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index ea5e6cb..62c9bba 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -39,7 +39,7 @@  static int ext4_release_dir(struct inode *inode,
 				struct file *filp);
 
 const struct file_operations ext4_dir_operations = {
-	.llseek		= generic_file_llseek,
+	.llseek		= ext4_llseek,
 	.read		= generic_read_dir,
 	.readdir	= ext4_readdir,		/* we take BKL. needed?*/
 	.unlocked_ioctl = ext4_ioctl,
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 19a4de5..e7050cd 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1870,6 +1870,7 @@  extern const struct file_operations ext4_dir_operations;
 /* file.c */
 extern const struct inode_operations ext4_file_inode_operations;
 extern const struct file_operations ext4_file_operations;
+extern loff_t ext4_llseek(struct file *file, loff_t offset, int origin);
 
 /* namei.c */
 extern const struct inode_operations ext4_dir_inode_operations;
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 5313ae4..f2e2d57 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -129,8 +129,45 @@  static int ext4_file_open(struct inode * inode, struct file * filp)
 	return dquot_file_open(inode, filp);
 }
 
+loff_t ext4_llseek(struct file *file, loff_t offset, int origin)
+{
+	struct inode *inode = file->f_mapping->host;
+	loff_t maxbytes;
+
+	mutex_lock(&inode->i_mutex);
+	if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
+		maxbytes = EXT4_SB(inode->i_sb)->s_bitmap_maxbytes;
+	else
+		maxbytes = inode->i_sb->s_maxbytes;
+	switch (origin) {
+	case SEEK_END:
+		offset += inode->i_size;
+		break;
+	case SEEK_CUR:
+		if (offset == 0) {
+			mutex_unlock(&inode->i_mutex);
+			return file->f_pos;
+		}
+		offset += file->f_pos;
+		break;
+       	}
+
+	if (offset < 0 || offset > maxbytes) {
+		mutex_unlock(&inode->i_mutex);
+		return -EINVAL;
+	}
+
+	if (offset != file->f_pos) {
+		file->f_pos = offset;
+		file->f_version = 0;
+	}
+	mutex_unlock(&inode->i_mutex);
+ 
+	return offset;
+}
+
 const struct file_operations ext4_file_operations = {
-	.llseek		= generic_file_llseek,
+	.llseek		= ext4_llseek,
 	.read		= do_sync_read,
 	.write		= do_sync_write,
 	.aio_read	= generic_file_aio_read,