diff mbox

vfs: allow custom EOF in generic_file_llseek code

Message ID 4F9AC770.8080004@redhat.com
State Not Applicable, archived
Headers show

Commit Message

Eric Sandeen April 27, 2012, 4:21 p.m. UTC
For ext3/4 htree directories, using the vfs llseek function with
SEEK_END goes to i_size like for any other file, but in reality
we want the maximum possible hash value.  Recent changes
in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
but replicating this core code seems like a bad idea, especially
since the copy has already diverged from the vfs.

This patch implements a version of generic_file_llseek which can accept
both a custom maximum offset, and a custom EOF position.  With this
in place, ext4_dir_llseek can pass in the appropriate maximum hash
position for both maxsize and eof, and get what it wants.

As far as I know, this does not fix any bugs - nfs in the kernel
doesn't use SEEK_END, and I don't know of any user who does.  But
some ext4 folks seem keen on doing the right thing here, and I can't
really argue.

(Patch also fixes up some comments slightly)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Sandeen April 27, 2012, 4:28 p.m. UTC | #1
On 4/27/12 11:21 AM, Eric Sandeen wrote:
> For ext3/4 htree directories, using the vfs llseek function with
> SEEK_END goes to i_size like for any other file, but in reality
> we want the maximum possible hash value.  Recent changes
> in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
> but replicating this core code seems like a bad idea, especially
> since the copy has already diverged from the vfs.
> 
> This patch implements a version of generic_file_llseek which can accept
> both a custom maximum offset, and a custom EOF position.  With this
> in place, ext4_dir_llseek can pass in the appropriate maximum hash
> position for both maxsize and eof, and get what it wants.
> 
> As far as I know, this does not fix any bugs - nfs in the kernel
> doesn't use SEEK_END, and I don't know of any user who does.  But
> some ext4 folks seem keen on doing the right thing here, and I can't
> really argue.
> 
> (Patch also fixes up some comments slightly)
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

I guess I should ahev done a patch series, although the ext4 patch is so
messy it's hard to read as a patch.  With the new framework in place,
ext4_dir_llseek can just be:

/*
 * ext4_dir_llseek() calls generic_file_llseek_size to handle htree
 * directories, where the "offset" is in terms of the filename hash
 * value instead of the byte offset.
 *
 * Because we may return a 64-bit hash that is well beyond offset limits,
 * we need to pass the max hash as the maximum allowable offset in
 * the htree directory case.
 *
 * For non-htree, ext4_llseek already chooses the proper max offset.
 */
loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
{
        struct inode *inode = file->f_mapping->host;
        int dx_dir = is_dx_dir(inode);
        loff_t htree_max = ext4_get_htree_eof(file);

        if (likely(dx_dir))
                return generic_file_llseek_size_eof(file, offset, origin,
                                                    htree_max, htree_max);
        else
                return ext4_llseek(file, offset, origin);
}


-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bernd Schubert April 27, 2012, 10:47 p.m. UTC | #2
On 04/27/2012 06:21 PM, Eric Sandeen wrote:
> For ext3/4 htree directories, using the vfs llseek function with
> SEEK_END goes to i_size like for any other file, but in reality
> we want the maximum possible hash value.  Recent changes
> in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
> but replicating this core code seems like a bad idea, especially
> since the copy has already diverged from the vfs.
> 
> This patch implements a version of generic_file_llseek which can accept
> both a custom maximum offset, and a custom EOF position.  With this
> in place, ext4_dir_llseek can pass in the appropriate maximum hash
> position for both maxsize and eof, and get what it wants.
> 
> As far as I know, this does not fix any bugs - nfs in the kernel
> doesn't use SEEK_END, and I don't know of any user who does.  But
> some ext4 folks seem keen on doing the right thing here, and I can't
> really argue.
> 
> (Patch also fixes up some comments slightly)
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index ffc99d2..ecd1828 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -51,14 +51,15 @@ static loff_t lseek_execute(struct file *file, struct inode *inode,
>  }
>  
>  /**
> - * generic_file_llseek_size - generic llseek implementation for regular files
> + * generic_file_llseek_size_eof - generic llseek implementation for regular files
>   * @file:	file structure to seek on
>   * @offset:	file offset to seek to
>   * @origin:	type of seek
> - * @size:	max size of file system
> + * @size:	max size of this file in file system
> + * @eof:	offset used for SEEK_END position
>   *
>   * This is a variant of generic_file_llseek that allows passing in a custom
> - * file size.
> + * maximum file size and a custom EOF position, for e.g. hashed directories
>   *
>   * Synchronization:
>   * SEEK_SET and SEEK_END are unsynchronized (but atomic on 64bit platforms)
> @@ -66,14 +67,14 @@ static loff_t lseek_execute(struct file *file, struct inode *inode,
>   * read/writes behave like SEEK_SET against seeks.
>   */
>  loff_t
> -generic_file_llseek_size(struct file *file, loff_t offset, int origin,
> -		loff_t maxsize)
> +generic_file_llseek_size_eof(struct file *file, loff_t offset, int origin,
> +		loff_t maxsize, loff_t eof)
>  {
>  	struct inode *inode = file->f_mapping->host;
>  
>  	switch (origin) {
>  	case SEEK_END:
> -		offset += i_size_read(inode);
> +		offset += eof;
>  		break;

Here is the only glitch I can see. As Andreas already said before, it
might overflow here. Do we need do care about that? As you already said,
SEEK_END is unlikely to be ever called for directories. But then we also
cannot keep user space from doing weird calls...


>  	case SEEK_CUR:
>  		/*
> @@ -99,7 +100,7 @@ generic_file_llseek_size(struct file *file, loff_t offset, int origin,
>  		 * In the generic case the entire file is data, so as long as
>  		 * offset isn't at the end of the file then the offset is data.
>  		 */
> -		if (offset >= i_size_read(inode))
> +		if (offset >= eof)
>  			return -ENXIO;
>  		break;
>  	case SEEK_HOLE:
> @@ -107,14 +108,35 @@ generic_file_llseek_size(struct file *file, loff_t offset, int origin,
>  		 * There is a virtual hole at the end of the file, so as long as
>  		 * offset isn't i_size or larger, return i_size.
>  		 */
> -		if (offset >= i_size_read(inode))
> +		if (offset >= eof)
>  			return -ENXIO;
> -		offset = i_size_read(inode);
> +		offset = eof;
>  		break;
>  	}
>  
>  	return lseek_execute(file, inode, offset, maxsize);
>  }
> +EXPORT_SYMBOL(generic_file_llseek_size_eof);
> +
> +/**
> + * generic_file_llseek_size - generic llseek implementation for regular files
> + * @file:	file structure to seek on
> + * @offset:	file offset to seek to
> + * @origin:	type of seek
> + * @size:	max size of this file in file system
> + *
> + * This is a variant of generic_file_llseek that allows passing in a custom
> + * maximum file size.
> + */
> +loff_t
> +generic_file_llseek_size(struct file *file, loff_t offset, int origin,
> +		loff_t maxsize)
> +{
> +	struct inode *inode = file->f_mapping->host;
> +
> +	return generic_file_llseek_size_eof(file, offset, origin, maxsize,
> +					i_size_read(inode));
> +}
>  EXPORT_SYMBOL(generic_file_llseek_size);
>  
>  /**
> @@ -131,8 +153,9 @@ loff_t generic_file_llseek(struct file *file, loff_t offset, int origin)
>  {
>  	struct inode *inode = file->f_mapping->host;
>  
> -	return generic_file_llseek_size(file, offset, origin,
> -					inode->i_sb->s_maxbytes);
> +	return generic_file_llseek_size_eof(file, offset, origin,
> +					inode->i_sb->s_maxbytes,
> +					i_size_read(inode));
>  }
>  EXPORT_SYMBOL(generic_file_llseek);
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8de6755..a6ae7a4 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2402,6 +2402,8 @@ extern loff_t no_llseek(struct file *file, loff_t offset, int origin);
>  extern loff_t generic_file_llseek(struct file *file, loff_t offset, int origin);
>  extern loff_t generic_file_llseek_size(struct file *file, loff_t offset,
>  		int origin, loff_t maxsize);
> +extern loff_t generic_file_llseek_size_eof(struct file *file, loff_t offset,
> +		int origin, loff_t maxsize, loff_t eof);
>  extern int generic_file_open(struct inode * inode, struct file * filp);
>  extern int nonseekable_open(struct inode * inode, struct file * filp);

Another question, wouldn't it be better to entirely move
generic_file_llseek_size() and generic_file_llseek() into fs.h to make
sure it gets inlined?


Cheers,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen April 27, 2012, 11:01 p.m. UTC | #3
On 4/27/12 5:47 PM, Bernd Schubert wrote:
> On 04/27/2012 06:21 PM, Eric Sandeen wrote:

...

>> +generic_file_llseek_size_eof(struct file *file, loff_t offset, int origin,
>> +		loff_t maxsize, loff_t eof)
>>  {
>>  	struct inode *inode = file->f_mapping->host;
>>  
>>  	switch (origin) {
>>  	case SEEK_END:
>> -		offset += i_size_read(inode);
>> +		offset += eof;
>>  		break;
> 
> Here is the only glitch I can see. As Andreas already said before, it
> might overflow here. Do we need do care about that? As you already said,
> SEEK_END is unlikely to be ever called for directories. But then we also
> cannot keep user space from doing weird calls...

It can happen already today, for a sufficiently large file offset.

# ls -l reallybigfile 
-rw-r--r--. 1 root root 9223372036854775807 Apr 27 18:02 reallybigfile

(that's 2^63 - 1)

so overflow protection may be warranted in here, but I think it's a separate problem.

...

>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 8de6755..a6ae7a4 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -2402,6 +2402,8 @@ extern loff_t no_llseek(struct file *file, loff_t offset, int origin);
>>  extern loff_t generic_file_llseek(struct file *file, loff_t offset, int origin);
>>  extern loff_t generic_file_llseek_size(struct file *file, loff_t offset,
>>  		int origin, loff_t maxsize);
>> +extern loff_t generic_file_llseek_size_eof(struct file *file, loff_t offset,
>> +		int origin, loff_t maxsize, loff_t eof);
>>  extern int generic_file_open(struct inode * inode, struct file * filp);
>>  extern int nonseekable_open(struct inode * inode, struct file * filp);
> 
> Another question, wouldn't it be better to entirely move
> generic_file_llseek_size() and generic_file_llseek() into fs.h to make
> sure it gets inlined?

Hm, perhaps.  It wasn't done for generic_file_llseek_size() so I just followed
that example, for now.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matthew Wilcox April 28, 2012, 6:33 p.m. UTC | #4
On Fri, Apr 27, 2012 at 11:21:04AM -0500, Eric Sandeen wrote:
> As far as I know, this does not fix any bugs - nfs in the kernel
> doesn't use SEEK_END, and I don't know of any user who does.  But
> some ext4 folks seem keen on doing the right thing here, and I can't
> really argue.

I like it.  In particular it removes a lot of calls to i_size_read() which
may have some nice benefits on 32-bit systems.  

However, there is only one call to generic_file_llseek_size() in the
kernel (and it's in ext4!)  I would suggest simply changing the prototype
of generic_file_llseek_size ... or if you insist, just renaming it to
generic_file_llseek_size_eof().
Eric Sandeen April 30, 2012, 2:17 p.m. UTC | #5
On 4/28/12 1:33 PM, Matthew Wilcox wrote:
> On Fri, Apr 27, 2012 at 11:21:04AM -0500, Eric Sandeen wrote:
>> As far as I know, this does not fix any bugs - nfs in the kernel
>> doesn't use SEEK_END, and I don't know of any user who does.  But
>> some ext4 folks seem keen on doing the right thing here, and I can't
>> really argue.
> 
> I like it.  In particular it removes a lot of calls to i_size_read() which
> may have some nice benefits on 32-bit systems.  
> 
> However, there is only one call to generic_file_llseek_size() in the
> kernel (and it's in ext4!)  I would suggest simply changing the prototype
> of generic_file_llseek_size ... or if you insist, just renaming it to
> generic_file_llseek_size_eof().

Ok, if both users are only in ext* you're right, probably no need to have
both variants with 2 kinds of special sauce.  Is it cool to change the
prototype of the existing function, or should I rename it?  I guess since
it'll properly break any current users in obvious ways, I could just
add a new argument to the _size() variant.  I'll send a V2.

Thanks,
-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/read_write.c b/fs/read_write.c
index ffc99d2..ecd1828 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -51,14 +51,15 @@  static loff_t lseek_execute(struct file *file, struct inode *inode,
 }
 
 /**
- * generic_file_llseek_size - generic llseek implementation for regular files
+ * generic_file_llseek_size_eof - generic llseek implementation for regular files
  * @file:	file structure to seek on
  * @offset:	file offset to seek to
  * @origin:	type of seek
- * @size:	max size of file system
+ * @size:	max size of this file in file system
+ * @eof:	offset used for SEEK_END position
  *
  * This is a variant of generic_file_llseek that allows passing in a custom
- * file size.
+ * maximum file size and a custom EOF position, for e.g. hashed directories
  *
  * Synchronization:
  * SEEK_SET and SEEK_END are unsynchronized (but atomic on 64bit platforms)
@@ -66,14 +67,14 @@  static loff_t lseek_execute(struct file *file, struct inode *inode,
  * read/writes behave like SEEK_SET against seeks.
  */
 loff_t
-generic_file_llseek_size(struct file *file, loff_t offset, int origin,
-		loff_t maxsize)
+generic_file_llseek_size_eof(struct file *file, loff_t offset, int origin,
+		loff_t maxsize, loff_t eof)
 {
 	struct inode *inode = file->f_mapping->host;
 
 	switch (origin) {
 	case SEEK_END:
-		offset += i_size_read(inode);
+		offset += eof;
 		break;
 	case SEEK_CUR:
 		/*
@@ -99,7 +100,7 @@  generic_file_llseek_size(struct file *file, loff_t offset, int origin,
 		 * In the generic case the entire file is data, so as long as
 		 * offset isn't at the end of the file then the offset is data.
 		 */
-		if (offset >= i_size_read(inode))
+		if (offset >= eof)
 			return -ENXIO;
 		break;
 	case SEEK_HOLE:
@@ -107,14 +108,35 @@  generic_file_llseek_size(struct file *file, loff_t offset, int origin,
 		 * There is a virtual hole at the end of the file, so as long as
 		 * offset isn't i_size or larger, return i_size.
 		 */
-		if (offset >= i_size_read(inode))
+		if (offset >= eof)
 			return -ENXIO;
-		offset = i_size_read(inode);
+		offset = eof;
 		break;
 	}
 
 	return lseek_execute(file, inode, offset, maxsize);
 }
+EXPORT_SYMBOL(generic_file_llseek_size_eof);
+
+/**
+ * generic_file_llseek_size - generic llseek implementation for regular files
+ * @file:	file structure to seek on
+ * @offset:	file offset to seek to
+ * @origin:	type of seek
+ * @size:	max size of this file in file system
+ *
+ * This is a variant of generic_file_llseek that allows passing in a custom
+ * maximum file size.
+ */
+loff_t
+generic_file_llseek_size(struct file *file, loff_t offset, int origin,
+		loff_t maxsize)
+{
+	struct inode *inode = file->f_mapping->host;
+
+	return generic_file_llseek_size_eof(file, offset, origin, maxsize,
+					i_size_read(inode));
+}
 EXPORT_SYMBOL(generic_file_llseek_size);
 
 /**
@@ -131,8 +153,9 @@  loff_t generic_file_llseek(struct file *file, loff_t offset, int origin)
 {
 	struct inode *inode = file->f_mapping->host;
 
-	return generic_file_llseek_size(file, offset, origin,
-					inode->i_sb->s_maxbytes);
+	return generic_file_llseek_size_eof(file, offset, origin,
+					inode->i_sb->s_maxbytes,
+					i_size_read(inode));
 }
 EXPORT_SYMBOL(generic_file_llseek);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8de6755..a6ae7a4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2402,6 +2402,8 @@  extern loff_t no_llseek(struct file *file, loff_t offset, int origin);
 extern loff_t generic_file_llseek(struct file *file, loff_t offset, int origin);
 extern loff_t generic_file_llseek_size(struct file *file, loff_t offset,
 		int origin, loff_t maxsize);
+extern loff_t generic_file_llseek_size_eof(struct file *file, loff_t offset,
+		int origin, loff_t maxsize, loff_t eof);
 extern int generic_file_open(struct inode * inode, struct file * filp);
 extern int nonseekable_open(struct inode * inode, struct file * filp);