diff mbox

ext4: Allocate entire range in zero range

Message ID 1424278168-13711-1-git-send-email-lczerner@redhat.com
State Accepted, archived
Headers show

Commit Message

Lukas Czerner Feb. 18, 2015, 4:49 p.m. UTC
Currently there is a bug in zero range code which causes zero range
calls to only allocate block aligned portion of the range, while
ignoring the rest in some cases.

In some cases, namely if the end of the range is past isize, we do
attempt to preallocate the last nonaligned block. However this might
cause kernel to BUG() in some carefully designed zero range requests on
setups where page size > block size.

Fix this problem by first preallocating the entire range, including the
nonaligned edges and converting the written extents to unwritten in the
next step. This approach will also give us the advantage of having the
range to be as linearly contiguous as possible.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/extents.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

Comments

Dave Chinner Feb. 19, 2015, 12:49 a.m. UTC | #1
On Wed, Feb 18, 2015 at 05:49:28PM +0100, Lukas Czerner wrote:
> Currently there is a bug in zero range code which causes zero range
> calls to only allocate block aligned portion of the range, while
> ignoring the rest in some cases.
> 
> In some cases, namely if the end of the range is past isize, we do
> attempt to preallocate the last nonaligned block. However this might
> cause kernel to BUG() in some carefully designed zero range requests on
> setups where page size > block size.

Is there a regression test you could write to exercise these casesi
in future?

Cheers,

Dave.
Lukas Czerner Feb. 19, 2015, 11:31 a.m. UTC | #2
Hi Dave,

yes I am planning to send a regression test for this case as well.

Thanks!
-Lukas

----- Original Message -----
From: "Dave Chinner" <david@fromorbit.com>
To: "Lukas Czerner" <lczerner@redhat.com>
Cc: linux-ext4@vger.kernel.org
Sent: Thursday, February 19, 2015 1:49:35 AM
Subject: Re: [PATCH] ext4: Allocate entire range in zero range

On Wed, Feb 18, 2015 at 05:49:28PM +0100, Lukas Czerner wrote:
> Currently there is a bug in zero range code which causes zero range
> calls to only allocate block aligned portion of the range, while
> ignoring the rest in some cases.
> 
> In some cases, namely if the end of the range is past isize, we do
> attempt to preallocate the last nonaligned block. However this might
> cause kernel to BUG() in some carefully designed zero range requests on
> setups where page size > block size.

Is there a regression test you could write to exercise these casesi
in future?


Cheers,

Dave.
Lukas Czerner March 5, 2015, 11:43 a.m. UTC | #3
Eric, can I get some review on this one ?

Ted, I think this is quite critical, could you please ACK this one
so we know whether it's going in or not ?

Thanks!
-Lukas

On Wed, 18 Feb 2015, Lukas Czerner wrote:

> Date: Wed, 18 Feb 2015 17:49:28 +0100
> From: Lukas Czerner <lczerner@redhat.com>
> To: linux-ext4@vger.kernel.org
> Cc: Lukas Czerner <lczerner@redhat.com>
> Subject: [PATCH] ext4: Allocate entire range in zero range
> 
> Currently there is a bug in zero range code which causes zero range
> calls to only allocate block aligned portion of the range, while
> ignoring the rest in some cases.
> 
> In some cases, namely if the end of the range is past isize, we do
> attempt to preallocate the last nonaligned block. However this might
> cause kernel to BUG() in some carefully designed zero range requests on
> setups where page size > block size.
> 
> Fix this problem by first preallocating the entire range, including the
> nonaligned edges and converting the written extents to unwritten in the
> next step. This approach will also give us the advantage of having the
> range to be as linearly contiguous as possible.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
>  fs/ext4/extents.c | 31 +++++++++++++++++++------------
>  1 file changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index bed4308..aa52242 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4803,12 +4803,6 @@ static long ext4_zero_range(struct file *file, loff_t offset,
>  	else
>  		max_blocks -= lblk;
>  
> -	flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT |
> -		EXT4_GET_BLOCKS_CONVERT_UNWRITTEN |
> -		EXT4_EX_NOCACHE;
> -	if (mode & FALLOC_FL_KEEP_SIZE)
> -		flags |= EXT4_GET_BLOCKS_KEEP_SIZE;
> -
>  	mutex_lock(&inode->i_mutex);
>  
>  	/*
> @@ -4825,15 +4819,28 @@ static long ext4_zero_range(struct file *file, loff_t offset,
>  		ret = inode_newsize_ok(inode, new_size);
>  		if (ret)
>  			goto out_mutex;
> -		/*
> -		 * If we have a partial block after EOF we have to allocate
> -		 * the entire block.
> -		 */
> -		if (partial_end)
> -			max_blocks += 1;
>  	}
>  
> +	flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT;
> +	if (mode & FALLOC_FL_KEEP_SIZE)
> +		flags |= EXT4_GET_BLOCKS_KEEP_SIZE;
> +
> +	/* Preallocate the range including the unaligned edges */
> +	if (partial_begin || partial_end) {
> +		ret = ext4_alloc_file_blocks(file,
> +				round_down(offset, 1 << blkbits) >> blkbits,
> +				(round_up((offset + len), 1 << blkbits) -
> +				 round_down(offset, 1 << blkbits)) >> blkbits,
> +				new_size, flags, mode);
> +		if (ret)
> +			goto out_mutex;
> +
> +	}
> +
> +	/* Zero range excluding the unaligned edges */
>  	if (max_blocks > 0) {
> +		flags |= (EXT4_GET_BLOCKS_CONVERT_UNWRITTEN |
> +			  EXT4_EX_NOCACHE);
>  
>  		/* Now release the pages and zero block aligned part of pages*/
>  		truncate_pagecache_range(inode, start, end - 1);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o April 3, 2015, 4:11 a.m. UTC | #4
On Wed, Feb 18, 2015 at 05:49:28PM +0100, Lukas Czerner wrote:
> Currently there is a bug in zero range code which causes zero range
> calls to only allocate block aligned portion of the range, while
> ignoring the rest in some cases.
> 
> In some cases, namely if the end of the range is past isize, we do
> attempt to preallocate the last nonaligned block. However this might
> cause kernel to BUG() in some carefully designed zero range requests on
> setups where page size > block size.
> 
> Fix this problem by first preallocating the entire range, including the
> nonaligned edges and converting the written extents to unwritten in the
> next step. This approach will also give us the advantage of having the
> range to be as linearly contiguous as possible.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>

Thanks, applied.  Apologies for the delay.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index bed4308..aa52242 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4803,12 +4803,6 @@  static long ext4_zero_range(struct file *file, loff_t offset,
 	else
 		max_blocks -= lblk;
 
-	flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT |
-		EXT4_GET_BLOCKS_CONVERT_UNWRITTEN |
-		EXT4_EX_NOCACHE;
-	if (mode & FALLOC_FL_KEEP_SIZE)
-		flags |= EXT4_GET_BLOCKS_KEEP_SIZE;
-
 	mutex_lock(&inode->i_mutex);
 
 	/*
@@ -4825,15 +4819,28 @@  static long ext4_zero_range(struct file *file, loff_t offset,
 		ret = inode_newsize_ok(inode, new_size);
 		if (ret)
 			goto out_mutex;
-		/*
-		 * If we have a partial block after EOF we have to allocate
-		 * the entire block.
-		 */
-		if (partial_end)
-			max_blocks += 1;
 	}
 
+	flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT;
+	if (mode & FALLOC_FL_KEEP_SIZE)
+		flags |= EXT4_GET_BLOCKS_KEEP_SIZE;
+
+	/* Preallocate the range including the unaligned edges */
+	if (partial_begin || partial_end) {
+		ret = ext4_alloc_file_blocks(file,
+				round_down(offset, 1 << blkbits) >> blkbits,
+				(round_up((offset + len), 1 << blkbits) -
+				 round_down(offset, 1 << blkbits)) >> blkbits,
+				new_size, flags, mode);
+		if (ret)
+			goto out_mutex;
+
+	}
+
+	/* Zero range excluding the unaligned edges */
 	if (max_blocks > 0) {
+		flags |= (EXT4_GET_BLOCKS_CONVERT_UNWRITTEN |
+			  EXT4_EX_NOCACHE);
 
 		/* Now release the pages and zero block aligned part of pages*/
 		truncate_pagecache_range(inode, start, end - 1);