Message ID | 1424278168-13711-1-git-send-email-lczerner@redhat.com |
---|---|
State | Accepted, archived |
Headers | show |
On Wed, Feb 18, 2015 at 05:49:28PM +0100, Lukas Czerner wrote: > Currently there is a bug in zero range code which causes zero range > calls to only allocate block aligned portion of the range, while > ignoring the rest in some cases. > > In some cases, namely if the end of the range is past isize, we do > attempt to preallocate the last nonaligned block. However this might > cause kernel to BUG() in some carefully designed zero range requests on > setups where page size > block size. Is there a regression test you could write to exercise these casesi in future? Cheers, Dave.
Hi Dave, yes I am planning to send a regression test for this case as well. Thanks! -Lukas ----- Original Message ----- From: "Dave Chinner" <david@fromorbit.com> To: "Lukas Czerner" <lczerner@redhat.com> Cc: linux-ext4@vger.kernel.org Sent: Thursday, February 19, 2015 1:49:35 AM Subject: Re: [PATCH] ext4: Allocate entire range in zero range On Wed, Feb 18, 2015 at 05:49:28PM +0100, Lukas Czerner wrote: > Currently there is a bug in zero range code which causes zero range > calls to only allocate block aligned portion of the range, while > ignoring the rest in some cases. > > In some cases, namely if the end of the range is past isize, we do > attempt to preallocate the last nonaligned block. However this might > cause kernel to BUG() in some carefully designed zero range requests on > setups where page size > block size. Is there a regression test you could write to exercise these casesi in future? Cheers, Dave.
Eric, can I get some review on this one ? Ted, I think this is quite critical, could you please ACK this one so we know whether it's going in or not ? Thanks! -Lukas On Wed, 18 Feb 2015, Lukas Czerner wrote: > Date: Wed, 18 Feb 2015 17:49:28 +0100 > From: Lukas Czerner <lczerner@redhat.com> > To: linux-ext4@vger.kernel.org > Cc: Lukas Czerner <lczerner@redhat.com> > Subject: [PATCH] ext4: Allocate entire range in zero range > > Currently there is a bug in zero range code which causes zero range > calls to only allocate block aligned portion of the range, while > ignoring the rest in some cases. > > In some cases, namely if the end of the range is past isize, we do > attempt to preallocate the last nonaligned block. However this might > cause kernel to BUG() in some carefully designed zero range requests on > setups where page size > block size. > > Fix this problem by first preallocating the entire range, including the > nonaligned edges and converting the written extents to unwritten in the > next step. This approach will also give us the advantage of having the > range to be as linearly contiguous as possible. > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > --- > fs/ext4/extents.c | 31 +++++++++++++++++++------------ > 1 file changed, 19 insertions(+), 12 deletions(-) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index bed4308..aa52242 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -4803,12 +4803,6 @@ static long ext4_zero_range(struct file *file, loff_t offset, > else > max_blocks -= lblk; > > - flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT | > - EXT4_GET_BLOCKS_CONVERT_UNWRITTEN | > - EXT4_EX_NOCACHE; > - if (mode & FALLOC_FL_KEEP_SIZE) > - flags |= EXT4_GET_BLOCKS_KEEP_SIZE; > - > mutex_lock(&inode->i_mutex); > > /* > @@ -4825,15 +4819,28 @@ static long ext4_zero_range(struct file *file, loff_t offset, > ret = inode_newsize_ok(inode, new_size); > if (ret) > goto out_mutex; > - /* > - * If we have a partial block after EOF we have to allocate > - * the entire block. > - */ > - if (partial_end) > - max_blocks += 1; > } > > + flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT; > + if (mode & FALLOC_FL_KEEP_SIZE) > + flags |= EXT4_GET_BLOCKS_KEEP_SIZE; > + > + /* Preallocate the range including the unaligned edges */ > + if (partial_begin || partial_end) { > + ret = ext4_alloc_file_blocks(file, > + round_down(offset, 1 << blkbits) >> blkbits, > + (round_up((offset + len), 1 << blkbits) - > + round_down(offset, 1 << blkbits)) >> blkbits, > + new_size, flags, mode); > + if (ret) > + goto out_mutex; > + > + } > + > + /* Zero range excluding the unaligned edges */ > if (max_blocks > 0) { > + flags |= (EXT4_GET_BLOCKS_CONVERT_UNWRITTEN | > + EXT4_EX_NOCACHE); > > /* Now release the pages and zero block aligned part of pages*/ > truncate_pagecache_range(inode, start, end - 1); > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 18, 2015 at 05:49:28PM +0100, Lukas Czerner wrote: > Currently there is a bug in zero range code which causes zero range > calls to only allocate block aligned portion of the range, while > ignoring the rest in some cases. > > In some cases, namely if the end of the range is past isize, we do > attempt to preallocate the last nonaligned block. However this might > cause kernel to BUG() in some carefully designed zero range requests on > setups where page size > block size. > > Fix this problem by first preallocating the entire range, including the > nonaligned edges and converting the written extents to unwritten in the > next step. This approach will also give us the advantage of having the > range to be as linearly contiguous as possible. > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> Thanks, applied. Apologies for the delay. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index bed4308..aa52242 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4803,12 +4803,6 @@ static long ext4_zero_range(struct file *file, loff_t offset, else max_blocks -= lblk; - flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT | - EXT4_GET_BLOCKS_CONVERT_UNWRITTEN | - EXT4_EX_NOCACHE; - if (mode & FALLOC_FL_KEEP_SIZE) - flags |= EXT4_GET_BLOCKS_KEEP_SIZE; - mutex_lock(&inode->i_mutex); /* @@ -4825,15 +4819,28 @@ static long ext4_zero_range(struct file *file, loff_t offset, ret = inode_newsize_ok(inode, new_size); if (ret) goto out_mutex; - /* - * If we have a partial block after EOF we have to allocate - * the entire block. - */ - if (partial_end) - max_blocks += 1; } + flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT; + if (mode & FALLOC_FL_KEEP_SIZE) + flags |= EXT4_GET_BLOCKS_KEEP_SIZE; + + /* Preallocate the range including the unaligned edges */ + if (partial_begin || partial_end) { + ret = ext4_alloc_file_blocks(file, + round_down(offset, 1 << blkbits) >> blkbits, + (round_up((offset + len), 1 << blkbits) - + round_down(offset, 1 << blkbits)) >> blkbits, + new_size, flags, mode); + if (ret) + goto out_mutex; + + } + + /* Zero range excluding the unaligned edges */ if (max_blocks > 0) { + flags |= (EXT4_GET_BLOCKS_CONVERT_UNWRITTEN | + EXT4_EX_NOCACHE); /* Now release the pages and zero block aligned part of pages*/ truncate_pagecache_range(inode, start, end - 1);
Currently there is a bug in zero range code which causes zero range calls to only allocate block aligned portion of the range, while ignoring the rest in some cases. In some cases, namely if the end of the range is past isize, we do attempt to preallocate the last nonaligned block. However this might cause kernel to BUG() in some carefully designed zero range requests on setups where page size > block size. Fix this problem by first preallocating the entire range, including the nonaligned edges and converting the written extents to unwritten in the next step. This approach will also give us the advantage of having the range to be as linearly contiguous as possible. Signed-off-by: Lukas Czerner <lczerner@redhat.com> --- fs/ext4/extents.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-)