diff mbox

ext4: invalidate gap cache when writing extents last block

Message ID 1305268616-5167-1-git-send-email-lczerner@redhat.com
State New, archived
Headers show

Commit Message

Lukas Czerner May 13, 2011, 6:36 a.m. UTC
Kazuya Mio reported that he was able to hit BUG_ON(next == lblock)
in ext4_ext_put_gap_in_cache() while creating a sparse file in extent
format and fill the tail of file up to its end. We will hit the BUG_ON
when we write the last block (2^32-1) into the sparse file.

That is because due to defensive programming we planted a lot of
BUG_ON's to prevent the length of the gap cache to be zero, but in this
case it actually will be zero, because there will be no gap at the end
of the file.

We could fix that as Kazuya Mio suggested by lowering the max file size
of extent format file by one block. But I do not think this is necessary
and we should rather fix the BUG_ON's to allow invalidating the gap
cache by setting its lenght to zero and this is what this commit is
doing.

The bug which this commit fixes can be reproduced as follows:

 dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-2))
 sync
 dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-1))

Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/extents.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

Comments

Kazuya Mio May 13, 2011, 8:55 a.m. UTC | #1
2011/05/13 15:36, Lukas Czerner wrote:
> That is because due to defensive programming we planted a lot of
> BUG_ON's to prevent the length of the gap cache to be zero, but in this
> case it actually will be zero, because there will be no gap at the end
> of the file.

Um, I think there is a block (blocksize -1 byte) in gap.
And this gap should be used for the next block searching.
According to ext4_ext_in_cache(), len=0 means the special case
which we have no cache extent, so len=0 should be disallowed.

Moreover, if we create the file which has 2^32-1 offset,
we can't get extent which starts from this offset with FIEMAP ioctl.
That's why I think the maximum file size should be 2^32-1 * blocksize.

Regards,
Kazuya Mio
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kazuya Mio May 24, 2011, 8:33 a.m. UTC | #2
Hi Lukas,
How do you think about my comment?

Regards,
Kazuya Mio

2011/05/13 17:55, Kazuya Mio wrote:
> 2011/05/13 15:36, Lukas Czerner wrote:
>> That is because due to defensive programming we planted a lot of
>> BUG_ON's to prevent the length of the gap cache to be zero, but in this
>> case it actually will be zero, because there will be no gap at the end
>> of the file.
> 
> Um, I think there is a block (blocksize -1 byte) in gap.
> And this gap should be used for the next block searching.
> According to ext4_ext_in_cache(), len=0 means the special case
> which we have no cache extent, so len=0 should be disallowed.
> 
> Moreover, if we create the file which has 2^32-1 offset,
> we can't get extent which starts from this offset with FIEMAP ioctl.
> That's why I think the maximum file size should be 2^32-1 * blocksize.
> 
> Regards,
> Kazuya Mio
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner May 24, 2011, 8:57 a.m. UTC | #3
Hi Kazuya,

I am really sorry for late answer. I think you're partly right. my
solution is not good, but I still think that your is not good as well. I
need to look at this again and more closely, sorry.

What do you think about this: ext4_ext_next_allocated_block() should
return next allocated block, however instead it in some cases returns
EXT_MAX_BLOCK, which points at the last logical block in a file, which
however in some cases might be equal to the last not allocated block,
not first allocated block in subsequent extent. And boom, we have (next
== lblock).

So if we want to really return next allocated block (or more
specifically, next block which we can not allocate), we should in those
cases return EXT_MAX_BLOCK+1. And we should do this in
ext4_ext_put_gap_in_cache() as well when there is no extent yet. Also
note that as I said EXT_MAX_BLOCK means maximum logical block, however
we use it as lenght in ext4_ext_put_gap_in_cache() which does not sound
right either. It seems all a little bit messy :-/. I need to look at it
and try it to see if it would work, but I think it does make sense.

What do you think?

Thanks!
-Lukas

On Tue, 24 May 2011, Kazuya Mio wrote:

> Hi Lukas,
> How do you think about my comment?
> 
> Regards,
> Kazuya Mio
> 
> 2011/05/13 17:55, Kazuya Mio wrote:
> > 2011/05/13 15:36, Lukas Czerner wrote:
> >> That is because due to defensive programming we planted a lot of
> >> BUG_ON's to prevent the length of the gap cache to be zero, but in this
> >> case it actually will be zero, because there will be no gap at the end
> >> of the file.
> > 
> > Um, I think there is a block (blocksize -1 byte) in gap.
> > And this gap should be used for the next block searching.
> > According to ext4_ext_in_cache(), len=0 means the special case
> > which we have no cache extent, so len=0 should be disallowed.
> > 
> > Moreover, if we create the file which has 2^32-1 offset,
> > we can't get extent which starts from this offset with FIEMAP ioctl.
> > That's why I think the maximum file size should be 2^32-1 * blocksize.
> > 
> > Regards,
> > Kazuya Mio
>
Kazuya Mio May 25, 2011, 8:59 a.m. UTC | #4
2011/05/24 17:57, Lukas Czerner wrote:
> Hi Kazuya,
>
> I am really sorry for late answer. I think you're partly right. my
> solution is not good, but I still think that your is not good as well. I
> need to look at this again and more closely, sorry.
>
> What do you think about this: ext4_ext_next_allocated_block() should
> return next allocated block, however instead it in some cases returns
> EXT_MAX_BLOCK, which points at the last logical block in a file, which
> however in some cases might be equal to the last not allocated block,
> not first allocated block in subsequent extent. And boom, we have (next
> == lblock).

I think so, too.

> So if we want to really return next allocated block (or more
> specifically, next block which we can not allocate), we should in those
> cases return EXT_MAX_BLOCK+1. And we should do this in
> ext4_ext_put_gap_in_cache() as well when there is no extent yet. Also
> note that as I said EXT_MAX_BLOCK means maximum logical block, however
> we use it as lenght in ext4_ext_put_gap_in_cache() which does not sound
> right either. It seems all a little bit messy :-/. I need to look at it
> and try it to see if it would work, but I think it does make sense.
>
> What do you think?

AFAIK, it's the best way to fix this problem. But I have no idea that doesn't
increase the size of struct ext4_ext_cache. I'm looking forward to seeing
the patch.

Regards,
Kazuya Mio
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 4890d6f..779ca49 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1944,7 +1944,6 @@  ext4_ext_put_in_cache(struct inode *inode, ext4_lblk_t block,
 			__u32 len, ext4_fsblk_t start)
 {
 	struct ext4_ext_cache *cex;
-	BUG_ON(len == 0);
 	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	cex = &EXT4_I(inode)->i_cached_extent;
 	cex->ec_block = block;
@@ -1991,7 +1990,7 @@  ext4_ext_put_gap_in_cache(struct inode *inode, struct ext4_ext_path *path,
 				le32_to_cpu(ex->ee_block),
 				ext4_ext_get_actual_len(ex),
 				block);
-		BUG_ON(next == lblock);
+		BUG_ON((next == lblock) && (next != EXT_MAX_BLOCK));
 		len = next - lblock;
 	} else {
 		lblock = len = 0;