Patchwork [2/2] ext4: fix eofblock flag handling

login
register
mail settings
Submitter Dmitri Monakho
Date April 19, 2010, 2:32 p.m.
Message ID <1271687537-15655-2-git-send-email-dmonakhov@openvz.org>
Download mbox | patch
Permalink /patch/50469/
State Superseded, archived
Delegated to: Theodore Ts'o
Headers show

Comments

Dmitri Monakho - April 19, 2010, 2:32 p.m.
- Fix NULL pointer deference on error path
- Extent header we found may be not latest node of the inode. In order to
  find latest extent we have to traverse a path from very beginning.

https://bugzilla.kernel.org/show_bug.cgi?id=15792

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/extents.c |   50 +++++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 39 insertions(+), 11 deletions(-)
Theodore Ts'o - May 25, 2010, 4:17 a.m.
On Mon, Apr 19, 2010 at 06:32:17PM +0400, Dmitry Monakhov wrote:
> - Fix NULL pointer deference on error path
> - Extent header we found may be not latest node of the inode. In order to
>   find latest extent we have to traverse a path from very beginning.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=15792
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

I split this patch into two patches, since they are addressing two
very distinct and different bugs.  As it turns out I chose a
completely different way of tackling the second issue, which has the
advantage being much simpler, and not requiring a second call to
ext4_ext_find_extent(), which can end up requiring extra disk reads.

Also note that I supplied a test case to demonstrate the problem.
This was helpful in assuring that the problem was fixed, and also in
proving that there really *was* a problem; as it turns out triggering
it is quite difficult and I would be very surprised if it has really
happenned in real life.

To construct the test case I first of all used a 1k block file system,
and then generated an extremely fragmented free space:

mkdir a; cd a
seq -f %05.0f 1 65536 | xargs touch
seq -f %05.0f 1 65536 | xargs -L 1 fallocate -l 1k
seq -f %05.0f 1 2 65536 | xargs rm
cd ..

I then created the fragmented file with the EOFBLOCKS set:

fallocate -n -l 32m foo

This should generate a file with a two-deep extent tree.  (It is
otherwise *very* hard to create a deep extent tree.)  I then found the
last block in an leaf block in the middle of the tree, and deleted the
last extent in that leaf block, using the tst_extents program found in
lib/ext2fs in the e2fsprogs sources (it is built using "make check").
In the case described in the commit, this happened to be for the
logical block 17925.

Could such a file be generated in real life?  Yes, but you'd have to
be quite unlucky, as it would require extending a sparse, fragmented
file using an i_size-preserving fallocate call, where there was a hole
at precisely the right (wrong) place in the extent tree.

			      	      	 - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 7c7b1d5..13758c3 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3327,7 +3327,7 @@  int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 {
 	struct ext4_ext_path *path = NULL;
 	struct ext4_extent_header *eh;
-	struct ext4_extent newex, *ex, *last_ex;
+	struct ext4_extent newex, *ex;
 	ext4_fsblk_t newblock;
 	int err = 0, depth, ret, cache_type;
 	unsigned int allocated = 0;
@@ -3510,17 +3510,38 @@  int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 	}
 
 	if (unlikely(ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS))) {
-		if (unlikely(!eh->eh_entries)) {
-			EXT4_ERROR_INODE(inode,
-					 "eh->eh_entries == 0 ee_block %d",
-					 ex->ee_block);
-			err = -EIO;
-			goto out2;
-		}
-		last_ex = EXT_LAST_EXTENT(eh);
+		struct ext4_extent *last_ex = EXT_LAST_EXTENT(eh);
+		/*
+		 * Optimization: Extent header we found may not be the latest
+		 * extent of the inode, but it is already in-core memory.
+		 * Let's test against this EH to avoid unecessery IO.
+		*/
+		if (unlikely(!eh->eh_entries))
+			goto bad_eh;
 		if (iblock + ar.len > le32_to_cpu(last_ex->ee_block)
-		    + ext4_ext_get_actual_len(last_ex))
-			ext4_clear_inode_flag(inode, EXT4_INODE_EOFBLOCKS);
+			+ ext4_ext_get_actual_len(last_ex)) {
+			struct ext4_ext_path *path2;
+			struct ext4_extent_header *eh2;
+			/* Real search of the latests extent is necessery */
+			path2 = ext4_ext_find_extent(inode, EXT_MAX_BLOCK, NULL);
+			eh2 = path2[depth].p_hdr;
+			if (IS_ERR(path2)) {
+				err = PTR_ERR(path2);
+				goto out2;
+			}
+			last_ex = path2[depth].p_ext;
+			if (unlikely(!eh2->eh_entries || !last_ex)) {
+				ext4_ext_drop_refs(path2);
+				kfree(path2);
+				goto bad_eh;
+			}
+			if (iblock + ar.len > le32_to_cpu(last_ex->ee_block)
+				+ ext4_ext_get_actual_len(last_ex))
+				ext4_clear_inode_flag(inode, EXT4_INODE_EOFBLOCKS);
+			ext4_ext_drop_refs(path2);
+			kfree(path2);
+		}
+
 	}
 	err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
 	if (err) {
@@ -3570,6 +3591,12 @@  out2:
 		kfree(path);
 	}
 	return err ? err : allocated;
+
+bad_eh:
+	EXT4_ERROR_INODE(inode, "eh->eh_entries == 0, i_flags = %lx iblock = %u"
+			, EXT4_I(inode)->i_flags, iblock);
+	err = -EIO;
+	goto out2;
 }
 
 void ext4_ext_truncate(struct inode *inode)
@@ -3580,6 +3607,7 @@  void ext4_ext_truncate(struct inode *inode)
 	handle_t *handle;
 	int err = 0;
 
+	BUG_ON(ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS));
 	/*
 	 * probably first extent we're gonna free will be last in block
 	 */