From patchwork Tue Oct 11 07:01:57 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: ext4: handle NULL p_ext in ext4_ext_next_allocated_block() Date: Mon, 10 Oct 2011 21:01:57 -0000 From: Dmitri Monakho X-Patchwork-Id: 118863 Message-Id: <87ipnwnh96.fsf@dmbot.sw.ru> To: Curt Wohlgemuth , Lukas Czerner Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org On Mon, 10 Oct 2011 08:28:23 -0700, Curt Wohlgemuth wrote: > Hi Lukas: > > On Mon, Oct 10, 2011 at 12:19 AM, Lukas Czerner wrote: > > On Sat, 8 Oct 2011, Curt Wohlgemuth wrote: > > > >> In ext4_ext_next_allocated_block(), the path[depth] might > >> have a p_ext that is NULL -- see ext4_ext_binsearch().  In > >> such a case, dereferencing it will crash the machine. > >> > >> This patch checks for p_ext == NULL in > >> ext4_ext_next_allocated_block() before dereferencinging it. > >> > >> Tested using a hand-crafted an inode with eh_entries == 0 in > >> an extent block, verified that running FIEMAP on it crashes > >> without this patch, works fine with it. > > > > Hi Curt, > > > > It seems to me that even that the patch fixes the NULL dereference, it > > is not a complete solution. Is it possible that in "normal" case p_ext > > would be NULL ? I think that this is only possible in extent split/add > > case (as noted in ext4_ext_binsearch()) which should be atomic to the > > other operations (locked i_mutex?). > > Yes, unfortunately, it is possible in "normal" cases for p_ext to be NULL. > > We've seen this problem during what appears to be a race between an > inode growth (or truncate?) and another task doing a FIEMAP ioctl. > The real problem is that FIEMAP handing in ext4 is just, well, buggy? Wow, IMHO it not just buggy, it is obviously incorrect, IMHO it is more fair just return -ENOTSUPP, at least it is much safer. Yes calling FIEMAP and truncate/write in parallel is stupid, but not prohibited. > > ext4_ext_walk_space() will get the i_data_sem, construct the path > array, then release the semaphore. But then it does a bazillion > accesses on the extent/header/index pointers in the path array, with > no protection against truncate, growth, or any other changes. As far > as I can tell, this is the only use of a path array retrieved from > ext4_ext_find_extent() that isn't completely covered by i_data_sem. In that case i_data sem protects us from nothing. Path collected can simply disappear under us. And in fact i don't understand the reason why we drop i_data_sem too soon. Are any reason to do that? Seems like following patch should fix the issue. >From 12cd56ccd86c4d132f186034a9c11b0a2441a19f Mon Sep 17 00:00:00 2001 From: Dmitry Monakhov Date: Tue, 11 Oct 2011 10:44:31 +0400 Subject: [PATCH] ext4: do not drop i_data_sem too soon Path returned from ext4_find_extent is valid only while we hold i_data_sem, so we can drop it only after we nolonger need it. Signed-off-by: Dmitry Monakhov --- fs/ext4/extents.c | 20 +++++++++++--------- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index da4583f..c716a1f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -1847,23 +1847,32 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block, num = last - block; /* find extent for this block */ down_read(&EXT4_I(inode)->i_data_sem); + if (ext_depth(inode) != depth) { + /* depth was changed. we have to realloc path */ + kfree(path); + path = NULL; + } + path = ext4_ext_find_extent(inode, block, path); - up_read(&EXT4_I(inode)->i_data_sem); if (IS_ERR(path)) { err = PTR_ERR(path); + up_read(&EXT4_I(inode)->i_data_sem); path = NULL; break; } depth = ext_depth(inode); if (unlikely(path[depth].p_hdr == NULL)) { + up_read(&EXT4_I(inode)->i_data_sem); EXT4_ERROR_INODE(inode, "path[%d].p_hdr == NULL", depth); err = -EIO; break; } + ex = path[depth].p_ext; next = ext4_ext_next_allocated_block(path); - + up_read(&EXT4_I(inode)->i_data_sem); + ext4_ext_drop_refs(path); exists = 0; if (!ex) { /* there is no extent yet, so try to allocate @@ -1915,7 +1924,6 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block, break; } err = func(inode, next, &cbex, ex, cbdata); - ext4_ext_drop_refs(path); if (err < 0) break; @@ -1927,12 +1935,6 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block, break; } - if (ext_depth(inode) != depth) { - /* depth was changed. we have to realloc path */ - kfree(path); - path = NULL; - } - block = cbex.ec_block + cbex.ec_len; }