Patchwork NULL pointer dereference in ext4_ext_remove_space on 3.5.1

login
register
mail settings
Submitter Theodore Ts'o
Date Aug. 16, 2012, 2:46 a.m.
Message ID <20120816024654.GB3781@thunk.org>
Download mbox | patch
Permalink /patch/177888/
State Superseded
Headers show

Comments

Theodore Ts'o - Aug. 16, 2012, 2:46 a.m.
On Wed, Aug 15, 2012 at 09:33:29PM +0300, Marti Raudsepp wrote:
> I was moving and deleting some files between two of my ext4 partitions
> when it suddenly crashed and dropped me into an kernel oops screen
> (below). I'm using ext4 on kernel 3.5.1 (Arch Linux). 

> BUG: unable to handle kernel NULL pointer dereference at 000...00028
> IP: [...] ext4_ext_remove_space+0xaa4/0xef0 [ext4]

Someone else has reported a similar crash, but we don't yet have
enough information to narrow it down quite yet.

If you could try applying the following debugging patch, and then try
to reproduce the failure, it would be really helpful.

Thanks!!

					- Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wu Fengguang - Aug. 16, 2012, 11:10 a.m.
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2432,6 +2432,10 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>  
>  	/* the header must be checked already in ext4_ext_remove_space() */
>  	ext_debug("truncate since %u in leaf to %u\n", start, end);
> +	if (!path[depth].p_hdr && !path[depth].p_bh) {
> +		EXT4_ERROR_INODE(inode, "depth %d", depth);
> +		BUG_ON(1);
> +	}
>  	if (!path[depth].p_hdr)
>  		path[depth].p_hdr = ext_block_hdr(path[depth].p_bh);
>  	eh = path[depth].p_hdr;
> @@ -2730,6 +2734,10 @@ cont:
>  		/* this is index block */
>  		if (!path[i].p_hdr) {
>  			ext_debug("initialize header\n");
> +			if (!path[i].p_hdr && !path[i].p_bh) {
> +				EXT4_ERROR_INODE(inode, "i=%d", i);
> +				BUG_ON(1);
> +			}
>  			path[i].p_hdr = ext_block_hdr(path[i].p_bh);
>  		}
>  

Here is the dmesg. BTW, it seems 3.5.0 don't have this issue.

[  640.266836] EXT4-fs error (device md0): ext4_ext_remove_space:2694: inode #12: comm rm: i=1
[  640.275701] ------------[ cut here ]------------
[  640.276684] kernel BUG at /c/wfg/tip/fs/ext4/extents.c:2695!
[  640.276684] invalid opcode: 0000 [#1] SMP
[  640.276684] Modules linked in:
[  640.276684] CPU 7
[  640.276684] Pid: 4079, comm: rm Not tainted 3.6.0-rc1+ #3 Supermicro X7DW3/X7DWN
[  640.276684] RIP: 0010:[<ffffffff811f8980>]  [<ffffffff811f8980>] ext4_ext_remove_space+0x86e/0xbee
[  640.276684] RSP: 0018:ffff88021e749cb8  EFLAGS: 00010287
[  640.276684] RAX: ffff880221072000 RBX: ffff88020fc680d0 RCX: 0000000000000092
[  640.276684] RDX: 0000000000003c3c RSI: 0000000000000092 RDI: ffff880221073800
[  640.276684] RBP: ffff88021e749d98 R08: ffffffff81f6ea88 R09: 0000000000000000
[  640.276684] R10: ffffffff81f19a30 R11: 0000000000000647 R12: ffff880222385840
[  640.276684] R13: 0000000000000001 R14: 0000000000000001 R15: ffff880222385870
[  640.276684] FS:  00007f4461203700(0000) GS:ffff88022f5c0000(0000) knlGS:0000000000000000
[  640.276684] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  640.276684] CR2: 00007f4460cf761d CR3: 000000022115c000 CR4: 00000000000007e0
[  640.276684] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  640.276684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  640.276684] Process rm (pid: 4079, threadinfo ffff88021e748000, task ffff8802236d2e80)
[  640.276684] Stack:
[  640.276684]  ffffffff819caec0 ffff880220901390 ffff88020fc680d0 ffff880220901390
[  640.276684]  ffff88020fc25750 0000000000790000 ffff88021e749d38 ffffffff811d6b05
[  640.276684]  8000880200000000 ffff88020fc68000 ffff880221072000 ffff8802223858a0
[  640.276684] Call Trace:
[  640.276684]  [<ffffffff811d6b05>] ? ext4_mark_iloc_dirty+0x47a/0x557
[  640.276684]  [<ffffffff811fa6ab>] ext4_ext_truncate+0xd8/0x176
[  640.276684]  [<ffffffff811d6de8>] ? ext4_mark_inode_dirty+0x17e/0x1c0
[  640.276684]  [<ffffffff811d4934>] ext4_truncate+0x7a/0xca
[  640.276684]  [<ffffffff811d8aa2>] ext4_evict_inode+0x2e9/0x422
[  640.276684]  [<ffffffff81163e17>] evict+0xae/0x163
[  640.276684]  [<ffffffff811640c4>] iput+0x1bb/0x1c3
[  640.276684]  [<ffffffff8115a2ca>] do_unlinkat+0x102/0x157
[  640.276684]  [<ffffffff813d550e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[  640.276684]  [<ffffffff8115bf97>] sys_unlinkat+0x22/0x2d
[  640.276684]  [<ffffffff81985229>] system_call_fastpath+0x16/0x1b
[  640.276684] Code: 75 33 49 8b 47 28 48 85 c0 75 22 45 89 e9 49 c7 c0 ce e8 d2 81 31 c9 ba 86 0a 00 00 48 c7 c6 80 fb 9c 81 48 89 df e8 eb 6c ff ff <0f> 0b 48 8b 40 28 49 89 47 20 49 8b 47 18 48 85 c0 75 1f 49 8b
[  640.276684] RIP  [<ffffffff811f8980>] ext4_ext_remove_space+0x86e/0xbee
[  640.276684]  RSP <ffff88021e749cb8>
[  640.530999] ---[ end trace e00762202fd8e8a0 ]---

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitri Monakho - Sept. 17, 2012, 12:21 p.m.
On Wed, 15 Aug 2012 22:46:54 -0400, Theodore Ts'o <tytso@mit.edu> wrote:
> On Wed, Aug 15, 2012 at 09:33:29PM +0300, Marti Raudsepp wrote:
> > I was moving and deleting some files between two of my ext4 partitions
> > when it suddenly crashed and dropped me into an kernel oops screen
> > (below). I'm using ext4 on kernel 3.5.1 (Arch Linux). 
Ohh, I've missed that gigantic topic, but still i've found the bug.
patch is available here http://patchwork.ozlabs.org/patch/183649/
> 
> > BUG: unable to handle kernel NULL pointer dereference at 000...00028
> > IP: [...] ext4_ext_remove_space+0xaa4/0xef0 [ext4]
> 
> Someone else has reported a similar crash, but we don't yet have
> enough information to narrow it down quite yet.
> 
> If you could try applying the following debugging patch, and then try
> to reproduce the failure, it would be really helpful.
> 
> Thanks!!
> 
> 					- Ted
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 769151d..3394d52 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2432,6 +2432,10 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>  
>  	/* the header must be checked already in ext4_ext_remove_space() */
>  	ext_debug("truncate since %u in leaf to %u\n", start, end);
> +	if (!path[depth].p_hdr && !path[depth].p_bh) {
> +		EXT4_ERROR_INODE(inode, "depth %d", depth);
> +		BUG_ON(1);
> +	}
>  	if (!path[depth].p_hdr)
>  		path[depth].p_hdr = ext_block_hdr(path[depth].p_bh);
>  	eh = path[depth].p_hdr;
> @@ -2730,6 +2734,10 @@ cont:
>  		/* this is index block */
>  		if (!path[i].p_hdr) {
>  			ext_debug("initialize header\n");
> +			if (!path[i].p_hdr && !path[i].p_bh) {
> +				EXT4_ERROR_INODE(inode, "i=%d", i);
> +				BUG_ON(1);
> +			}
>  			path[i].p_hdr = ext_block_hdr(path[i].p_bh);
>  		}
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Sept. 17, 2012, 1:52 p.m.
On Mon, Sep 17, 2012 at 04:21:44PM +0400, Dmitry Monakhov wrote:
> On Wed, 15 Aug 2012 22:46:54 -0400, Theodore Ts'o <tytso@mit.edu> wrote:
> > On Wed, Aug 15, 2012 at 09:33:29PM +0300, Marti Raudsepp wrote:
> > > I was moving and deleting some files between two of my ext4 partitions
> > > when it suddenly crashed and dropped me into an kernel oops screen
> > > (below). I'm using ext4 on kernel 3.5.1 (Arch Linux). 
> Ohh, I've missed that gigantic topic, but still i've found the bug.
> patch is available here http://patchwork.ozlabs.org/patch/183649/

Dmitry, we have a patch in mainline already which addresses this, and
it's already backported to v3.5.3 or later.

What version was your patch series based against?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitri Monakho - Sept. 17, 2012, 2:48 p.m.
On Mon, 17 Sep 2012 09:52:15 -0400, Theodore Ts'o <tytso@mit.edu> wrote:
> On Mon, Sep 17, 2012 at 04:21:44PM +0400, Dmitry Monakhov wrote:
> > On Wed, 15 Aug 2012 22:46:54 -0400, Theodore Ts'o <tytso@mit.edu> wrote:
> > > On Wed, Aug 15, 2012 at 09:33:29PM +0300, Marti Raudsepp wrote:
> > > > I was moving and deleting some files between two of my ext4 partitions
> > > > when it suddenly crashed and dropped me into an kernel oops screen
> > > > (below). I'm using ext4 on kernel 3.5.1 (Arch Linux). 
> > Ohh, I've missed that gigantic topic, but still i've found the bug.
> > patch is available here http://patchwork.ozlabs.org/patch/183649/
> 
> Dmitry, we have a patch in mainline already which addresses this, and
> it's already backported to v3.5.3 or later.
> 
> What version was your patch series based against?
patch-set was prepared against d0f56971992a0bcc7 (old ext4.git's HEAD)
And you right your patch fixed the issue. So you can ignore my version.
but other patches from the 'Bunch of DIO/AIO fixes V2' queue are still
valid an applies to recent git tree w/o problems.
Should i resend whole series or you'll pick original one?
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 769151d..3394d52 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2432,6 +2432,10 @@  ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 
 	/* the header must be checked already in ext4_ext_remove_space() */
 	ext_debug("truncate since %u in leaf to %u\n", start, end);
+	if (!path[depth].p_hdr && !path[depth].p_bh) {
+		EXT4_ERROR_INODE(inode, "depth %d", depth);
+		BUG_ON(1);
+	}
 	if (!path[depth].p_hdr)
 		path[depth].p_hdr = ext_block_hdr(path[depth].p_bh);
 	eh = path[depth].p_hdr;
@@ -2730,6 +2734,10 @@  cont:
 		/* this is index block */
 		if (!path[i].p_hdr) {
 			ext_debug("initialize header\n");
+			if (!path[i].p_hdr && !path[i].p_bh) {
+				EXT4_ERROR_INODE(inode, "i=%d", i);
+				BUG_ON(1);
+			}
 			path[i].p_hdr = ext_block_hdr(path[i].p_bh);
 		}