Patchwork Intentionally corrupted fs: [this times really] "kernel BUG at fs/jbd2/transaction.c:161!"

login
register
mail settings
Submitter Sami Liedes
Date April 28, 2012, 12:45 a.m.
Message ID <20120428004522.GC20648@sli.dy.fi>
Download mbox | patch
Permalink /patch/155608/
State New
Headers show

Comments

Sami Liedes - April 28, 2012, 12:45 a.m.
[This is a distinct bug from the another similar-looking mail that I
just sent on this list - confusingly by mistake with the subject this
mail should have had...]

The 10 MiB ext4 image available at

   http://www.niksula.hut.fi/~sliedes/ext4/2000177.min.ext4.bz2

causes a crash on mainline 3.3.4 kernel at umount time when the
following operations are run (not all of these are probably really
necessary):

1.  mount 2000177.min.ext4 /mnt -t ext4 -o errors=continue
2.  cd /mnt
3.  cp -r doc doc2 >&/dev/null
4.  find -xdev >&/dev/null
5.  find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
6.  mkdir tmp >&/dev/null
7.  echo whoah >tmp/filu 2>/dev/null
8.  rm -rf /mnt/* >&/dev/null
9.  cd /
10. umount /mnt

See the dmesg output below (the panic is due to panic_on_oops=1).

The image differs from a pristine, fully working ext4 filesystem
available at

   http://www.niksula.hut.fi/~sliedes/ext4/pristine.ext4.bz2

by only one bit:

------------------------------------------------------------
$ wget http://www.niksula.hut.fi/~sliedes/ext4/pristine.ext4.bz2
$ bunzip2 pristine.ext4.bz2
$ diff -u <(hd pristine.ext4) <(hd 2000177.min.ext4)
------------------------------------------------------------


(This bug report serves also as a preview of the Berserker toolkit
that automates finding such bugs and minimizing the differences to
pristine filesystems. I plan on making the announcement and making the
toolkit available within the next few hours.)

	Sami


------------------------------------------------------------
EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=continue
------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:161!
invalid opcode: 0000 [#1]
CPU 0
Pid: 1528, comm: umount Not tainted 3.3.4 #1 Bochs Bochs
RIP: 0010:[<ffffffff811c19b1>]  [<ffffffff811c19b1>] start_this_handle+0x591/0x6d0
RSP: 0018:ffff880006c57b38  EFLAGS: 00010202
RAX: 0000000000000039 RBX: ffff880006d06828 RCX: 0000000000021020
RDX: 0000000000000039 RSI: ffffffff817b4ba0 RDI: ffff880006d06828
RBP: ffff880006c57be8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000153 R12: 0000000000000062
R13: ffff880006d06800 R14: ffff880006cee040 R15: ffff880006c57ba8
FS:  0000000000000000(0000) GS:ffffffff8161d000(0063) knlGS:00000000f74e9750
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f75833e0 CR3: 000000000614c000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 1528, threadinfo ffff880006c56000, task ffff880006cee040)
Stack:
 0000000000000000 ffff880006cee040 ffffffff810e748d 00000000fffedcd4
 ffff880007252608 ffff880006cee040 ffff880006c0f200 ffff880006cee040
 ffff880000000050 0000000000000296 ffff880006c57b98 ffffffff8107660d
Call Trace:
 [<ffffffff810e748d>] ? kmem_cache_alloc+0xad/0x150
 [<ffffffff8107660d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff811c1c11>] jbd2__journal_start+0x121/0x1a0
 [<ffffffff81186785>] ? ext4_evict_inode+0x135/0x490
 [<ffffffff811c1c9e>] jbd2_journal_start+0xe/0x10
 [<ffffffff811a07da>] ext4_journal_start_sb+0x7a/0x1d0
 [<ffffffff8114437b>] ? __dquot_initialize+0x2b/0x180
 [<ffffffff81186785>] ext4_evict_inode+0x135/0x490
 [<ffffffff81105057>] evict+0xa7/0x1b0
 [<ffffffff81105de5>] iput+0x105/0x210
 [<ffffffff811cb847>] jbd2_journal_destroy+0x1b7/0x240
 [<ffffffff81053200>] ? abort_exclusive_wait+0xb0/0xb0
 [<ffffffff811a0cd3>] ext4_put_super+0x73/0x320
 [<ffffffff810ee52d>] generic_shutdown_super+0x5d/0xf0
 [<ffffffff810efa3b>] kill_block_super+0x2b/0x80
 [<ffffffff810ee365>] deactivate_locked_super+0x45/0x80
 [<ffffffff810ee3e8>] deactivate_super+0x48/0x60
 [<ffffffff81108bd0>] mntput_no_expire+0xa0/0xf0
 [<ffffffff81109e0c>] sys_umount+0x6c/0x360
 [<ffffffff8110a10b>] sys_oldumount+0xb/0x10
 [<ffffffff813f5331>] sysenter_dispatch+0x7/0x2a
 [<ffffffff8122ee0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
Code: 47 fa e8 93 16 23 00 48 8b 85 60 ff ff ff 4d 39 be a8 00 00 00 73 07 4d 89 be a8 00 00 00 48 89 c7 e8 a4 17 23 00 e9 30 ff ff ff <0f> 0b 48 c7 c1 50 21 42 81 ba df 00 00 00 31 c0 48 c7 c6 38 8c
RIP  [<ffffffff811c19b1>] start_this_handle+0x591/0x6d0
 RSP <ffff880006c57b38>
---[ end trace a277562ac91d83ad ]---
Kernel panic - not syncing: Fatal exception
Rebooting in 1 seconds..
------------------------------------------------------------
Eric Sandeen - April 28, 2012, 1:42 a.m.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 4/27/12 7:45 PM, Sami Liedes wrote:
> [This is a distinct bug from the another similar-looking mail that I
> just sent on this list - confusingly by mistake with the subject this
> mail should have had...]
> 
> The 10 MiB ext4 image available at
> 
>    http://www.niksula.hut.fi/~sliedes/ext4/2000177.min.ext4.bz2
> 
> causes a crash on mainline 3.3.4 kernel at umount time when the
> following operations are run (not all of these are probably really
> necessary):
> 
> 1.  mount 2000177.min.ext4 /mnt -t ext4 -o errors=continue
> 2.  cd /mnt
> 3.  cp -r doc doc2 >&/dev/null
> 4.  find -xdev >&/dev/null
> 5.  find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
> 6.  mkdir tmp >&/dev/null
> 7.  echo whoah >tmp/filu 2>/dev/null
> 8.  rm -rf /mnt/* >&/dev/null
> 9.  cd /
> 10. umount /mnt
> 
> See the dmesg output below (the panic is due to panic_on_oops=1).
> 
> The image differs from a pristine, fully working ext4 filesystem
> available at
> 
>    http://www.niksula.hut.fi/~sliedes/ext4/pristine.ext4.bz2
> 
> by only one bit:
> 
> ------------------------------------------------------------
> $ wget http://www.niksula.hut.fi/~sliedes/ext4/pristine.ext4.bz2
> $ bunzip2 pristine.ext4.bz2
> $ diff -u <(hd pristine.ext4) <(hd 2000177.min.ext4)
> --- /dev/fd/63 2012-04-28 03:39:00.167101668 +0300
> +++ /dev/fd/62 2012-04-28 03:39:00.167101668 +0300
> @@ -32168,7 +32168,7 @@
>  00170050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>  *
>  00170400  07 01 00 00 0c 00 01 02  2e 00 00 00 a4 00 00 00  |................|
> -00170410  0c 00 02 02 2e 2e 00 00  08 01 00 00 e8 03 28 01  |..............(.|
> +00170410  0c 00 02 02 2e 2e 00 00  08 00 00 00 e8 03 28 01  |..............(.|
>  00170420  5c 78 32 66 64 65 76 69  63 65 73 5c 78 32 66 76  |\x2fdevices\x2fv|
>  00170430  69 72 74 75 61 6c 5c 78  32 66 62 6c 6f 63 6b 5c  |irtual\x2fblock\|
>  00170440  78 32 66 64 6d 2d 31 31  00 00 00 00 00 00 00 00  |x2fdm-11........|
> ------------------------------------------------------------

That's block 1473 (this is a 1k block fs)

debugfs:  stat <263>
Inode: 263   Type: directory    Mode:  0755   Flags: 0x80000
Generation: 1162591830    Version: 0x00000002
User:     0   Group:     0   Size: 1024
File ACL: 0    Directory ACL: 0
Links: 2   Blockcount: 2
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x48bf269d -- Wed Sep  3 19:06:53 2008
atime: 0x484aa7c8 -- Sat Jun  7 10:22:48 2008
mtime: 0x4845aca6 -- Tue Jun  3 15:42:14 2008
EXTENTS:
(0):1473

which is a directory.

The changed byte is the inode nr in the dir entry; it should be 264 (0x108) but got changed to 8 (0x8) ... which is the journal inode.  That can't be good.  Testing a patch I'll send shortly if it works.

- -Eric

> 

> (This bug report serves also as a preview of the Berserker toolkit
> that automates finding such bugs and minimizing the differences to
> pristine filesystems. I plan on making the announcement and making the
> toolkit available within the next few hours.)

sounds a bit like fsfuzzer, how is it different?

> 	Sami
> 
> 
> ------------------------------------------------------------
> EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=continue
> ------------[ cut here ]------------
> kernel BUG at fs/jbd2/transaction.c:161!

        BUG_ON(journal->j_flags & JBD2_UNMOUNT);


> invalid opcode: 0000 [#1]
> CPU 0
> Pid: 1528, comm: umount Not tainted 3.3.4 #1 Bochs Bochs
> RIP: 0010:[<ffffffff811c19b1>]  [<ffffffff811c19b1>] start_this_handle+0x591/0x6d0
> RSP: 0018:ffff880006c57b38  EFLAGS: 00010202
> RAX: 0000000000000039 RBX: ffff880006d06828 RCX: 0000000000021020
> RDX: 0000000000000039 RSI: ffffffff817b4ba0 RDI: ffff880006d06828
> RBP: ffff880006c57be8 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000153 R12: 0000000000000062
> R13: ffff880006d06800 R14: ffff880006cee040 R15: ffff880006c57ba8
> FS:  0000000000000000(0000) GS:ffffffff8161d000(0063) knlGS:00000000f74e9750
> CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 00000000f75833e0 CR3: 000000000614c000 CR4: 00000000000006b0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process umount (pid: 1528, threadinfo ffff880006c56000, task ffff880006cee040)
> Stack:
>  0000000000000000 ffff880006cee040 ffffffff810e748d 00000000fffedcd4
>  ffff880007252608 ffff880006cee040 ffff880006c0f200 ffff880006cee040
>  ffff880000000050 0000000000000296 ffff880006c57b98 ffffffff8107660d
> Call Trace:
>  [<ffffffff810e748d>] ? kmem_cache_alloc+0xad/0x150
>  [<ffffffff8107660d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff811c1c11>] jbd2__journal_start+0x121/0x1a0
>  [<ffffffff81186785>] ? ext4_evict_inode+0x135/0x490
>  [<ffffffff811c1c9e>] jbd2_journal_start+0xe/0x10
>  [<ffffffff811a07da>] ext4_journal_start_sb+0x7a/0x1d0
>  [<ffffffff8114437b>] ? __dquot_initialize+0x2b/0x180
>  [<ffffffff81186785>] ext4_evict_inode+0x135/0x490
>  [<ffffffff81105057>] evict+0xa7/0x1b0
>  [<ffffffff81105de5>] iput+0x105/0x210
>  [<ffffffff811cb847>] jbd2_journal_destroy+0x1b7/0x240
>  [<ffffffff81053200>] ? abort_exclusive_wait+0xb0/0xb0
>  [<ffffffff811a0cd3>] ext4_put_super+0x73/0x320
>  [<ffffffff810ee52d>] generic_shutdown_super+0x5d/0xf0
>  [<ffffffff810efa3b>] kill_block_super+0x2b/0x80
>  [<ffffffff810ee365>] deactivate_locked_super+0x45/0x80
>  [<ffffffff810ee3e8>] deactivate_super+0x48/0x60
>  [<ffffffff81108bd0>] mntput_no_expire+0xa0/0xf0
>  [<ffffffff81109e0c>] sys_umount+0x6c/0x360
>  [<ffffffff8110a10b>] sys_oldumount+0xb/0x10
>  [<ffffffff813f5331>] sysenter_dispatch+0x7/0x2a
>  [<ffffffff8122ee0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> Code: 47 fa e8 93 16 23 00 48 8b 85 60 ff ff ff 4d 39 be a8 00 00 00 73 07 4d 89 be a8 00 00 00 48 89 c7 e8 a4 17 23 00 e9 30 ff ff ff <0f> 0b 48 c7 c1 50 21 42 81 ba df 00 00 00 31 c0 48 c7 c6 38 8c
> RIP  [<ffffffff811c19b1>] start_this_handle+0x591/0x6d0
>  RSP <ffff880006c57b38>
> ---[ end trace a277562ac91d83ad ]---
> Kernel panic - not syncing: Fatal exception
> Rebooting in 1 seconds..
> ------------------------------------------------------------

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJPm0sVAAoJECCuFpLhPd7gQnEQAJ2nivGha/vInCViGr0Q3Q8w
HVtjEcpH1fK16IrhyOKqon+P7O/Jwn3+4rnSv8jGDwEmnly9KWQ7bTpo3KPGrdTk
/IpqMW06NLJIrlVyRO3ro2q7iqQD9VapZYg1jLdf/XKRT+VIlr38Ve/gS2kPQWN0
vR22A0+od22lvMiERCVzWRJ0IO32pkwFM7Ff3GuEcau9r+aOFYZz7kLQDzlssWa7
SfeIsKTtuplSSIcBudvXHK8asLtW9goLmTz4JfuZWTakGze4Y/Q0nd7yYgeO38wr
3L0X+pwTQpGmEdu7sFl5ZmSsu38RNxJqxepoNGiKZnEW2XNrO8ro+ckg732YmVpg
6N/c5OP2dnsPPISuORM1ssoHtJyEs/IvBE6RV9r9IZ/EA1yFFSLZbdxzFmO3A21K
bDuNG3H7loPZ52JULAWTEmvTkFiiC7nkk5ZuowKk8a4p3kGa5kAsLT7UEsV/gyix
wmq1C1LeiVdYR78wLEj9eMd9cz0WasrPMzbwzyatoXIkRvw1vtVsfWkaXoVrcsAu
Ae2Xhi9ito+yZ4rymXvy9+i2KZOf7XVrdRbivDC9tumpImBVDpnOO3zS/NbKosiw
AdxrVkQYTZnMoW8jOowakmEXMPZgmFmHBNSk0YG/EawSh0GDKKe1n5Tn57dl0XV9
/e/CT822l7foYCC6zG+B
=+8Qg
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sami Liedes - April 28, 2012, 1:56 a.m.
On Fri, Apr 27, 2012 at 08:42:46PM -0500, Eric Sandeen wrote:
> > (This bug report serves also as a preview of the Berserker toolkit
> > that automates finding such bugs and minimizing the differences to
> > pristine filesystems. I plan on making the announcement and making the
> > toolkit available within the next few hours.)
> 
> sounds a bit like fsfuzzer, how is it different?

I hope the announcement I just sent responded to this; but basically,
the toolkit works on a somewhat higher level (it uses zzuf, which I
believe to be similar to fsfuzzer but actually only fuzzes the
filesystem image and does not itself run tests on it; zzuf is not
written by me). Mostly the toolkit is about managing a virtual machine
to run tests automatically and about minimizing the differences to
pristine filesystems.

I confess I'm not intimately familiar with fsfuzzer (I should be!),
but I believe one of the advantages of berserker is that it automates
the virtual machine stuff and provides convenient scripts that can be
used with e.g. git bisect run.

	Sami

Patch

--- /dev/fd/63 2012-04-28 03:39:00.167101668 +0300
+++ /dev/fd/62 2012-04-28 03:39:00.167101668 +0300
@@ -32168,7 +32168,7 @@ 
 00170050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00170400  07 01 00 00 0c 00 01 02  2e 00 00 00 a4 00 00 00  |................|
-00170410  0c 00 02 02 2e 2e 00 00  08 01 00 00 e8 03 28 01  |..............(.|
+00170410  0c 00 02 02 2e 2e 00 00  08 00 00 00 e8 03 28 01  |..............(.|
 00170420  5c 78 32 66 64 65 76 69  63 65 73 5c 78 32 66 76  |\x2fdevices\x2fv|
 00170430  69 72 74 75 61 6c 5c 78  32 66 62 6c 6f 63 6b 5c  |irtual\x2fblock\|
 00170440  78 32 66 64 6d 2d 31 31  00 00 00 00 00 00 00 00  |x2fdm-11........|