Message ID | 877gkls1q7.fsf@openvz.org |
---|---|
State | Superseded, archived |
Headers | show |
On Tue, Apr 02, 2013 at 01:47:44PM +0400, Dmitry Monakhov wrote: > On Mon, 1 Apr 2013 23:15:07 -0700 (PDT), Christian Kujau <lists@nerdbynature.de> wrote: > > Hi, > > > > my machine (PowerBook G4) just crashed and the only thing netconsole was > > able to transmit was: > > > > ------------[ cut here ]------------ > > kernel BUG at /usr/local/src/linux-git/fs/ext4/inode.c:1591! > > > > But (unfortunately) nothing more. I have no clear way to reproduce this, > > but I have some kind of a (longish) backstory to this, see below. The > > system is running 3.9-rc4, its .config and dmesg: > > > > http://nerdbynature.de/bits/3.9.0-rc1/config.gz (oldconfig'ed to -rc4) > > http://nerdbynature.de/bits/3.9.0-rc1/dmesg.txt (w/o the calltrace at the end) > > > > > > I was having trouble all day downloading a file via bittorrent to an > > ext4 filesystem. It looks like the same problem [1]. But it should have been fixed in 3.9-rc4. Frankly, I think the root cause is es_cache. Sorry, it hasn't been well tested. 1. http://www.serverphorums.com/read.php?12,667656 Could you please revert your tree to this commit (3a225670), and try again. I want to make sure that the regression won't be fixed until now or it is introduced after this commit. Thanks in advance, - Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2 Apr 2013 09:58:46 -0700 (PDT), Christian Kujau <lists@nerdbynature.de> wrote: > On Tue, 2 Apr 2013 at 13:47, Dmitry Monakhov wrote: > > Unfortunately it is like a regression which we missed > > due to s390x and ppc is not well tested. > > :-( > > > Ohh that is sad. Unfortunately I can't reproduce this on my own > > environment. I have power mac pro G5 but w/o graphics card, so i cant > > install linux on it. If you know how to do that w/o monitor please let > > me know. > > Hm, w/o a graphics card..the only way to install any OS would be via a > serial line, I assume. I've tried to use qemu but I can not even boot the kernel: Preparing to boot Linux version 3.9.0-rc4 (root@dbuild4.qa.sw.ru) (gcc version 4.4.5 (Debian 4.4.5-8) ) #6 Tue Apr 2 19:12:42 MSK 2013 Detected machine type: 00000400 command line: console=ttyS0,9600 console=tty0 memory layout at init: memory_limit : 00000000 (16 MB aligned) alloc_bottom : 0164d000 alloc_top : 20000000 alloc_top_hi : 20000000 rmo_top : 20000000 ram_top : 20000000 found display : /pci@80000000/QEMU,VGA@1, opening... done copying OF device tree... Building dt strings... Building dt structure... Device tree strings 0x0164e000 -> 0x0164e4d7 Device tree struct 0x0164f000 -> 0x01651000 Calling quiesce... returning from prom_init Trying to write invalid spr 1015 3f7 at c0008bc0 Can anybody help me with simple thing Build and boot kernel via qemu > > > So you just do bunch of writes/mmap to fallocated area. > > The only guess I have is that some bug in extent status tree > > "writes/mmap to fallocated area" - this sounds like the exact thing this > bittorrent client is doing! > > > Please run test with a patch which was posted here: > > http://marc.info/?l=linux-kernel&m=136455173926544&w=2 > > This patch enable sanity checks for extent_status tree. > > Also please try following patch. It voluntary disable es_lookup functionality. > > I'll find a way to reproduce this first and then play around with those patches. Probably all you need is just run fsstress (https://github.com/dmonakhov/xfstests/blob/master/ltp/fsstress.c) And run in like follows: #fsstress -d $YOUR_PATH -p 4 -z -f rmdir=10 -f link=10 -f creat=10 -f mkdir=10 \ -f rename=30 -f stat=30 -f unlink=30 -f truncate=20 -n99999999 > > Thanks for your response, > Christian. > -- > BOFH excuse #415: > > Maintenance window broken -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2 Apr 2013 14:35:29 -0700 (PDT), Christian Kujau <lists@nerdbynature.de> wrote: > On Tue, 2 Apr 2013 at 13:47, Dmitry Monakhov wrote: > > So you just do bunch of writes/mmap to fallocated area. > > The only guess I have is that some bug in extent status tree > > > > Please run test with a patch which was posted here: > > http://marc.info/?l=linux-kernel&m=136455173926544&w=2 > > This patch enable sanity checks for extent_status tree. > > Also please try following patch. It voluntary disable es_lookup functionality. > > I tested your patch below (applied to 3.9-rc4) and now the BUG is gone. > The machine stays up and the corruption of that torrent file is gone too! > > Feel free to add my Tested-by: but I don't know if this will be the final > solution to this issue, no? No. This is just a proof that es_cache is a root of cause. Please drop that patch and collect logs with a kernel which has only 0001-enable-ES_AGGRESSIVE_TEST-V2.patch patch applied This can help us understand what was wrong. From CAI Qian's logs(http://marc.info/?l=linux-ext4&m=136489690730402&w=2) I found that in most cases assertion failed because ec_cache contains BH_Mapped entries, but extent_tree has not data at all Also there is another assertion failure where es_cache {15/1/33490/MAPPED} != extent_tree {15/1/33579/BH_UNWRITTEN} > > Thanks! > Christian. > > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c> index fe3337a..95d27cd 100644 > --- a/fs/ext4/extents_status.c > +++ b/fs/ext4/extents_status.c > @@ -689,6 +689,7 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, > trace_ext4_es_lookup_extent_enter(inode, lblk); > es_debug("lookup extent in block %u\n", lblk); > > + return 0; > tree = &EXT4_I(inode)->i_es_tree; > read_lock(&EXT4_I(inode)->i_es_lock); > > -- > BOFH excuse #414: > > tachyon emissions overloading the system -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index fe3337a..95d27cd 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -689,6 +689,7 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, trace_ext4_es_lookup_extent_enter(inode, lblk); es_debug("lookup extent in block %u\n", lblk); + return 0; tree = &EXT4_I(inode)->i_es_tree; read_lock(&EXT4_I(inode)->i_es_lock);