Patchwork logfs unmount bug

login
register
mail settings
Submitter srimugunthan dhandapani
Date Aug. 31, 2011, 7:22 a.m.
Message ID <CAMjNe_d2pOC9+2nSy2PA=2aigF95vZv3_A-jPyWyTXteW3mqFw@mail.gmail.com>
Download mbox | patch
Permalink /patch/112450/
State New
Headers show

Comments

srimugunthan dhandapani - Aug. 31, 2011, 7:22 a.m.
On Wed, Aug 31, 2011 at 11:28 AM, Jörn Engel <joern@logfs.org> wrote:

> I can do.  Can you ensure you have my patch applied, rerun the rest
> case and send me the kernel results?

I applied your patch1  and the bonnie output is the same(stuck at
"Creating  files in sequential order ..."). The kernel is from your
git directory(http://git.kernel.org/?p=linux/kernel/git/joern/logfs.git;a=summary)
I am pretty sure i applied the patch. The git diff output is below[1]

 The kernel log is at
 https://docs.google.com/leaf?id=0BycgLWCW61phNjY0ZDg4ZjUtYzAyMy00YTgwLWFlMmItNjlmZWIzMWFlNGUy&hl=en_US

While i still have your attention, i would like to point out at max
writepage size restriction in logfs. Currently logfs has a max
writepage size as 4K. Currently, large page nand flashes and MLC nand
come in  8K page sizes. As the usecase for logfs is large nand
flashes(mostly with parallel and DMA write capabilities),  it may be
essential to remove the 4K write page size restriction.
I was only able to change the logfs-tools for >4K writepage size. If
you send a patch, that makes logfs useable for writepagesize >4K, i
can try on real hardware, instead of nandsim and give you the results
:-)


[1]

Thanks,
mugunthan
Jörn Engel - Aug. 31, 2011, 7:49 a.m.
On Wed, 31 August 2011 12:52:54 +0530, srimugunthan dhandapani wrote:
> On Wed, Aug 31, 2011 at 11:28 AM, Jörn Engel <joern@logfs.org> wrote:
> 
> > I can do.  Can you ensure you have my patch applied, rerun the rest
> > case and send me the kernel results?
> 
> I applied your patch1  and the bonnie output is the same(stuck at
> "Creating  files in sequential order ..."). The kernel is from your
> git directory(http://git.kernel.org/?p=linux/kernel/git/joern/logfs.git;a=summary)
> I am pretty sure i applied the patch. The git diff output is below[1]
> 
>  The kernel log is at
>  https://docs.google.com/leaf?id=0BycgLWCW61phNjY0ZDg4ZjUtYzAyMy00YTgwLWFlMmItNjlmZWIzMWFlNGUy&hl=en_US

600M.  And starting at line 34285 or about 0% in, the output mutates
to something like the below.  Not very useful. :(

Aug 31 12:28:06 mll kernel: [  991.561293] logfs_write_obj_alia 8,, 83,(3f(320(#2 #s #es sesaseiaslialilialilialialialialialialialialialialialialialilialialialilialialialililialialialilialialialilililialialialialialilialialiallialilialialialiaali_alj_abjo_ws_fsgfogflo lo] l4] 7581586156.561.591.9 9  [  >[7>[<7
Aug 31 12:28:06 mll kernel: <7
Aug 31 12:28:07 mll kernel: last message repeated 4 times
Aug 31 12:28:06 mll kernel: <
Aug 31 12:28:06 mll kernel: <7
Aug 31 12:28:06 mll kernel: <7
Aug 31 12:28:06 mll kernel: <
Aug 31 12:28:06 mll kernel: <
Aug 31 12:28:06 mll kernel: <7
Aug 31 12:28:06 mll kernel: <7
Aug 31 12:28:06 mll kernel: <

Can you set LOGFS_DEBUG to 0, both to cut down the noise and in the
hope that the log output doesn't get corrupted like this again?

> While i still have your attention, i would like to point out at max
> writepage size restriction in logfs. Currently logfs has a max
> writepage size as 4K. Currently, large page nand flashes and MLC nand
> come in  8K page sizes. As the usecase for logfs is large nand
> flashes(mostly with parallel and DMA write capabilities),  it may be
> essential to remove the 4K write page size restriction.
> I was only able to change the logfs-tools for >4K writepage size. If
> you send a patch, that makes logfs useable for writepagesize >4K, i
> can try on real hardware, instead of nandsim and give you the results
> :-)

That sure sounds useful.  I'll have a look...

Jörn
srimugunthan dhandapani - Aug. 31, 2011, 12:49 p.m.
On Wed, Aug 31, 2011 at 1:19 PM, Jörn Engel <joern@logfs.org> wrote:


> Can you set LOGFS_DEBUG to 0, both to cut down the noise and in the
> hope that the log output doesn't get corrupted like this again?

It doesn't crash, so nothing much is printed in kern.log if LOGFS_DEBUG=0.
I produced the log with LOGFS_DEBUG=0xDFF, (without LOGFS_DEBUG_ALIASES)
Its a 60MB log file "kern_LOGFS_DEBUG=0xDFF.log" shared in the following link:

https://docs.google.com/leaf?id=0BycgLWCW61phNjY0ZDg4ZjUtYzAyMy00YTgwLWFlMmItNjlmZWIzMWFlNGUy&hl=en_US

Thanks,
mugunthan
Jörn Engel - Aug. 31, 2011, 2:17 p.m.
On Wed, 31 August 2011 18:19:30 +0530, srimugunthan dhandapani wrote:
> On Wed, Aug 31, 2011 at 1:19 PM, Jörn Engel <joern@logfs.org> wrote:
> 
> > Can you set LOGFS_DEBUG to 0, both to cut down the noise and in the
> > hope that the log output doesn't get corrupted like this again?
> 
> It doesn't crash, so nothing much is printed in kern.log if LOGFS_DEBUG=0.

Ok.  So for those of us who are a bit slow (me), you say that you
cannot reproduce the bug with LOGFS_DEBUG=0.  In order to reproduce
it, you have to set LOGFS_DEBUG=LOGFS_DEBUG_ALL or some such.  Is that
right?

> I produced the log with LOGFS_DEBUG=0xDFF, (without LOGFS_DEBUG_ALIASES)
> Its a 60MB log file "kern_LOGFS_DEBUG=0xDFF.log" shared in the following link:
> 
> https://docs.google.com/leaf?id=0BycgLWCW61phNjY0ZDg4ZjUtYzAyMy00YTgwLWFlMmItNjlmZWIzMWFlNGUy&hl=en_US

LOGFS_DEBUG=0xDFF is equivalent to LOGFS_DEBUG=LOGFS_DEBUG_ALL - the
unset bits aren't used anyway.  Which leaves me utterly confused.
Were you unable to reproduce, but you sent me the log anyway?

Jörn
srimugunthan dhandapani - Sept. 1, 2011, 6:08 a.m.
> Were you unable to reproduce, but you sent me the log anyway?
I was not able to reproduce the file.c:172 bug. Please refer to my
previous mail.

  " On Mon, Aug 29, 2011 at 3:37 PM, srimugunthan dhandapani
<srimugunthan.dhandapani@gmail.com> wrote:

> To clarify on the bugs I reported
> 1. bonnie test( bonnie  -s 20 -r 10) does not complete . It gets stuck
> at "Creating  files in sequential order ..."
> (tested with nandsim, kernel 3.0.1. and 2.6.38.8; consistently
> reproducible on 2 machines.)
> The free command show that, while the bonnie test was run for half an
> hour, free space changed from 2982340 KB to 2550156 KB.
>
> 2. with mount-mkdir-unmount loop , logfs hits KERNEL bug at segment.c:784
> (tested with nandsim, the kernel is from your git.)
>
> 3. with bonnie test , sometimes it hits kernel bug at file.c:172
> (happens only on the unstable kernel that i was trying. not
> consistently reproducible on other kernels)
>
> Regarding the third bug, for double checking ,I thought of taking the
> log for your patch once more
> But I have recompiled the kernel in my machine and unfortunately i am
> not able to reproduce the third bug any more.
>
> I think the first two bugs should be reproducible at your end. If not,
> pls let me know, i will see whats wrong with my test setup. "

Its frustrating when you ask me to take a new set of logs without
reading my previous mails.
To repeat, iam not able to reproduce the third bug. I think the first
two bugs should be reproducible at your end.
I thought that you were asking me to take the logs because you are not
able to reproduce them too.
Out of curiosity, are you able to run bonnie successfully with logfs+nandsim?

Thanks,
mugunthan
srimugunthan dhandapani - Sept. 1, 2011, 9:46 a.m.
>
>  " On Mon, Aug 29, 2011 at 3:37 PM, srimugunthan dhandapani
> <srimugunthan.dhandapani@gmail.com> wrote:
>
>> To clarify on the bugs I reported
>> 1. bonnie test( bonnie  -s 20 -r 10) does not complete . It gets stuck
>> at "Creating  files in sequential order ..."
>> (tested with nandsim, kernel 3.0.1. and 2.6.38.8; consistently
>> reproducible on 2 machines.)
>> The free command show that, while the bonnie test was run for half an
>> hour, free space changed from 2982340 KB to 2550156 KB.

In as far a i can infer, this bug happens because it is stuck in the
infinite loop at logfs_write_anchor() in journal.c


again:
	super->s_no_je = 0;
	for_each_area(i) {
		if (!super->s_area[i]->a_is_open)
			continue;
		super->s_sum_index = i;
		err = logfs_write_je(sb, logfs_write_area);
		if (err)
			goto again;
	}
	err = logfs_write_obj_aliases(sb);
	if (err)
		goto again;
		
It is stuck in the goto again loop.

thanks

Patch

diff --git a/fs/logfs/file.c b/fs/logfs/file.c
index c2ad702..ee3c76a 100644
--- a/fs/logfs/file.c
+++ b/fs/logfs/file.c
@@ -158,7 +158,6 @@  static int logfs_writepage(struct page *page, struct writeba
        zero_user_segment(page, offset, PAGE_CACHE_SIZE);
        return __logfs_writepage(page);
 }
-
 static void logfs_invalidatepage(struct page *page, unsigned long offset)
 {
        struct logfs_block *block = logfs_block(page);
@@ -166,12 +165,24 @@  static void logfs_invalidatepage(struct page *page, unsign
        if (block->reserved_bytes) {
                struct super_block *sb = page->mapping->host->i_sb;
                struct logfs_super *super = logfs_super(sb);
-
                super->s_dirty_pages -= block->reserved_bytes;
-               block->ops->free_block(sb, block);
-               BUG_ON(bitmap_weight(block->alias_map, LOGFS_BLOCK_FACTOR));
+       //      block->ops->free_block(sb, block);
+       //      BUG_ON(bitmap_weight(block->alias_map, LOGFS_BLOCK_FACTOR));
+               if (bitmap_weight(block->alias_map, LOGFS_BLOCK_FACTOR))
+               {
+                       printk(KERN_DEBUG"logfs_invalidatepage(%lx, %x, %llx)\n"
+                                       page->mapping->host->i_ino,
+                                       page->mapping->host->i_nlink,
+                                       page->mapping->host->i_size);
+                       move_page_to_btree(page);
+               } else
+               {
+                       block->ops->free_block(sb, block);
+               }
        } else
+       {
                move_page_to_btree(page);
+       }
        BUG_ON(PagePrivate(page) || page->private);
 }

diff --git a/fs/logfs/logfs.h b/fs/logfs/logfs.h
index 9e74902..24c19a6 100644
--- a/fs/logfs/logfs.h
+++ b/fs/logfs/logfs.h
@@ -35,7 +35,9 @@ 
 #define LOGFS_DEBUG_BLOCKMOVE  (0x0400)
 #define LOGFS_DEBUG_ALL                (0xffffffff)

-#define LOGFS_DEBUG            (0x01)
+//#define LOGFS_DEBUG          (0x01)
+
+#define LOGFS_DEBUG LOGFS_DEBUG_ALL
 /*
  * To enable specific log messages, simply define LOGFS_DEBUG to match any
  * or all of the above.