From patchwork Sat May  5 01:38:41 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sami Liedes <sami.liedes@iki.fi>
X-Patchwork-Id: 157014
Return-Path: <linux-ext4-owner@vger.kernel.org>
X-Original-To: patchwork-incoming@ozlabs.org
Delivered-To: patchwork-incoming@ozlabs.org
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by ozlabs.org (Postfix) with ESMTP id 86504B6FD1
	for <patchwork-incoming@ozlabs.org>;
	Sat,  5 May 2012 11:38:54 +1000 (EST)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755732Ab2EEBiw (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);
	Fri, 4 May 2012 21:38:52 -0400
Received: from smtp-4.hut.fi ([130.233.228.94]:34912 "EHLO smtp-4.hut.fi"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751154Ab2EEBiw (ORCPT <rfc822;linux-ext4@vger.kernel.org>);
	Fri, 4 May 2012 21:38:52 -0400
Received: from localhost (katosiko.hut.fi [130.233.228.115])
	by smtp-4.hut.fi (8.13.6/8.12.10) with ESMTP id q451coO5019691;
	Sat, 5 May 2012 04:38:50 +0300
Received: from smtp-4.hut.fi ([130.233.228.94])
	by localhost (katosiko.hut.fi [130.233.228.115]) (amavisd-new,
	port 10024)
	with LMTP id 30777-1581; Sat,  5 May 2012 04:38:49 +0300 (EEST)
Received: from kosh.localdomain (kosh.hut.fi [130.233.228.12])
	by smtp-4.hut.fi (8.13.6/8.12.10) with ESMTP id q451cfeA019688;
	Sat, 5 May 2012 04:38:42 +0300
Received: by kosh.localdomain (Postfix, from userid 43888)
	id CC1566FD; Sat,  5 May 2012 04:38:41 +0300 (EEST)
Date: Sat, 5 May 2012 04:38:41 +0300
From: Sami Liedes <sami.liedes@iki.fi>
To: linux-ext4@vger.kernel.org, linux-fsdev@vger.kernel.org
Subject: ext2 hang on (intentionally) corrupted filesystem
Message-ID: <20120505013841.GD13332@sli.dy.fi>
Mail-Followup-To: linux-ext4@vger.kernel.org, linux-fsdev@vger.kernel.org
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TKK-Virus-Scanned: by amavisd-new-2.1.2-hutcc at katosiko.hut.fi
Sender: linux-ext4-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-ext4.vger.kernel.org>
X-Mailing-List: linux-ext4@vger.kernel.org

Hi,

There seems to be a bug in the ext2 implementation (in vanilla 3.3.4)
where operations on a corrupted ext2 filesystem cause a hung task:

1. wget http://sli.dy.fi/~sliedes/berserker/testcases/ext2.110.min.bz2
2. mount ... /mnt -t ext2 -o errors=continue
3. Do some operations; what I do (it's the rm that crashes):

  timeout 30 cp -r doc doc2 >&/dev/null
  timeout 30 find -xdev >&/dev/null
  timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
  timeout 30 mkdir tmp >&/dev/null
  timeout 30 echo whoah >tmp/filu 2>/dev/null
  timeout 30 rm -rf /mnt/* >&/dev/null

4. The rm task hangs

The filesystem in fact differs from a pristine, fully working ext2
filesystem by only one bit:

------------------------------------------------------------
$ diff -u <(hd testimg.ext2) <(hd testimg.ext2.110.min)
------------------------------------------------------------

The buggy filesystem (10 MiB uncompressed) can be downloaded from

   http://sli.dy.fi/~sliedes/berserker/testcases/ext2.110.min.bz2

and the pristine filesystem from

   http://sli.dy.fi/~sliedes/berserker/testcases/pristine.ext2.bz2

See the dmesg output below.

	Sami


------------------------------------------------------------
INFO: task rm:1549 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm              D ffff880006f0cd40     0  1549   1548 0x00020004
 ffff88000560ddc8 0000000000000046 ffff8800068b2040 ffff88000560dfd8
 ffff88000560dfd8 ffff88000560dfd8 ffff880007852040 ffff8800068b2040
 ffff88000560de08 ffff880006f0cd00 ffff8800068b2040 0000000000000246
Call Trace:
 [<ffffffff8171d609>] schedule+0x39/0x50
 [<ffffffff8171baa0>] mutex_lock_nested+0x130/0x2f0
 [<ffffffff810fb467>] ? vfs_rmdir+0x67/0x120
 [<ffffffff810fb467>] vfs_rmdir+0x67/0x120
 [<ffffffff810fb62b>] do_rmdir+0x10b/0x120
 [<ffffffff81556e5d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff810fb94d>] sys_unlinkat+0x2d/0x40
 [<ffffffff817204b1>] sysenter_dispatch+0x7/0x2a
 [<ffffffff81556e1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
2 locks held by rm/1549:
 #0:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffff810fb58c>] do_rmdir+0x6c/0x120
 #1:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<ffffffff810fb467>] vfs_rmdir+0x67/0x120
Kernel panic - not syncing: hung_task: blocked tasks
Pid: 361, comm: khungtaskd Not tainted 3.3.4 #3
Call Trace:
 [<ffffffff81713aff>] panic+0xb5/0x1be
 [<ffffffff8108a017>] watchdog+0x2b7/0x2c0
 [<ffffffff81089dc6>] ? watchdog+0x66/0x2c0
 [<ffffffff81089d60>] ? hung_task_panic+0x20/0x20
 [<ffffffff810525cd>] kthread+0x8d/0xa0
 [<ffffffff81720304>] kernel_thread_helper+0x4/0x10
 [<ffffffff8171ec30>] ? retint_restore_args+0x13/0x13
 [<ffffffff81052540>] ? kthread_flush_work_fn+0x10/0x10
 [<ffffffff81720300>] ? gs_change+0x13/0x13
Rebooting in 1 seconds..
------------------------------------------------------------
--- /dev/fd/63 2012-05-05 04:26:49.972546154 +0300
+++ /dev/fd/62 2012-05-05 04:26:49.972546154 +0300
@@ -13520,7 +13520,7 @@
 00902c90  73 64 65 31 00 00 00 00  00 00 00 00 00 00 00 00  |sde1............|
 00902ca0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
-00903000  1d 05 00 00 0c 00 01 02  2e 00 00 00 29 00 00 00  |............)...|
+00903000  1d 05 00 00 0c 00 01 02  6e 00 00 00 29 00 00 00  |........n...)...|
 00903010  0c 00 02 02 2e 2e 00 00  1e 05 00 00 e8 03 26 01  |..............&.|
 00903020  5c 78 32 66 64 65 76 69  63 65 73 5c 78 32 66 76  |\x2fdevices\x2fv|
 00903030  69 72 74 75 61 6c 5c 78  32 66 74 74 79 5c 78 32  |irtual\x2ftty\x2|