From patchwork Fri Aug 17 13:50:17 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Theodore Ts'o X-Patchwork-Id: 178208 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id E6D442C009A for ; Fri, 17 Aug 2012 23:50:40 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755084Ab2HQNue (ORCPT ); Fri, 17 Aug 2012 09:50:34 -0400 Received: from li9-11.members.linode.com ([67.18.176.11]:45243 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751417Ab2HQNuY convert rfc822-to-8bit (ORCPT ); Fri, 17 Aug 2012 09:50:24 -0400 Received: from root (helo=closure.thunk.org) by imap.thunk.org with local-esmtp (Exim 4.72) (envelope-from ) id 1T2MwO-0001cx-9P; Fri, 17 Aug 2012 13:50:20 +0000 Received: by closure.thunk.org (Postfix, from userid 15806) id 408B0241763; Fri, 17 Aug 2012 09:50:17 -0400 (EDT) From: Theodore Ts'o To: Ext4 Developers List Cc: Theodore Ts'o , stable@vger.kernel.org Subject: [PATCH] ext4: fix kernel BUG on large-scale rm -rf commands Date: Fri, 17 Aug 2012 09:50:17 -0400 Message-Id: <1345211417-26968-1-git-send-email-tytso@mit.edu> X-Mailer: git-send-email 1.7.12.rc0.22.gcdd159b In-Reply-To: <20120817131558.GA11439@thunk.org> References: <20120817131558.GA11439@thunk.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Commit 968dee7722: "ext4: fix hole punch failure when depth is greater than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel crashes when users ran run "rm -rf" on large directory hierarchy on ext4 filesystems on RAID devices: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710) Call Trace: [] ? __ext4_handle_dirty_metadata+0x83/0x110 [] ext4_ext_truncate+0x193/0x1d0 [] ? ext4_mark_inode_dirty+0x7f/0x1f0 [] ext4_truncate+0xf5/0x100 [] ext4_evict_inode+0x461/0x490 [] evict+0xa2/0x1a0 [] iput+0x103/0x1f0 [] do_unlinkat+0x154/0x1c0 [] ? sys_newfstatat+0x2a/0x40 [] sys_unlinkat+0x1b/0x50 [] system_call_fastpath+0x16/0x1b Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 <48> 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00 RIP [] ext4_ext_remove_space+0xa34/0xdf0 This could be reproduced as follows: The problem in commit 968dee7722 was that caused the variable 'i' to be left uninitialized if the truncate required more space than was available in the journal. This resulted in the function ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused ext4_ext_remove_space() to restart the truncate operation after starting a new jbd2 handle. Reported-by: Maciej Żenczykowski Reported-by: Marti Raudsepp Tested-by: Fengguang Wu Signed-off-by: "Theodore Ts'o" Cc: stable@vger.kernel.org --- fs/ext4/extents.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index cd0c7ed..aabbb3f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2662,6 +2662,7 @@ cont: } path[0].p_depth = depth; path[0].p_hdr = ext_inode_hdr(inode); + i = 0; if (ext4_ext_check(inode, path[0].p_hdr, depth)) { err = -EIO;