From patchwork Sun Oct 28 04:23:57 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Sandeen X-Patchwork-Id: 194647 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 20ECC2C0095 for ; Sun, 28 Oct 2012 15:24:20 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750951Ab2J1EYJ (ORCPT ); Sun, 28 Oct 2012 00:24:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35933 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750931Ab2J1EYJ (ORCPT ); Sun, 28 Oct 2012 00:24:09 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q9S4Nvdl012642 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sun, 28 Oct 2012 00:23:57 -0400 Received: from liberator.sandeen.net (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q9S4NtJL021846 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sun, 28 Oct 2012 00:23:56 -0400 Message-ID: <508CB35D.9080102@redhat.com> Date: Sat, 27 Oct 2012 23:23:57 -0500 From: Eric Sandeen User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Nix CC: "Ted Ts'o" , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org Subject: [PATCH] ext4: fix unjournaled inode bitmap modification References: <87objupjlr.fsf@spindle.srvr.nix> <20121023013343.GB6370@fieldses.org> <87mwzdnuww.fsf@spindle.srvr.nix> <20121023143019.GA3040@fieldses.org> <874nllxi7e.fsf_-_@spindle.srvr.nix> <87pq48nbyz.fsf_-_@spindle.srvr.nix> In-Reply-To: <87pq48nbyz.fsf_-_@spindle.srvr.nix> X-Enigmail-Version: 1.4.5 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org commit 119c0d4460b001e44b41dcf73dc6ee794b98bd31 changed ext4_new_inode() such that the inode bitmap was being modified outside a transaction, which could lead to corruption, and was discovered when journal_checksum found a bad checksum in the journal during log replay. Nix ran into this when using the journal_async_commit mount option, which enables journal checksumming. The ensuing journal replay failures due to the bad checksums led to filesystem corruption reported as the now infamous "Apparent serious progressive ext4 data corruption bug" I've tested this by mounting with journal_checksum and running fsstress then dropping power; I've also tested by hacking DM to create snapshots w/o first quiescing, which allows me to test journal replay repeatedly w/o actually power-cycling the box. Without the patch I hit a journal checksum error every time. With this fix it survives many iterations. Signed-off-by: Eric Sandeen cc: Nix --- A little more going on here to try to properly handle error cases & moving to the next group; despite ext4_handle_release_buffer being a no-op, I've tried to sprinkle it in at the right places. Double checking on review is probably a fine idea ;) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 4facdd2..1d18fba 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -706,10 +706,17 @@ got_group: continue; } - brelse(inode_bitmap_bh); + if (inode_bitmap_bh) { + ext4_handle_release_buffer(handle, inode_bitmap_bh); + brelse(inode_bitmap_bh); + } inode_bitmap_bh = ext4_read_inode_bitmap(sb, group); if (!inode_bitmap_bh) goto fail; + BUFFER_TRACE(inode_bitmap_bh, "get_write_access"); + err = ext4_journal_get_write_access(handle, inode_bitmap_bh); + if (err) + goto fail; repeat_in_this_group: ino = ext4_find_next_zero_bit((unsigned long *) @@ -734,10 +741,16 @@ repeat_in_this_group: if (ino < EXT4_INODES_PER_GROUP(sb)) goto repeat_in_this_group; } + ext4_handle_release_buffer(handle, inode_bitmap_bh); err = -ENOSPC; goto out; got: + BUFFER_TRACE(inode_bitmap_bh, "call ext4_handle_dirty_metadata"); + err = ext4_handle_dirty_metadata(handle, NULL, inode_bitmap_bh); + if (err) + goto fail; + /* We may have to initialize the block bitmap if it isn't already */ if (ext4_has_group_desc_csum(sb) && gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) { @@ -771,11 +784,6 @@ got: goto fail; } - BUFFER_TRACE(inode_bitmap_bh, "get_write_access"); - err = ext4_journal_get_write_access(handle, inode_bitmap_bh); - if (err) - goto fail; - BUFFER_TRACE(group_desc_bh, "get_write_access"); err = ext4_journal_get_write_access(handle, group_desc_bh); if (err) @@ -823,11 +831,6 @@ got: } ext4_unlock_group(sb, group); - BUFFER_TRACE(inode_bitmap_bh, "call ext4_handle_dirty_metadata"); - err = ext4_handle_dirty_metadata(handle, NULL, inode_bitmap_bh); - if (err) - goto fail; - BUFFER_TRACE(group_desc_bh, "call ext4_handle_dirty_metadata"); err = ext4_handle_dirty_metadata(handle, NULL, group_desc_bh); if (err)