From patchwork Thu May 8 00:05:41 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Darrick Wong X-Patchwork-Id: 346857 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44FCF140121 for ; Thu, 8 May 2014 10:05:54 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751555AbaEHAFx (ORCPT ); Wed, 7 May 2014 20:05:53 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:49296 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751054AbaEHAFw (ORCPT ); Wed, 7 May 2014 20:05:52 -0400 Received: from acsinet22.oracle.com (acsinet22.oracle.com [141.146.126.238]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id s4805lAa023549 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 8 May 2014 00:05:48 GMT Received: from aserz7021.oracle.com (aserz7021.oracle.com [141.146.126.230]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s4805jpq017780 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 8 May 2014 00:05:47 GMT Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by aserz7021.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s4805jJx026284; Thu, 8 May 2014 00:05:45 GMT Received: from localhost (/10.145.179.157) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 07 May 2014 17:05:43 -0700 Date: Wed, 7 May 2014 17:05:41 -0700 From: "Darrick J. Wong" To: =?utf-8?B?THVrw6HFoQ==?= Czerner Cc: tytso@mit.edu, linux-ext4@vger.kernel.org Subject: Re: [PATCH 10/37] e2fsck: verify checksums after checking everything else Message-ID: <20140508000541.GB8923@birch.djwong.org> References: <20140501231222.31890.82860.stgit@birch.djwong.org> <20140501231328.31890.34436.stgit@birch.djwong.org> <20140505225647.GJ8434@birch.djwong.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: acsinet22.oracle.com [141.146.126.238] Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue, May 06, 2014 at 01:32:32PM +0200, Lukáš Czerner wrote: > On Mon, 5 May 2014, Darrick J. Wong wrote: > > > Date: Mon, 5 May 2014 15:56:47 -0700 > > From: Darrick J. Wong > > To: Lukáš Czerner > > Cc: tytso@mit.edu, linux-ext4@vger.kernel.org > > Subject: Re: [PATCH 10/37] e2fsck: verify checksums after checking everything > > else > > > > On Fri, May 02, 2014 at 02:32:11PM +0200, Lukáš Czerner wrote: > > > On Thu, 1 May 2014, Darrick J. Wong wrote: > > > > > > > Date: Thu, 01 May 2014 16:13:28 -0700 > > > > From: Darrick J. Wong > > > > To: tytso@mit.edu, darrick.wong@oracle.com > > > > Cc: linux-ext4@vger.kernel.org > > > > Subject: [PATCH 10/37] e2fsck: verify checksums after checking everything else > > > > > > > > There's a particular problem with e2fsck's user interface where > > > > checksum errors are concerned: Fixing the first complaint about > > > > a checksum problem results in the inode being cleared even if e2fsck > > > > could otherwise have recovered it. While this mode is useful for > > > > cleaning the remaining broken crud off the filesystem, we could at > > > > least default to checking everything /else/ and only complaining about > > > > the incorrect checksum if fsck finds nothing else wrong. > > > > > > > > So, plumb in a config option. We default to "verify and checksum" > > > > unless the user tell us otherwise. > > > > > > I wonder whether it would not be better to always check the checksum > > > of an object because it might yield additional information. > > > > > > If the checksum is good and the object is somewhat broken that it's > > > highly likely that we have a problem within a kernel (or possibly > > > e2fsprogs if some other operations were performed) > > > > > > If the checksum is bad and the object is bad, then it's likely that > > > the corruption happened outside of the file system code, in memory, > > > on disk or in transfer. > > > > > > If checksum is bad and the object is good then it's trickier since it > > > can be kernel metadata csum bug, unlucky silent corruption, or > > > intentional change of the metadata. > > > > > > It's not huge amount of information we can get from it, but I think > > > that it might be useful when dealing with corrupted file system. > > > > Hm. So right now, the object verification code works roughly like this: > > > > A) Verify checksum, offer to zero object if strict_csums and csum failure. > > B) Check everything else and offer to fix broken things. > > C) Verify checksum again; if !strict_csums and csum failure, offer to zero the > > object. > > > > Do you think that it would be helpful to users if e2fsck warned of checksum > > verification failures during step (A) if strict_csums is set? I think that > > would help users (or us developers) to distinguish those three scenarios. > > It wouldn't be difficult to make fix_problem() spit out the message. > > Yes, I think that this is going to be helpful to both, users and > developers. I am not sure how easy or hard it would be but having > e2sfck specifically say that: > > "Object checksum is corrupted, but the object seems fine" > > or > > "Object checksum is ok, but the object itself seems corrupted" > > or > > "object checksum is corrupted and the object itself is corrupted" > > after the checksum verification and object check. > > But your solution would be useful as well. Ok, I've changed the patch to spit out this, what do you think: Pass 1: Checking inodes, blocks, and sizes Inode 12 checksum does not match inode. Running sanity checks. Inode 12 passes checks, but checksum does not match inode. Fix? yes --D --- From: Darrick J. Wong Subject: [PATCH] e2fsck: verify checksums after checking everything else There's a particular problem with e2fsck's user interface where checksum errors are concerned: Fixing the first complaint about a checksum problem results in the inode being cleared even if e2fsck could otherwise have recovered it. While this mode is useful for cleaning the remaining broken crud off the filesystem, we could at least default to checking everything /else/ and only complaining about the incorrect checksum if fsck finds nothing else wrong. So, plumb in a config option. We default to "verify and checksum" unless the user tell us otherwise. Signed-off-by: Darrick J. Wong --- e2fsck/e2fsck.8.in | 12 ++++++++++++ e2fsck/e2fsck.conf.5.in | 20 ++++++++++++++++++++ e2fsck/e2fsck.h | 1 + e2fsck/problem.c | 25 +++++++++++++++++++++---- e2fsck/problemP.h | 1 + e2fsck/unix.c | 11 +++++++++++ 6 files changed, 66 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/e2fsck/e2fsck.8.in b/e2fsck/e2fsck.8.in index f5ed758..43ee063 100644 --- a/e2fsck/e2fsck.8.in +++ b/e2fsck/e2fsck.8.in @@ -207,6 +207,18 @@ option may prevent you from further manual data recovery. .BI nodiscard Do not attempt to discard free blocks and unused inode blocks. This option is exactly the opposite of discard option. This is set as default. +.TP +.BI strict_csums +Verify each metadata object's checksum before checking anything other fields +in the metadata object. If the verification fails, offer to clear the item, +also before checking any of the other fields. This option causes e2fsck to +favor throwing away broken objects over trying to salvage them. +.TP +.BI no_strict_csums +Perform all regular checks of a metadata object and only verify the checksum if +no problems were found. This option causes e2fsck to try to salvage slightly +damaged metadata objects, at the cost of spending processing time on recovering +data. This is set as the default. .RE .TP .B \-f diff --git a/e2fsck/e2fsck.conf.5.in b/e2fsck/e2fsck.conf.5.in index 9ebfbbf..a8219a8 100644 --- a/e2fsck/e2fsck.conf.5.in +++ b/e2fsck/e2fsck.conf.5.in @@ -222,6 +222,26 @@ If this boolean relation is true, e2fsck will run as if the option .B -v is always specified. This will cause e2fsck to print some additional information at the end of each full file system check. +.TP +.I strict_csums +If this boolean relation is true, e2fsck will run as if +.B -E strict_csums +is set. This causes e2fsck to verify each metadata object's checksum before +checking anything other fields in the metadata object. If the verification +fails, offer to clear the item, also before checking any of the other fields. +This option causes e2fsck to favor throwing away broken objects over trying to +salvage them. +.IP +If the boolean relation is false, e2fsck will run as if +.B -E no_strict_csums +is set. In this case, e2fsck will perform all regular checks of a metadata +object and only verify the checksum if no problems were found. This option +causes e2fsck to try to salvage slightly damaged metadata objects, at the cost +of spending processing time on recovering data. +.IP +The default is for e2fsck to behave as if +.B -E no_strict_csums +is set. .SH THE [problems] STANZA Each tag in the .I [problems] diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h index dbd6ea8..d7a7be9 100644 --- a/e2fsck/e2fsck.h +++ b/e2fsck/e2fsck.h @@ -167,6 +167,7 @@ struct resource_track { #define E2F_OPT_FRAGCHECK 0x0800 #define E2F_OPT_JOURNAL_ONLY 0x1000 /* only replay the journal */ #define E2F_OPT_DISCARD 0x2000 +#define E2F_OPT_CSUM_FIRST 0x4000 /* * E2fsck flags diff --git a/e2fsck/problem.c b/e2fsck/problem.c index 7f0ad6c..3683dd4 100644 --- a/e2fsck/problem.c +++ b/e2fsck/problem.c @@ -970,7 +970,7 @@ static struct e2fsck_problem problem_table[] = { /* inode checksum does not match inode */ { PR_1_INODE_CSUM_INVALID, N_("@i %i checksum does not match @i. "), - PROMPT_CLEAR, PR_PREEN_OK }, + PROMPT_CLEAR, PR_PREEN_OK | PR_INITIAL_CSUM }, /* inode passes checks, but checksum does not match inode */ { PR_1_INODE_ONLY_CSUM_INVALID, @@ -981,7 +981,7 @@ static struct e2fsck_problem problem_table[] = { { PR_1_EXTENT_CSUM_INVALID, N_("@i %i extent block checksum does not match extent\n\t(logical @b " "%c, @n physical @b %b, len %N)\n"), - PROMPT_CLEAR, 0 }, + PROMPT_CLEAR, PR_INITIAL_CSUM }, /* * Inode extent block passes checks, but checksum does not match @@ -996,7 +996,7 @@ static struct e2fsck_problem problem_table[] = { { PR_1_EA_BLOCK_CSUM_INVALID, N_("Extended attribute @a @b %b checksum for @i %i does not " "match. "), - PROMPT_CLEAR, 0 }, + PROMPT_CLEAR, PR_INITIAL_CSUM }, /* * Extended attribute block passes checks, but checksum for inode does @@ -1470,7 +1470,7 @@ static struct e2fsck_problem problem_table[] = { /* leaf node fails checksum */ { PR_2_LEAF_NODE_CSUM_INVALID, N_("@d @i %i, %B, offset %N: @d fails checksum\n"), - PROMPT_SALVAGE, PR_PREEN_OK }, + PROMPT_SALVAGE, PR_PREEN_OK | PR_INITIAL_CSUM }, /* leaf node has no checksum */ { PR_2_LEAF_NODE_MISSING_CSUM, @@ -2030,6 +2030,23 @@ int fix_problem(e2fsck_t ctx, problem_t code, struct problem_context *pctx) } if (ctx->logf && message) print_e2fsck_message(ctx->logf, ctx, message, pctx, 1, 0); + /* + * If there is a problem with the initial csum verification and the + * user told e2fsck to verify csums /after/ checking everything else, + * then don't "fix" anything, just warn the user that the csum failed + * and that sanity checks are about to be run. + */ + if ((ptr->flags & PR_INITIAL_CSUM) && + !(ctx->options & E2F_OPT_CSUM_FIRST)) { + if (*message) { + print_e2fsck_message(stdout, ctx, + "Running sanity checks.\n", pctx, 1, 0); + if (ctx->logf) + print_e2fsck_message(ctx->logf, ctx, + "Running sanity checks.\n", pctx, 1, 0); + } + return 0; + } if (!(ptr->flags & PR_PREEN_OK) && (ptr->prompt != PROMPT_NONE)) preenhalt(ctx); diff --git a/e2fsck/problemP.h b/e2fsck/problemP.h index 7944cd6..a983598 100644 --- a/e2fsck/problemP.h +++ b/e2fsck/problemP.h @@ -44,3 +44,4 @@ struct latch_descr { #define PR_CONFIG 0x080000 /* This problem has been customized from the config file */ #define PR_FORCE_NO 0x100000 /* Force the answer to be no */ +#define PR_INITIAL_CSUM 0x200000 /* User can ignore initial csum check */ diff --git a/e2fsck/unix.c b/e2fsck/unix.c index b39383d..c6cdb49 100644 --- a/e2fsck/unix.c +++ b/e2fsck/unix.c @@ -692,6 +692,10 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts) else ctx->log_fn = string_copy(ctx, arg, 0); continue; + } else if (strcmp(token, "strict_csums") == 0) { + ctx->options |= E2F_OPT_CSUM_FIRST; + } else if (strcmp(token, "no_strict_csums") == 0) { + ctx->options &= ~E2F_OPT_CSUM_FIRST; } else { fprintf(stderr, _("Unknown extended option: %s\n"), token); @@ -710,6 +714,8 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts) fputs(("\tjournal_only\n"), stderr); fputs(("\tdiscard\n"), stderr); fputs(("\tnodiscard\n"), stderr); + fputs(("\tstrict_csums\n"), stderr); + fputs(("\tno_strict_csums\n"), stderr); fputc('\n', stderr); exit(1); } @@ -945,6 +951,11 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx) profile_set_syntax_err_cb(syntax_err_report); profile_init(config_fn, &ctx->profile); + profile_get_boolean(ctx->profile, "options", "strict_csums", NULL, + 0, &c); + if (c) + ctx->options |= E2F_OPT_CSUM_FIRST; + profile_get_boolean(ctx->profile, "options", "report_time", 0, 0, &c); if (c)