diff mbox

[RFC] ext4: don't clear orphan list on ro mount with errors

Message ID 503BCA24.7050100@redhat.com
State Accepted, archived
Headers show

Commit Message

Eric Sandeen Aug. 27, 2012, 7:27 p.m. UTC
When we have a filesystem with an orphan inode list *and* in error
state, things behave differently if:

1) e2fsck -p is done prior to mount: e2fsck fixes things and exits
   happily (barring other significant problems)

vs.

2) mount is done first, then e2fsck -p: due to the orphan inode
   list removal, more errors are found and e2fsck exits with
   UNEXPECTED INCONSISTENCY.

The 2nd case above, on the root filesystem, has the tendency to halt
the boot process, which is unfortunate.

The situation can be improved by not clearing the orphan
inode list when the fs is mounted readonly.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Andreas Dilger Aug. 27, 2012, 11:31 p.m. UTC | #1
On 2012-08-27, at 1:27 PM, Eric Sandeen wrote:
> When we have a filesystem with an orphan inode list *and* in error
> state, things behave differently if:
> 
> 1) e2fsck -p is done prior to mount: e2fsck fixes things and exits
>   happily (barring other significant problems)
> 
> vs.
> 
> 2) mount is done first, then e2fsck -p: due to the orphan inode
>   list removal, more errors are found and e2fsck exits with
>   UNEXPECTED INCONSISTENCY.
> 
> The 2nd case above, on the root filesystem, has the tendency to halt
> the boot process, which is unfortunate.

I think the reasoning is that if the filesystem is corrupted, then
processing the orphan list may introduce further corruption.  If one
has to run a full e2fsck run anyway, then there is no benefit to be
had from processing the orphan list in advance, and a potential
downside (e.g. corrupt inode in the list pointing to some valid inode
and causing it to be deleted).

That said, it depends on how robust the orphan handling code is -
if it won't get confused and delete an in-use inode (i.e. dtime == 0)
then it probably is OK.  I wouldn't trust the inode bitmaps to determine
if the inode is in use or not, only whether it is referenced by some
directory.

That said, no value in trying to clear the orphan list on a read-only fs,
so I think you patch is OK.

Acked-by: Andreas Dilger <adilger@dilger.ca>

> The situation can be improved by not clearing the orphan
> inode list when the fs is mounted readonly.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 2d51cd9..2e1ea01 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -2165,10 +2165,12 @@ static void ext4_orphan_cleanup(struct super_block *sb,
> 	}
> 
> 	if (EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS) {
> -		if (es->s_last_orphan)
> +		/* don't clear list on RO mount w/ errors */
> +		if (es->s_last_orphan && !(s_flags & MS_RDONLY)) {
> 			jbd_debug(1, "Errors on filesystem, "
> 				  "clearing orphan list.\n");
> -		es->s_last_orphan = 0;
> +			es->s_last_orphan = 0;
> +		}
> 		jbd_debug(1, "Skipping orphan recovery on fs with errors.\n");
> 		return;
> 	}
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Aug. 27, 2012, 11:35 p.m. UTC | #2
On 8/27/12 6:31 PM, Andreas Dilger wrote:
> On 2012-08-27, at 1:27 PM, Eric Sandeen wrote:
>> When we have a filesystem with an orphan inode list *and* in error
>> state, things behave differently if:
>>
>> 1) e2fsck -p is done prior to mount: e2fsck fixes things and exits
>>   happily (barring other significant problems)
>>
>> vs.
>>
>> 2) mount is done first, then e2fsck -p: due to the orphan inode
>>   list removal, more errors are found and e2fsck exits with
>>   UNEXPECTED INCONSISTENCY.
>>
>> The 2nd case above, on the root filesystem, has the tendency to halt
>> the boot process, which is unfortunate.
> 
> I think the reasoning is that if the filesystem is corrupted, then
> processing the orphan list may introduce further corruption.  If one
> has to run a full e2fsck run anyway, then there is no benefit to be
> had from processing the orphan list in advance, and a potential
> downside (e.g. corrupt inode in the list pointing to some valid inode
> and causing it to be deleted).
> 
> That said, it depends on how robust the orphan handling code is -
> if it won't get confused and delete an in-use inode (i.e. dtime == 0)
> then it probably is OK.  I wouldn't trust the inode bitmaps to determine
> if the inode is in use or not, only whether it is referenced by some
> directory.

What's interesting, though, is that e2fsck trusts the orphan inode list
even in the ERROR_FS case.  Seems inconsistent with the kernel, I guess,
although e2fsck will only be processing it, not adding to it... *shrug*

> That said, no value in trying to clear the orphan list on a read-only fs,
> so I think you patch is OK.
> 
> Acked-by: Andreas Dilger <adilger@dilger.ca>

Thanks,
-Eric
 

>> The situation can be improved by not clearing the orphan
>> inode list when the fs is mounted readonly.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 2d51cd9..2e1ea01 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -2165,10 +2165,12 @@ static void ext4_orphan_cleanup(struct super_block *sb,
>> 	}
>>
>> 	if (EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS) {
>> -		if (es->s_last_orphan)
>> +		/* don't clear list on RO mount w/ errors */
>> +		if (es->s_last_orphan && !(s_flags & MS_RDONLY)) {
>> 			jbd_debug(1, "Errors on filesystem, "
>> 				  "clearing orphan list.\n");
>> -		es->s_last_orphan = 0;
>> +			es->s_last_orphan = 0;
>> +		}
>> 		jbd_debug(1, "Skipping orphan recovery on fs with errors.\n");
>> 		return;
>> 	}
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Cheers, Andreas
> 
> 
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Sept. 27, 2012, 3:32 a.m. UTC | #3
On Mon, Aug 27, 2012 at 02:27:32PM -0500, Eric Sandeen wrote:
> When we have a filesystem with an orphan inode list *and* in error
> state, things behave differently if:
> 
> 1) e2fsck -p is done prior to mount: e2fsck fixes things and exits
>    happily (barring other significant problems)
> 
> vs.
> 
> 2) mount is done first, then e2fsck -p: due to the orphan inode
>    list removal, more errors are found and e2fsck exits with
>    UNEXPECTED INCONSISTENCY.
> 
> The 2nd case above, on the root filesystem, has the tendency to halt
> the boot process, which is unfortunate.
> 
> The situation can be improved by not clearing the orphan
> inode list when the fs is mounted readonly.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

I've applied this commit since I agree with Jan's observation that if
the file system is mounted read-only, we should try to minimize
changes to it if it contains errors.  I have modified the commit
description though:

ext4: don't clear orphan list on ro mount with errors

From: Eric Sandeen <sandeen@redhat.com>

If the file system contains errors and it is being mounted read-only,
don't clear the orphan list.  We should minimize changes to the file
system if it is mounted read-only.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Sept. 27, 2012, 4:32 a.m. UTC | #4
On 9/26/12 10:32 PM, Theodore Ts'o wrote:
> On Mon, Aug 27, 2012 at 02:27:32PM -0500, Eric Sandeen wrote:
>> When we have a filesystem with an orphan inode list *and* in error
>> state, things behave differently if:
>>
>> 1) e2fsck -p is done prior to mount: e2fsck fixes things and exits
>>    happily (barring other significant problems)
>>
>> vs.
>>
>> 2) mount is done first, then e2fsck -p: due to the orphan inode
>>    list removal, more errors are found and e2fsck exits with
>>    UNEXPECTED INCONSISTENCY.
>>
>> The 2nd case above, on the root filesystem, has the tendency to halt
>> the boot process, which is unfortunate.
>>
>> The situation can be improved by not clearing the orphan
>> inode list when the fs is mounted readonly.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> 
> I've applied this commit since I agree with Jan's observation that if
> the file system is mounted read-only, we should try to minimize
> changes to it if it contains errors.  I have modified the commit
> description though:

Fair enough, thanks.

-Eric

> ext4: don't clear orphan list on ro mount with errors
> 
> From: Eric Sandeen <sandeen@redhat.com>
> 
> If the file system contains errors and it is being mounted read-only,
> don't clear the orphan list.  We should minimize changes to the file
> system if it is mounted read-only.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 2d51cd9..2e1ea01 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2165,10 +2165,12 @@  static void ext4_orphan_cleanup(struct super_block *sb,
 	}
 
 	if (EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS) {
-		if (es->s_last_orphan)
+		/* don't clear list on RO mount w/ errors */
+		if (es->s_last_orphan && !(s_flags & MS_RDONLY)) {
 			jbd_debug(1, "Errors on filesystem, "
 				  "clearing orphan list.\n");
-		es->s_last_orphan = 0;
+			es->s_last_orphan = 0;
+		}
 		jbd_debug(1, "Skipping orphan recovery on fs with errors.\n");
 		return;
 	}