diff mbox series

[-next] jbd2: discard last transaction when commit block checksum broken in v2v3

Message ID 20210929035528.1990993-1-yebin10@huawei.com
State New
Headers show
Series [-next] jbd2: discard last transaction when commit block checksum broken in v2v3 | expand

Commit Message

yebin (H) Sept. 29, 2021, 3:55 a.m. UTC
Now, we meet an issue that commit block has broken checksum when cold reboot
device, that lead to mount failed.
The reason maybe only some sector store on disk, and then device power off.
But we calculate checksum with whole logic block.The data stored on disk can
only ensure the atomicity of sector level.
Actually, we already replay previous transactions. We can just discard last
transaction. As now, descriptor/revocation/commit/superblock has it's own
checksum.

Fixes:80b3767fbe15("jbd2: don't wipe the journal on a failed journal checksum")
Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 fs/jbd2/journal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Kara Nov. 22, 2021, 6:22 p.m. UTC | #1
On Wed 29-09-21 11:55:28, Ye Bin wrote:
> Now, we meet an issue that commit block has broken checksum when cold reboot
> device, that lead to mount failed.
> The reason maybe only some sector store on disk, and then device power off.
> But we calculate checksum with whole logic block.The data stored on disk can
> only ensure the atomicity of sector level.
> Actually, we already replay previous transactions. We can just discard last
> transaction. As now, descriptor/revocation/commit/superblock has it's own
> checksum.
> 
> Fixes:80b3767fbe15("jbd2: don't wipe the journal on a failed journal checksum")
> Signed-off-by: Ye Bin <yebin10@huawei.com>

Thanks for the patch. It seems to have fallen through the cracks. Sorry for
that.

> ---
>  fs/jbd2/journal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 35302bc192eb..a3dd7b757b3d 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -2080,7 +2080,7 @@ int jbd2_journal_load(journal_t *journal)
>  	if (jbd2_journal_recover(journal))
>  		goto recovery_error;
>  
> -	if (journal->j_failed_commit) {
> +	if (journal->j_failed_commit && !jbd2_journal_has_csum_v2or3(journal)) {

I guess this decision somewhat questionable. If the failed commit was
indeed the last one, I guess loosing the last transaction as you suggest is
a sensible thing to do. However if the checksum failed somewhere in the
middle of the journal because of a bitflip or something like that, we
probably don't want to loose that many transactions and rather want to do
fsck and try to recover as much data as possible... What do others think?

								Honza

>  		printk(KERN_ERR "JBD2: journal transaction %u on %s "
>  		       "is corrupt.\n", journal->j_failed_commit,
>  		       journal->j_devname);
> -- 
> 2.31.1
>
diff mbox series

Patch

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 35302bc192eb..a3dd7b757b3d 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2080,7 +2080,7 @@  int jbd2_journal_load(journal_t *journal)
 	if (jbd2_journal_recover(journal))
 		goto recovery_error;
 
-	if (journal->j_failed_commit) {
+	if (journal->j_failed_commit && !jbd2_journal_has_csum_v2or3(journal)) {
 		printk(KERN_ERR "JBD2: journal transaction %u on %s "
 		       "is corrupt.\n", journal->j_failed_commit,
 		       journal->j_devname);