diff mbox series

[v2] jbd2: fix descriptor block checksum failed after format with lazy_journal_init=1

Message ID 02cc0284c5cb4c4f9cb5282302807bc2@hikvision.com
State Rejected
Headers show
Series [v2] jbd2: fix descriptor block checksum failed after format with lazy_journal_init=1 | expand

Commit Message

常凤楠 Sept. 8, 2020, 8:51 a.m. UTC
After the last commit, I found that some situations were not considered.
In last post, I only considered the case of descriptor block checksum failed,but there is others situations.
for example:

if you format with lazy_journal_init=1 first time, after mount a short time, you reboot machine, the layout of jbd2 may be like this:

journal Superblock|     [ transactions..... ]      |descriptor_block | data_blocks|  commmit_block | descriptor_block |  data_blocks| commmit_block|[more transactions...
-------------------------|     [some transactions]   |-------------------       transaction x          -----------------|-----------------          transaction x+1         ---------------|

and after reboot, you format with lazy_journal_init=1 second time, after mount a short time, you reboot machine again, the layout of jbd2 may be like this:

1.
journal Superblock|     [ transactions..... ]       |descriptor_block | data_blocks|  commmit_block | descriptor_block |  data_blocks| commmit_block|[more transactions...
-------------------------|     [some transactions]   |---------------          transaction x       ---------------------|
2.
journal Superblock|     [ transactions..... ]       |descriptor_block |  data_blocks | data_blocks  |    data_blocks   | commmit_block| commmit_block|[more transactions...
-------------------------|     [some transactions]   |-----------------------------------               transaction x                  --------------------------------|
3.
journal Superblock|     [ transactions..... ]      |descriptor_block |  data_blocks | data_blocks  |    commmit_block | data_blocks | commmit_block|[more transactions...
-------------------------|    [some transactions]   |--------------------------------     transaction x           ----------------------------|

In case one and two,will be recovery failed. So there is another patch:

 fs/jbd2/recovery.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 55 insertions(+), 5 deletions(-)

Comments

Andreas Dilger Sept. 8, 2020, 10:48 p.m. UTC | #1
On Sep 8, 2020, at 2:51 AM, 常凤楠 <changfengnan@hikvision.com> wrote:
> 
> After the last commit, I found that some situations were not considered.
> In last post, I only considered the case of descriptor block checksum failed,but there is others situations.

Thank you for the updated patch.  There are several things that need to be
fixed before the patch can be accepted:
- the whitespace of the patch is broken, since it looks like all tabs are lost
- the patch summary should be shorter than 64 characters, maybe like:
  "jbd2: avoid transaction reuse after reformatting" or similar, and then
  put a more complete description in the commit message
- the patch commit message should be shorter than 70 characters.  It would be
  much easier to read if you aligned the transaction blocks vertically, down
  the page, instead having such very long lines, like:

     case 1 journal        case 2 journal        case 3 journal
  ====================  ====================  ===================
  [journal superblock]  [journal superblock]
  [ transactions
      ....
        ....         ]
  [ descriptor block |
  |   data blocks
      ....
        ....         |
  |   commit block   ]
  [ more transactions
       ....
         ....        ]

- the commit message should describe only the patch itself, not the previous
  version of the patch or general comments.  Other comments can be written
  in a separate email that is a reply to the patch itself.
- please run your patch through "checkpatch.pl" to find any style issues
- you need to add a "Signed-off-by: 常凤楠 <changfengnan@hikvision.com>" line
  or maybe "Signed-off-by: 常凤楠 (Chang Fengnan) <changfengnan@hikvision.com>"

Some general comments on the patch inline.

> for example:
> 
> if you format with lazy_journal_init=1 first time, after mount a short time, you reboot machine, the layout of jbd2 may be like this:
> 
> journal Superblock|     [ transactions..... ]      |descriptor_block | data_blocks|  commmit_block | descriptor_block |  data_blocks| commmit_block|[more transactions...
> -------------------------|     [some transactions]   |-------------------       transaction x          -----------------|-----------------          transaction x+1         ---------------|
> 
> and after reboot, you format with lazy_journal_init=1 second time, after mount a short time, you reboot machine again, the layout of jbd2 may be like this:
> 
> 1.
> journal Superblock|     [ transactions..... ]       |descriptor_block | data_blocks|  commmit_block | descriptor_block |  data_blocks| commmit_block|[more transactions...
> -------------------------|     [some transactions]   |---------------          transaction x       ---------------------|
> 2.
> journal Superblock|     [ transactions..... ]       |descriptor_block |  data_blocks | data_blocks  |    data_blocks   | commmit_block| commmit_block|[more transactions...
> -------------------------|     [some transactions]   |-----------------------------------               transaction x                  --------------------------------|
> 3.
> journal Superblock|     [ transactions..... ]      |descriptor_block |  data_blocks | data_blocks  |    commmit_block | data_blocks | commmit_block|[more transactions...
> -------------------------|    [some transactions]   |--------------------------------     transaction x           ----------------------------|
> 
> In case one and two,will be recovery failed. So there is another patch:
> 
> fs/jbd2/recovery.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> index a4967b27ffb6..56198c2c1a04 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -33,6 +33,8 @@ struct recovery_info
> intnr_replays;
> intnr_revokes;
> intnr_revoke_hits;
> +int transaction_flag;

(minor) This should be an unsigned int, and there should be a #define'd
constant to describe what "0x1" means instead of using "0x1" in the code.
However, it may be better to make this "ri_commit_block", see below.

> +__be64  last_trans_commit_time;
> };
> 
> enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
> @@ -412,7 +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
> else
> return tag->t_checksum == cpu_to_be16(csum32);
> }
> +/*
> + * We check the commit time and compare it with the commit time of the previous transaction,

Please wrap lines at 80 columns.

> + * if it's smaller than previous, We think it's not belong to same journal.
> + */
> +static int is_same_journal(journal_t *journal,struct buffer_head *bh, unsigned long blocknr, __u64 last_commit_sec)

(minor) return type should be "bool"
(style) Wrap at 80 columns.

> +{
> +int commit_block_nr = blocknr + count_tags(journal, bh) + 1;

(defect) "commit_block_nr" should not be a signed int, or it may overflow
(style) it would be good to use a consistent naming for the block
number and the buffer head, like:

	unsigned long commit_block;
	struct buffer_head *commit_bh;

> +struct buffer_head *nbh;
> 
> +int err = jread(&nbh, journal, commit_block_nr);

(style) blank line after variable declarations

> +if (err)
> +return 1;

(defect?) Should the checksum of commit_bh be verified before use?
Otherwise, this may be a random block in the journal that has some
value > last_commit_sec, and not a commit block at all.  However,
it may be enough to avoid this extra commit block checksum by storing
the commit block number as ri_commit_block, and then checking it
is correct when the commit block is later processed.

> +
> +struct commit_header *cbh = (struct commit_header *)nbh->b_data;
> +__u64commit_sec = be64_to_cpu(cbh->h_commit_sec);

(style) no variable declarations in the middle of functions

> +
> +if(commit_sec < last_commit_sec)
> +return 0;
> +else
> +return 1;

(style) no need for "else" after "return".  That said, this function
could just return the boolean comparison value directly:

	return commit_sec >= last_commit_sec;

> +}
> static int do_one_pass(journal_t *journal,
> struct recovery_info *info, enum passtype pass)
> {
> @@ -514,18 +536,31 @@ static int do_one_pass(journal_t *journal,
> switch(blocktype) {
> case JBD2_DESCRIPTOR_BLOCK:
> /* Verify checksum first */
> +if(pass == PASS_SCAN) {
> +info->transaction_flag = 0x1;
> +}

(style) no need for {} around single-line block
(style) could this instead store the block number of the logical commit
block number that is referenced and verified by this descriptor block?
That would be better than just storing the "0x1" flag, to verify that
the proper checksum/timestamp was used by the descriptor block?

I had to re-indent the below code to understand the logic, but I think
I did it correctly.

> 		if (jbd2_journal_has_csum_v2or3(journal))
> 			descr_csum_size =
> 				sizeof(struct jbd2_journal_block_tail);
> 		if (descr_csum_size > 0 &&
> 		    !jbd2_descriptor_block_csum_verify(journal,
> 						       bh->b_data)) {
> -			printk(KERN_ERR "JBD2: Invalid checksum "
> +			if(is_same_journal(journal,bh,next_log_block-1,info->last_trans_commit_time)) {

(style) wrap lines at 80 columns.

> +				printk(KERN_ERR "JBD2: Invalid checksum "
>  				       "recovering block %lu in log\n",
>  				       next_log_block);

(style) console error strings should *not* be wrapped at 80 columns, so it
can be found more easily in the code (e.g. grep for "checksum recovering"
should be able to find this line of code).  This code was written before
the "do not wrap error strings" rule was added, but should be updated to
follow the new code style if modified.

> -			err = -EFSBADCRC;
> -			brelse(bh);
> -			goto failed;
> +				err = -EFSBADCRC;
> +				brelse(bh);
> +				goto failed;

> +			} else {
> +				/*if it's not belong to same journal, just end this recovery, return with success*/

(style) wrap at 80 columns

> +				printk(KERN_ERR "JBD2: Invalid checksum "
> +				       "found in block %lu in log, but not same journal %d\n",
> +				       next_log_block,next_commit_ID);

(style) do not wrap error strings
(minor) this shouldn't really be printed as an "error" to the user, since
it isn't an "error" at all, only a bad coincidence or a short transaction?
This should probably only be a jbd_debug() message or similar?

> +err = 0;
> +brelse(bh);
> +goto done;
> +}
> }
> 
> /* If it is a valid descriptor block, replay it
> @@ -688,6 +723,17 @@ static int do_one_pass(journal_t *journal,
>  * are present verify them in PASS_SCAN; else not
>  * much to do other than move on to the next sequence
>  * number. */
> +if(pass == PASS_SCAN) {
> +struct commit_header *cbh =
> +(struct commit_header *)bh->b_data;
> +if(info->transaction_flag != 0x1) {

(style) This should check "info->ri_commit_block" to verify that this block
is the correct commit block, which is safer than just checking "flag != 0x1".

> +jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
> +brelse(bh);
> +goto done;
> +}
> +info->transaction_flag = 0x0;
> +info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> +}
> if (pass == PASS_SCAN &&
>     jbd2_has_feature_checksum(journal)) {
> int chksum_err, chksum_seen;
> @@ -761,7 +807,11 @@ static int do_one_pass(journal_t *journal,
> brelse(bh);
> continue;
> }
> -
> +if (pass != PASS_SCAN && info->transaction_flag != 0x1) {
> +jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
> +brelse(bh);
> +goto done;
> +}
> err = scan_revoke_records(journal, bh,
>   next_commit_ID, info);
> brelse(bh);


Cheers, Andreas
diff mbox series

Patch

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index a4967b27ffb6..56198c2c1a04 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -33,6 +33,8 @@  struct recovery_info
 intnr_replays;
 intnr_revokes;
 intnr_revoke_hits;
+int transaction_flag;
+__be64  last_trans_commit_time;
 };

 enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
@@ -412,7 +414,27 @@  static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
 else
 return tag->t_checksum == cpu_to_be16(csum32);
 }
+/*
+ * We check the commit time and compare it with the commit time of the previous transaction,
+ * if it's smaller than previous, We think it's not belong to same journal.
+ */
+static int is_same_journal(journal_t *journal,struct buffer_head *bh, unsigned long blocknr, __u64 last_commit_sec)
+{
+int commit_block_nr = blocknr + count_tags(journal, bh) + 1;
+struct buffer_head *nbh;

+int err = jread(&nbh, journal, commit_block_nr);
+if (err)
+return 1;
+
+struct commit_header *cbh = (struct commit_header *)nbh->b_data;
+__u64commit_sec = be64_to_cpu(cbh->h_commit_sec);
+
+if(commit_sec < last_commit_sec)
+return 0;
+else
+return 1;
+}
 static int do_one_pass(journal_t *journal,
 struct recovery_info *info, enum passtype pass)
 {
@@ -514,18 +536,31 @@  static int do_one_pass(journal_t *journal,
 switch(blocktype) {
 case JBD2_DESCRIPTOR_BLOCK:
 /* Verify checksum first */
+if(pass == PASS_SCAN) {
+info->transaction_flag = 0x1;
+}
 if (jbd2_journal_has_csum_v2or3(journal))
 descr_csum_size =
 sizeof(struct jbd2_journal_block_tail);
 if (descr_csum_size > 0 &&
     !jbd2_descriptor_block_csum_verify(journal,
        bh->b_data)) {
-printk(KERN_ERR "JBD2: Invalid checksum "
+if(is_same_journal(journal,bh,next_log_block-1,info->last_trans_commit_time)) {
+printk(KERN_ERR "JBD2: Invalid checksum "
        "recovering block %lu in log\n",
        next_log_block);
-err = -EFSBADCRC;
-brelse(bh);
-goto failed;
+err = -EFSBADCRC;
+brelse(bh);
+goto failed;
+} else {
+/*if it's not belong to same journal, just end this recovery, return with success*/
+printk(KERN_ERR "JBD2: Invalid checksum "
+       "found in block %lu in log, but not same journal %d\n",
+       next_log_block,next_commit_ID);
+err = 0;
+brelse(bh);
+goto done;
+}
 }

 /* If it is a valid descriptor block, replay it
@@ -688,6 +723,17 @@  static int do_one_pass(journal_t *journal,
  * are present verify them in PASS_SCAN; else not
  * much to do other than move on to the next sequence
  * number. */
+if(pass == PASS_SCAN) {
+struct commit_header *cbh =
+(struct commit_header *)bh->b_data;
+if(info->transaction_flag != 0x1) {
+jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
+brelse(bh);
+goto done;
+}
+info->transaction_flag = 0x0;
+info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
+}
 if (pass == PASS_SCAN &&
     jbd2_has_feature_checksum(journal)) {
 int chksum_err, chksum_seen;
@@ -761,7 +807,11 @@  static int do_one_pass(journal_t *journal,
 brelse(bh);
 continue;
 }
-
+if (pass != PASS_SCAN && info->transaction_flag != 0x1) {
+jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
+brelse(bh);
+goto done;
+}
 err = scan_revoke_records(journal, bh,
   next_commit_ID, info);
 brelse(bh);