Message ID | 20200203140458.37397-3-yi.zhang@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | None | expand |
On Mon 03-02-20 22:04:58, zhangyi (F) wrote: > Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from > an older transaction") set the BH_Freed flag when forgetting a metadata > buffer which belongs to the committing transaction, it indicate the > committing process clear dirty bits when it is done with the buffer. But > it also clear the BH_Mapped flag at the same time, which may trigger > below NULL pointer oops when block_size < PAGE_SIZE. > > rmdir 1 kjournald2 mkdir 2 > jbd2_journal_commit_transaction > commit transaction N > jbd2_journal_forget > set_buffer_freed(bh1) > jbd2_journal_commit_transaction > commit transaction N+1 > ... > clear_buffer_mapped(bh1) > ext4_getblk(bh2 ummapped) > ... > grow_dev_page > init_page_buffers > bh1->b_private=NULL > bh2->b_private=NULL > jbd2_journal_put_journal_head(jh1) > __journal_remove_journal_head(hb1) > jh1 is NULL and trigger oops > > *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has > already been unmapped. > > For the metadata buffer we forgetting, clear the dirty flags is enough, > so this patch add BH_Unmap flag for the journal_unmap_buffer() case and > keep the mapped flag for the metadata buffer. > > Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") > Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Good spotting! Thanks for the patch. Some comments below: > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index 6396fe70085b..a649cdd1c5e5 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -987,10 +987,13 @@ void jbd2_journal_commit_transaction(journal_t *journal) > if (buffer_freed(bh) && !jh->b_next_transaction) { > clear_buffer_freed(bh); > clear_buffer_jbddirty(bh); > - clear_buffer_mapped(bh); > - clear_buffer_new(bh); > - clear_buffer_req(bh); > - bh->b_bdev = NULL; > + if (buffer_unmap(bh)) { > + clear_buffer_unmap(bh); > + clear_buffer_mapped(bh); > + clear_buffer_new(bh); > + clear_buffer_req(bh); > + bh->b_bdev = NULL; > + } Any reason why you don't want to clear buffer_req and buffer_new flags for all buffers as well? I agree that b_bdev setting and buffer_mapped need special treatment. Also rather than introducing this new buffer_unmap bit, I'd use the fact this special treatment is needed only for buffers coming from the block device mapping. And we can check for that like: /* * We can (and need to) unmap buffer only for normal mappings. * Block device buffers need to stay mapped all the time. * We need to be careful about the check because the page * mapping can get cleared under our hands. */ mapping = READ_ONCE(bh->b_page->mapping); if (mapping && !sb_is_blkdev_sb(mapping->host->i_sb)) { ... } Longer term, we might want to rework how the handling of truncated buffers works with JDB2. There's lots of duplication between jbd2_journal_forget() and jbd2_journal_unmap_buffer(), the dirtiness is tracked in jh->b_modified as well as buffer_jbddirty() and it is further redundant with the journal list the buffer is currently on. So I suspect it could all be simplified if we took a fresh look at things. Honza
Thanks for the comments. On 2020/2/6 19:46, Jan Kara wrote: > On Mon 03-02-20 22:04:58, zhangyi (F) wrote: [..] >> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c >> index 6396fe70085b..a649cdd1c5e5 100644 >> --- a/fs/jbd2/commit.c >> +++ b/fs/jbd2/commit.c >> @@ -987,10 +987,13 @@ void jbd2_journal_commit_transaction(journal_t *journal) >> if (buffer_freed(bh) && !jh->b_next_transaction) { >> clear_buffer_freed(bh); >> clear_buffer_jbddirty(bh); >> - clear_buffer_mapped(bh); >> - clear_buffer_new(bh); >> - clear_buffer_req(bh); >> - bh->b_bdev = NULL; >> + if (buffer_unmap(bh)) { >> + clear_buffer_unmap(bh); >> + clear_buffer_mapped(bh); >> + clear_buffer_new(bh); >> + clear_buffer_req(bh); >> + bh->b_bdev = NULL; >> + } > > Any reason why you don't want to clear buffer_req and buffer_new flags for > all buffers as well? I agree that b_bdev setting and buffer_mapped need > special treatment. > IIUC, for the buffer coming from jbd2_journal_forget() is always 'block device backed' metadata buffer (not pretty sure), and for these metadata buffer, buffer_new flag will not be set. At the same time, since it's always mapped, so it's fine to keep the buffer_req flag even it's freed by the filesystem now, because it means the block device has committed this buffer, and it seems that it does not affect we reuse this buffer. Am I missing something ? > Also rather than introducing this new buffer_unmap bit, I'd use the fact > this special treatment is needed only for buffers coming from the block device > mapping. And we can check for that like: > > /* > * We can (and need to) unmap buffer only for normal mappings. > * Block device buffers need to stay mapped all the time. > * We need to be careful about the check because the page > * mapping can get cleared under our hands. > */ > mapping = READ_ONCE(bh->b_page->mapping); > if (mapping && !sb_is_blkdev_sb(mapping->host->i_sb)) { > ... > } > It looks better, I will use this checking in the next iteration. > Longer term, we might want to rework how the handling of truncated buffers > works with JDB2. There's lots of duplication between jbd2_journal_forget() > and jbd2_journal_unmap_buffer(), the dirtiness is tracked in jh->b_modified > as well as buffer_jbddirty() and it is further redundant with the journal > list the buffer is currently on. So I suspect it could all be simplified if > we took a fresh look at things. > Indeed, it is tricky and not pretty easy to understand now, refactoring these is awesome int the future. Thanks, Yi.
On 2020/2/6 19:46, Jan Kara wrote: > On Mon 03-02-20 22:04:58, zhangyi (F) wrote: >> Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from >> an older transaction") set the BH_Freed flag when forgetting a metadata >> buffer which belongs to the committing transaction, it indicate the >> committing process clear dirty bits when it is done with the buffer. But >> it also clear the BH_Mapped flag at the same time, which may trigger >> below NULL pointer oops when block_size < PAGE_SIZE. >> >> rmdir 1 kjournald2 mkdir 2 >> jbd2_journal_commit_transaction >> commit transaction N >> jbd2_journal_forget >> set_buffer_freed(bh1) >> jbd2_journal_commit_transaction >> commit transaction N+1 >> ... >> clear_buffer_mapped(bh1) >> ext4_getblk(bh2 ummapped) >> ... >> grow_dev_page >> init_page_buffers >> bh1->b_private=NULL >> bh2->b_private=NULL >> jbd2_journal_put_journal_head(jh1) >> __journal_remove_journal_head(hb1) >> jh1 is NULL and trigger oops >> >> *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has >> already been unmapped. >> >> For the metadata buffer we forgetting, clear the dirty flags is enough, >> so this patch add BH_Unmap flag for the journal_unmap_buffer() case and >> keep the mapped flag for the metadata buffer. >> >> Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") >> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> [..] > > Also rather than introducing this new buffer_unmap bit, I'd use the fact > this special treatment is needed only for buffers coming from the block device > mapping. And we can check for that like: > > /* > * We can (and need to) unmap buffer only for normal mappings. > * Block device buffers need to stay mapped all the time. > * We need to be careful about the check because the page > * mapping can get cleared under our hands. > */ > mapping = READ_ONCE(bh->b_page->mapping); > if (mapping && !sb_is_blkdev_sb(mapping->host->i_sb)) { > ... > } Think about it again, it may missing clearing of mapped flag if 'mapping' of journalled data page was cleared, and finally trigger exception if we reuse the buffer again. So I think it should be: if (!(mapping && sb_is_blkdev_sb(mapping->host->i_sb))) { ... } Thanks, Yi.
On Thu 06-02-20 23:28:01, zhangyi (F) wrote: > Thanks for the comments. > > On 2020/2/6 19:46, Jan Kara wrote: > > On Mon 03-02-20 22:04:58, zhangyi (F) wrote: > [..] > >> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > >> index 6396fe70085b..a649cdd1c5e5 100644 > >> --- a/fs/jbd2/commit.c > >> +++ b/fs/jbd2/commit.c > >> @@ -987,10 +987,13 @@ void jbd2_journal_commit_transaction(journal_t *journal) > >> if (buffer_freed(bh) && !jh->b_next_transaction) { > >> clear_buffer_freed(bh); > >> clear_buffer_jbddirty(bh); > >> - clear_buffer_mapped(bh); > >> - clear_buffer_new(bh); > >> - clear_buffer_req(bh); > >> - bh->b_bdev = NULL; > >> + if (buffer_unmap(bh)) { > >> + clear_buffer_unmap(bh); > >> + clear_buffer_mapped(bh); > >> + clear_buffer_new(bh); > >> + clear_buffer_req(bh); > >> + bh->b_bdev = NULL; > >> + } > > > > Any reason why you don't want to clear buffer_req and buffer_new flags for > > all buffers as well? I agree that b_bdev setting and buffer_mapped need > > special treatment. > > > IIUC, for the buffer coming from jbd2_journal_forget() is always 'block > device backed' metadata buffer (not pretty sure), and for these metadata Yes, it is. > buffer, buffer_new flag will not be set. At the same time, since it's > always mapped, so it's fine to keep the buffer_req flag even it's freed > by the filesystem now, because it means the block device has committed > this buffer, and it seems that it does not affect we reuse this buffer. > Am I missing something ? OK, you're right that buffer_new shouldn't be ever set for block backed buffers and we don't care about buffer_req. So let's keep the split of bits to clear as you did and just add a comment that for block device buffers it is enough to clear buffer_jbddirty and buffer_freed, for file mapping buffers (i.e., journalled data) we have to be more careful and clear more bits. Honza
On Tue 11-02-20 14:51:10, zhangyi (F) wrote: > On 2020/2/6 19:46, Jan Kara wrote: > > On Mon 03-02-20 22:04:58, zhangyi (F) wrote: > >> Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from > >> an older transaction") set the BH_Freed flag when forgetting a metadata > >> buffer which belongs to the committing transaction, it indicate the > >> committing process clear dirty bits when it is done with the buffer. But > >> it also clear the BH_Mapped flag at the same time, which may trigger > >> below NULL pointer oops when block_size < PAGE_SIZE. > >> > >> rmdir 1 kjournald2 mkdir 2 > >> jbd2_journal_commit_transaction > >> commit transaction N > >> jbd2_journal_forget > >> set_buffer_freed(bh1) > >> jbd2_journal_commit_transaction > >> commit transaction N+1 > >> ... > >> clear_buffer_mapped(bh1) > >> ext4_getblk(bh2 ummapped) > >> ... > >> grow_dev_page > >> init_page_buffers > >> bh1->b_private=NULL > >> bh2->b_private=NULL > >> jbd2_journal_put_journal_head(jh1) > >> __journal_remove_journal_head(hb1) > >> jh1 is NULL and trigger oops > >> > >> *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has > >> already been unmapped. > >> > >> For the metadata buffer we forgetting, clear the dirty flags is enough, > >> so this patch add BH_Unmap flag for the journal_unmap_buffer() case and > >> keep the mapped flag for the metadata buffer. > >> > >> Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") > >> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> > [..] > > > > Also rather than introducing this new buffer_unmap bit, I'd use the fact > > this special treatment is needed only for buffers coming from the block device > > mapping. And we can check for that like: > > > > /* > > * We can (and need to) unmap buffer only for normal mappings. > > * Block device buffers need to stay mapped all the time. > > * We need to be careful about the check because the page > > * mapping can get cleared under our hands. > > */ > > mapping = READ_ONCE(bh->b_page->mapping); > > if (mapping && !sb_is_blkdev_sb(mapping->host->i_sb)) { > > ... > > } > > Think about it again, it may missing clearing of mapped flag if 'mapping' > of journalled data page was cleared, and finally trigger exception if > we reuse the buffer again. So I think it should be: > > if (!(mapping && sb_is_blkdev_sb(mapping->host->i_sb))) { > ... > } Well, if b_page->mapping got cleared, it means the page got fully truncated and in such case buffers can never be reused - the page and buffers will be freed once we are done with them. So what you are concerned about cannot happen. But you're right it is good to explain this in the comment. Honza
Hi, On 2020/2/12 16:47, Jan Kara wrote: > On Tue 11-02-20 14:51:10, zhangyi (F) wrote: >> On 2020/2/6 19:46, Jan Kara wrote: >>> On Mon 03-02-20 22:04:58, zhangyi (F) wrote: >>>> Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from >>>> an older transaction") set the BH_Freed flag when forgetting a metadata >>>> buffer which belongs to the committing transaction, it indicate the >>>> committing process clear dirty bits when it is done with the buffer. But >>>> it also clear the BH_Mapped flag at the same time, which may trigger >>>> below NULL pointer oops when block_size < PAGE_SIZE. >>>> >>>> rmdir 1 kjournald2 mkdir 2 >>>> jbd2_journal_commit_transaction >>>> commit transaction N >>>> jbd2_journal_forget >>>> set_buffer_freed(bh1) >>>> jbd2_journal_commit_transaction >>>> commit transaction N+1 >>>> ... >>>> clear_buffer_mapped(bh1) >>>> ext4_getblk(bh2 ummapped) >>>> ... >>>> grow_dev_page >>>> init_page_buffers >>>> bh1->b_private=NULL >>>> bh2->b_private=NULL >>>> jbd2_journal_put_journal_head(jh1) >>>> __journal_remove_journal_head(hb1) >>>> jh1 is NULL and trigger oops >>>> >>>> *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has >>>> already been unmapped. >>>> >>>> For the metadata buffer we forgetting, clear the dirty flags is enough, >>>> so this patch add BH_Unmap flag for the journal_unmap_buffer() case and >>>> keep the mapped flag for the metadata buffer. >>>> >>>> Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") >>>> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> >> [..] >>> >>> Also rather than introducing this new buffer_unmap bit, I'd use the fact >>> this special treatment is needed only for buffers coming from the block device >>> mapping. And we can check for that like: >>> >>> /* >>> * We can (and need to) unmap buffer only for normal mappings. >>> * Block device buffers need to stay mapped all the time. >>> * We need to be careful about the check because the page >>> * mapping can get cleared under our hands. >>> */ >>> mapping = READ_ONCE(bh->b_page->mapping); >>> if (mapping && !sb_is_blkdev_sb(mapping->host->i_sb)) { >>> ... >>> } >> >> Think about it again, it may missing clearing of mapped flag if 'mapping' >> of journalled data page was cleared, and finally trigger exception if >> we reuse the buffer again. So I think it should be: >> >> if (!(mapping && sb_is_blkdev_sb(mapping->host->i_sb))) { >> ... >> } > > Well, if b_page->mapping got cleared, it means the page got fully truncated > and in such case buffers can never be reused - the page and buffers will be > freed once we are done with them. So what you are concerned about cannot > happen. But you're right it is good to explain this in the comment. > Yes, you are right, the page and buffer will be freed in release_buffer_page() and it seems there is no exception, I will send V3 to back to use the judgement condition as you suggested and add comments after tests. Thanks, Yi.
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 6396fe70085b..a649cdd1c5e5 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -987,10 +987,13 @@ void jbd2_journal_commit_transaction(journal_t *journal) if (buffer_freed(bh) && !jh->b_next_transaction) { clear_buffer_freed(bh); clear_buffer_jbddirty(bh); - clear_buffer_mapped(bh); - clear_buffer_new(bh); - clear_buffer_req(bh); - bh->b_bdev = NULL; + if (buffer_unmap(bh)) { + clear_buffer_unmap(bh); + clear_buffer_mapped(bh); + clear_buffer_new(bh); + clear_buffer_req(bh); + bh->b_bdev = NULL; + } } if (buffer_jbddirty(bh)) { diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index a479cbf8ae54..717964eec9d3 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -2335,6 +2335,7 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh, * should clear dirty bits when it is done with the buffer. */ set_buffer_freed(bh); + set_buffer_unmap(bh); if (journal->j_running_transaction && buffer_jbddirty(bh)) jh->b_next_transaction = journal->j_running_transaction; may_free = 0; diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index f613d8529863..f74906ebc73a 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -310,6 +310,7 @@ enum jbd_state_bits { = BH_PrivateStart, BH_JWrite, /* Being written to log (@@@ DEBUGGING) */ BH_Freed, /* Has been freed (truncated) */ + BH_Unmap, /* Has been freed and need to unmap */ BH_Revoked, /* Has been revoked from the log */ BH_RevokeValid, /* Revoked flag is valid */ BH_JBDDirty, /* Is dirty but journaled */ @@ -328,6 +329,7 @@ TAS_BUFFER_FNS(Revoked, revoked) BUFFER_FNS(RevokeValid, revokevalid) TAS_BUFFER_FNS(RevokeValid, revokevalid) BUFFER_FNS(Freed, freed) +BUFFER_FNS(Unmap, unmap) BUFFER_FNS(Shadow, shadow) BUFFER_FNS(Verified, verified)
Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") set the BH_Freed flag when forgetting a metadata buffer which belongs to the committing transaction, it indicate the committing process clear dirty bits when it is done with the buffer. But it also clear the BH_Mapped flag at the same time, which may trigger below NULL pointer oops when block_size < PAGE_SIZE. rmdir 1 kjournald2 mkdir 2 jbd2_journal_commit_transaction commit transaction N jbd2_journal_forget set_buffer_freed(bh1) jbd2_journal_commit_transaction commit transaction N+1 ... clear_buffer_mapped(bh1) ext4_getblk(bh2 ummapped) ... grow_dev_page init_page_buffers bh1->b_private=NULL bh2->b_private=NULL jbd2_journal_put_journal_head(jh1) __journal_remove_journal_head(hb1) jh1 is NULL and trigger oops *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has already been unmapped. For the metadata buffer we forgetting, clear the dirty flags is enough, so this patch add BH_Unmap flag for the journal_unmap_buffer() case and keep the mapped flag for the metadata buffer. Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> --- fs/jbd2/commit.c | 11 +++++++---- fs/jbd2/transaction.c | 1 + include/linux/jbd2.h | 2 ++ 3 files changed, 10 insertions(+), 4 deletions(-)