diff mbox series

[1/3] jbd2: protect buffers release with j_checkpoint_mutex

Message ID 20210408113618.1033785-2-yi.zhang@huawei.com
State New
Headers show
Series ext4: fix two issue about bdev_try_to_free_page() | expand

Commit Message

Zhang Yi April 8, 2021, 11:36 a.m. UTC
There is a race between jbd2_journal_try_to_free_buffers() and
jbd2_journal_destroy(), so the jbd2_log_do_checkpoint() may still
missing to detect the buffer write io error flag and lead to filesystem
inconsistency.

jbd2_journal_try_to_free_buffers()     ext4_put_super()
                                        jbd2_journal_destroy()
  __jbd2_journal_remove_checkpoint()
  detect buffer write error              jbd2_log_do_checkpoint()
                                         jbd2_cleanup_journal_tail()
                                           <--- lead to inconsistency
  jbd2_journal_abort()

Fix this issue by add j_checkpoint_mutex to protect journal buffer
release on jbd2_journal_try_to_free_buffers().

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/jbd2/transaction.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jan Kara April 8, 2021, 1:45 p.m. UTC | #1
On Thu 08-04-21 19:36:16, Zhang Yi wrote:
> There is a race between jbd2_journal_try_to_free_buffers() and
> jbd2_journal_destroy(), so the jbd2_log_do_checkpoint() may still
> missing to detect the buffer write io error flag and lead to filesystem
> inconsistency.
> 
> jbd2_journal_try_to_free_buffers()     ext4_put_super()
>                                         jbd2_journal_destroy()
>   __jbd2_journal_remove_checkpoint()
>   detect buffer write error              jbd2_log_do_checkpoint()
>                                          jbd2_cleanup_journal_tail()
>                                            <--- lead to inconsistency
>   jbd2_journal_abort()
> 
> Fix this issue by add j_checkpoint_mutex to protect journal buffer
> release on jbd2_journal_try_to_free_buffers().
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Thanks for the patch Zhang. I agree with your problem analysis but I don't
think the solution is correct:

>  	J_ASSERT(PageLocked(page));
>  
> +	mutex_lock(&journal->j_checkpoint_mutex);

We cannot grab j_checkpoint_mutex inside jbd2_journal_try_to_free_buffers()
(or even ext4_releasepage()) because that function is called withe a page
lock which ranks below the checkpoint mutex - generally page locks are
acquired within a transaction and thus all locks required to start a
transaction (and j_checkpoint_mutex is one of them) rank above the page
lock.

Also even if the lock ordering was OK, grabbing j_checkpoint_mutex for
every page from memory reclaim just to close this rare race seems like a
performance overkill.

What we seem to need is a quick way of marking the journal as "IO error
occured" in __journal_try_to_free_buffer() before actually removing the
buffer from the checkpoint list. Perhaps this marking could even happen
already in __jbd2_journal_remove_checkpoint() and we can reuse it in
jbd2_log_do_checkpoint() for IO error handling as well... And then once we
are in a safer context, we can do:

	if (!is_journal_aborted(journal) && journal_io_error_happened(journal))
		jbd2_journal_abort(...)

								Honza

>  	head = page_buffers(page);
>  	bh = head;
>  	do {
> @@ -2163,6 +2164,7 @@ int jbd2_journal_try_to_free_buffers(journal_t *journal, struct page *page)
>  	if (has_write_io_error)
>  		jbd2_journal_abort(journal, -EIO);
>  
> +	mutex_unlock(&journal->j_checkpoint_mutex);
>  	return ret;
>  }
>  
> -- 
> 2.25.4
>
diff mbox series

Patch

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 9396666b7314..b935b20cbae4 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -2123,6 +2123,7 @@  int jbd2_journal_try_to_free_buffers(journal_t *journal, struct page *page)
 
 	J_ASSERT(PageLocked(page));
 
+	mutex_lock(&journal->j_checkpoint_mutex);
 	head = page_buffers(page);
 	bh = head;
 	do {
@@ -2163,6 +2164,7 @@  int jbd2_journal_try_to_free_buffers(journal_t *journal, struct page *page)
 	if (has_write_io_error)
 		jbd2_journal_abort(journal, -EIO);
 
+	mutex_unlock(&journal->j_checkpoint_mutex);
 	return ret;
 }