Message ID | 20240624170127.3253-3-jack@suse.cz |
---|---|
State | Awaiting Upstream |
Headers | show |
Series | jbd2: Avoid infinite transaction commit loop | expand |
On 2024/6/25 1:01, Jan Kara wrote: > Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into > t_outstanding_credits") started to account descriptor blocks into > transactions outstanding credits. However it didn't appropriately > decrease the maximum amount of credits available to userspace. Thus if > the filesystem requests a transaction smaller than > j_max_transaction_buffers but large enough that when descriptor blocks > are added the size exceeds j_max_transaction_buffers, we confuse > add_transaction_credits() into thinking previous handles have grown the > transaction too much and enter infinite journal commit loop in > start_this_handle() -> add_transaction_credits() trying to create > transaction with enough credits available. > Hello Jan! I understand that the incorrect max transaction limit in start_this_handle() could lead to infinite loop in start_this_handle()-> add_transaction_credits() with large enough userspace credits (from j_max_transaction_buffers - overheads to j_max_transaction_buffers), but I don't get how could it lead to ran out of space in the journal commit traction? IIUC, below codes in add_transaction_credits() could make sure that we have enough space when committing traction: static int add_transaction_credits() { ... if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { ... return 1; ... } ... } I can't open and download the image Alexander gave, so I can't get to the bottom of this issue, please let me know what happened with jbd2_journal_next_log_block(). Thanks, Yi. > Fix the problem by properly accounting for transaction space reserved > for descriptor blocks when verifying requested transaction handle size. > > CC: stable@vger.kernel.org > Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") > Reported-by: Alexander Coffin <alex.coffin@maticrobots.com> > Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6Vyxtw@mail.gmail.com > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/jbd2/transaction.c | 21 ++++++++++++++------- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c > index a095f1a3114b..66513c18ca29 100644 > --- a/fs/jbd2/transaction.c > +++ b/fs/jbd2/transaction.c > @@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks) > wake_up(&journal->j_wait_reserved); > } > > +/* Maximum number of blocks for user transaction payload */ > +static int jbd2_max_user_trans_buffers(journal_t *journal) > +{ > + return journal->j_max_transaction_buffers - > + journal->j_transaction_overhead_buffers; > +} > + > /* > * Wait until we can add credits for handle to the running transaction. Called > * with j_state_lock held for reading. Returns 0 if handle joined the running > @@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock) > * big to fit this handle? Wait until reserved credits are freed. > */ > if (atomic_read(&journal->j_reserved_credits) + total > > - journal->j_max_transaction_buffers) { > + jbd2_max_user_trans_buffers(journal)) { > read_unlock(&journal->j_state_lock); > jbd2_might_wait_for_commit(journal); > wait_event(journal->j_wait_reserved, > atomic_read(&journal->j_reserved_credits) + total <= > - journal->j_max_transaction_buffers); > + jbd2_max_user_trans_buffers(journal)); > __acquire(&journal->j_state_lock); /* fake out sparse */ > return 1; > } > @@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock) > > needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits); > /* We allow at most half of a transaction to be reserved */ > - if (needed > journal->j_max_transaction_buffers / 2) { > + if (needed > jbd2_max_user_trans_buffers(journal) / 2) { > sub_reserved_credits(journal, rsv_blocks); > atomic_sub(total, &t->t_outstanding_credits); > read_unlock(&journal->j_state_lock); > jbd2_might_wait_for_commit(journal); > wait_event(journal->j_wait_reserved, > atomic_read(&journal->j_reserved_credits) + rsv_blocks > - <= journal->j_max_transaction_buffers / 2); > + <= jbd2_max_user_trans_buffers(journal) / 2); > __acquire(&journal->j_state_lock); /* fake out sparse */ > return 1; > } > @@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle, > * size and limit the number of total credits to not exceed maximum > * transaction size per operation. > */ > - if ((rsv_blocks > journal->j_max_transaction_buffers / 2) || > - (rsv_blocks + blocks > journal->j_max_transaction_buffers)) { > + if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 || > + rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) { > printk(KERN_ERR "JBD2: %s wants too many credits " > "credits:%d rsv_credits:%d max:%d\n", > current->comm, blocks, rsv_blocks, > - journal->j_max_transaction_buffers); > + jbd2_max_user_trans_buffers(journal)); > WARN_ON(1); > return -ENOSPC; > } >
On Wed 26-06-24 15:38:42, Zhang Yi wrote: > On 2024/6/25 1:01, Jan Kara wrote: > > Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into > > t_outstanding_credits") started to account descriptor blocks into > > transactions outstanding credits. However it didn't appropriately > > decrease the maximum amount of credits available to userspace. Thus if > > the filesystem requests a transaction smaller than > > j_max_transaction_buffers but large enough that when descriptor blocks > > are added the size exceeds j_max_transaction_buffers, we confuse > > add_transaction_credits() into thinking previous handles have grown the > > transaction too much and enter infinite journal commit loop in > > start_this_handle() -> add_transaction_credits() trying to create > > transaction with enough credits available. > > I understand that the incorrect max transaction limit in > start_this_handle() could lead to infinite loop in > start_this_handle()-> add_transaction_credits() with large enough > userspace credits (from j_max_transaction_buffers - overheads to > j_max_transaction_buffers), but I don't get how could it lead to ran > out of space in the journal commit traction? IIUC, below codes in > add_transaction_credits() could make sure that we have enough space > when committing traction: > > static int add_transaction_credits() > { > ... > if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { > ... > return 1; > ... > } > ... > } > > I can't open and download the image Alexander gave, so I can't get to > the bottom of this issue, please let me know what happened with > jbd2_journal_next_log_block(). Sure. So what was exactly happening is a loop like this: start_this_handle() blocks = 252 (handle->h_total_credits) - starts a new transaction - t_outstanding_credits set to 6 to account for the commit block and descriptor blocks add_transaction_credits(journal, 252, 0) needed = atomic_add_return(252, &t->t_outstanding_credits); if (needed > journal->j_max_transaction_buffers) { /* Yes, this is exceeded due to descriptor blocks being in * t_outstanding_credits */ ... wait_transaction_locked(journal); - this commits an empty transaction - contains only the commit block return 1 goto repeat So we commit single block transactions in a loop until we exhaust all the journal space. The condition in add_transaction_credits() whether there's enough space in the journal is never reached so we don't ever push the journal tail to make space in the journal. Honza
On 2024/6/26 19:22, Jan Kara wrote: > On Wed 26-06-24 15:38:42, Zhang Yi wrote: >> On 2024/6/25 1:01, Jan Kara wrote: >>> Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into >>> t_outstanding_credits") started to account descriptor blocks into >>> transactions outstanding credits. However it didn't appropriately >>> decrease the maximum amount of credits available to userspace. Thus if >>> the filesystem requests a transaction smaller than >>> j_max_transaction_buffers but large enough that when descriptor blocks >>> are added the size exceeds j_max_transaction_buffers, we confuse >>> add_transaction_credits() into thinking previous handles have grown the >>> transaction too much and enter infinite journal commit loop in >>> start_this_handle() -> add_transaction_credits() trying to create >>> transaction with enough credits available. >> >> I understand that the incorrect max transaction limit in >> start_this_handle() could lead to infinite loop in >> start_this_handle()-> add_transaction_credits() with large enough >> userspace credits (from j_max_transaction_buffers - overheads to >> j_max_transaction_buffers), but I don't get how could it lead to ran >> out of space in the journal commit traction? IIUC, below codes in >> add_transaction_credits() could make sure that we have enough space >> when committing traction: >> >> static int add_transaction_credits() >> { >> ... >> if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { >> ... >> return 1; >> ... >> } >> ... >> } >> >> I can't open and download the image Alexander gave, so I can't get to >> the bottom of this issue, please let me know what happened with >> jbd2_journal_next_log_block(). > > Sure. So what was exactly happening is a loop like this: > > start_this_handle() > blocks = 252 (handle->h_total_credits) > - starts a new transaction > - t_outstanding_credits set to 6 to account for the commit block and > descriptor blocks > add_transaction_credits(journal, 252, 0) > needed = atomic_add_return(252, &t->t_outstanding_credits); > if (needed > journal->j_max_transaction_buffers) { > /* Yes, this is exceeded due to descriptor blocks being in > * t_outstanding_credits */ > ... > wait_transaction_locked(journal); > - this commits an empty transaction - contains only the commit > block > return 1 > goto repeat > > So we commit single block transactions in a loop until we exhaust all the > journal space. The condition in add_transaction_credits() whether there's > enough space in the journal is never reached so we don't ever push the > journal tail to make space in the journal. > mm-hm, ha, yeah, this will lead to submit an empty transaction in each loop, but I still have one question, although the journal tail can't be updated in add_transaction_credits(), I think the journal tail should be updated in jbd2_journal_commit_transaction()->jbd2_update_log_tail() since we don't add empty transactions to journal->j_checkpoint_transactions list, the loop in jbd2_journal_commit_transaction() should like this: ... jbd2_journal_commit_transaction() update_tail = jbd2_journal_get_log_tail() //journal->j_checkpoint_transactions should be NULL, tid is the //committing transaction's tid, which should be large than the //tail tid since the second loop, so this should be true after //the second loop if (freed < journal->j_max_transaction_buffers) update_tail = 0; //update_tail should be true after j_max_transaction_buffers loop jbd2_update_log_tail() //j_free should be increased in each //j_max_transaction_buffers loop if (tid_gt(tid, journal->j_tail_sequence)) //it's true __jbd2_update_log_tail() journal->j_free += freed; //update log tail and increase j_free //j_max_transaction_buffers blocks ... As I understand it, the journal space can't be exhausted, I don't know how it happened, am I missing something? Thanks, Yi.
On Wed 26-06-24 21:24:13, Zhang Yi wrote: > On 2024/6/26 19:22, Jan Kara wrote: > > On Wed 26-06-24 15:38:42, Zhang Yi wrote: > >> On 2024/6/25 1:01, Jan Kara wrote: > >>> Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into > >>> t_outstanding_credits") started to account descriptor blocks into > >>> transactions outstanding credits. However it didn't appropriately > >>> decrease the maximum amount of credits available to userspace. Thus if > >>> the filesystem requests a transaction smaller than > >>> j_max_transaction_buffers but large enough that when descriptor blocks > >>> are added the size exceeds j_max_transaction_buffers, we confuse > >>> add_transaction_credits() into thinking previous handles have grown the > >>> transaction too much and enter infinite journal commit loop in > >>> start_this_handle() -> add_transaction_credits() trying to create > >>> transaction with enough credits available. > >> > >> I understand that the incorrect max transaction limit in > >> start_this_handle() could lead to infinite loop in > >> start_this_handle()-> add_transaction_credits() with large enough > >> userspace credits (from j_max_transaction_buffers - overheads to > >> j_max_transaction_buffers), but I don't get how could it lead to ran > >> out of space in the journal commit traction? IIUC, below codes in > >> add_transaction_credits() could make sure that we have enough space > >> when committing traction: > >> > >> static int add_transaction_credits() > >> { > >> ... > >> if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { > >> ... > >> return 1; > >> ... > >> } > >> ... > >> } > >> > >> I can't open and download the image Alexander gave, so I can't get to > >> the bottom of this issue, please let me know what happened with > >> jbd2_journal_next_log_block(). > > > > Sure. So what was exactly happening is a loop like this: > > > > start_this_handle() > > blocks = 252 (handle->h_total_credits) > > - starts a new transaction > > - t_outstanding_credits set to 6 to account for the commit block and > > descriptor blocks > > add_transaction_credits(journal, 252, 0) > > needed = atomic_add_return(252, &t->t_outstanding_credits); > > if (needed > journal->j_max_transaction_buffers) { > > /* Yes, this is exceeded due to descriptor blocks being in > > * t_outstanding_credits */ > > ... > > wait_transaction_locked(journal); > > - this commits an empty transaction - contains only the commit > > block > > return 1 > > goto repeat > > > > So we commit single block transactions in a loop until we exhaust all the > > journal space. The condition in add_transaction_credits() whether there's > > enough space in the journal is never reached so we don't ever push the > > journal tail to make space in the journal. > > > > mm-hm, ha, yeah, this will lead to submit an empty transaction in each loop, > but I still have one question, although the journal tail can't be updated > in add_transaction_credits(), I think the journal tail should be updated in > jbd2_journal_commit_transaction()->jbd2_update_log_tail() since we don't > add empty transactions to journal->j_checkpoint_transactions list, the loop > in jbd2_journal_commit_transaction() should like this: > > ... > jbd2_journal_commit_transaction() > update_tail = jbd2_journal_get_log_tail() > //journal->j_checkpoint_transactions should be NULL, tid is the > //committing transaction's tid, which should be large than the > //tail tid since the second loop, so this should be true after > //the second loop > if (freed < journal->j_max_transaction_buffers) > update_tail = 0; > //update_tail should be true after j_max_transaction_buffers loop > > jbd2_update_log_tail() //j_free should be increased in each > //j_max_transaction_buffers loop > if (tid_gt(tid, journal->j_tail_sequence)) //it's true > __jbd2_update_log_tail() > journal->j_free += freed; //update log tail and increase j_free > //j_max_transaction_buffers blocks > ... > > As I understand it, the journal space can't be exhausted, I don't know how > it happened, am I missing something? Well, at the beginning of the journal there are a few non-empty transactions whose buffers didn't get checkpointed before we entered the commit loop. So we cannot just push the journal tail without doing a checkpoint and we never call into the checkpointing code in the commit loop... Honza
On 2024/6/26 22:55, Jan Kara wrote: > On Wed 26-06-24 21:24:13, Zhang Yi wrote: >> On 2024/6/26 19:22, Jan Kara wrote: >>> On Wed 26-06-24 15:38:42, Zhang Yi wrote: >>>> On 2024/6/25 1:01, Jan Kara wrote: >>>>> Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into >>>>> t_outstanding_credits") started to account descriptor blocks into >>>>> transactions outstanding credits. However it didn't appropriately >>>>> decrease the maximum amount of credits available to userspace. Thus if >>>>> the filesystem requests a transaction smaller than >>>>> j_max_transaction_buffers but large enough that when descriptor blocks >>>>> are added the size exceeds j_max_transaction_buffers, we confuse >>>>> add_transaction_credits() into thinking previous handles have grown the >>>>> transaction too much and enter infinite journal commit loop in >>>>> start_this_handle() -> add_transaction_credits() trying to create >>>>> transaction with enough credits available. >>>> >>>> I understand that the incorrect max transaction limit in >>>> start_this_handle() could lead to infinite loop in >>>> start_this_handle()-> add_transaction_credits() with large enough >>>> userspace credits (from j_max_transaction_buffers - overheads to >>>> j_max_transaction_buffers), but I don't get how could it lead to ran >>>> out of space in the journal commit traction? IIUC, below codes in >>>> add_transaction_credits() could make sure that we have enough space >>>> when committing traction: >>>> >>>> static int add_transaction_credits() >>>> { >>>> ... >>>> if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { >>>> ... >>>> return 1; >>>> ... >>>> } >>>> ... >>>> } >>>> >>>> I can't open and download the image Alexander gave, so I can't get to >>>> the bottom of this issue, please let me know what happened with >>>> jbd2_journal_next_log_block(). >>> >>> Sure. So what was exactly happening is a loop like this: >>> >>> start_this_handle() >>> blocks = 252 (handle->h_total_credits) >>> - starts a new transaction >>> - t_outstanding_credits set to 6 to account for the commit block and >>> descriptor blocks >>> add_transaction_credits(journal, 252, 0) >>> needed = atomic_add_return(252, &t->t_outstanding_credits); >>> if (needed > journal->j_max_transaction_buffers) { >>> /* Yes, this is exceeded due to descriptor blocks being in >>> * t_outstanding_credits */ >>> ... >>> wait_transaction_locked(journal); >>> - this commits an empty transaction - contains only the commit >>> block >>> return 1 >>> goto repeat >>> >>> So we commit single block transactions in a loop until we exhaust all the >>> journal space. The condition in add_transaction_credits() whether there's >>> enough space in the journal is never reached so we don't ever push the >>> journal tail to make space in the journal. >>> >> >> mm-hm, ha, yeah, this will lead to submit an empty transaction in each loop, >> but I still have one question, although the journal tail can't be updated >> in add_transaction_credits(), I think the journal tail should be updated in >> jbd2_journal_commit_transaction()->jbd2_update_log_tail() since we don't >> add empty transactions to journal->j_checkpoint_transactions list, the loop >> in jbd2_journal_commit_transaction() should like this: >> >> ... >> jbd2_journal_commit_transaction() >> update_tail = jbd2_journal_get_log_tail() >> //journal->j_checkpoint_transactions should be NULL, tid is the >> //committing transaction's tid, which should be large than the >> //tail tid since the second loop, so this should be true after >> //the second loop >> if (freed < journal->j_max_transaction_buffers) >> update_tail = 0; >> //update_tail should be true after j_max_transaction_buffers loop >> >> jbd2_update_log_tail() //j_free should be increased in each >> //j_max_transaction_buffers loop >> if (tid_gt(tid, journal->j_tail_sequence)) //it's true >> __jbd2_update_log_tail() >> journal->j_free += freed; //update log tail and increase j_free >> //j_max_transaction_buffers blocks >> ... >> >> As I understand it, the journal space can't be exhausted, I don't know how >> it happened, am I missing something? > > Well, at the beginning of the journal there are a few non-empty > transactions whose buffers didn't get checkpointed before we entered the > commit loop. So we cannot just push the journal tail without doing a > checkpoint and we never call into the checkpointing code in the commit > loop... Oh, indeed, I see, thanks for explaining and this patch looks good to me. Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Thanks, Yi.
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index a095f1a3114b..66513c18ca29 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks) wake_up(&journal->j_wait_reserved); } +/* Maximum number of blocks for user transaction payload */ +static int jbd2_max_user_trans_buffers(journal_t *journal) +{ + return journal->j_max_transaction_buffers - + journal->j_transaction_overhead_buffers; +} + /* * Wait until we can add credits for handle to the running transaction. Called * with j_state_lock held for reading. Returns 0 if handle joined the running @@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock) * big to fit this handle? Wait until reserved credits are freed. */ if (atomic_read(&journal->j_reserved_credits) + total > - journal->j_max_transaction_buffers) { + jbd2_max_user_trans_buffers(journal)) { read_unlock(&journal->j_state_lock); jbd2_might_wait_for_commit(journal); wait_event(journal->j_wait_reserved, atomic_read(&journal->j_reserved_credits) + total <= - journal->j_max_transaction_buffers); + jbd2_max_user_trans_buffers(journal)); __acquire(&journal->j_state_lock); /* fake out sparse */ return 1; } @@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock) needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits); /* We allow at most half of a transaction to be reserved */ - if (needed > journal->j_max_transaction_buffers / 2) { + if (needed > jbd2_max_user_trans_buffers(journal) / 2) { sub_reserved_credits(journal, rsv_blocks); atomic_sub(total, &t->t_outstanding_credits); read_unlock(&journal->j_state_lock); jbd2_might_wait_for_commit(journal); wait_event(journal->j_wait_reserved, atomic_read(&journal->j_reserved_credits) + rsv_blocks - <= journal->j_max_transaction_buffers / 2); + <= jbd2_max_user_trans_buffers(journal) / 2); __acquire(&journal->j_state_lock); /* fake out sparse */ return 1; } @@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle, * size and limit the number of total credits to not exceed maximum * transaction size per operation. */ - if ((rsv_blocks > journal->j_max_transaction_buffers / 2) || - (rsv_blocks + blocks > journal->j_max_transaction_buffers)) { + if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 || + rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) { printk(KERN_ERR "JBD2: %s wants too many credits " "credits:%d rsv_credits:%d max:%d\n", current->comm, blocks, rsv_blocks, - journal->j_max_transaction_buffers); + jbd2_max_user_trans_buffers(journal)); WARN_ON(1); return -ENOSPC; }
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available. Fix the problem by properly accounting for transaction space reserved for descriptor blocks when verifying requested transaction handle size. CC: stable@vger.kernel.org Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") Reported-by: Alexander Coffin <alex.coffin@maticrobots.com> Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6Vyxtw@mail.gmail.com Signed-off-by: Jan Kara <jack@suse.cz> --- fs/jbd2/transaction.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)