diff mbox series

[16/22] jbd2: Account descriptor blocks into t_outstanding_credits

Message ID 20191003220613.10791-16-jack@suse.cz
State Superseded
Headers show
Series ext4: Fix transaction overflow due to revoke descriptors | expand

Commit Message

Jan Kara Oct. 3, 2019, 10:06 p.m. UTC
Currently, journal descriptor blocks were not accounted in
transaction->t_outstanding_credits and we were just leaving some slack
space in the journal for them (in jbd2_log_space_left() and
jbd2_space_needed()). This is making proper accounting (and reservation
we want to add) of descriptor blocks difficult so switch to accounting
descriptor blocks in transaction->t_outstanding_credits and just reserve
the same amount of credits in t_outstanding credits for journal
descriptor blocks when creating transaction.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/commit.c      |  3 +++
 fs/jbd2/journal.c     |  1 +
 fs/jbd2/transaction.c | 20 ++++++++++++--------
 include/linux/jbd2.h  | 16 +++-------------
 4 files changed, 19 insertions(+), 21 deletions(-)

Comments

Theodore Ts'o Oct. 21, 2019, 9:04 p.m. UTC | #1
On Fri, Oct 04, 2019 at 12:06:02AM +0200, Jan Kara wrote:
> Currently, journal descriptor blocks were not accounted in
> transaction->t_outstanding_credits and we were just leaving some slack
> space in the journal for them (in jbd2_log_space_left() and
> jbd2_space_needed()). This is making proper accounting (and reservation
> we want to add) of descriptor blocks difficult so switch to accounting
> descriptor blocks in transaction->t_outstanding_credits and just reserve
> the same amount of credits in t_outstanding credits for journal
> descriptor blocks when creating transaction.

This changes the meaning of t_oustanding credits; in particular the
documentation of t_outstanding_credits in include/linux/jbd2.h is no
longer correct, as it currently defines it has containing:

     Number of buffers reserved for use by all handles in this transaction
     handle but not yet modified. [none]

Previously, t_outstanding_credits would go to zero once all of the
handles attached to the transaction were closed.  Now, it is
initialized to j_max_transaction_buffers >> 32, and once all of the
handles are closed t_outstanding_credits will go back to that value.
It then gets decremented as we write each jbd descriptor block
(whether it is for a revoke block or a data block) during the commit
and we throw a warning if we ever write more than j_max_transaction_buffers >> 32
descriptor blocks.

Is that a fair summary of what happens after this commit?

The thing is, I don't see how this helps the rest of the patch series;
we account for space needed for the revoke blocks in later patches,
but I don't see that adjusting t_outstanding credits.  We reserve
extra space for the revoke blocks, and we then account for that space,
but the fact that we have accounted for all of the extra descriptor
blocks in t_outstanding_credits doesn't seem to be changed.  As a
result, we appear to be double-counting the space needed for the
revoke descriptor blocks.  Which is fine; I don't mind the accounting
being a bit more conservative, but I find myself being a bit puzzled
about why this change is necessary or adds value.

What am I missing?

						- Ted
Jan Kara Oct. 23, 2019, 1:09 p.m. UTC | #2
On Mon 21-10-19 17:04:20, Theodore Y. Ts'o wrote:
> On Fri, Oct 04, 2019 at 12:06:02AM +0200, Jan Kara wrote:
> > Currently, journal descriptor blocks were not accounted in
> > transaction->t_outstanding_credits and we were just leaving some slack
> > space in the journal for them (in jbd2_log_space_left() and
> > jbd2_space_needed()). This is making proper accounting (and reservation
> > we want to add) of descriptor blocks difficult so switch to accounting
> > descriptor blocks in transaction->t_outstanding_credits and just reserve
> > the same amount of credits in t_outstanding credits for journal
> > descriptor blocks when creating transaction.
> 
> This changes the meaning of t_oustanding credits; in particular the
> documentation of t_outstanding_credits in include/linux/jbd2.h is no
> longer correct, as it currently defines it has containing:
> 
>      Number of buffers reserved for use by all handles in this transaction
>      handle but not yet modified. [none]

Right, I can improve the description to better match the new meaning.

> Previously, t_outstanding_credits would go to zero once all of the
> handles attached to the transaction were closed.  Now, it is
> initialized to j_max_transaction_buffers >> 32, and once all of the
> handles are closed t_outstanding_credits will go back to that value.
> It then gets decremented as we write each jbd descriptor block
> (whether it is for a revoke block or a data block) during the commit
> and we throw a warning if we ever write more than j_max_transaction_buffers >> 32
> descriptor blocks.
> 
> Is that a fair summary of what happens after this commit?

Yes, that's correct.

> The thing is, I don't see how this helps the rest of the patch series;
> we account for space needed for the revoke blocks in later patches,
> but I don't see that adjusting t_outstanding credits.  We reserve
> extra space for the revoke blocks, and we then account for that space,
> but the fact that we have accounted for all of the extra descriptor
> blocks in t_outstanding_credits doesn't seem to be changed.  As a
> result, we appear to be double-counting the space needed for the
> revoke descriptor blocks.  Which is fine; I don't mind the accounting
> being a bit more conservative, but I find myself being a bit puzzled
> about why this change is necessary or adds value.

As you properly mentioned above the new meaning of t_outstanding_credits
is meant to be "the amount of space reserved for the transaction in the
journal". This is including all the descriptor blocks the transaction may
need. And it seemed easier to me to change t_outstanding_credits to this
new meaning because later the amount of space reserved for descriptor
blocks stops being constant so we would have to change several places to
use "t_outstanding_credits + t_descritor_credits" instead which gets
especially tricky in cases where we manipulate t_outstanding_credits
atomically (start_this_handle(), add_transaction_credits() in particular).

WRT double-accounting of credits reserved for descriptor blocks: Yes,
revoke descriptor blocks get accounted separately later in the series and
my plan was to shrink the estimate in jbd2_descriptor_blocks_per_trans() at
the end of the series to match the fact that now we need to account only
for commit block and other control blocks which are much more limited.
Which I forgot about in the end. So I will add a patch to do that now.
Thanks for the review!

								Honza
diff mbox series

Patch

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index b67e2d0cff88..43f2dde5bb47 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -889,6 +889,9 @@  void jbd2_journal_commit_transaction(journal_t *journal)
 	if (err)
 		jbd2_journal_abort(journal, err);
 
+	WARN_ON_ONCE(
+		atomic_read(&commit_transaction->t_outstanding_credits) < 0);
+
 	/*
 	 * Now disk caches for filesystem device are flushed so we are safe to
 	 * erase checkpointed transactions from the log by updating journal
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 1c58859aa592..a4ec198b10c5 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -840,6 +840,7 @@  jbd2_journal_get_descriptor_buffer(transaction_t *transaction, int type)
 	bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
 	if (!bh)
 		return NULL;
+	atomic_dec(&transaction->t_outstanding_credits);
 	lock_buffer(bh);
 	memset(bh->b_data, 0, journal->j_blocksize);
 	header = (journal_header_t *)bh->b_data;
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index d4ee02e5161b..a364d0623884 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -62,6 +62,17 @@  void jbd2_journal_free_transaction(transaction_t *transaction)
 	kmem_cache_free(transaction_cache, transaction);
 }
 
+/*
+ * We reserve t_outstanding_credits >> JBD2_CONTROL_BLOCKS_SHIFT for
+ * transaction descriptor blocks.
+ */
+#define JBD2_CONTROL_BLOCKS_SHIFT 5
+
+static int jbd2_descriptor_blocks_per_trans(journal_t *journal)
+{
+	return journal->j_max_transaction_buffers >> JBD2_CONTROL_BLOCKS_SHIFT;
+}
+
 /*
  * jbd2_get_transaction: obtain a new transaction_t object.
  *
@@ -88,6 +99,7 @@  static void jbd2_get_transaction(journal_t *journal,
 	spin_lock_init(&transaction->t_handle_lock);
 	atomic_set(&transaction->t_updates, 0);
 	atomic_set(&transaction->t_outstanding_credits,
+		   jbd2_descriptor_blocks_per_trans(journal) +
 		   atomic_read(&journal->j_reserved_credits));
 	atomic_set(&transaction->t_handle_count, 0);
 	INIT_LIST_HEAD(&transaction->t_inode_list);
@@ -634,14 +646,6 @@  int jbd2_journal_extend(handle_t *handle, int nblocks)
 		goto unlock;
 	}
 
-	if (wanted + (wanted >> JBD2_CONTROL_BLOCKS_SHIFT) >
-	    jbd2_log_space_left(journal)) {
-		jbd_debug(3, "denied handle %p %d blocks: "
-			  "insufficient log space\n", handle, nblocks);
-		atomic_sub(nblocks, &transaction->t_outstanding_credits);
-		goto unlock;
-	}
-
 	trace_jbd2_handle_extend(journal->j_fs_dev->bd_dev,
 				 transaction->t_tid,
 				 handle->h_type, handle->h_line_no,
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 727ff91d7f3e..1bb37d3e3839 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1560,20 +1560,13 @@  static inline int jbd2_journal_has_csum_v2or3(journal_t *journal)
 	return journal->j_chksum_driver != NULL;
 }
 
-/*
- * We reserve t_outstanding_credits >> JBD2_CONTROL_BLOCKS_SHIFT for
- * transaction control blocks.
- */
-#define JBD2_CONTROL_BLOCKS_SHIFT 5
-
 /*
  * Return the minimum number of blocks which must be free in the journal
  * before a new transaction may be started.  Must be called under j_state_lock.
  */
 static inline int jbd2_space_needed(journal_t *journal)
 {
-	int nblocks = journal->j_max_transaction_buffers;
-	return nblocks + (nblocks >> JBD2_CONTROL_BLOCKS_SHIFT);
+	return journal->j_max_transaction_buffers;
 }
 
 /*
@@ -1585,11 +1578,8 @@  static inline unsigned long jbd2_log_space_left(journal_t *journal)
 	long free = journal->j_free - 32;
 
 	if (journal->j_committing_transaction) {
-		unsigned long committing = atomic_read(&journal->
-			j_committing_transaction->t_outstanding_credits);
-
-		/* Transaction + control blocks */
-		free -= committing + (committing >> JBD2_CONTROL_BLOCKS_SHIFT);
+		free -= atomic_read(&journal->
+                        j_committing_transaction->t_outstanding_credits);
 	}
 	return max_t(long, free, 0);
 }