diff mbox series

[20/25] jbd2: Reserve space for revoke descriptor blocks

Message ID 20191105164437.32602-20-jack@suse.cz
State Accepted, archived
Headers show
Series None | expand

Commit Message

Jan Kara Nov. 5, 2019, 4:44 p.m. UTC
Extend functions for starting, extending, and restarting transaction
handles to take number of revoke records handle must be able to
accommodate. These functions then make sure transaction has enough
credits to be able to store resulting revoke descriptor blocks. Also
revoke code tracks number of revoke records created by a handle to catch
situation where some place didn't reserve enough space for revoke
records. Similarly to standard transaction credits, space for unused
reserved revoke records is released when the handle is stopped.

On the ext4 side we currently take a simplistic approach of reserving
space for 1024 revoke records for any transaction. This grows amount of
credits reserved for each handle only by a few and is enough for any
normal workload so that we don't hit warnings in jbd2. We will refine
the logic in following commits.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/ext4_jbd2.c   |  2 +-
 fs/ext4/ext4_jbd2.h   |  4 ++--
 fs/jbd2/journal.c     | 21 ++++++++++++++++++++
 fs/jbd2/revoke.c      |  6 ++++++
 fs/jbd2/transaction.c | 54 ++++++++++++++++++++++++++++++++++++++++++++-------
 fs/ocfs2/journal.c    |  4 ++--
 include/linux/jbd2.h  | 43 ++++++++++++++++++++++++++++++----------
 7 files changed, 112 insertions(+), 22 deletions(-)

Comments

Eric Biggers Nov. 15, 2019, 7:52 a.m. UTC | #1
On Tue, Nov 05, 2019 at 05:44:26PM +0100, Jan Kara wrote:
>  static inline int jbd2_handle_buffer_credits(handle_t *handle)
>  {
> -	return handle->h_buffer_credits;
> +	journal_t *journal = handle->h_transaction->t_journal;
> +
> +	return handle->h_buffer_credits -
> +		DIV_ROUND_UP(handle->h_revoke_credits_requested,
> +			     journal->j_revoke_records_per_block);
>  }

This patch is causing a crash with 'kvm-xfstests -c dioread_nolock ext4/024'.
Looks like this code incorrectly assumes that h_transaction is always valid
rather than the other member of the union, h_journal.


BUG: kernel NULL pointer dereference, address: 0000000000000614
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] SMP
CPU: 1 PID: 105 Comm: kworker/u4:3 Not tainted 5.4.0-rc3-00020-gfdc3ef882a5d #18
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014
Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
RIP: 0010:jbd2_handle_buffer_credits include/linux/jbd2.h:1656 [inline]
RIP: 0010:__ext4_journal_start_reserved+0x38/0x1f0 fs/ext4/ext4_jbd2.c:122
Code: 83 ec 10 48 81 ff ff 0f 00 00 89 75 d4 89 55 d0 0f 86 f5 00 00 00 48 8b 07 49 89 fc 48 8b 5d 08 4c 8b a8 40 07 00 6
RSP: 0018:ffffc90000457d40 EFLAGS: 00010296
RAX: 0000000000000038 RBX: ffffffff812e68fb RCX: 000000000000000c
RDX: 000000000000000b RSI: 000000000000137f RDI: ffff8880779c5468
RBP: ffffc90000457d78 R08: 0000000000001000 R09: 0000000000001000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880779c5468
R13: ffff88807b726000 R14: ffff8880779ad9e8 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88807fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000614 CR3: 000000007a0dd000 CR4: 00000000003406e0
Call Trace:
 ext4_convert_unwritten_extents+0x8b/0x250 fs/ext4/extents.c:4991
 ext4_end_io fs/ext4/page-io.c:152 [inline]
 ext4_do_flush_completed_IO fs/ext4/page-io.c:226 [inline]
 ext4_end_io_rsv_work+0x11a/0x1f0 fs/ext4/page-io.c:240
 process_one_work+0x227/0x5b0 kernel/workqueue.c:2269
 worker_thread+0x4b/0x3c0 kernel/workqueue.c:2415
 kthread+0x125/0x140 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
CR2: 0000000000000614
---[ end trace d8eaf4e1225480d5 ]---
Jan Kara Nov. 15, 2019, 10:02 a.m. UTC | #2
On Thu 14-11-19 23:52:23, Eric Biggers wrote:
> On Tue, Nov 05, 2019 at 05:44:26PM +0100, Jan Kara wrote:
> >  static inline int jbd2_handle_buffer_credits(handle_t *handle)
> >  {
> > -	return handle->h_buffer_credits;
> > +	journal_t *journal = handle->h_transaction->t_journal;
> > +
> > +	return handle->h_buffer_credits -
> > +		DIV_ROUND_UP(handle->h_revoke_credits_requested,
> > +			     journal->j_revoke_records_per_block);
> >  }
> 
> This patch is causing a crash with 'kvm-xfstests -c dioread_nolock ext4/024'.
> Looks like this code incorrectly assumes that h_transaction is always valid
> rather than the other member of the union, h_journal.

Right, thanks for the report! Just out of curiosity: You have to have that
tracepoint enabled for the crash to trigger, don't you? Because I'm pretty
sure I did dioread_nolock runs...

I'll send a fix shortly.

								Honza

> 
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000614
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0 
> Oops: 0000 [#1] SMP
> CPU: 1 PID: 105 Comm: kworker/u4:3 Not tainted 5.4.0-rc3-00020-gfdc3ef882a5d #18
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014
> Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
> RIP: 0010:jbd2_handle_buffer_credits include/linux/jbd2.h:1656 [inline]
> RIP: 0010:__ext4_journal_start_reserved+0x38/0x1f0 fs/ext4/ext4_jbd2.c:122
> Code: 83 ec 10 48 81 ff ff 0f 00 00 89 75 d4 89 55 d0 0f 86 f5 00 00 00 48 8b 07 49 89 fc 48 8b 5d 08 4c 8b a8 40 07 00 6
> RSP: 0018:ffffc90000457d40 EFLAGS: 00010296
> RAX: 0000000000000038 RBX: ffffffff812e68fb RCX: 000000000000000c
> RDX: 000000000000000b RSI: 000000000000137f RDI: ffff8880779c5468
> RBP: ffffc90000457d78 R08: 0000000000001000 R09: 0000000000001000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880779c5468
> R13: ffff88807b726000 R14: ffff8880779ad9e8 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff88807fd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000614 CR3: 000000007a0dd000 CR4: 00000000003406e0
> Call Trace:
>  ext4_convert_unwritten_extents+0x8b/0x250 fs/ext4/extents.c:4991
>  ext4_end_io fs/ext4/page-io.c:152 [inline]
>  ext4_do_flush_completed_IO fs/ext4/page-io.c:226 [inline]
>  ext4_end_io_rsv_work+0x11a/0x1f0 fs/ext4/page-io.c:240
>  process_one_work+0x227/0x5b0 kernel/workqueue.c:2269
>  worker_thread+0x4b/0x3c0 kernel/workqueue.c:2415
>  kthread+0x125/0x140 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> CR2: 0000000000000614
> ---[ end trace d8eaf4e1225480d5 ]---
Theodore Ts'o Nov. 15, 2019, 2:20 p.m. UTC | #3
On Fri, Nov 15, 2019 at 11:02:22AM +0100, Jan Kara wrote:
> On Thu 14-11-19 23:52:23, Eric Biggers wrote:
> > On Tue, Nov 05, 2019 at 05:44:26PM +0100, Jan Kara wrote:
> > >  static inline int jbd2_handle_buffer_credits(handle_t *handle)
> > >  {
> > > -	return handle->h_buffer_credits;
> > > +	journal_t *journal = handle->h_transaction->t_journal;
> > > +
> > > +	return handle->h_buffer_credits -
> > > +		DIV_ROUND_UP(handle->h_revoke_credits_requested,
> > > +			     journal->j_revoke_records_per_block);
> > >  }
> > 
> > This patch is causing a crash with 'kvm-xfstests -c dioread_nolock ext4/024'.
> > Looks like this code incorrectly assumes that h_transaction is always valid
> > rather than the other member of the union, h_journal.
> 
> Right, thanks for the report! Just out of curiosity: You have to have that
> tracepoint enabled for the crash to trigger, don't you? Because I'm pretty
> sure I did dioread_nolock runs...

I've been *definitely* been doing dioread_nolock runs (including two
last night), with no failures.

ext4/dioread_nolock: 485 tests, 40 skipped, 5142 seconds

		     	 	   	    - Ted
Eric Biggers Nov. 15, 2019, 5:10 p.m. UTC | #4
On Fri, Nov 15, 2019 at 09:20:33AM -0500, Theodore Y. Ts'o wrote:
> On Fri, Nov 15, 2019 at 11:02:22AM +0100, Jan Kara wrote:
> > On Thu 14-11-19 23:52:23, Eric Biggers wrote:
> > > On Tue, Nov 05, 2019 at 05:44:26PM +0100, Jan Kara wrote:
> > > >  static inline int jbd2_handle_buffer_credits(handle_t *handle)
> > > >  {
> > > > -	return handle->h_buffer_credits;
> > > > +	journal_t *journal = handle->h_transaction->t_journal;
> > > > +
> > > > +	return handle->h_buffer_credits -
> > > > +		DIV_ROUND_UP(handle->h_revoke_credits_requested,
> > > > +			     journal->j_revoke_records_per_block);
> > > >  }
> > > 
> > > This patch is causing a crash with 'kvm-xfstests -c dioread_nolock ext4/024'.
> > > Looks like this code incorrectly assumes that h_transaction is always valid
> > > rather than the other member of the union, h_journal.
> > 
> > Right, thanks for the report! Just out of curiosity: You have to have that
> > tracepoint enabled for the crash to trigger, don't you? Because I'm pretty
> > sure I did dioread_nolock runs...
> 
> I've been *definitely* been doing dioread_nolock runs (including two
> last night), with no failures.
> 
> ext4/dioread_nolock: 485 tests, 40 skipped, 5142 seconds
> 

No I didn't enable the tracepoint.  I think the difference is that I had
CONFIG_UBSAN enabled.  I get the crash if I use the following kconfig:

	curl -o .config 'https://git.kernel.org/pub/scm/fs/ext2/xfstests-bld.git/plain/kernel-configs/x86_64-config-5.4'
	echo CONFIG_UBSAN=y >> .config
	make olddefconfig

... but not if I don't enable UBSAN.

No idea why UBSAN makes a difference here, though.  I'm using gcc 9.2.0.

- Eric
diff mbox series

Patch

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 731bbfdbce5b..b81190bee32d 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -78,7 +78,7 @@  handle_t *__ext4_journal_start_sb(struct super_block *sb, unsigned int line,
 	journal = EXT4_SB(sb)->s_journal;
 	if (!journal)
 		return ext4_get_nojournal();
-	return jbd2__journal_start(journal, blocks, rsv_blocks, GFP_NOFS,
+	return jbd2__journal_start(journal, blocks, rsv_blocks, 1024, GFP_NOFS,
 				   type, line);
 }
 
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 36aa72599646..aca05e52e317 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -328,14 +328,14 @@  static inline handle_t *ext4_journal_current_handle(void)
 static inline int ext4_journal_extend(handle_t *handle, int nblocks)
 {
 	if (ext4_handle_valid(handle))
-		return jbd2_journal_extend(handle, nblocks);
+		return jbd2_journal_extend(handle, nblocks, 1024);
 	return 0;
 }
 
 static inline int ext4_journal_restart(handle_t *handle, int nblocks)
 {
 	if (ext4_handle_valid(handle))
-		return jbd2_journal_restart(handle, nblocks);
+		return jbd2__journal_restart(handle, nblocks, 1024, GFP_NOFS);
 	return 0;
 }
 
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 22b14b3ca197..eef809f61722 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1500,6 +1500,21 @@  void jbd2_journal_update_sb_errno(journal_t *journal)
 }
 EXPORT_SYMBOL(jbd2_journal_update_sb_errno);
 
+static int journal_revoke_records_per_block(journal_t *journal)
+{
+	int record_size;
+	int space = journal->j_blocksize - sizeof(jbd2_journal_revoke_header_t);
+
+	if (jbd2_has_feature_64bit(journal))
+		record_size = 8;
+	else
+		record_size = 4;
+
+	if (jbd2_journal_has_csum_v2or3(journal))
+		space -= sizeof(struct jbd2_journal_block_tail);
+	return space / record_size;
+}
+
 /*
  * Read the superblock for a given journal, performing initial
  * validation of the format.
@@ -1608,6 +1623,8 @@  static int journal_get_superblock(journal_t *journal)
 						   sizeof(sb->s_uuid));
 	}
 
+	journal->j_revoke_records_per_block =
+				journal_revoke_records_per_block(journal);
 	set_buffer_verified(bh);
 
 	return 0;
@@ -1928,6 +1945,8 @@  int jbd2_journal_set_features (journal_t *journal, unsigned long compat,
 	sb->s_feature_ro_compat |= cpu_to_be32(ro);
 	sb->s_feature_incompat  |= cpu_to_be32(incompat);
 	unlock_buffer(journal->j_sb_buffer);
+	journal->j_revoke_records_per_block =
+				journal_revoke_records_per_block(journal);
 
 	return 1;
 #undef COMPAT_FEATURE_ON
@@ -1958,6 +1977,8 @@  void jbd2_journal_clear_features(journal_t *journal, unsigned long compat,
 	sb->s_feature_compat    &= ~cpu_to_be32(compat);
 	sb->s_feature_ro_compat &= ~cpu_to_be32(ro);
 	sb->s_feature_incompat  &= ~cpu_to_be32(incompat);
+	journal->j_revoke_records_per_block =
+				journal_revoke_records_per_block(journal);
 }
 EXPORT_SYMBOL(jbd2_journal_clear_features);
 
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index f08073d7bbf5..fa608788b93d 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -371,6 +371,11 @@  int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
 	}
 #endif
 
+	if (WARN_ON_ONCE(handle->h_revoke_credits <= 0)) {
+		if (!bh_in)
+			brelse(bh);
+		return -EIO;
+	}
 	/* We really ought not ever to revoke twice in a row without
            first having the revoke cancelled: it's illegal to free a
            block twice without allocating it in between! */
@@ -391,6 +396,7 @@  int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
 			__brelse(bh);
 		}
 	}
+	handle->h_revoke_credits--;
 
 	jbd_debug(2, "insert revoke for block %llu, bh_in=%p\n",blocknr, bh_in);
 	err = insert_revoke_hash(journal, blocknr,
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index ba388da7e02b..1c121afbcf8f 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -101,6 +101,7 @@  static void jbd2_get_transaction(journal_t *journal,
 	atomic_set(&transaction->t_outstanding_credits,
 		   jbd2_descriptor_blocks_per_trans(journal) +
 		   atomic_read(&journal->j_reserved_credits));
+	atomic_set(&transaction->t_outstanding_revokes, 0);
 	atomic_set(&transaction->t_handle_count, 0);
 	INIT_LIST_HEAD(&transaction->t_inode_list);
 	INIT_LIST_HEAD(&transaction->t_private_list);
@@ -418,6 +419,7 @@  static int start_this_handle(journal_t *journal, handle_t *handle,
 	update_t_max_wait(transaction, ts);
 	handle->h_transaction = transaction;
 	handle->h_requested_credits = blocks;
+	handle->h_revoke_credits_requested = handle->h_revoke_credits;
 	handle->h_start_jiffies = jiffies;
 	atomic_inc(&transaction->t_updates);
 	atomic_inc(&transaction->t_handle_count);
@@ -451,8 +453,8 @@  static handle_t *new_handle(int nblocks)
 }
 
 handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
-			      gfp_t gfp_mask, unsigned int type,
-			      unsigned int line_no)
+			      int revoke_records, gfp_t gfp_mask,
+			      unsigned int type, unsigned int line_no)
 {
 	handle_t *handle = journal_current_handle();
 	int err;
@@ -466,6 +468,8 @@  handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
 		return handle;
 	}
 
+	nblocks += DIV_ROUND_UP(revoke_records,
+				journal->j_revoke_records_per_block);
 	handle = new_handle(nblocks);
 	if (!handle)
 		return ERR_PTR(-ENOMEM);
@@ -481,6 +485,7 @@  handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
 		rsv_handle->h_journal = journal;
 		handle->h_rsv_handle = rsv_handle;
 	}
+	handle->h_revoke_credits = revoke_records;
 
 	err = start_this_handle(journal, handle, gfp_mask);
 	if (err < 0) {
@@ -521,7 +526,7 @@  EXPORT_SYMBOL(jbd2__journal_start);
  */
 handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
 {
-	return jbd2__journal_start(journal, nblocks, 0, GFP_NOFS, 0, 0);
+	return jbd2__journal_start(journal, nblocks, 0, 0, GFP_NOFS, 0, 0);
 }
 EXPORT_SYMBOL(jbd2_journal_start);
 
@@ -598,6 +603,7 @@  EXPORT_SYMBOL(jbd2_journal_start_reserved);
  * int jbd2_journal_extend() - extend buffer credits.
  * @handle:  handle to 'extend'
  * @nblocks: nr blocks to try to extend by.
+ * @revoke_records: number of revoke records to try to extend by.
  *
  * Some transactions, such as large extends and truncates, can be done
  * atomically all at once or in several stages.  The operation requests
@@ -614,7 +620,7 @@  EXPORT_SYMBOL(jbd2_journal_start_reserved);
  * return code < 0 implies an error
  * return code > 0 implies normal transaction-full status.
  */
-int jbd2_journal_extend(handle_t *handle, int nblocks)
+int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records)
 {
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal;
@@ -636,6 +642,12 @@  int jbd2_journal_extend(handle_t *handle, int nblocks)
 		goto error_out;
 	}
 
+	nblocks += DIV_ROUND_UP(
+			handle->h_revoke_credits_requested + revoke_records,
+			journal->j_revoke_records_per_block) -
+		DIV_ROUND_UP(
+			handle->h_revoke_credits_requested,
+			journal->j_revoke_records_per_block);
 	spin_lock(&transaction->t_handle_lock);
 	wanted = atomic_add_return(nblocks,
 				   &transaction->t_outstanding_credits);
@@ -655,6 +667,8 @@  int jbd2_journal_extend(handle_t *handle, int nblocks)
 
 	handle->h_buffer_credits += nblocks;
 	handle->h_requested_credits += nblocks;
+	handle->h_revoke_credits += revoke_records;
+	handle->h_revoke_credits_requested += revoke_records;
 	result = 0;
 
 	jbd_debug(3, "extended handle %p by %d\n", handle, nblocks);
@@ -669,10 +683,31 @@  static void stop_this_handle(handle_t *handle)
 {
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal = transaction->t_journal;
+	int revokes;
 
 	J_ASSERT(journal_current_handle() == handle);
 	J_ASSERT(atomic_read(&transaction->t_updates) > 0);
 	current->journal_info = NULL;
+	/*
+	 * Subtract necessary revoke descriptor blocks from handle credits. We
+	 * take care to account only for revoke descriptor blocks the
+	 * transaction will really need as large sequences of transactions with
+	 * small numbers of revokes are relatively common.
+	 */
+	revokes = handle->h_revoke_credits_requested - handle->h_revoke_credits;
+	if (revokes) {
+		int t_revokes, revoke_descriptors;
+		int rr_per_blk = journal->j_revoke_records_per_block;
+
+		WARN_ON_ONCE(DIV_ROUND_UP(revokes, rr_per_blk)
+				> handle->h_buffer_credits);
+		t_revokes = atomic_add_return(revokes,
+				&transaction->t_outstanding_revokes);
+		revoke_descriptors =
+			DIV_ROUND_UP(t_revokes, rr_per_blk) -
+			DIV_ROUND_UP(t_revokes - revokes, rr_per_blk);
+		handle->h_buffer_credits -= revoke_descriptors;
+	}
 	atomic_sub(handle->h_buffer_credits,
 		   &transaction->t_outstanding_credits);
 	if (handle->h_rsv_handle)
@@ -692,6 +727,7 @@  static void stop_this_handle(handle_t *handle)
  * int jbd2_journal_restart() - restart a handle .
  * @handle:  handle to restart
  * @nblocks: nr credits requested
+ * @revoke_records: number of revoke record credits requested
  * @gfp_mask: memory allocation flags (for start_this_handle)
  *
  * Restart a handle for a multi-transaction filesystem
@@ -704,7 +740,8 @@  static void stop_this_handle(handle_t *handle)
  * credits. We preserve reserved handle if there's any attached to the
  * passed in handle.
  */
-int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
+int jbd2__journal_restart(handle_t *handle, int nblocks, int revoke_records,
+			  gfp_t gfp_mask)
 {
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal;
@@ -735,7 +772,10 @@  int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
 	read_unlock(&journal->j_state_lock);
 	if (need_to_start)
 		jbd2_log_start_commit(journal, tid);
-	handle->h_buffer_credits = nblocks;
+	handle->h_buffer_credits = nblocks +
+		DIV_ROUND_UP(revoke_records,
+			     journal->j_revoke_records_per_block);
+	handle->h_revoke_credits = revoke_records;
 	return start_this_handle(journal, handle, gfp_mask);
 }
 EXPORT_SYMBOL(jbd2__journal_restart);
@@ -743,7 +783,7 @@  EXPORT_SYMBOL(jbd2__journal_restart);
 
 int jbd2_journal_restart(handle_t *handle, int nblocks)
 {
-	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
+	return jbd2__journal_restart(handle, nblocks, 0, GFP_NOFS);
 }
 EXPORT_SYMBOL(jbd2_journal_restart);
 
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 019aaf2a3f8a..a032f0297dad 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -426,7 +426,7 @@  int ocfs2_extend_trans(handle_t *handle, int nblocks)
 #ifdef CONFIG_OCFS2_DEBUG_FS
 	status = 1;
 #else
-	status = jbd2_journal_extend(handle, nblocks);
+	status = jbd2_journal_extend(handle, nblocks, 0);
 	if (status < 0) {
 		mlog_errno(status);
 		goto bail;
@@ -466,7 +466,7 @@  int ocfs2_allocate_extend_trans(handle_t *handle, int thresh)
 	if (old_nblks < thresh)
 		return 0;
 
-	status = jbd2_journal_extend(handle, OCFS2_MAX_TRANS_DATA);
+	status = jbd2_journal_extend(handle, OCFS2_MAX_TRANS_DATA, 0);
 	if (status < 0) {
 		mlog_errno(status);
 		goto bail;
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 1dd2703a8e26..2a3d5f50e7a1 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -478,6 +478,7 @@  struct jbd2_revoke_table_s;
  * @h_journal: Which journal handle belongs to - used iff h_reserved set.
  * @h_rsv_handle: Handle reserved for finishing the logical operation.
  * @h_buffer_credits: Number of remaining buffers we are allowed to dirty.
+ * @h_revoke_credits: Number of remaining revoke records available for handle
  * @h_ref: Reference count on this handle.
  * @h_err: Field for caller's use to track errors through large fs operations.
  * @h_sync: Flag for sync-on-close.
@@ -488,6 +489,7 @@  struct jbd2_revoke_table_s;
  * @h_line_no: For handle statistics.
  * @h_start_jiffies: Handle Start time.
  * @h_requested_credits: Holds @h_buffer_credits after handle is started.
+ * @h_revoke_credits_requested: Holds @h_revoke_credits after handle is started.
  * @saved_alloc_context: Saved context while transaction is open.
  **/
 
@@ -505,6 +507,8 @@  struct jbd2_journal_handle
 
 	handle_t		*h_rsv_handle;
 	int			h_buffer_credits;
+	int			h_revoke_credits;
+	int			h_revoke_credits_requested;
 	int			h_ref;
 	int			h_err;
 
@@ -688,6 +692,17 @@  struct transaction_s
 	 */
 	atomic_t		t_outstanding_credits;
 
+	/*
+	 * Number of revoke records for this transaction added by already
+	 * stopped handles. [none]
+	 */
+	atomic_t		t_outstanding_revokes;
+
+	/*
+	 * How many handles used this transaction? [none]
+	 */
+	atomic_t		t_handle_count;
+
 	/*
 	 * Forward and backward links for the circular list of all transactions
 	 * awaiting checkpoint. [j_list_lock]
@@ -705,11 +720,6 @@  struct transaction_s
 	 */
 	ktime_t			t_start_time;
 
-	/*
-	 * How many handles used this transaction? [none]
-	 */
-	atomic_t		t_handle_count;
-
 	/*
 	 * This transaction is being forced and some process is
 	 * waiting for it to finish.
@@ -1026,6 +1036,13 @@  struct journal_s
 	 */
 	int			j_max_transaction_buffers;
 
+	/**
+	 * @j_revoke_records_per_block:
+	 *
+	 * Number of revoke records that fit in one descriptor block.
+	 */
+	int			j_revoke_records_per_block;
+
 	/**
 	 * @j_commit_interval:
 	 *
@@ -1360,14 +1377,16 @@  static inline handle_t *journal_current_handle(void)
 
 extern handle_t *jbd2_journal_start(journal_t *, int nblocks);
 extern handle_t *jbd2__journal_start(journal_t *, int blocks, int rsv_blocks,
-				     gfp_t gfp_mask, unsigned int type,
-				     unsigned int line_no);
+				     int revoke_records, gfp_t gfp_mask,
+				     unsigned int type, unsigned int line_no);
 extern int	 jbd2_journal_restart(handle_t *, int nblocks);
-extern int	 jbd2__journal_restart(handle_t *, int nblocks, gfp_t gfp_mask);
+extern int	 jbd2__journal_restart(handle_t *, int nblocks,
+				       int revoke_records, gfp_t gfp_mask);
 extern int	 jbd2_journal_start_reserved(handle_t *handle,
 				unsigned int type, unsigned int line_no);
 extern void	 jbd2_journal_free_reserved(handle_t *handle);
-extern int	 jbd2_journal_extend (handle_t *, int nblocks);
+extern int	 jbd2_journal_extend(handle_t *handle, int nblocks,
+				     int revoke_records);
 extern int	 jbd2_journal_get_write_access(handle_t *, struct buffer_head *);
 extern int	 jbd2_journal_get_create_access (handle_t *, struct buffer_head *);
 extern int	 jbd2_journal_get_undo_access(handle_t *, struct buffer_head *);
@@ -1631,7 +1650,11 @@  static inline tid_t  jbd2_get_latest_transaction(journal_t *journal)
 
 static inline int jbd2_handle_buffer_credits(handle_t *handle)
 {
-	return handle->h_buffer_credits;
+	journal_t *journal = handle->h_transaction->t_journal;
+
+	return handle->h_buffer_credits -
+		DIV_ROUND_UP(handle->h_revoke_credits_requested,
+			     journal->j_revoke_records_per_block);
 }
 
 #ifdef __KERNEL__