diff mbox

jbd2: Fix return value of jbd2_journal_start_commit()

Message ID 1233597309-20233-3-git-send-email-jack@suse.cz
State Accepted, archived
Headers show

Commit Message

Jan Kara Feb. 2, 2009, 5:55 p.m. UTC
jbd2_journal_start_commit() returns 1 if either a transaction is committing or
the function has queued a transaction commit. But it returns 0 if we raced with
somebody queueing the transaction commit as well. This resulted in
ext4_sync_fs() not functioning correctly (description from Arthur Jones): In
the case of a data=ordered umount with pending long symlinks which are delayed
due to a long list of other I/O on the backing block device, this causes the
buffer associated with the long symlinks to not be moved to the inode dirty
list in the second phase of fsync_super.  Then, before they can be dirtied
again, kjournald exits, seeing the UMOUNT flag and the dirty pages are never
written to the backing block device, causing long symlink corruption and
exposing new or previously freed block data to userspace.

This can be reproduced with a script created by Eric Sandeen
<sandeen@redhat.com>:

        #!/bin/bash

        umount /mnt/test2
        mount /dev/sdb4 /mnt/test2
        rm -f /mnt/test2/*
        dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
        touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
        ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
        /mnt/test2/link
        umount /mnt/test2
        mount /dev/sdb4 /mnt/test2
        ls /mnt/test2/

This patch fixes jbd2_journal_start_commit() to always return 1 when there's
a transaction committing or queued for commit.

CC: Eric Sandeen <sandeen@redhat.com>
CC: linux-ext4@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/journal.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)
diff mbox

Patch

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 5667530..42d8953 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -450,7 +450,7 @@  int __jbd2_log_space_left(journal_t *journal)
 }
 
 /*
- * Called under j_state_lock.  Returns true if a transaction was started.
+ * Called under j_state_lock.  Returns true if a transaction commit was started.
  */
 int __jbd2_log_start_commit(journal_t *journal, tid_t target)
 {
@@ -518,7 +518,8 @@  int jbd2_journal_force_commit_nested(journal_t *journal)
 
 /*
  * Start a commit of the current running transaction (if any).  Returns true
- * if a transaction was started, and fills its tid in at *ptid
+ * if a transaction is going to be committed (or is currently already
+ * committing), and fills its tid in at *ptid
  */
 int jbd2_journal_start_commit(journal_t *journal, tid_t *ptid)
 {
@@ -528,15 +529,19 @@  int jbd2_journal_start_commit(journal_t *journal, tid_t *ptid)
 	if (journal->j_running_transaction) {
 		tid_t tid = journal->j_running_transaction->t_tid;
 
-		ret = __jbd2_log_start_commit(journal, tid);
-		if (ret && ptid)
+		__jbd2_log_start_commit(journal, tid);
+		/* There's a running transaction and we've just made sure
+		 * it's commit has been scheduled. */
+		if (ptid)
 			*ptid = tid;
-	} else if (journal->j_committing_transaction && ptid) {
+		ret = 1;
+	} else if (journal->j_committing_transaction) {
 		/*
 		 * If ext3_write_super() recently started a commit, then we
 		 * have to wait for completion of that transaction
 		 */
-		*ptid = journal->j_committing_transaction->t_tid;
+		if (ptid)
+			*ptid = journal->j_committing_transaction->t_tid;
 		ret = 1;
 	}
 	spin_unlock(&journal->j_state_lock);