diff mbox series

[RFC,v2,4/5] ext4: data=journal: add inode to transaction inode list in ext4_page_mkwrite()

Message ID 20200810010210.3305322-5-mfo@canonical.com
State Superseded
Headers show
Series ext4/jbd2: data=journal: write-protect pages on transaction commit | expand

Commit Message

Mauricio Faria de Oliveira Aug. 10, 2020, 1:02 a.m. UTC
Since we only add the inode to the transaction's inode list in
__ext4_journalled_writepage(), we depend on msync() or writeback work
(which call it) for the write-protect mechanism to work.

This test snippet shows that, as pwrite() gets the inode into a
transaction (!= than into transaction's inode list), and addr[]
write access gets the page writeably mapped.

    fd = open("file");
    addr = mmap(fd);
    pwrite(fd, "a", 1, 0); // journals inode via ext4_write_begin()
    addr[0] = 'a'; // page is writeably mapped to user space.
    // periodic journal commit / jbd2 thread runs now.
    // __ext4_journalled_writepage() was not called yet.

Now it's possible for a subsequent addr[] write access to race
with the commit function, and possibly hit the window to cause
invalid checksums.
---
 fs/ext4/inode.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 978ccde8454f..ce5464f92a7e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -6008,9 +6008,10 @@  vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 		len = PAGE_SIZE;
 	/*
 	 * Return if we have all the buffers mapped. This avoids the need to do
-	 * journal_start/journal_stop which can block and take a long time
+	 * journal_start/journal_stop which can block and take a long time. But
+	 * not on data journalling, as we have to add the inode to the txn list.
 	 */
-	if (page_has_buffers(page)) {
+	if (page_has_buffers(page) && !ext4_should_journal_data(inode)) {
 		if (!ext4_walk_page_buffers(NULL, page_buffers(page),
 					    0, len, NULL,
 					    ext4_bh_unmapped)) {
@@ -6043,6 +6044,12 @@  vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 			goto out;
 		}
 		ext4_set_inode_state(inode, EXT4_STATE_JDATA);
+		if (ext4_jbd2_inode_add_write(handle, inode, 0, PAGE_SIZE)) {
+			unlock_page(page);
+			ret = VM_FAULT_SIGBUS;
+			ext4_journal_stop(handle);
+			goto out;
+		}
 	}
 	ext4_journal_stop(handle);
 	if (err == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))