diff mbox

ext4: Speedup WB_SYNC_ALL pass

Message ID 1393887208-19462-1-git-send-email-jack@suse.cz
State Accepted, archived
Headers show

Commit Message

Jan Kara March 3, 2014, 10:53 p.m. UTC
When doing filesystem wide sync, there's no need to force transaction
commit (or synchronously write inode buffer) separately for each inode
because ext4_sync_fs() takes care of forcing commit at the end (VFS
takes care of flushing buffer cache, respectively). Most of the time
this slowness doesn't manifest because previous WB_SYNC_NONE writeback
doesn't leave much to write but when there are processes aggressively
creating new files and several filesystems to sync, the sync slowness
can be noticeable. In the following test script sync(1) takes around 6
minutes when there are two ext4 filesystems mounted on a standard SATA
drive. After this patch sync takes a couple of seconds so we have about
two orders of magnitude improvement.

      function run_writers
      {
        for (( i = 0; i < 10; i++ )); do
          mkdir $1/dir$i
          for (( j = 0; j < 40000; j++ )); do
            dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
          done &
        done
      }

      for dir in "$@"; do
        run_writers $dir
      done

      sleep 40
      time sync

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Theodore Ts'o March 4, 2014, 3:54 p.m. UTC | #1
On Mon, Mar 03, 2014 at 11:53:28PM +0100, Jan Kara wrote:
> When doing filesystem wide sync, there's no need to force transaction
> commit (or synchronously write inode buffer) separately for each inode
> because ext4_sync_fs() takes care of forcing commit at the end (VFS
> takes care of flushing buffer cache, respectively). Most of the time
> this slowness doesn't manifest because previous WB_SYNC_NONE writeback
> doesn't leave much to write but when there are processes aggressively
> creating new files and several filesystems to sync, the sync slowness
> can be noticeable. In the following test script sync(1) takes around 6
> minutes when there are two ext4 filesystems mounted on a standard SATA
> drive. After this patch sync takes a couple of seconds so we have about
> two orders of magnitude improvement.
> 
>       function run_writers
>       {
>         for (( i = 0; i < 10; i++ )); do
>           mkdir $1/dir$i
>           for (( j = 0; j < 40000; j++ )); do
>             dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
>           done &
>         done
>       }
> 
>       for dir in "$@"; do
>         run_writers $dir
>       done
> 
>       sleep 40
>       time sync
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Looks good, thanks for the patch!

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6e39895a91b8..7850584b0679 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4443,7 +4443,12 @@  int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
 			return -EIO;
 		}
 
-		if (wbc->sync_mode != WB_SYNC_ALL)
+		/*
+		 * No need to force transaction in WB_SYNC_NONE mode. Also
+		 * ext4_sync_fs() will force the commit after everything is
+		 * written.
+		 */
+		if (wbc->sync_mode != WB_SYNC_ALL || wbc->for_sync)
 			return 0;
 
 		err = ext4_force_commit(inode->i_sb);
@@ -4453,7 +4458,11 @@  int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
 		err = __ext4_get_inode_loc(inode, &iloc, 0);
 		if (err)
 			return err;
-		if (wbc->sync_mode == WB_SYNC_ALL)
+		/*
+		 * sync(2) will flush the whole buffer cache. No need to do
+		 * it here separately for each inode.
+		 */
+		if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync)
 			sync_dirty_buffer(iloc.bh);
 		if (buffer_req(iloc.bh) && !buffer_uptodate(iloc.bh)) {
 			EXT4_ERROR_INODE_BLOCK(inode, iloc.bh->b_blocknr,