Message ID | 1381886317-19539-1-git-send-email-ming.lei@canonical.com |
---|---|
State | Accepted, archived |
Headers | show |
On Wed, Oct 16, 2013 at 09:18:37AM +0800, Ming Lei wrote: > Commit 4e7ea81db5(ext4: restructure writeback path) introduces > another performance regression on random write: > > - one more page may be added to ext4 extent in mpage_prepare_extent_to_map, > and will be submitted for I/O so nr_to_write will become -1 before 'done' > is set > > - the worse thing is that dirty pages may still be retrieved from page > cache after nr_to_write becomes negative, so lots of small chunks can be > submitted to block device when page writeback is catching up with write > path, and performance is hurted. > > On one arm A15 board with sata 3.0 SSD(CPU: 1.5GHz dura core, RAM: 2GB, > SATA controller: 3.0Gbps), this patch can improve below test's result > from 157MB/sec to 174MB/sec(>10%): > > dd if=/dev/zero of=./z.img bs=8K count=512K > > The above test is actually prototype of block write in bonnie++ utility. > > This patch makes sure no more pages than nr_to_write can be added to extent > for mapping, so that nr_to_write won't become negative. > > Cc: Ted Tso <tytso@mit.edu> > Cc: linux-ext4@vger.kernel.org > Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org> > Acked-by: Jan Kara <jack@suse.cz> > Signed-off-by: Ming Lei <ming.lei@canonical.com> Thanks, applied. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 32c04ab..32beaa4 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2295,6 +2295,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) struct address_space *mapping = mpd->inode->i_mapping; struct pagevec pvec; unsigned int nr_pages; + long left = mpd->wbc->nr_to_write; pgoff_t index = mpd->first_page; pgoff_t end = mpd->last_page; int tag; @@ -2330,6 +2331,17 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) if (page->index > end) goto out; + /* + * Accumulated enough dirty pages? This doesn't apply + * to WB_SYNC_ALL mode. For integrity sync we have to + * keep going because someone may be concurrently + * dirtying pages, and we might have synced a lot of + * newly appeared dirty pages, but have not synced all + * of the old dirty pages. + */ + if (mpd->wbc->sync_mode == WB_SYNC_NONE && left <= 0) + goto out; + /* If we can't merge this page, we are done. */ if (mpd->map.m_len > 0 && mpd->next_page != page->index) goto out; @@ -2364,19 +2376,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) if (err <= 0) goto out; err = 0; - - /* - * Accumulated enough dirty pages? This doesn't apply - * to WB_SYNC_ALL mode. For integrity sync we have to - * keep going because someone may be concurrently - * dirtying pages, and we might have synced a lot of - * newly appeared dirty pages, but have not synced all - * of the old dirty pages. - */ - if (mpd->wbc->sync_mode == WB_SYNC_NONE && - mpd->next_page - mpd->first_page >= - mpd->wbc->nr_to_write) - goto out; + left--; } pagevec_release(&pvec); cond_resched();