Message ID | 20090407221933.GB7031@mit.edu |
---|---|
State | Not Applicable, archived |
Headers | show |
On Tue, 7 Apr 2009 18:19:33 -0400 Theodore Tso <tytso@mit.edu> wrote: > Now that we have a distinction between WRITE_SYNC and WRITE_SYNC_PLUG, > use WRITE_SYNC_PLUG in __block_write_full_page() to avoid unplugging > the block device I/O queue between each page that gets flushed out. > > The upstream callers of block_write_full_page() which wait for the > writes to finish call wait_on_buffer(), wait_on_writeback_range() > (which ultimately calls sync_page(), which calls > blk_run_backing_dev(), which will unplug the device queue), and so on. > <sob> > > We should get this applied to avoid any performance regressions > resulting from commit a64c8610. > > fs/buffer.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 977e12a..95b5390 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1646,7 +1646,8 @@ static int __block_write_full_page(struct inode *inode, struct page *page, > struct buffer_head *bh, *head; > const unsigned blocksize = 1 << inode->i_blkbits; > int nr_underway = 0; > - int write_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); > + int write_op = (wbc->sync_mode == WB_SYNC_ALL ? > + WRITE_SYNC_PLUG : WRITE); > > BUG_ON(!PageLocked(page)); So how does WRITE_SYNC_PLUG differ from WRITE, and what effect does this change have upon kernel behaviour? -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 07, 2009 at 04:09:44PM -0700, Andrew Morton wrote: > > > > The upstream callers of block_write_full_page() which wait for the > > writes to finish call wait_on_buffer(), wait_on_writeback_range() > > (which ultimately calls sync_page(), which calls > > blk_run_backing_dev(), which will unplug the device queue), and so on. > > <sob> No question, this stuff needs to be better documented; the codepaths involved is scattered between files in block/, fs/, and mm/ directories, and it's not well documented as *all* what a filesystem developer is supposed to do. > > const unsigned blocksize = 1 << inode->i_blkbits; > > int nr_underway = 0; > > - int write_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); > > + int write_op = (wbc->sync_mode == WB_SYNC_ALL ? > > + WRITE_SYNC_PLUG : WRITE); > > > > BUG_ON(!PageLocked(page)); > > So how does WRITE_SYNC_PLUG differ from WRITE, and what effect does > this change have upon kernel behaviour? The difference between WRITE_SYNC_PLUG and WRITE is that from the perspective of the I/O scheduler, they are prioritized as "synchronous" operations. Some I/O schedulers, such as AS and CFQ, prioritize synchronous writes and put them in the same bucket as synchronous reads, and above asynchronous writes. Currently, we are using WRITE_SYNC, which has the implicit unplug if wbc->sync_mode is WB_SYNC_ALL. WRITE_SYNC_PLUG removes the implicit unplug, which was the issue that you had expressed concern. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 07 2009, Theodore Tso wrote: > Now that we have a distinction between WRITE_SYNC and WRITE_SYNC_PLUG, > use WRITE_SYNC_PLUG in __block_write_full_page() to avoid unplugging > the block device I/O queue between each page that gets flushed out. > > The upstream callers of block_write_full_page() which wait for the > writes to finish call wait_on_buffer(), wait_on_writeback_range() > (which ultimately calls sync_page(), which calls > blk_run_backing_dev(), which will unplug the device queue), and so on. > > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > --- > > We should get this applied to avoid any performance regressions > resulting from commit a64c8610. > > fs/buffer.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 977e12a..95b5390 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1646,7 +1646,8 @@ static int __block_write_full_page(struct inode *inode, struct page *page, > struct buffer_head *bh, *head; > const unsigned blocksize = 1 << inode->i_blkbits; > int nr_underway = 0; > - int write_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); > + int write_op = (wbc->sync_mode == WB_SYNC_ALL ? > + WRITE_SYNC_PLUG : WRITE); > > BUG_ON(!PageLocked(page)); I think you should comment on why we don't need to do the actual unplug. See what I added in fs/jbd/commit.c:journal_commit_transaction(): /* * Use plugged writes here, since we want to submit several * before we unplug the device. We don't do explicit * unplugging in here, instead we rely on sync_buffer() doing * the unplug for us. */
On Wed, Apr 08, 2009 at 08:00:33AM +0200, Jens Axboe wrote: > > I think you should comment on why we don't need to do the actual unplug. > See what I added in fs/jbd/commit.c:journal_commit_transaction(): > > /* > * Use plugged writes here, since we want to submit several > * before we unplug the device. We don't do explicit > * unplugging in here, instead we rely on sync_buffer() doing > * the unplug for us. > */ OK, agreed. I'll add a comment explaining what is going on in the patch; better there than in the commit log. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/buffer.c b/fs/buffer.c index 977e12a..95b5390 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1646,7 +1646,8 @@ static int __block_write_full_page(struct inode *inode, struct page *page, struct buffer_head *bh, *head; const unsigned blocksize = 1 << inode->i_blkbits; int nr_underway = 0; - int write_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); + int write_op = (wbc->sync_mode == WB_SYNC_ALL ? + WRITE_SYNC_PLUG : WRITE); BUG_ON(!PageLocked(page));
Now that we have a distinction between WRITE_SYNC and WRITE_SYNC_PLUG, use WRITE_SYNC_PLUG in __block_write_full_page() to avoid unplugging the block device I/O queue between each page that gets flushed out. The upstream callers of block_write_full_page() which wait for the writes to finish call wait_on_buffer(), wait_on_writeback_range() (which ultimately calls sync_page(), which calls blk_run_backing_dev(), which will unplug the device queue), and so on. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- We should get this applied to avoid any performance regressions resulting from commit a64c8610. fs/buffer.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-)