From patchwork Fri Jun 25 23:32:15 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: gregkh@suse.de X-Patchwork-Id: 57036 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4CFA01054FA for ; Sat, 26 Jun 2010 09:32:40 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756920Ab0FYXci (ORCPT ); Fri, 25 Jun 2010 19:32:38 -0400 Received: from kroah.org ([198.145.64.141]:33847 "EHLO coco.kroah.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756520Ab0FYXch (ORCPT ); Fri, 25 Jun 2010 19:32:37 -0400 Received: from localhost (c-24-16-163-131.hsd1.wa.comcast.net [24.16.163.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by coco.kroah.org (Postfix) with ESMTPSA id 27F1448A57; Fri, 25 Jun 2010 16:32:37 -0700 (PDT) Subject: Patch "ext4: Fix file fragmentation during large file write." has been added to the 2.6.27-stable tree To: aneesh.kumar@linux.vnet.ibm.com, david@fromorbit.com, dev@jaysonking.com, gregkh@suse.de, Kay.Diederichs@uni-konstanz.de, linux-ext4@vger.kernel.org, tytso@mit.edu Cc: , From: Date: Fri, 25 Jun 2010 16:32:15 -0700 In-Reply-To: <4C001901.1070207@jaysonking.com> Message-ID: <12775087352667@site> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org This is a note to let you know that I've just added the patch titled ext4: Fix file fragmentation during large file write. to the 2.6.27-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: ext4-fix-file-fragmentation-during-large-file-write.patch and it can be found in the queue-2.6.27 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. From dev@jaysonking.com Fri Jun 25 15:33:09 2010 From: Aneesh Kumar K.V Date: Fri, 28 May 2010 14:26:57 -0500 Subject: ext4: Fix file fragmentation during large file write. Cc: "Jayson R. King" , Theodore Ts'o , "Aneesh Kumar K.V" , Dave Chinner , Ext4 Developers List , Kay Diederichs Message-ID: <4C001901.1070207@jaysonking.com> From: Aneesh Kumar K.V commit 22208dedbd7626e5fc4339c417f8d24cc21f79d7 upstream. The range_cyclic writeback mode uses the address_space writeback_index as the start index for writeback. With delayed allocation we were updating writeback_index wrongly resulting in highly fragmented file. This patch reduces the number of extents reduced from 4000 to 27 for a 3GB file. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Theodore Ts'o [dev@jaysonking.com: Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.] [dev@jaysonking.com: Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.] Signed-off-by: Jayson R. King Signed-off-by: Greg Kroah-Hartman --- fs/ext4/inode.c | 79 ++++++++++++++++++++++++++++++++------------------------ 1 file changed, 46 insertions(+), 33 deletions(-) Patches currently in stable-queue which might be from aneesh.kumar@linux.vnet.ibm.com are queue-2.6.27/ext4-fix-file-fragmentation-during-large-file-write.patch queue-2.6.27/ext4-use-our-own-write_cache_pages.patch queue-2.6.27/ext4-implement-range_cyclic-in-ext4_da_writepages-instead-of-write_cache_pages.patch -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1721,7 +1721,11 @@ static int mpage_da_submit_io(struct mpa pages_skipped = mpd->wbc->pages_skipped; err = mapping->a_ops->writepage(page, mpd->wbc); - if (!err) + if (!err && (pages_skipped == mpd->wbc->pages_skipped)) + /* + * have successfully written the page + * without skipping the same + */ mpd->pages_written++; /* * In error case, we have to continue because @@ -2295,7 +2299,6 @@ static int mpage_da_writepages(struct ad struct writeback_control *wbc, struct mpage_da_data *mpd) { - long to_write; int ret; if (!mpd->get_block) @@ -2310,19 +2313,18 @@ static int mpage_da_writepages(struct ad mpd->pages_written = 0; mpd->retval = 0; - to_write = wbc->nr_to_write; - ret = write_cache_pages_da(mapping, wbc, mpd); - /* * Handle last extent of pages */ if (!mpd->io_done && mpd->next_page != mpd->first_page) { if (mpage_da_map_blocks(mpd) == 0) mpage_da_submit_io(mpd); - } - wbc->nr_to_write = to_write - mpd->pages_written; + mpd->io_done = 1; + ret = MPAGE_DA_EXTENT_TAIL; + } + wbc->nr_to_write -= mpd->pages_written; return ret; } @@ -2567,11 +2569,13 @@ static int ext4_da_writepages_trans_bloc static int ext4_da_writepages(struct address_space *mapping, struct writeback_control *wbc) { + pgoff_t index; + int range_whole = 0; handle_t *handle = NULL; struct mpage_da_data mpd; struct inode *inode = mapping->host; + long pages_written = 0, pages_skipped; int needed_blocks, ret = 0, nr_to_writebump = 0; - long to_write, pages_skipped = 0; struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); /* @@ -2605,16 +2609,20 @@ static int ext4_da_writepages(struct add nr_to_writebump = sbi->s_mb_stream_request - wbc->nr_to_write; wbc->nr_to_write = sbi->s_mb_stream_request; } + if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) + range_whole = 1; - - pages_skipped = wbc->pages_skipped; + if (wbc->range_cyclic) + index = mapping->writeback_index; + else + index = wbc->range_start >> PAGE_CACHE_SHIFT; mpd.wbc = wbc; mpd.inode = mapping->host; -restart_loop: - to_write = wbc->nr_to_write; - while (!ret && to_write > 0) { + pages_skipped = wbc->pages_skipped; + + while (!ret && wbc->nr_to_write > 0) { /* * we insert one extent at a time. So we need @@ -2647,46 +2655,51 @@ restart_loop: goto out_writepages; } } - to_write -= wbc->nr_to_write; - mpd.get_block = ext4_da_get_block_write; ret = mpage_da_writepages(mapping, wbc, &mpd); ext4_journal_stop(handle); - if (mpd.retval == -ENOSPC) + if (mpd.retval == -ENOSPC) { + /* commit the transaction which would + * free blocks released in the transaction + * and try again + */ jbd2_journal_force_commit_nested(sbi->s_journal); - - /* reset the retry count */ - if (ret == MPAGE_DA_EXTENT_TAIL) { + wbc->pages_skipped = pages_skipped; + ret = 0; + } else if (ret == MPAGE_DA_EXTENT_TAIL) { /* * got one extent now try with * rest of the pages */ - to_write += wbc->nr_to_write; + pages_written += mpd.pages_written; + wbc->pages_skipped = pages_skipped; ret = 0; - } else if (wbc->nr_to_write) { + } else if (wbc->nr_to_write) /* * There is no more writeout needed * or we requested for a noblocking writeout * and we found the device congested */ - to_write += wbc->nr_to_write; break; - } - wbc->nr_to_write = to_write; - } - - if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) { - /* We skipped pages in this loop */ - wbc->nr_to_write = to_write + - wbc->pages_skipped - pages_skipped; - wbc->pages_skipped = pages_skipped; - goto restart_loop; } + if (pages_skipped != wbc->pages_skipped) + printk(KERN_EMERG "This should not happen leaving %s " + "with nr_to_write = %ld ret = %d\n", + __func__, wbc->nr_to_write, ret); + + /* Update index */ + index += pages_written; + if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) + /* + * set the writeback_index so that range_cyclic + * mode will write it back later + */ + mapping->writeback_index = index; out_writepages: - wbc->nr_to_write = to_write - nr_to_writebump; + wbc->nr_to_write -= nr_to_writebump; return ret; }