Message ID | 1299180594.2826.6.camel@mingming-laptop |
---|---|
State | Accepted, archived |
Headers | show |
On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote: > While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing > conversion from unwritten extents to written for IOs completed from async direct IO patch. > Per filesystem is enough, we don't need per cpu threads to work on conversion. > > Signed-off-by: Mingming Cao <cmm@us.ibm.com> Thanks, added to the ext4 patch queue. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote: > While running ext4 testing on multiple core, we found there are per > cpu ext4-dio-unwritten threads processing conversion from unwritten > extents to written for IOs completed from async direct IO patch. > Per filesystem is enough, we don't need per cpu threads to work on > conversion. > > Signed-off-by: Mingming Cao <cmm@us.ibm.com> Eric, would you be able to do a very quick sanity check on your 48-core machine? I can definitely see how having a huge number of threads per file system could be problematic, especially on a system with 32 or 64 ext4 file systems. I'm curious though if we'll end up taking a performance hit on direct I/O workloads. If I remember correctly we currently have large file create with DIO turned off, right? Would it be possible to do a large file create with DIO enabled, and do a quick run both with and without this patch? In the future it would also be interesting to see how we are doing versus other file systems using a DIO workload. This is a probably another area where I suspect some lockstat and oprofile runs may give us opportunities for further optimization. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/05/2011 12:46 PM, Ted Ts'o wrote: > On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote: >> While running ext4 testing on multiple core, we found there are per >> cpu ext4-dio-unwritten threads processing conversion from unwritten >> extents to written for IOs completed from async direct IO patch. >> Per filesystem is enough, we don't need per cpu threads to work on >> conversion. >> >> Signed-off-by: Mingming Cao<cmm@us.ibm.com> > > Eric, would you be able to do a very quick sanity check on your > 48-core machine? I can definitely see how having a huge number of > threads per file system could be problematic, especially on a system > with 32 or 64 ext4 file systems. I'm curious though if we'll end up > taking a performance hit on direct I/O workloads. > Hi Ted: Sure, I can do that - I'll queue it up once I'm done with the "for .39" patch measurements. > If I remember correctly we currently have large file create with DIO > turned off, right? Would it be possible to do a large file create > with DIO enabled, and do a quick run both with and without this patch? That's right, we're not measuring DIO right now. I think I've got enough hardware to run a filesystem per core (or more), and I think it should be straightforward to write a modified ffsb profile to run (say) 48 filesystems in parallel. > > In the future it would also be interesting to see how we are doing > versus other file systems using a DIO workload. This is a probably > another area where I suspect some lockstat and oprofile runs may give > us opportunities for further optimization. Yes - as discussed at Plumber's. I'll put that on the list as well. With luck, there should be some time towards the end of the .39 merge window. Eric > > - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 2011-03-05 at 12:46 -0500, Ted Ts'o wrote: > On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote: > > While running ext4 testing on multiple core, we found there are per > > cpu ext4-dio-unwritten threads processing conversion from unwritten > > extents to written for IOs completed from async direct IO patch. > > Per filesystem is enough, we don't need per cpu threads to work on > > conversion. > > > > Signed-off-by: Mingming Cao <cmm@us.ibm.com> > > Eric, would you be able to do a very quick sanity check on your > 48-core machine? I can definitely see how having a huge number of > threads per file system could be problematic, especially on a system > with 32 or 64 ext4 file systems. I'm curious though if we'll end up > taking a performance hit on direct I/O workloads. > > If I remember correctly we currently have large file create with DIO > turned off, right? Would it be possible to do a large file create > with DIO enabled, and do a quick run both with and without this patch? > The background thread performs the conversion when IOs from async dio writing to holes/preallocated is completed. So would need to setup fallocated files and running async and direct IO would possible to exercise any potential scalability issue with the background dio conversion thread... I took a look at FFSB, it doesn't support fallocate and async IO yet. But fio does support aio and fallocate. This is a simple fio profile I use for test file being setup by fallocate() and run random aio dio over it. See it is useful for Eric to give it a try or a reference on his 48 core. examples$ cat aio-setup ; Random read/write to fallocat files with aio dio [global] ioengine=libaio direct=1 rw=randrw bs=4k size=2m filesize=1024m fallocate=1 directory=/tmp [file1] iodepth=4 > In the future it would also be interesting to see how we are doing > versus other file systems using a DIO workload. This is a probably > another area where I suspect some lockstat and oprofile runs may give > us opportunities for further optimization. > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index f6a318f..c76a6a5 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -3509,7 +3509,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) percpu_counter_set(&sbi->s_dirtyblocks_counter, 0); no_journal: - EXT4_SB(sb)->dio_unwritten_wq = create_workqueue("ext4-dio-unwritten"); + EXT4_SB(sb)->dio_unwritten_wq = create_singlethread_workqueue("ext4-dio-unwritten"); if (!EXT4_SB(sb)->dio_unwritten_wq) { printk(KERN_ERR "EXT4-fs: failed to create DIO workqueue\n"); goto failed_mount_wq;
While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing conversion from unwritten extents to written for IOs completed from async direct IO patch. Per filesystem is enough, we don't need per cpu threads to work on conversion. Signed-off-by: Mingming Cao <cmm@us.ibm.com> --- fs/ext4/super.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)