Message ID | 20120703174147.GA14986@gmail.com |
---|---|
State | Accepted, archived |
Headers | show |
> workload frequently does a uninitialized extent conversion. Thus, now > we set it to 256 (1MB chunk), and put it into super block in order to > adjust it dynamically in sysfs. It's a bit of a tangent, but this caught my eye. > + /* If extent has less than 2*s_extent_zeroout_len zerout directly */ > + if (ee_len<= 2*sbi->s_extent_zeroout_len&& > (EXT4_EXT_MAY_ZEROOUT& split_flag)) { > - if (allocated<= EXT4_EXT_ZERO_LEN&& > + if (allocated<= sbi->s_extent_zeroout_len&& > (EXT4_EXT_MAY_ZEROOUT& split_flag)) { > } else if ((map->m_lblk - ee_block + map->m_len< > - EXT4_EXT_ZERO_LEN)&& > + sbi->s_extent_zeroout_len)&& I'd be worried about having to verify that nothing bad happened if these sbi s_extent_zeroout_len references could see different values if they raced with a sysfs update. Can they do that? Maybe avoid the risk all together and have an on-stack copy that's only referenced once at the start? - z -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Zach, Thanks for reviewing this patch. On Tue, Jul 03, 2012 at 10:57:35AM -0700, Zach Brown wrote: > > >workload frequently does a uninitialized extent conversion. Thus, now > >we set it to 256 (1MB chunk), and put it into super block in order to > >adjust it dynamically in sysfs. > > It's a bit of a tangent, but this caught my eye. Oh, actually now the default value is set to 1MB in this patch. But I think maybe other users want to change this value. So I add an interface in sysfs to adjust dynaimically. Certainly it is convenient for me to do the above tests. :-) > > >+ /* If extent has less than 2*s_extent_zeroout_len zerout directly */ > >+ if (ee_len<= 2*sbi->s_extent_zeroout_len&& > > (EXT4_EXT_MAY_ZEROOUT& split_flag)) { > > >- if (allocated<= EXT4_EXT_ZERO_LEN&& > >+ if (allocated<= sbi->s_extent_zeroout_len&& > > (EXT4_EXT_MAY_ZEROOUT& split_flag)) { > > > } else if ((map->m_lblk - ee_block + map->m_len< > >- EXT4_EXT_ZERO_LEN)&& > >+ sbi->s_extent_zeroout_len)&& > > I'd be worried about having to verify that nothing bad happened if these > sbi s_extent_zeroout_len references could see different values if they > raced with a sysfs update. Can they do that? > > Maybe avoid the risk all together and have an on-stack copy that's only > referenced once at the start? I don't think 's_extent_zeroout_len' is updated frequently by user, but using an on-stack copy quite can avoid the risk. I will fix it if most of all think that this patch is useful and can be applied. Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index cfc4e01..0f44577 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1265,6 +1265,9 @@ struct ext4_sb_info { /* locality groups */ struct ext4_locality_group __percpu *s_locality_groups; + /* the length of zero-out chunk */ + unsigned int s_extent_zeroout_len; + /* for write statistics */ unsigned long s_sectors_written_start; u64 s_kbytes_written; diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 91341ec..e921c02 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3029,7 +3029,6 @@ out: return err ? err : map->m_len; } -#define EXT4_EXT_ZERO_LEN 7 /* * This function is called by ext4_ext_map_blocks() if someone tries to write * to an uninitialized extent. It may result in splitting the uninitialized @@ -3055,6 +3054,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, struct ext4_map_blocks *map, struct ext4_ext_path *path) { + struct ext4_sb_info *sbi; struct ext4_extent_header *eh; struct ext4_map_blocks split_map; struct ext4_extent zero_ex; @@ -3069,6 +3069,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, "block %llu, max_blocks %u\n", inode->i_ino, (unsigned long long)map->m_lblk, map->m_len); + sbi = EXT4_SB(inode->i_sb); eof_block = (inode->i_size + inode->i_sb->s_blocksize - 1) >> inode->i_sb->s_blocksize_bits; if (eof_block < map->m_lblk + map->m_len) @@ -3168,8 +3169,8 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, */ split_flag |= ee_block + ee_len <= eof_block ? EXT4_EXT_MAY_ZEROOUT : 0; - /* If extent has less than 2*EXT4_EXT_ZERO_LEN zerout directly */ - if (ee_len <= 2*EXT4_EXT_ZERO_LEN && + /* If extent has less than 2*s_extent_zeroout_len zerout directly */ + if (ee_len <= 2*sbi->s_extent_zeroout_len && (EXT4_EXT_MAY_ZEROOUT & split_flag)) { err = ext4_ext_zeroout(inode, ex); if (err) @@ -3195,7 +3196,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, split_map.m_len = map->m_len; if (allocated > map->m_len) { - if (allocated <= EXT4_EXT_ZERO_LEN && + if (allocated <= sbi->s_extent_zeroout_len && (EXT4_EXT_MAY_ZEROOUT & split_flag)) { /* case 3 */ zero_ex.ee_block = @@ -3209,7 +3210,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, split_map.m_lblk = map->m_lblk; split_map.m_len = allocated; } else if ((map->m_lblk - ee_block + map->m_len < - EXT4_EXT_ZERO_LEN) && + sbi->s_extent_zeroout_len) && (EXT4_EXT_MAY_ZEROOUT & split_flag)) { /* case 2 */ if (map->m_lblk != ee_block) { diff --git a/fs/ext4/super.c b/fs/ext4/super.c index eb7aa3e..ad6cf73 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2535,6 +2535,7 @@ EXT4_RW_ATTR_SBI_UI(mb_order2_req, s_mb_order2_reqs); EXT4_RW_ATTR_SBI_UI(mb_stream_req, s_mb_stream_request); EXT4_RW_ATTR_SBI_UI(mb_group_prealloc, s_mb_group_prealloc); EXT4_RW_ATTR_SBI_UI(max_writeback_mb_bump, s_max_writeback_mb_bump); +EXT4_RW_ATTR_SBI_UI(extent_zeroout_len, s_extent_zeroout_len); EXT4_ATTR(trigger_fs_error, 0200, NULL, trigger_test_error); static struct attribute *ext4_attrs[] = { @@ -2550,6 +2551,7 @@ static struct attribute *ext4_attrs[] = { ATTR_LIST(mb_stream_req), ATTR_LIST(mb_group_prealloc), ATTR_LIST(max_writeback_mb_bump), + ATTR_LIST(extent_zeroout_len), ATTR_LIST(trigger_fs_error), NULL, }; @@ -3626,6 +3628,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_stripe = ext4_get_stripe_size(sbi); sbi->s_max_writeback_mb_bump = 128; + sbi->s_extent_zeroout_len = 256; /* * set up enough so that it can read an inode