Message ID | 4CFE7409.9090609@redhat.com |
---|---|
State | Not Applicable, archived |
Headers | show |
On Tue, 07 Dec 2010 11:51:05 -0600 Eric Sandeen <sandeen@redhat.com> wrote: > The addition of 64k block capability in the rec_len_from_disk > and rec_len_to_disk functions added a bit of math overhead which > slows down file create workloads needlessly when the architecture > cannot even support 64k blocks, thanks to page size limits. > > The directory entry checking can also be optimized a bit > by sprinkling in some unlikely() conditions to move the > error handling out of line. > > bonnie++ sequential file creates on a 512MB ramdisk speeds up > from about 2200/s to about 2500/s, about a 14% improvement. > hrm, that's an improbably-large sounding improvement from eliminating just a few test-n-branches from a pretty heavyweight operation. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/7/10 3:07 PM, Andrew Morton wrote: > On Tue, 07 Dec 2010 11:51:05 -0600 > Eric Sandeen <sandeen@redhat.com> wrote: > >> The addition of 64k block capability in the rec_len_from_disk >> and rec_len_to_disk functions added a bit of math overhead which >> slows down file create workloads needlessly when the architecture >> cannot even support 64k blocks, thanks to page size limits. >> >> The directory entry checking can also be optimized a bit >> by sprinkling in some unlikely() conditions to move the >> error handling out of line. >> >> bonnie++ sequential file creates on a 512MB ramdisk speeds up >> from about 2200/s to about 2500/s, about a 14% improvement. >> > > hrm, that's an improbably-large sounding improvement from eliminating > just a few test-n-branches from a pretty heavyweight operation. And yet ... Yeah, I dunno. Part of it is that ext2_add_link does a linear search, so when you do rec_len_from_disk 50,000 times on a dir, that little bit adds up quite badly I suppose. Retesting at a bunch of different number-of-files in bonnie (with a small sample size so probably a little noise) |files per sec| files stock patched delta 10,000 12300 14700 +19% 20,000 6300 7600 +20% 30,000 4200 5000 +20% 40,000 3150 3700 +17% 50,000 2500 3000 +20% (again all on a 512MB ramdisk) *shrug* I'll believe my lyin' eyes, I guess. :) -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 07 Dec 2010 15:22:55 -0600 Eric Sandeen <sandeen@redhat.com> wrote: > On 12/7/10 3:07 PM, Andrew Morton wrote: > > On Tue, 07 Dec 2010 11:51:05 -0600 > > Eric Sandeen <sandeen@redhat.com> wrote: > > > >> The addition of 64k block capability in the rec_len_from_disk > >> and rec_len_to_disk functions added a bit of math overhead which > >> slows down file create workloads needlessly when the architecture > >> cannot even support 64k blocks, thanks to page size limits. > >> > >> The directory entry checking can also be optimized a bit > >> by sprinkling in some unlikely() conditions to move the > >> error handling out of line. > >> > >> bonnie++ sequential file creates on a 512MB ramdisk speeds up > >> from about 2200/s to about 2500/s, about a 14% improvement. > >> > > > > hrm, that's an improbably-large sounding improvement from eliminating > > just a few test-n-branches from a pretty heavyweight operation. > > And yet ... > > Yeah, I dunno. Part of it is that ext2_add_link does a linear > search, so when you do rec_len_from_disk 50,000 times on a dir, > that little bit adds up quite badly I suppose. oh yeah, that would do it. > Retesting at a bunch of different number-of-files in bonnie > (with a small sample size so probably a little noise) > > |files per sec| > files stock patched delta > 10,000 12300 14700 +19% > 20,000 6300 7600 +20% > 30,000 4200 5000 +20% > 40,000 3150 3700 +17% > 50,000 2500 3000 +20% > > (again all on a 512MB ramdisk) > > *shrug* I'll believe my lyin' eyes, I guess. :) > I bet other tweaks in there would yield similar goodliness. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/7/10 3:33 PM, Andrew Morton wrote:
> I bet other tweaks in there would yield similar goodliness.
Yep, probably so ...
We could also speed up ext3/4 a bit by not doing the extN_check_dir_entry
call on every single access, but only when we read it from disk.
Given that I've talked about it for 4 years and never done it, maybe
I'll try to find a volunteer ;)
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2010-12-07, at 14:33, Andrew Morton wrote: > On Tue, 07 Dec 2010 15:22:55 -0600 > Eric Sandeen <sandeen@redhat.com> wrote: >> Retesting at a bunch of different number-of-files in bonnie >> (with a small sample size so probably a little noise) >> >> |files per sec| >> files stock patched delta >> 10,000 12300 14700 +19% >> 20,000 6300 7600 +20% >> 30,000 4200 5000 +20% >> 40,000 3150 3700 +17% >> 50,000 2500 3000 +20% >> >> (again all on a 512MB ramdisk) >> >> *shrug* I'll believe my lyin' eyes, I guess. :) > > I bet other tweaks in there would yield similar goodliness. I think an important factor here is that this is being tested on a ramdisk, and is likely CPU bound, so any CPU reduction will directly be measured as a performance improvement. Probably oprofile is in order to see where other major CPU users are. In the past, ext3_mark_inode_dirty() was a major offender, and I recall that we discussed on the ext4 concall a mechanism to only copy the inode once per transaction so that overhead could be removed. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/08/2010 01:01 PM, Andreas Dilger wrote: > On 2010-12-07, at 14:33, Andrew Morton wrote: >> On Tue, 07 Dec 2010 15:22:55 -0600 Eric Sandeen >> <sandeen@redhat.com> wrote: >>> Retesting at a bunch of different number-of-files in bonnie (with >>> a small sample size so probably a little noise) >>> >>> |files per sec| files stock patched delta 10,000 12300 14700 >>> +19% 20,000 6300 7600 +20% 30,000 4200 5000 +20% 40,000 3150 >>> 3700 +17% 50,000 2500 3000 +20% >>> >>> (again all on a 512MB ramdisk) >>> >>> *shrug* I'll believe my lyin' eyes, I guess. :) >> >> I bet other tweaks in there would yield similar goodliness. > > I think an important factor here is that this is being tested on a > ramdisk, and is likely CPU bound, so any CPU reduction will directly > be measured as a performance improvement. Probably oprofile is in > order to see where other major CPU users are. Yep, I ran oprofile. samples % app name symbol name 1140046 41.8702 ext2.ko ext2_find_entry 1052117 38.6408 ext2.ko ext2_add_link 98424 3.6148 vmlinux native_safe_halt 40461 1.4860 vmlinux wait_on_page_read 29084 1.0682 vmlinux find_get_page pretty slammed on those 2 ext2 functions! I think it's pretty overwhelmed by the linear search. -Eric > In the past, ext3_mark_inode_dirty() was a major offender, and I > recall that we discussed on the ext4 concall a mechanism to only copy > the inode once per transaction so that overhead could be removed. > > Cheers, Andreas > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2010-12-08, at 14:07, Eric Sandeen wrote: > On 12/08/2010 01:01 PM, Andreas Dilger wrote: >> I think an important factor here is that this is being tested on a >> ramdisk, and is likely CPU bound, so any CPU reduction will directly >> be measured as a performance improvement. Probably oprofile is in >> order to see where other major CPU users are. > > Yep, I ran oprofile. > > samples % app name symbol name > 1140046 41.8702 ext2.ko ext2_find_entry > 1052117 38.6408 ext2.ko ext2_add_link > 98424 3.6148 vmlinux native_safe_halt > 40461 1.4860 vmlinux wait_on_page_read > 29084 1.0682 vmlinux find_get_page > > pretty slammed on those 2 ext2 functions! I think it's pretty > overwhelmed by the linear search. Can you test ext4 with nojournal mode, but with dir_index enabled? I suspect that testing ext2 for directory performance is pointless. My personal threshold for ext2 directories was 10k files before I considered it a lost cause, and all of your tests are with 10k+ files per directory. Just another log on the fire beneath getting rid of ext2 (and eventually ext3) in favour of ext4, IMHO. I'd be surprised if there are many benchmarks that ext2 can beat ext4 in nojournal mode, if allowed to enable "reversible" format changes like dir_index, uninit_bg, etc. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/08/2010 04:44 PM, Andreas Dilger wrote: > On 2010-12-08, at 14:07, Eric Sandeen wrote: >> On 12/08/2010 01:01 PM, Andreas Dilger wrote: >>> I think an important factor here is that this is being tested on a >>> ramdisk, and is likely CPU bound, so any CPU reduction will directly >>> be measured as a performance improvement. Probably oprofile is in >>> order to see where other major CPU users are. >> Yep, I ran oprofile. >> >> samples % app name symbol name >> 1140046 41.8702 ext2.ko ext2_find_entry >> 1052117 38.6408 ext2.ko ext2_add_link >> 98424 3.6148 vmlinux native_safe_halt >> 40461 1.4860 vmlinux wait_on_page_read >> 29084 1.0682 vmlinux find_get_page >> >> pretty slammed on those 2 ext2 functions! I think it's pretty >> overwhelmed by the linear search. > Can you test ext4 with nojournal mode, but with dir_index enabled? I suspect that testing ext2 for directory performance is pointless. My personal threshold for ext2 directories was 10k files before I considered it a lost cause, and all of your tests are with 10k+ files per directory. > > Just another log on the fire beneath getting rid of ext2 (and eventually ext3) in favour of ext4, IMHO. I'd be surprised if there are many benchmarks that ext2 can beat ext4 in nojournal mode, if allowed to enable "reversible" format changes like dir_index, uninit_bg, etc. > > Cheers, Andreas > If we could get rid of ext2 (and eventually ext3), it would actually help reduce the testing matrix and possibly let us invest even more in testing ext4. Having to maintain three very similar code bases and test them all for correctness and performance is a real pain :) ric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/8/10 3:44 PM, Andreas Dilger wrote: > On 2010-12-08, at 14:07, Eric Sandeen wrote: >> On 12/08/2010 01:01 PM, Andreas Dilger wrote: >>> I think an important factor here is that this is being tested on >>> a ramdisk, and is likely CPU bound, so any CPU reduction will >>> directly be measured as a performance improvement. Probably >>> oprofile is in order to see where other major CPU users are. >> >> Yep, I ran oprofile. >> >> samples % app name symbol name 1140046 >> 41.8702 ext2.ko ext2_find_entry 1052117 38.6408 >> ext2.ko ext2_add_link 98424 3.6148 vmlinux >> native_safe_halt 40461 1.4860 vmlinux >> wait_on_page_read 29084 1.0682 vmlinux >> find_get_page >> >> pretty slammed on those 2 ext2 functions! I think it's pretty >> overwhelmed by the linear search. > > Can you test ext4 with nojournal mode, but with dir_index enabled? I > suspect that testing ext2 for directory performance is pointless. Oh, I agree. I just had a report of a regression in a certain distro, which is frowned upon... :) I'm positive ext4 nojournal would beat the pants off it due to dir_index. > My personal threshold for ext2 directories was 10k files before I > considered it a lost cause, and all of your tests are with 10k+ files > per directory. Agreed, it's not very realistic but then it was a simple fix too. Still took time though... > Just another log on the fire beneath getting rid of ext2 (and > eventually ext3) in favour of ext4, IMHO. I'd be surprised if there > are many benchmarks that ext2 can beat ext4 in nojournal mode, if > allowed to enable "reversible" format changes like dir_index, > uninit_bg, etc. Hey, I wouldn't complain. -Eric > Cheers, Andreas > > > > > > -- To unsubscribe from this list: send the line "unsubscribe > linux-ext4" in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 08, 2010 at 08:06:25PM -0500, Ric Wheeler wrote: > If we could get rid of ext2 (and eventually ext3), it would actually > help reduce the testing matrix and possibly let us invest even more > in testing ext4. Having to maintain three very similar code bases > and test them all for correctness and performance is a real pain :) > A distribution can do that at any time, just by unconfiguring CONFIG_EXT2 and/or CONFIG_EXT3 and configuring CONFIG_EXT4_USE_FOR_EXT23. :-) - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/10/2010 12:55 PM, Ted Ts'o wrote: > On Wed, Dec 08, 2010 at 08:06:25PM -0500, Ric Wheeler wrote: > >> If we could get rid of ext2 (and eventually ext3), it would actually >> help reduce the testing matrix and possibly let us invest even more >> in testing ext4. Having to maintain three very similar code bases >> and test them all for correctness and performance is a real pain :) >> > A distribution can do that at any time, just by unconfiguring > CONFIG_EXT2 and/or CONFIG_EXT3 and configuring > CONFIG_EXT4_USE_FOR_EXT23. :-) > > - Ted Way too easy :) What makes users stop using things is google searches that show up Ted saying "We don't need ext2 any more, just use ext4 with no journal mode" ;) Ric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2010-12-10, at 10:55, Ted Ts'o wrote: > On Wed, Dec 08, 2010 at 08:06:25PM -0500, Ric Wheeler wrote: >> If we could get rid of ext2 (and eventually ext3), it would actually >> help reduce the testing matrix and possibly let us invest even more >> in testing ext4. Having to maintain three very similar code bases >> and test them all for correctness and performance is a real pain :) > > A distribution can do that at any time, just by unconfiguring > CONFIG_EXT2 and/or CONFIG_EXT3 and configuring > CONFIG_EXT4_USE_FOR_EXT23. :-) That doesn't get rid of the maintenance efforts, however. There are a constant stream of "patch for ext[234] needs to be ported to ext[342]". Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/10/10 5:40 PM, Andreas Dilger wrote: > On 2010-12-10, at 10:55, Ted Ts'o wrote: >> On Wed, Dec 08, 2010 at 08:06:25PM -0500, Ric Wheeler wrote: >>> If we could get rid of ext2 (and eventually ext3), it would actually >>> help reduce the testing matrix and possibly let us invest even more >>> in testing ext4. Having to maintain three very similar code bases >>> and test them all for correctness and performance is a real pain :) >> >> A distribution can do that at any time, just by unconfiguring >> CONFIG_EXT2 and/or CONFIG_EXT3 and configuring >> CONFIG_EXT4_USE_FOR_EXT23. :-) > > That doesn't get rid of the maintenance efforts, however. There are a > constant stream of "patch for ext[234] needs to be ported to ext[342]". Agreed.... our 3-way fork is getting cumbersome. > Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext2/dir.c b/fs/ext2/dir.c index 2709b34..47cda41 100644 --- a/fs/ext2/dir.c +++ b/fs/ext2/dir.c @@ -28,21 +28,30 @@ typedef struct ext2_dir_entry_2 ext2_dirent; +/* + * Tests against MAX_REC_LEN etc were put in place for 64k block + * sizes; if that is not possible on this arch, we can skip + * those tests and speed things up. + */ static inline unsigned ext2_rec_len_from_disk(__le16 dlen) { unsigned len = le16_to_cpu(dlen); +#if (PAGE_CACHE_SIZE >= 65536) if (len == EXT2_MAX_REC_LEN) return 1 << 16; +#endif return len; } static inline __le16 ext2_rec_len_to_disk(unsigned len) { +#if (PAGE_CACHE_SIZE >= 65536) if (len == (1 << 16)) return cpu_to_le16(EXT2_MAX_REC_LEN); else BUG_ON(len > (1 << 16)); +#endif return cpu_to_le16(len); } @@ -129,15 +138,15 @@ static void ext2_check_page(struct page *page, int quiet) p = (ext2_dirent *)(kaddr + offs); rec_len = ext2_rec_len_from_disk(p->rec_len); - if (rec_len < EXT2_DIR_REC_LEN(1)) + if (unlikely(rec_len < EXT2_DIR_REC_LEN(1))) goto Eshort; - if (rec_len & 3) + if (unlikely(rec_len & 3)) goto Ealign; - if (rec_len < EXT2_DIR_REC_LEN(p->name_len)) + if (unlikely(rec_len < EXT2_DIR_REC_LEN(p->name_len))) goto Enamelen; - if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1)) + if (unlikely(((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))) goto Espan; - if (le32_to_cpu(p->inode) > max_inumber) + if (unlikely(le32_to_cpu(p->inode) > max_inumber)) goto Einumber; } if (offs != limit)
The addition of 64k block capability in the rec_len_from_disk and rec_len_to_disk functions added a bit of math overhead which slows down file create workloads needlessly when the architecture cannot even support 64k blocks, thanks to page size limits. The directory entry checking can also be optimized a bit by sprinkling in some unlikely() conditions to move the error handling out of line. bonnie++ sequential file creates on a 512MB ramdisk speeds up from about 2200/s to about 2500/s, about a 14% improvement. Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html