Message ID | 4C0DF9CD.9010807@kit.edu |
---|---|
State | New |
Headers | show |
On 8 June 2010 09:05, Johannes Singler wrote: > Jonathan Wakely wrote: >> On 8 June 2010 08:21, Johannes Singler wrote: >>> This patch corrects the calculation of the part lengths for parallel >>> partial_sum, leading to the expected behavior for >>> partial_sum_dilation!=1, and thus better performance. >>> >>> Tested x86_64-unknown-linux-gnu: No regressions. >>> >>> Please approve for mainline. >>> >>> 2010-06-08 Johannes Singler <singler@kit.edu> >>> >>> * include/parallel/partial_sum.h >>> (__parallel_partial_sum_linear): >>> Correctly calculate part lengths for partial_sum_dilation!=1. >> >> Please declare each variable on its own line, as per >> http://gcc.gnu.org/onlinedocs/libstdc++/manual/source_code_style.html >> >> char* c = "abc"; // each variable goes on its own line, always. > > Okay, I hope you like the attached version better. Yes, thanks. Are the casts to float necessary in the initializer for first_part_length? The 1.0 means the result will be a double anyway. Should that be 1.0f, or are the (float) casts just visual noise? In any case, I think it's OK for mainline, thanks. Jonathan
Jonathan Wakely wrote: > On 8 June 2010 09:05, Johannes Singler wrote: >> Jonathan Wakely wrote: >>> On 8 June 2010 08:21, Johannes Singler wrote: >>>> This patch corrects the calculation of the part lengths for parallel >>>> partial_sum, leading to the expected behavior for >>>> partial_sum_dilation!=1, and thus better performance. >>>> >>>> Tested x86_64-unknown-linux-gnu: No regressions. >>>> >>>> Please approve for mainline. >>>> >>>> 2010-06-08 Johannes Singler <singler@kit.edu> >>>> >>>> * include/parallel/partial_sum.h >>>> (__parallel_partial_sum_linear): >>>> Correctly calculate part lengths for partial_sum_dilation!=1. >>> Please declare each variable on its own line, as per >>> http://gcc.gnu.org/onlinedocs/libstdc++/manual/source_code_style.html >>> >>> char* c = "abc"; // each variable goes on its own line, always. >> Okay, I hope you like the attached version better. > > Yes, thanks. Are the casts to float necessary in the initializer for > first_part_length? The 1.0 means the result will be a double anyway. > Should that be 1.0f, or are the (float) casts just visual noise? Good point, I have cleaned that up, and commited. Johannes
Index: include/parallel/partial_sum.h =================================================================== --- include/parallel/partial_sum.h (revision 160424) +++ include/parallel/partial_sum.h (working copy) @@ -127,10 +127,13 @@ equally_split(__n, __num_threads + 1, __borders); else { + _DifferenceType __first_part_length = + std::max<_DifferenceType>(1, (float)__n / + (1.0 + __s.partial_sum_dilation * (float)__num_threads)); _DifferenceType __chunk_length = - ((double)__n - / ((double)__num_threads + __s.partial_sum_dilation)), - __borderstart = __n - __num_threads * __chunk_length; + (__n - __first_part_length) / __num_threads; + _DifferenceType __borderstart = + __n - __num_threads * __chunk_length; __borders[0] = 0; for (_ThreadIndex __i = 1; __i < (__num_threads + 1); ++__i) {