From patchwork Tue Jun 8 07:21:56 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Singler X-Patchwork-Id: 54944 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A8733B7D6D for ; Tue, 8 Jun 2010 17:22:11 +1000 (EST) Received: (qmail 9837 invoked by alias); 8 Jun 2010 07:22:09 -0000 Received: (qmail 9821 invoked by uid 22791); 8 Jun 2010 07:22:08 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from iramx2.ira.uni-karlsruhe.de (HELO iramx2.ira.uni-karlsruhe.de) (141.3.10.81) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 08 Jun 2010 07:22:04 +0000 Received: from irams1.ira.uni-karlsruhe.de ([141.3.10.5]) by iramx2.ira.uni-karlsruhe.de with esmtps port 25 id 1OLt8G-0004Jj-Vz; Tue, 08 Jun 2010 09:22:02 +0200 Received: from i10pc67.iti.kit.edu ([141.3.24.67]) by irams1.ira.uni-karlsruhe.de with esmtps port 465 id 1OLt8G-00060C-QH; Tue, 08 Jun 2010 09:21:56 +0200 Message-ID: <4C0DEF94.5020404@kit.edu> Date: Tue, 08 Jun 2010 09:21:56 +0200 From: Johannes Singler User-Agent: Thunderbird 2.0.0.24 (X11/20100302) MIME-Version: 1.0 To: libstdc++ , gcc-patches@gcc.gnu.org Subject: [PATCH][libstdc++-v3 parallel mode] Correct part lengths calculation for parallel partial_sum X-ATIS-AV: ClamAV (irams1.ira.uni-karlsruhe.de) X-ATIS-AV: ClamAV (iramx2.ira.uni-karlsruhe.de) X-ATIS-AV: Kaspersky (iramx2.ira.uni-karlsruhe.de) X-ATIS-Timestamp: iramx2.ira.uni-karlsruhe.de 1275981722.370725000 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch corrects the calculation of the part lengths for parallel partial_sum, leading to the expected behavior for partial_sum_dilation!=1, and thus better performance. Tested x86_64-unknown-linux-gnu: No regressions. Please approve for mainline. 2010-06-08 Johannes Singler * include/parallel/partial_sum.h (__parallel_partial_sum_linear): Correctly calculate part lengths for partial_sum_dilation!=1. Johannes Index: include/parallel/partial_sum.h =================================================================== --- include/parallel/partial_sum.h (revision 160253) +++ include/parallel/partial_sum.h (working copy) @@ -127,10 +127,12 @@ equally_split(__n, __num_threads + 1, __borders); else { - _DifferenceType __chunk_length = - ((double)__n - / ((double)__num_threads + __s.partial_sum_dilation)), - __borderstart = __n - __num_threads * __chunk_length; + _DifferenceType + __first_part_length = std::max<_DifferenceType>(1, + (float)__n / + (1.0 + __s.partial_sum_dilation * (float)__num_threads)), + __chunk_length = (__n - __first_part_length) / __num_threads, + __borderstart = __n - __num_threads * __chunk_length; __borders[0] = 0; for (_ThreadIndex __i = 1; __i < (__num_threads + 1); ++__i) {