From patchwork Wed Jun 9 15:07:19 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 55104 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 3397CB7D20 for ; Thu, 10 Jun 2010 01:07:35 +1000 (EST) Received: (qmail 7423 invoked by alias); 9 Jun 2010 15:07:33 -0000 Received: (qmail 7415 invoked by uid 22791); 9 Jun 2010 15:07:32 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from nikam-dmz.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 09 Jun 2010 15:07:21 +0000 Received: from localhost (occam.ms.mff.cuni.cz [195.113.18.121]) by nikam.ms.mff.cuni.cz (Postfix) with ESMTP id 33FF39AC86B; Wed, 9 Jun 2010 17:07:19 +0200 (CEST) Received: by localhost (Postfix, from userid 16202) id 2EE7E5641AD; Wed, 9 Jun 2010 17:07:19 +0200 (CEST) Date: Wed, 9 Jun 2010 17:07:19 +0200 From: Jan Hubicka To: Paolo Carlini Cc: Richard Guenther , Jan Hubicka , gcc-patches@gcc.gnu.org Subject: Re: Speedup bitmap iterators Message-ID: <20100609150718.GC15770@kam.mff.cuni.cz> References: <20100609095137.GA18910@kam.mff.cuni.cz> <4C0F77C7.1090900@oracle.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <4C0F77C7.1090900@oracle.com> User-Agent: Mutt/1.5.18 (2008-05-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, here is updated patch addressing the code duplication. With __builtin_ctzl I really believe we should fix it at codegen side if it turns out to be slow on hosts we care about. It is codegen bug, __builtin_ctzl exists for this purpose IMO. On i386 the codegen is good even if the topmost bit is usually set. It is slower than test+jcc case but not by that big margin. I might make some statistics on average amount of bits skipped, but build time testing+oprofiling definitly didn't found any regression here. Honza * bitmap.h (bmp_iter_next_bit): New. (bmp_iter_set, bmp_iter_and, bmp_iter_and_compl): Use it. Index: bitmap.h =================================================================== --- bitmap.h (revision 160447) +++ bitmap.h (working copy) @@ -385,6 +385,27 @@ bmp_iter_next (bitmap_iterator *bi, unsi *bit_no += 1; } +/* Advance to first set bit in BI. */ + +static inline void +bmp_iter_next_bit (bitmap_iterator * bi, unsigned *bit_no) +{ +#if (GCC_VERSION >= 3004) + { + unsigned int n = __builtin_ctzl (bi->bits); + gcc_assert (sizeof (unsigned long) == sizeof (BITMAP_WORD)); + bi->bits >>= n; + *bit_no += n; + } +#else + while (!(bi->bits & 1)) + { + bi->bits >>= 1; + *bit_no += 1; + } +#endif +} + /* Advance to the next nonzero bit of a single bitmap, we will have already advanced past the just iterated bit. Return true if there is a bit to iterate. */ @@ -396,11 +417,7 @@ bmp_iter_set (bitmap_iterator *bi, unsig if (bi->bits) { next_bit: - while (!(bi->bits & 1)) - { - bi->bits >>= 1; - *bit_no += 1; - } + bmp_iter_next_bit (bi, bit_no); return true; } @@ -443,11 +460,7 @@ bmp_iter_and (bitmap_iterator *bi, unsig if (bi->bits) { next_bit: - while (!(bi->bits & 1)) - { - bi->bits >>= 1; - *bit_no += 1; - } + bmp_iter_next_bit (bi, bit_no); return true; } @@ -510,11 +523,7 @@ bmp_iter_and_compl (bitmap_iterator *bi, if (bi->bits) { next_bit: - while (!(bi->bits & 1)) - { - bi->bits >>= 1; - *bit_no += 1; - } + bmp_iter_next_bit (bi, bit_no); return true; }