From patchwork Fri Apr 19 21:31:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andi Kleen X-Patchwork-Id: 238118 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 71FFE2C007C for ; Sat, 20 Apr 2013 07:33:13 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=A8xF/V9zJ8D0niKBZbz3hHHwwjrXAvDXbxwqS/MFFUFI30t5fHjKW a5ox1AW49eGb9xOQ5Cz2r3DG9ACexIF8XdsZ116S9Zg1baNtlANjDaUS8zh4Va85 ROF9XmBF3fj09+xdrN7g54cvu2Y/EgaPChaSv9BkTgLzuB6+p0N+M4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=dbSMXJ02Btd1pDkHePlFAU1NFTM=; b=g9TleAqi9re4IYrp877X hmWqjriRkrDD20SuMAtrDGcKi97BYc76wJpidG4KkFWUvDnkKGqRCLLDce9gjINr OHfeYRUPBUhsBf5jPjMuzxb+ul7ccjzojFf28ml0ggnG1AgvMFtDwn4S7uQLKjCU W8vWtVbzwPbTuigb4vwBDlo= Received: (qmail 32596 invoked by alias); 19 Apr 2013 21:32:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 32462 invoked by uid 89); 19 Apr 2013 21:32:13 -0000 X-Spam-SWARE-Status: No, score=-3.7 required=5.0 tests=AWL, BAYES_00, KHOP_THREADED, RP_MATCHES_RCVD autolearn=ham version=3.3.1 Received: from one.firstfloor.org (HELO one.firstfloor.org) (193.170.194.197) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Fri, 19 Apr 2013 21:32:11 +0000 Received: by one.firstfloor.org (Postfix, from userid 503) id D2C6E86766; Fri, 19 Apr 2013 23:32:07 +0200 (CEST) From: Andi Kleen To: gcc-patches@gcc.gnu.org Cc: hubicka@ucw.cz, Andi Kleen Subject: [PATCH 4/9] Add murmurhash2a Date: Fri, 19 Apr 2013 23:31:52 +0200 Message-Id: <1366407117-18462-5-git-send-email-andi@firstfloor.org> In-Reply-To: <1366407117-18462-1-git-send-email-andi@firstfloor.org> References: <1366407117-18462-1-git-send-email-andi@firstfloor.org> From: Andi Kleen Used in the next patch. I use Austin Appleby's Public Domain Murmur2A reference code. I don't own that code. Murmur hash is available from http://code.google.com/p/smhasher/wiki/MurmurHash gcc/: 2013-04-18 Andi Kleen * murmurhash.h: New file --- gcc/murmurhash.h | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) create mode 100644 gcc/murmurhash.h diff --git a/gcc/murmurhash.h b/gcc/murmurhash.h new file mode 100644 index 0000000..3051e6d --- /dev/null +++ b/gcc/murmurhash.h @@ -0,0 +1,106 @@ +/* CMurmurHash2A, by Austin Appleby with minor changes for gcc. + This is a variant of MurmurHash2 modified to use the Merkle-Damgard + construction. Bulk speed should be identical to Murmur2, small-key speed + will be 10%-20% slower due to the added overhead at the end of the hash. + This is a sample implementation of MurmurHash2A designed to work + incrementally. + + Only uses 4 bytes, no special 64bit implementation. + However gcc is typically more bound by hash misses on collisions than + the speed of the hash function itself. */ + +#define mmix(h,k) { k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; } + +class cmurmurhash2A +{ + public: + + void begin(uint32_t seed = 0) + { + m_hash = seed; + m_tail = 0; + m_count = 0; + m_size = 0; + } + + /* Note the caller must pass in 4 byte aligned values */ + void add(const unsigned char * data, int len) + { + m_size += len; + + mix_tail (data,len); + + while(len >= 4) + { + uint32_t k = *(const uint32_t *)data; + + mmix (m_hash, k); + + data += 4; + len -= 4; + } + + mix_tail (data,len); + } + + void add_int(uint32_t i) + { + add ((const unsigned char *)&i, 4); + } + + void add_ptr(void *p) + { + add ((const unsigned char *)&p, sizeof (void *)); + } + + void merge(cmurmurhash2A &s) + { + // better way? + add_int (s.end()); + } + + /* Does not destroy state. */ + uint32_t end(void) + { + uint32_t t_hash = m_hash; + uint32_t t_tail = m_tail; + uint32_t t_size = m_size; + + mmix (t_hash, t_tail); + mmix (t_hash, t_size); + + t_hash ^= t_hash >> 13; + t_hash *= m; + t_hash ^= t_hash >> 15; + + return t_hash; + } + + private: + + static const uint32_t m = 0x5bd1e995; + static const int r = 24; + + void mix_tail (const unsigned char * & data, int & len) + { + while (len && ((len<4) || m_count)) + { + m_tail |= (*data++) << (m_count * 8); + + m_count++; + len--; + + if(m_count == 4) + { + mmix (m_hash, m_tail); + m_tail = 0; + m_count = 0; + } + } + } + + uint32_t m_hash; + uint32_t m_tail; + uint32_t m_count; + uint32_t m_size; +};