Patchwork Update jhash.h with the new version of Jenkins' hash

login
register
mail settings
Submitter Jozsef Kadlecsik
Date Feb. 10, 2009, 7:40 p.m.
Message ID <alpine.DEB.2.00.0902102026060.10299@blackhole.kfki.hu>
Download mbox | patch
Permalink /patch/22885/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Jozsef Kadlecsik - Feb. 10, 2009, 7:40 p.m.
Hi Dave,

Please consider applying the following patch:

The current jhash.h implements the lookup2() hash function by Bob Jenkins. 
However, lookup2() is outdated as Bob wrote a new hash function called 
lookup3(). The new hash function

- mixes better than lookup2(): it passes the check that every input bit 
  changes every output bit 50% of the time, while lookup2() failed it.
- performs better: compiled with -O2 on Core2 Duo, lookup3() 20-40% faster
  than lookup2() depending on the key length.

The patch replaces the lookup2() implementation of the 'jhash*' 
functions with that of lookup3().

You can read a longer comparison of the two and other hash functions at 
http://burtleburtle.net/bob/hash/doobs.html.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>



Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott Feldman - Feb. 10, 2009, 9:19 p.m.
Jozsef Kadlecsik wrote:
> The patch replaces the lookup2() implementation of the 'jhash*' 
> functions with that of lookup3().

Should the lookup3() be added to rather than replacing lookup2()?  In 
case some hardware vendor used the lookup2() version for weird things 
like flow classification.

>  /* The golden ration: an arbitrary value */
> -#define JHASH_GOLDEN_RATIO	0x9e3779b9
> +#define JHASH_GOLDEN_RATIO	0xdeadbeef

The #define seems mis-named now.

-scott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Feb. 11, 2009, 1:17 a.m.
From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Date: Tue, 10 Feb 2009 20:40:47 +0100 (CET)

> Please consider applying the following patch:
> 
> The current jhash.h implements the lookup2() hash function by Bob Jenkins. 
> However, lookup2() is outdated as Bob wrote a new hash function called 
> lookup3(). The new hash function
> 
> - mixes better than lookup2(): it passes the check that every input bit 
>   changes every output bit 50% of the time, while lookup2() failed it.
> - performs better: compiled with -O2 on Core2 Duo, lookup3() 20-40% faster
>   than lookup2() depending on the key length.
> 
> The patch replaces the lookup2() implementation of the 'jhash*' 
> functions with that of lookup3().
> 
> You can read a longer comparison of the two and other hash functions at 
> http://burtleburtle.net/bob/hash/doobs.html.
> 
> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

I support this change and I also don't think it makes sense
to have a jhash3.h new header file, that's just a waste of
time.  Either the new stuff is better or it isn't, and that
whole "matching hashes" argument is pure garbage.

But this kind of change isn't for me to decide, changing jhash
impacts the entire tree not just networking, and also I know
that Rusty Russell was investigating making this change and
he even did his own timings on various platforms.

Therefore this needs to be proposed on linux-kernel with appropriate
parties CC:'d.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/linux/jhash.h b/include/linux/jhash.h
index 2a2f99f..2000b9f 100644
--- a/include/linux/jhash.h
+++ b/include/linux/jhash.h
@@ -3,80 +3,95 @@ 
 
 /* jhash.h: Jenkins hash support.
  *
- * Copyright (C) 1996 Bob Jenkins (bob_jenkins@burtleburtle.net)
+ * Copyright (C) 2006. Bob Jenkins (bob_jenkins@burtleburtle.net)
  *
  * http://burtleburtle.net/bob/hash/
  *
  * These are the credits from Bob's sources:
  *
- * lookup2.c, by Bob Jenkins, December 1996, Public Domain.
- * hash(), hash2(), hash3, and mix() are externally useful functions.
- * Routines to test the hash are included if SELF_TEST is defined.
- * You can use this free for any purpose.  It has no warranty.
+ * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
  *
- * Copyright (C) 2003 David S. Miller (davem@redhat.com)
+ * These are functions for producing 32-bit hashes for hash table lookup.
+ * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final() 
+ * are externally useful functions.  Routines to test the hash are included 
+ * if SELF_TEST is defined.  You can use this free for any purpose.  It's in
+ * the public domain.  It has no warranty.
+ *
+ * Copyright (C) 2009 Jozsef Kadlecsik (kadlec@blackhole.kfki.hu)
  *
  * I've modified Bob's hash to be useful in the Linux kernel, and
- * any bugs present are surely my fault.  -DaveM
+ * any bugs present are my fault.  Jozsef
  */
 
-/* NOTE: Arguments are modified. */
-#define __jhash_mix(a, b, c) \
+#define __rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
+
+/* __jhash_mix - mix 3 32-bit values reversibly. */
+#define __jhash_mix(a,b,c) \
+{ \
+  a -= c;  a ^= __rot(c, 4);  c += b; \
+  b -= a;  b ^= __rot(a, 6);  a += c; \
+  c -= b;  c ^= __rot(b, 8);  b += a; \
+  a -= c;  a ^= __rot(c,16);  c += b; \
+  b -= a;  b ^= __rot(a,19);  a += c; \
+  c -= b;  c ^= __rot(b, 4);  b += a; \
+}
+
+/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
+#define __jhash_final(a,b,c) \
 { \
-  a -= b; a -= c; a ^= (c>>13); \
-  b -= c; b -= a; b ^= (a<<8); \
-  c -= a; c -= b; c ^= (b>>13); \
-  a -= b; a -= c; a ^= (c>>12);  \
-  b -= c; b -= a; b ^= (a<<16); \
-  c -= a; c -= b; c ^= (b>>5); \
-  a -= b; a -= c; a ^= (c>>3);  \
-  b -= c; b -= a; b ^= (a<<10); \
-  c -= a; c -= b; c ^= (b>>15); \
+  c ^= b; c -= __rot(b,14); \
+  a ^= c; a -= __rot(c,11); \
+  b ^= a; b -= __rot(a,25); \
+  c ^= b; c -= __rot(b,16); \
+  a ^= c; a -= __rot(c,4);  \
+  b ^= a; b -= __rot(a,14); \
+  c ^= b; c -= __rot(b,24); \
 }
 
 /* The golden ration: an arbitrary value */
-#define JHASH_GOLDEN_RATIO	0x9e3779b9
+#define JHASH_GOLDEN_RATIO	0xdeadbeef
 
 /* The most generic version, hashes an arbitrary sequence
  * of bytes.  No alignment or length assumptions are made about
- * the input key.
+ * the input key. The result depends on endianness.
  */
 static inline u32 jhash(const void *key, u32 length, u32 initval)
 {
-	u32 a, b, c, len;
+	u32 a,b,c;
 	const u8 *k = key;
 
-	len = length;
-	a = b = JHASH_GOLDEN_RATIO;
-	c = initval;
-
-	while (len >= 12) {
-		a += (k[0] +((u32)k[1]<<8) +((u32)k[2]<<16) +((u32)k[3]<<24));
-		b += (k[4] +((u32)k[5]<<8) +((u32)k[6]<<16) +((u32)k[7]<<24));
-		c += (k[8] +((u32)k[9]<<8) +((u32)k[10]<<16)+((u32)k[11]<<24));
-
-		__jhash_mix(a,b,c);
+	/* Set up the internal state */
+	a = b = c = JHASH_GOLDEN_RATIO + length + initval;
 
+	/* all but the last block: affect some 32 bits of (a,b,c) */
+	while (length > 12) {
+    		a += (k[0] + ((u32)k[1]<<8) + ((u32)k[2]<<16) + ((u32)k[3]<<24));
+		b += (k[4] + ((u32)k[5]<<8) + ((u32)k[6]<<16) + ((u32)k[7]<<24));
+		c += (k[8] + ((u32)k[9]<<8) + ((u32)k[10]<<16) + ((u32)k[11]<<24));
+		__jhash_mix(a, b, c);
+		length -= 12;
 		k += 12;
-		len -= 12;
 	}
 
-	c += length;
-	switch (len) {
-	case 11: c += ((u32)k[10]<<24);
-	case 10: c += ((u32)k[9]<<16);
-	case 9 : c += ((u32)k[8]<<8);
-	case 8 : b += ((u32)k[7]<<24);
-	case 7 : b += ((u32)k[6]<<16);
-	case 6 : b += ((u32)k[5]<<8);
+	/* last block: affect all 32 bits of (c) */
+	/* all the case statements fall through */
+	switch (length) {
+	case 12: c += (u32)k[11]<<24;
+	case 11: c += (u32)k[10]<<16;
+	case 10: c += (u32)k[9]<<8;
+	case 9 : c += k[8];
+	case 8 : b += (u32)k[7]<<24;
+	case 7 : b += (u32)k[6]<<16;
+	case 6 : b += (u32)k[5]<<8;
 	case 5 : b += k[4];
-	case 4 : a += ((u32)k[3]<<24);
-	case 3 : a += ((u32)k[2]<<16);
-	case 2 : a += ((u32)k[1]<<8);
+	case 4 : a += (u32)k[3]<<24;
+	case 3 : a += (u32)k[2]<<16;
+	case 2 : a += (u32)k[1]<<8;
 	case 1 : a += k[0];
-	};
-
-	__jhash_mix(a,b,c);
+		__jhash_final(a, b, c);
+	case 0 :
+		break;
+	}
 
 	return c;
 }
@@ -86,58 +101,57 @@  static inline u32 jhash(const void *key, u32 length, u32 initval)
  */
 static inline u32 jhash2(const u32 *k, u32 length, u32 initval)
 {
-	u32 a, b, c, len;
+	u32 a, b, c;
 
-	a = b = JHASH_GOLDEN_RATIO;
-	c = initval;
-	len = length;
+	/* Set up the internal state */
+	a = b = c = JHASH_GOLDEN_RATIO + (length<<2) + initval;
 
-	while (len >= 3) {
+	/* handle most of the key */
+	while (length > 3) {
 		a += k[0];
 		b += k[1];
 		c += k[2];
 		__jhash_mix(a, b, c);
-		k += 3; len -= 3;
+		length -= 3;
+		k += 3;
 	}
 
-	c += length * 4;
-
-	switch (len) {
-	case 2 : b += k[1];
-	case 1 : a += k[0];
-	};
-
-	__jhash_mix(a,b,c);
+	/* handle the last 3 u32's */
+	/* all the case statements fall through */ 
+	switch (length) {
+	case 3: c += k[2];
+	case 2: b += k[1];
+	case 1: a += k[0];
+		__jhash_final(a, b, c);
+	case 0:     /* case 0: nothing left to add */
+		break;
+	}
 
 	return c;
 }
 
-
 /* A special ultra-optimized versions that knows they are hashing exactly
  * 3, 2 or 1 word(s).
- *
- * NOTE: In partilar the "c += length; __jhash_mix(a,b,c);" normally
- *       done at the end is not done here.
  */
 static inline u32 jhash_3words(u32 a, u32 b, u32 c, u32 initval)
 {
-	a += JHASH_GOLDEN_RATIO;
-	b += JHASH_GOLDEN_RATIO;
-	c += initval;
+	a += JHASH_GOLDEN_RATIO + initval;
+	b += JHASH_GOLDEN_RATIO + initval;
+	c += JHASH_GOLDEN_RATIO + initval;
 
-	__jhash_mix(a, b, c);
+	__jhash_final(a, b, c);
 
 	return c;
 }
 
 static inline u32 jhash_2words(u32 a, u32 b, u32 initval)
 {
-	return jhash_3words(a, b, 0, initval);
+	return jhash_3words(0, a, b, initval);
 }
 
 static inline u32 jhash_1word(u32 a, u32 initval)
 {
-	return jhash_3words(a, 0, 0, initval);
+	return jhash_3words(0, 0, a, initval);
 }
 
 #endif /* _LINUX_JHASH_H */