diff mbox

Do CRC 4 bits at a time

Message ID 78941ada-34a7-432f-f991-a0002a615797@acm.org
State New
Headers show

Commit Message

Nathan Sidwell April 25, 2017, 4:47 p.m. UTC
Hi,
our current CRC routine processes 1 bit at a time, and permits arbitrary 
numbers of bits from 1 to 32.  However we only ever feed it multiples of 
8 bits to process.

So part of this patch changes the interface to use a crc32_unsigned_n 
worker function, which crcs a N-byte integer.

The other change is to do 4 bits at a time.  This is possible because 
the feedback syndrome is '0x04c11db7', which has the top 5 bits clear. 
But 5's an awkward number to work with, so just go with nibble at a time.

bootstrapped on x86_64-linux, ok for trunk?

nathan

Comments

Richard Biener April 26, 2017, 8:52 a.m. UTC | #1
On Tue, Apr 25, 2017 at 6:47 PM, Nathan Sidwell <nathan@acm.org> wrote:
> Hi,
> our current CRC routine processes 1 bit at a time, and permits arbitrary
> numbers of bits from 1 to 32.  However we only ever feed it multiples of 8
> bits to process.
>
> So part of this patch changes the interface to use a crc32_unsigned_n worker
> function, which crcs a N-byte integer.
>
> The other change is to do 4 bits at a time.  This is possible because the
> feedback syndrome is '0x04c11db7', which has the top 5 bits clear. But 5's
> an awkward number to work with, so just go with nibble at a time.
>
> bootstrapped on x86_64-linux, ok for trunk?

Please use 'inline' rather than 'static inline'.

Did you test the patch produces the same CRCs than before?  Did you do
any performance measurements?

Otherwise looks ok to me.  I wonder why we have this "copy" at all rather
than using libiberties xcrc32?

Thanks,
Richard.

> nathan
>
> --
> Nathan Sidwell
Nathan Sidwell April 26, 2017, 2:04 p.m. UTC | #2
On 04/26/2017 04:52 AM, Richard Biener wrote:
> On Tue, Apr 25, 2017 at 6:47 PM, Nathan Sidwell <nathan@acm.org> wrote:

> Please use 'inline' rather than 'static inline'.

Oh, ok (must have been misled by some exiting static inline somewhere)
> 
> Did you test the patch produces the same CRCs than before?  Did you do
> any performance measurements?

Yes.
1) applied both to a random incoming checksum and random value.  A 
billion iterations showed no differences.

2) 100 million iterations show the new version slightly more than twice 
as fast.

> Otherwise looks ok to me.  I wonder why we have this "copy" at all rather
> than using libiberties xcrc32?

Hm, not entirely sure, I originally introduced crc32_string back in 
2003, which could have used libiberty's (unless perhaps at that time 
things were out of sync, so we didn't have it there?)  For some reason I 
chose not to.

But now, we commonly get the CRC of individual byte values or unsigneds, 
which the xcrc32 interface doesn't do well.  David Li broke it apart to 
make that useful in 2011.

ok with the static inline fix?

nathan
Richard Biener April 26, 2017, 2:25 p.m. UTC | #3
On Wed, Apr 26, 2017 at 4:04 PM, Nathan Sidwell <nathan@acm.org> wrote:
> On 04/26/2017 04:52 AM, Richard Biener wrote:
>>
>> On Tue, Apr 25, 2017 at 6:47 PM, Nathan Sidwell <nathan@acm.org> wrote:
>
>
>> Please use 'inline' rather than 'static inline'.
>
>
> Oh, ok (must have been misled by some exiting static inline somewhere)
>>
>>
>> Did you test the patch produces the same CRCs than before?  Did you do
>> any performance measurements?
>
>
> Yes.
> 1) applied both to a random incoming checksum and random value.  A billion
> iterations showed no differences.
>
> 2) 100 million iterations show the new version slightly more than twice as
> fast.
>
>> Otherwise looks ok to me.  I wonder why we have this "copy" at all rather
>> than using libiberties xcrc32?
>
>
> Hm, not entirely sure, I originally introduced crc32_string back in 2003,
> which could have used libiberty's (unless perhaps at that time things were
> out of sync, so we didn't have it there?)  For some reason I chose not to.
>
> But now, we commonly get the CRC of individual byte values or unsigneds,
> which the xcrc32 interface doesn't do well.  David Li broke it apart to make
> that useful in 2011.

I see.

> ok with the static inline fix?

Ok.

Thanks,
Richard.

> nathan
>
> --
> Nathan Sidwell
diff mbox

Patch

2017-04-25  Nathan Sidwell  <nathan@acm.org>

	* tree.h (crc32_unsigned_n): Declare.
	(crc32_unsigned, crc32_unsigned): Make inline.
	* tree.c (crc32_unsigned_bits): Replace with ...
	(crc32_unsigned_n): ... this.
	(crc32_unsigned, crc32_byte): Remove.
	(crc32_string): Remove unnecessary braces.

Index: tree.c
===================================================================
--- tree.c	(revision 247207)
+++ tree.c	(working copy)
@@ -9611,38 +9611,34 @@  dump_tree_statistics (void)
 
 #define FILE_FUNCTION_FORMAT "_GLOBAL__%s_%s"
 
-/* Generate a crc32 of a byte.  */
+/* Generate a crc32 of the low BYTES bytes of VALUE.  */
 
-static unsigned
-crc32_unsigned_bits (unsigned chksum, unsigned value, unsigned bits)
+unsigned
+crc32_unsigned_n (unsigned chksum, unsigned value, unsigned bytes)
 {
-  unsigned ix;
-
-  for (ix = bits; ix--; value <<= 1)
+  /* This relies on the raw feedback's top 4 bits being zero.  */
+#define FEEDBACK(X) ((X) * 0x04c11db7)
+#define SYNDROME(X) (FEEDBACK ((X) & 1) ^ FEEDBACK ((X) & 2) \
+		     ^ FEEDBACK ((X) & 4) ^ FEEDBACK ((X) & 8))
+  static const unsigned syndromes[16] =
     {
-      unsigned feedback;
-      
-      feedback = (value ^ chksum) & 0x80000000 ? 0x04c11db7 : 0;
-      chksum <<= 1;
-      chksum ^= feedback;
-    }
-  return chksum;
-}
-
-/* Generate a crc32 of a 32-bit unsigned.  */
+      SYNDROME(0x0), SYNDROME(0x1), SYNDROME(0x2), SYNDROME(0x3),
+      SYNDROME(0x4), SYNDROME(0x5), SYNDROME(0x6), SYNDROME(0x7),
+      SYNDROME(0x8), SYNDROME(0x9), SYNDROME(0xa), SYNDROME(0xb),
+      SYNDROME(0xc), SYNDROME(0xd), SYNDROME(0xe), SYNDROME(0xf),
+    };
+#undef FEEDBACK
+#undef SYNDROME
 
-unsigned
-crc32_unsigned (unsigned chksum, unsigned value)
-{
-  return crc32_unsigned_bits (chksum, value, 32);
-}
+  value <<= (32 - bytes * 8);
+  for (unsigned ix = bytes * 2; ix--; value <<= 4)
+    {
+      unsigned feedback = syndromes[((value ^ chksum) >> 28) & 0xf];
 
-/* Generate a crc32 of a byte.  */
+      chksum = (chksum << 4) ^ feedback;
+    }
 
-unsigned
-crc32_byte (unsigned chksum, char byte)
-{
-  return crc32_unsigned_bits (chksum, (unsigned) byte << 24, 8);
+  return chksum;
 }
 
 /* Generate a crc32 of a string.  */
@@ -9651,9 +9647,7 @@  unsigned
 crc32_string (unsigned chksum, const char *string)
 {
   do
-    {
-      chksum = crc32_byte (chksum, *string);
-    }
+    chksum = crc32_byte (chksum, *string);
   while (*string++);
   return chksum;
 }
Index: tree.h
===================================================================
--- tree.h	(revision 247207)
+++ tree.h	(working copy)
@@ -4688,9 +4688,18 @@  inlined_function_outer_scope_p (const_tr
        function_args_iter_next (&(ITER)))
 
 /* In tree.c */
+extern unsigned crc32_unsigned_n (unsigned, unsigned, unsigned);
 extern unsigned crc32_string (unsigned, const char *);
-extern unsigned crc32_byte (unsigned, char);
-extern unsigned crc32_unsigned (unsigned, unsigned);
+static inline unsigned
+crc32_unsigned (unsigned chksum, unsigned value)
+{
+  return crc32_unsigned_n (chksum, value, 4);
+}
+static inline unsigned
+crc32_byte (unsigned chksum, char byte)
+{
+  return crc32_unsigned_n (chksum, byte, 1);
+}
 extern void clean_symbol_name (char *);
 extern tree get_file_function_name (const char *);
 extern tree get_callee_fndecl (const_tree);