Patchwork libcpp - tiny speedup

login
register
mail settings
Submitter Nicola Pero
Date Oct. 22, 2010, 6:17 p.m.
Message ID <1287771464.882432386@192.168.2.230>
Download mbox | patch
Permalink /patch/68895/
State New
Headers show

Comments

Nicola Pero - Oct. 22, 2010, 6:17 p.m.
This patch inlines a libcpp hash function for speed.  The speedup is to tiny (I had to
run 100m get_identifier() calls in a loop to measure an 8% reduction in time) that I'm
not sure it's worth the extra complication of having a static inline function in the headers.

Anyway since I've already done the work, here it is.  If the libcpp maintainers 
like it I can commit it.  If not, I won't be offended - I was going to trash it anyway. ;-)

Thanks

PS: calc_hash was already automatically inlined by the compiler even if not declared 
inline.  I inlined the other call, which the compiler doesn't do automatically probably
because the function to inline is too big.
Dodji Seketeli - Oct. 26, 2010, 4:50 a.m.
Thank you for your bothering doing this, Nicola.

"Nicola Pero" <nicola.pero@meta-innovation.com> a écrit:

> This patch inlines a libcpp hash function for speed.  The speedup is
> to tiny (I had to run 100m get_identifier() calls in a loop to measure
> an 8% reduction in time) that I'm not sure it's worth the extra
> complication of having a static inline function in the headers.

Just to make sure, did you notice 8% of reduction time of the
get_identifier call or 8% of reduction of something bigger like, says,
running the preprocessor on some input? I would guess that having
improvement figures on something like the overall preprocessing time of
some somewhat significant input would be useful.
Tom Tromey - Jan. 6, 2011, 3:16 p.m.
>>>>> "Dodji" == Dodji Seketeli <dodji@seketeli.org> writes:

Dodji> Thank you for your bothering doing this, Nicola.
Dodji> "Nicola Pero" <nicola.pero@meta-innovation.com> a écrit:

Nicola> This patch inlines a libcpp hash function for speed.  The speedup is
Nicola> to tiny (I had to run 100m get_identifier() calls in a loop to measure
Nicola> an 8% reduction in time) that I'm not sure it's worth the extra
Nicola> complication of having a static inline function in the headers.

Dodji> Just to make sure, did you notice 8% of reduction time of the
Dodji> get_identifier call or 8% of reduction of something bigger like, says,
Dodji> running the preprocessor on some input? I would guess that having
Dodji> improvement figures on something like the overall preprocessing time of
Dodji> some somewhat significant input would be useful.

I would also like to know the answer to this.
I am generally in favor of libcpp speedups, even at the cost of making
the code more complicated or harder to maintain.
But, if the speedup is not really visible, it doesn't seem worthwhile.

Tom
Nicola Pero - Jan. 6, 2011, 4:32 p.m.
> Dodji> Thank you for your bothering doing this, Nicola.
> Dodji> "Nicola Pero" <nicola.pero@meta-innovation.com> a écrit:
>
> Nicola> This patch inlines a libcpp hash function for speed.  The  
> speedup is
> Nicola> to tiny (I had to run 100m get_identifier() calls in a loop  
> to measure
> Nicola> an 8% reduction in time) that I'm not sure it's worth the  
> extra
> Nicola> complication of having a static inline function in the  
> headers.
>
> Dodji> Just to make sure, did you notice 8% of reduction time of the
> Dodji> get_identifier call or 8% of reduction of something bigger  
> like, says,
> Dodji> running the preprocessor on some input? I would guess that  
> having
> Dodji> improvement figures on something like the overall  
> preprocessing time of
> Dodji> some somewhat significant input would be useful.
>
> I would also like to know the answer to this.
> I am generally in favor of libcpp speedups, even at the cost of making
> the code more complicated or harder to maintain.
> But, if the speedup is not really visible, it doesn't seem worthwhile.

Apologies for not answering before.

I only saw an 8% speedup in get_identifier().

I suppose that it may percolate down to some speedup in actual  
practice, but if so,
it was small and I didn't have a system setup to properly measure it.   
When I tried
to measure it by compiling actual, real files with my compiler, any  
speedup was lost
in the experimental noise.  But, the compiler was built with --enable- 
checking=yes
(I suppose for speed testing you'd need to build with --enable- 
checking=no) and I
wasn't sure about what files to use for testing (presumably there are  
some particular
types of files/tasks that more intensive utilize get_identifier(); I  
tried some headers
and ObjC files and I couldn't measure any difference).

So I don't have much information but the actual speedup (if any) looks  
very small
(ie, max 0.1% even in the best case ?), and I'd suggest we just ignore  
the patch.

Thanks

Patch

Index: symtab.c
===================================================================
--- symtab.c    (revision 165822)
+++ symtab.c    (working copy)
@@ -30,27 +30,12 @@ 
    intrinsically how to calculate a hash value, and how to compare an
    existing entry with a potential new one.  */
 
-static unsigned int calc_hash (const unsigned char *, size_t);
 static void ht_expand (hash_table *);
 static double approx_sqrt (double);
 
 /* A deleted entry.  */
 #define DELETED ((hashnode) -1)
 
-/* Calculate the hash of the string STR of length LEN.  */
-
-static unsigned int
-calc_hash (const unsigned char *str, size_t len)
-{
-  size_t n = len;
-  unsigned int r = 0;
-
-  while (n--)
-    r = HT_HASHSTEP (r, *str++);
-
-  return HT_HASHFINISH (r, len);
-}
-
 /* Initialize an identifier hashtable.  */
 
 hash_table *
@@ -85,20 +70,7 @@ 
   free (table);
 }
 
-/* Returns the hash entry for the a STR of length LEN.  If that string
-   already exists in the table, returns the existing entry.  If the
-   identifier hasn't been seen before, and INSERT is CPP_NO_INSERT,
-   returns NULL.  Otherwise insert and returns a new entry.  A new
-   string is allocated.  */
 hashnode
-ht_lookup (hash_table *table, const unsigned char *str, size_t len,
-          enum ht_lookup_option insert)
-{
-  return ht_lookup_with_hash (table, str, len, calc_hash (str, len),
-                             insert);
-}
-
-hashnode
 ht_lookup_with_hash (hash_table *table, const unsigned char *str,
                     size_t len, unsigned int hash,
                     enum ht_lookup_option insert)
Index: include/symtab.h
===================================================================
--- include/symtab.h    (revision 165822)
+++ include/symtab.h    (working copy)
@@ -76,14 +76,34 @@ 
 /* Frees all memory associated with a hash table.  */
 extern void ht_destroy (hash_table *);
 
-extern hashnode ht_lookup (hash_table *, const unsigned char *,
-                          size_t, enum ht_lookup_option);
 extern hashnode ht_lookup_with_hash (hash_table *, const unsigned char *,
                                      size_t, unsigned int,
                                      enum ht_lookup_option);
+
 #define HT_HASHSTEP(r, c) ((r) * 67 + ((c) - 113));
 #define HT_HASHFINISH(r, len) ((r) + (len))
 
+/* Returns the hash entry for the a STR of length LEN.  If that string
+   already exists in the table, returns the existing entry.  If the
+   identifier hasn't been seen before, and INSERT is CPP_NO_INSERT,
+   returns NULL.  Otherwise insert and returns a new entry.  A new
+   string is allocated.  */
+static inline hashnode
+ht_lookup (hash_table *table, const unsigned char *str, size_t len,
+          enum ht_lookup_option insert)
+{
+  /* Calculate the hash of the string STR of length LEN.  */
+  size_t n = len;
+  unsigned int r = 0;
+  const unsigned char *s = str;
+
+  while (n--)
+    r = HT_HASHSTEP (r, *s++);
+
+  return ht_lookup_with_hash (table, str, len, HT_HASHFINISH (r, len),
+                             insert);
+}
+
 /* For all nodes in TABLE, make a callback.  The callback takes
    TABLE->PFILE, the node, and a PTR, and the callback sequence stops
    if the callback returns zero.  */

Index: ChangeLog
===================================================================
--- ChangeLog   (revision 165822)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@ 
+2010-10-22  Nicola Pero <nicola.pero@meta-innovation.com>
+
+       * include/symtab.h (ht_lookup): Added as static inline.
+       * symtab.c (calc_hash): Removed.
+       (ht_lookup): Removed.
+       
 2010-10-19  Basile Starynkevitch  <basile@starynkevitch.net>
        * line-map.h (source_location): Remove obsolete comment
        mentioning location_s.