Patchwork Patch: speed up compiler a little bit by optimizing lookup_attribute() and is_attribute_p()

login
register
mail settings
Submitter Dimitrios Apostolou
Date June 21, 2011, 12:20 p.m.
Message ID <alpine.LNX.2.02.1106211457130.1102@localhost.localdomain>
Download mbox | patch
Permalink /patch/101280/
State New
Headers show

Comments

Dimitrios Apostolou - June 21, 2011, 12:20 p.m.
FWIW I think that most of the speedup is due to inlining 
lookup_attribute(). I got almost the same by applying only the attached 
very simple patch, since strlen() was called too often (according to the 
profile at [1]). I used the always_inline attribute to avoid using a 
macro.

I was going to send it together with some other changes I'm trying and 
after proper measurements. Anyway, better late than ever. Thanks to 
Christophe Jaillet (CC'd) for pointing it to me.


Thanks,
Dimitris


[1] http://gcc.gnu.org/wiki/OptimisingGCC




On Tue, 21 Jun 2011, Richard Guenther wrote:

> On Tue, Jun 21, 2011 at 12:17 PM, Nicola Pero
> <nicola.pero@meta-innovation.com> wrote:
>>>> This patch speeds up the C/C++/ObjC/ObjC++ compiler a little bit by optimizing
>>>> lookup_attribute() and is_attribute_p().  The main change is that these functions
>>>> are now inline.
>>>
>>> I don't think this is a good idea.
>>
>> Can you explain why ?  You never do in your response :-)
>>
>> I'm guessing it's because you think the inline functions are too big and that's
>> counter-productive ?
>
> Yes.
>
>>> For this case I'd factor out the NULL attribute list check into an
>>> inline function and keeping the non-NULL attribute list pieces out-of-line.
>>
>> Ok ... so is this what you'd like - the common, hot case in the inline function,
>> and the less common, cold case in the out-of-line one ? :-)
>
> Right.
>
>> That makes sense - I'll try that out tonight (it takes a few hours to run all
>> the benchmarks). ;-)
>
> Thanks.
>
>>
>>> Does it work for all languages?  And yes, I agree it would be nice to
>>> canonicalize to the form without _s even in the attribute lists itself.
>>> Or go one step further - do not store an identifier but an enum as we
>>> in fact only ever insert known attributes into the list (which should
>>> be a VEC instead, possibly even sorted to allow binary search).
>>
>> Sure ... one step at a time. :-)
>
> Of course ;)
>
> Richard.
>
>> Thanks
>>
>>
>
Richard Guenther - June 21, 2011, 12:22 p.m.
On Tue, Jun 21, 2011 at 2:20 PM, Dimitrios Apostolou <jimis@gmx.net> wrote:
> FWIW I think that most of the speedup is due to inlining lookup_attribute().
> I got almost the same by applying only the attached very simple patch, since
> strlen() was called too often (according to the profile at [1]). I used the
> always_inline attribute to avoid using a macro.
>
> I was going to send it together with some other changes I'm trying and after
> proper measurements. Anyway, better late than ever. Thanks to Christophe
> Jaillet (CC'd) for pointing it to me.

Don't use always_inline, and also inline the check for no present attributes.

Richard.

>
> Thanks,
> Dimitris
>
>
> [1] http://gcc.gnu.org/wiki/OptimisingGCC
>
>
>
>
> On Tue, 21 Jun 2011, Richard Guenther wrote:
>
>> On Tue, Jun 21, 2011 at 12:17 PM, Nicola Pero
>> <nicola.pero@meta-innovation.com> wrote:
>>>>>
>>>>> This patch speeds up the C/C++/ObjC/ObjC++ compiler a little bit by
>>>>> optimizing
>>>>> lookup_attribute() and is_attribute_p().  The main change is that these
>>>>> functions
>>>>> are now inline.
>>>>
>>>> I don't think this is a good idea.
>>>
>>> Can you explain why ?  You never do in your response :-)
>>>
>>> I'm guessing it's because you think the inline functions are too big and
>>> that's
>>> counter-productive ?
>>
>> Yes.
>>
>>>> For this case I'd factor out the NULL attribute list check into an
>>>> inline function and keeping the non-NULL attribute list pieces
>>>> out-of-line.
>>>
>>> Ok ... so is this what you'd like - the common, hot case in the inline
>>> function,
>>> and the less common, cold case in the out-of-line one ? :-)
>>
>> Right.
>>
>>> That makes sense - I'll try that out tonight (it takes a few hours to run
>>> all
>>> the benchmarks). ;-)
>>
>> Thanks.
>>
>>>
>>>> Does it work for all languages?  And yes, I agree it would be nice to
>>>> canonicalize to the form without _s even in the attribute lists itself.
>>>> Or go one step further - do not store an identifier but an enum as we
>>>> in fact only ever insert known attributes into the list (which should
>>>> be a VEC instead, possibly even sorted to allow binary search).
>>>
>>> Sure ... one step at a time. :-)
>>
>> Of course ;)
>>
>> Richard.
>>
>>> Thanks
>>>
>>>
>
Nicola Pero - June 21, 2011, 1:25 p.m.
Dimitrious

I didn't realize you were working on this.  Your patch is indeed very similar. :-)

Can I go ahead and rewrite mine following Richard's suggestions (which would make it
even more similar to yours), and add your name to the ChangeLog entry too ?

Thanks

On 21 Jun 2011, at 13:20, Dimitrios Apostolou wrote:

> FWIW I think that most of the speedup is due to inlining lookup_attribute(). I got almost the same by applying only the attached very simple patch, since strlen() was called too often (according to the profile at [1]). I used the always_inline attribute to avoid using a macro.
> 
> I was going to send it together with some other changes I'm trying and after proper measurements. Anyway, better late than ever. Thanks to Christophe Jaillet (CC'd) for pointing it to me.
> 
> 
> Thanks,
> Dimitris
> 
> 
> [1] http://gcc.gnu.org/wiki/OptimisingGCC
> 
> 
> 
> 
> On Tue, 21 Jun 2011, Richard Guenther wrote:
> 
>> On Tue, Jun 21, 2011 at 12:17 PM, Nicola Pero
>> <nicola.pero@meta-innovation.com> wrote:
>>>>> This patch speeds up the C/C++/ObjC/ObjC++ compiler a little bit by optimizing
>>>>> lookup_attribute() and is_attribute_p().  The main change is that these functions
>>>>> are now inline.
>>>> 
>>>> I don't think this is a good idea.
>>> 
>>> Can you explain why ?  You never do in your response :-)
>>> 
>>> I'm guessing it's because you think the inline functions are too big and that's
>>> counter-productive ?
>> 
>> Yes.
>> 
>>>> For this case I'd factor out the NULL attribute list check into an
>>>> inline function and keeping the non-NULL attribute list pieces out-of-line.
>>> 
>>> Ok ... so is this what you'd like - the common, hot case in the inline function,
>>> and the less common, cold case in the out-of-line one ? :-)
>> 
>> Right.
>> 
>>> That makes sense - I'll try that out tonight (it takes a few hours to run all
>>> the benchmarks). ;-)
>> 
>> Thanks.
>> 
>>> 
>>>> Does it work for all languages?  And yes, I agree it would be nice to
>>>> canonicalize to the form without _s even in the attribute lists itself.
>>>> Or go one step further - do not store an identifier but an enum as we
>>>> in fact only ever insert known attributes into the list (which should
>>>> be a VEC instead, possibly even sorted to allow binary search).
>>> 
>>> Sure ... one step at a time. :-)
>> 
>> Of course ;)
>> 
>> Richard.
>> 
>>> Thanks
>>> 
>>> 
> <gcc-lookup_attr.patch>
Dimitrios Apostolou - June 21, 2011, 1:39 p.m.
Hi Nicola,

my patch is too simple compared to yours, feel free to work on it as much 
as you wish, no need to credit me since you posted it independantly. I 
just posted it to note that the inlining part is the one providing most 
performance benefit.

richi: I used always_inline because it is the case that you *never* want 
it to be out-of-line, since it's a small wrapper function providing 
important performance benefit. Do you think a macro would be better?


Thanks,
Dimitris

P.S. Nicola, if you remember it, please keep me Cc'd in further posts of 
yours related to performance of gcc


On Tue, 21 Jun 2011, Nicola Pero wrote:
> Dimitrious
>
> I didn't realize you were working on this.  Your patch is indeed very similar. :-)
>
> Can I go ahead and rewrite mine following Richard's suggestions (which would make it
> even more similar to yours), and add your name to the ChangeLog entry too ?
>
> Thanks
Richard Guenther - June 21, 2011, 3:33 p.m.
On Tue, Jun 21, 2011 at 3:39 PM, Dimitrios Apostolou <jimis@gmx.net> wrote:
> Hi Nicola,
>
> my patch is too simple compared to yours, feel free to work on it as much as
> you wish, no need to credit me since you posted it independantly. I just
> posted it to note that the inlining part is the one providing most
> performance benefit.
>
> richi: I used always_inline because it is the case that you *never* want it
> to be out-of-line, since it's a small wrapper function providing important
> performance benefit. Do you think a macro would be better?

Well, just keep it as static inline function.  We can't use always-inline
because we have to support non-GCC host compilers.

Richard.

>
> Thanks,
> Dimitris
>
> P.S. Nicola, if you remember it, please keep me Cc'd in further posts of
> yours related to performance of gcc
>
>
> On Tue, 21 Jun 2011, Nicola Pero wrote:
>>
>> Dimitrious
>>
>> I didn't realize you were working on this.  Your patch is indeed very
>> similar. :-)
>>
>> Can I go ahead and rewrite mine following Richard's suggestions (which
>> would make it
>> even more similar to yours), and add your name to the ChangeLog entry too
>> ?
>>
>> Thanks
>

Patch

--- gcc-4.6.0.orig/gcc/tree.c	2011-03-14 14:20:48.000000000 +0200

+++ gcc-4.6.0-mine/gcc/tree.c	2011-06-21 12:57:00.000000000 +0300

@@ -5230,10 +5230,9 @@  is_attribute_p (const char *attr, const_

    be passed back in if further occurrences are wanted.  */
 
 tree
-lookup_attribute (const char *attr_name, tree list)

+lookup_attribute_len (const char *attr_name, size_t attr_len, tree list)

 {
   tree l;
-  size_t attr_len = strlen (attr_name);

 
   for (l = list; l; l = TREE_CHAIN (l))
     {

--- gcc-4.6.0.orig/gcc/tree.h	2011-03-12 00:38:58.000000000 +0200

+++ gcc-4.6.0-mine/gcc/tree.h	2011-06-21 13:13:02.000000000 +0300

@@ -4369,7 +4369,16 @@  extern int is_attribute_p (const char *,

 /* Given an attribute name and a list of attributes, return the list element
    of the attribute or NULL_TREE if not found.  */
 
-extern tree lookup_attribute (const char *, tree);

+extern tree lookup_attribute_len (const char *, size_t, tree);

+

+/* implemented as inline so that the compiler optimises away the strlen() for

+   known strings*/

+static __attribute__((__always_inline__)) tree

+lookup_attribute (const char *attr_name, tree list)

+{

+  return lookup_attribute_len(attr_name, strlen (attr_name), list);

+}

+

 
 /* Remove any instances of attribute ATTR_NAME in LIST and return the
    modified list.  */