diff mbox

Time profiler - phase 2

Message ID 20131205133454.GA15832@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Dec. 5, 2013, 1:34 p.m. UTC
> Hello,
>    there are dumps for Inkscape, it looks very well. There are few of
> functions that look like this (wpa cgraph):
> 
> _ZL13resync_activeP19_EgeSelectOneActionii/2604322 (resync_active)
> @0x7f84af42cea0
>   Type: function definition analyzed
>   Visibility: prevailing_def_ironly
>   References:
>   Referring:
>   Read from file: libinkscape.a
>   Availability: local
>   First run: 4422
>   Function flags: executed 47x local
>   Called by: ege_select_one_action_set_active_text/2604300 (0.34 per
> call) (can throw external)
> _ZL21commit_pending_changeP19_EgeSelectOneAction/2604327 (0.16 per
> call) (can throw external)
> _ZL34ege_select_one_action_set_propertyP8_GObjectjPK7_GValueP11_GParamSpec/2604316
> (47x) (0.16 per call) (can throw external)
>   Calls: _ZL13resync_activeP19_EgeSelectOneActionii.part.0/2604456
> (10x) (0.21 per call) (can throw external)
> 
> _ZL13resync_activeP19_EgeSelectOneActionii.part.0/2604456
> (_ZL13resync_activeP19_EgeSelectOneActionii.part.0) @0x7f84af42cd68
>   Type: function definition analyzed
>   Visibility: artificial
>   References: _ZL7signals/2604291 (read)
>   Referring:
>   Read from file: libinkscape.a
>   Availability: local
>   First run: 0
>   Function flags: executed 10x local
>   Called by: _ZL13resync_activeP19_EgeSelectOneActionii/2604322 (10x)
> (0.21 per call) (can throw external)
> 
> First function has a profile (position is correct according to
> valgrind) and second not. Both of them comes from the same object
> file. The problem is that the second one is called according to
> valgrind. What does .part.X means, is it a part of function that was
> separated to a different function? Is there any was these two profiles
> could be merged?

.part.X are functions created by function splitting.  I assume we want to modify
ipa-split.c as follows:

Can you, please, send me the -flto systemtaps for gimp and/or inkscape so we can decide
on the patch? We should enable it earlier in stage3 rather than later.

Honza

Comments

Jan Hubicka Dec. 5, 2013, 1:38 p.m. UTC | #1
> Can you, please, send me the -flto systemtaps for gimp and/or inkscape so we can decide
> on the patch? We should enable it earlier in stage3 rather than later.

I see, the PDF was included within the tar file.  Was this with
-freorder-blocks-and-partition?  If so, the patch is OK.
I still think we should put cold parts of hot/normal function into a subsection different
from unlikely section, but lets handle that incrementally.

Thanks,
Honza
> 
> Honza
Martin Liška Dec. 5, 2013, 7:38 p.m. UTC | #2
Hello,
   thank you for the trick in ipa-split.c. It really helped! I
prepared 2 tests for Inkscape, first was just with my function
reordering pass. And for the second, I enable also
-freorder-blocks-and-partition (note: files ending with _blocks.xxx in
attached tar).

Touched pages:
just reorder functions: 1108
with -freorder-blocks-and-partition: 1120

Please see dumps at:
https://drive.google.com/file/d/0B0pisUJ80pO1R19zSXR6U1Q4NmM/edit?usp=sharing
Note: I put all function to a single section (for easier layout orientation).

If I'm correct, there a small chunk of about 10 functions after the
boundary of untouched functions and a single miss at the end of
binary: __libc_csu_init.
If you look at the graph line, it looks really well.

I will prepare patch for mailing list as a phase 2.

Martin

On 5 December 2013 14:38, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Can you, please, send me the -flto systemtaps for gimp and/or inkscape so we can decide
>> on the patch? We should enable it earlier in stage3 rather than later.
>
> I see, the PDF was included within the tar file.  Was this with
> -freorder-blocks-and-partition?  If so, the patch is OK.
> I still think we should put cold parts of hot/normal function into a subsection different
> from unlikely section, but lets handle that incrementally.
>
> Thanks,
> Honza
>>
>> Honza
Jan Hubicka Dec. 5, 2013, 9:32 p.m. UTC | #3
> Hello,
>    thank you for the trick in ipa-split.c. It really helped! I

Good!, this patch is pre-approved after testing.
> prepared 2 tests for Inkscape, first was just with my function
> reordering pass. And for the second, I enable also
> -freorder-blocks-and-partition (note: files ending with _blocks.xxx in
> attached tar).
> 
> Touched pages:
> just reorder functions: 1108
> with -freorder-blocks-and-partition: 1120
> 
> Please see dumps at:
> https://drive.google.com/file/d/0B0pisUJ80pO1R19zSXR6U1Q4NmM/edit?usp=sharing
> Note: I put all function to a single section (for easier layout orientation).

I think for -freorder-blocks-and-partition you will need to split the sections
again, otherwise we will not see the effect of the pass. Can you, please, make
one extra systemtap with this (it would be good to have both
-fno-reorder-blocks-and-partition and -freorder-blocks-and-partition so we can
compare size of bloth)?
But overall it looks good!

Honza
> 
> If I'm correct, there a small chunk of about 10 functions after the
> boundary of untouched functions and a single miss at the end of
> binary: __libc_csu_init.
> If you look at the graph line, it looks really well.
> 
> I will prepare patch for mailing list as a phase 2.
> 
> Martin
> 
> On 5 December 2013 14:38, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> Can you, please, send me the -flto systemtaps for gimp and/or inkscape so we can decide
> >> on the patch? We should enable it earlier in stage3 rather than later.
> >
> > I see, the PDF was included within the tar file.  Was this with
> > -freorder-blocks-and-partition?  If so, the patch is OK.
> > I still think we should put cold parts of hot/normal function into a subsection different
> > from unlikely section, but lets handle that incrementally.
> >
> > Thanks,
> > Honza
> >>
> >> Honza
diff mbox

Patch

Index: ipa-split.c
===================================================================
--- ipa-split.c (revision 205531)
+++ ipa-split.c (working copy)
@@ -1233,6 +1233,7 @@ 
                                     !split_part_return_p,
                                     split_point->split_bbs,
                                     split_point->entry_bb, "part");
+  node->tp_first_run = cur_node->tp_first_run + 1;
   /* For usual cloning it is enough to clear builtin only when signature
      changes.  For partial inlining we however can not expect the part
      of builtin implementation to have same semantic as the whole.  */