Patchwork Generate a label for the split cold function while using -freorder-blocks-and-partition

login
register
mail settings
Submitter Sriraman Tallam
Date April 23, 2013, 10:58 p.m.
Message ID <CAAs8HmxQidaPqi3fFAjtdrK_d2R1d6uwYD=tJqJD8KYLV1yADQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/239022/
State New
Headers show

Comments

Sriraman Tallam - April 23, 2013, 10:58 p.m.
Hi,

  This patch generates labels for cold function parts that are split when
using the option -freorder-blocks-and-partition.  The cold label name
is generated by suffixing ".cold" to the assembler name of the hot
function.

  This is useful when getting back traces from gdb when the cold function
part does get executed.

        * final.c (final_scan_insn): Generate cold label name by suffixing
        ".cold" to function's assembler name.
        * gcc.dg/tree-prof/cold_partition_label.c: New test.

Comments please.

Thanks
Sri
Patch to generate labels for cold function parts that are split when
using the option -freorder-blocks-and-partition.  The cold label name
is generated by suffixing ".cold" to the assembler name of the hot
function.

This is useful when getting back traces from gdb when the cold function
part does get executed. 

	* final.c (final_scan_insn): Generate cold label name by suffixing
	".cold" to function's assembler name.
	* gcc.dg/tree-prof/cold_partition_label.c: New test.
Jakub Jelinek - April 24, 2013, 4:59 a.m.
On Tue, Apr 23, 2013 at 03:58:06PM -0700, Sriraman Tallam wrote:
>   This patch generates labels for cold function parts that are split when
> using the option -freorder-blocks-and-partition.  The cold label name
> is generated by suffixing ".cold" to the assembler name of the hot
> function.
> 
>   This is useful when getting back traces from gdb when the cold function
> part does get executed.
> 
>         * final.c (final_scan_insn): Generate cold label name by suffixing
>         ".cold" to function's assembler name.
>         * gcc.dg/tree-prof/cold_partition_label.c: New test.

This doesn't honor NO_DOT_IN_LABEL (and NO_DOLLAR_IN_LABEL).
Also, don't some function start in cold section and then switch into hot
section?

	Jakub
Teresa Johnson - April 24, 2013, 12:55 p.m.
On Tue, Apr 23, 2013 at 9:59 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Apr 23, 2013 at 03:58:06PM -0700, Sriraman Tallam wrote:
>>   This patch generates labels for cold function parts that are split when
>> using the option -freorder-blocks-and-partition.  The cold label name
>> is generated by suffixing ".cold" to the assembler name of the hot
>> function.
>>
>>   This is useful when getting back traces from gdb when the cold function
>> part does get executed.
>>
>>         * final.c (final_scan_insn): Generate cold label name by suffixing
>>         ".cold" to function's assembler name.
>>         * gcc.dg/tree-prof/cold_partition_label.c: New test.
>
> This doesn't honor NO_DOT_IN_LABEL (and NO_DOLLAR_IN_LABEL).
> Also, don't some function start in cold section and then switch into hot
> section?

That's a good question - with the current trunk implementation it can,
but I have a patch that fixes this and many other bugs I found when
trying to use -freorder-blocks-and-partition. I had sent the patch out
for review awhile ago but it still needs review. See:

http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01303.html
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02141.html (most recent
version of patch).

Teresa

>
>         Jakub



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Steven Bosscher - May 14, 2013, 10:36 p.m.
On Wed, Apr 24, 2013 at 6:59 AM, Jakub Jelinek wrote:
> Also, don't some function start in cold section and then switch into hot
> section?

Yes, this can happen, and there is nothing in the
find_rarely_executed_basic_blocks_and_crossing_edges algorithm to
prevent it. It's not supposed to happen, though, and it is only
possible to trigger the problem if your profile is based on multiple
runs with different test inputs.

Currently the decision about to which partition a basic block is
assigned is based on the return value of probably_never_executed_bb_p,
which looks like this:

static vec<edge>
find_rarely_executed_basic_blocks_and_crossing_edges (void)
{
  ...
  FOR_EACH_BB (bb)
    {
      if (probably_never_executed_bb_p (cfun, bb))
        BB_SET_PARTITION (bb, BB_COLD_PARTITION);
      else
        BB_SET_PARTITION (bb, BB_HOT_PARTITION);
    }
  ....
}

bool
probably_never_executed_bb_p (struct function *fun, const_basic_block bb)
{
  if (profile_info && flag_branch_probabilities)
    return ((bb->count + profile_info->runs / 2) / profile_info->runs) == 0;
   ...
  return false;
}

Consider a test case which has, say, profile_info->runs==6, and a
function in the test case that is only used in one of the runs so that
bb->count==1. In that case, the entry block will be cold, and the
supposed-to-be-imposed rule that a hot region is never dominated by a
cold region is broken. See attached test case with resulting .dot
file.

IMHO this is a bug in the bbpart implementation, and the checking code
I proposed will expose these bugs.

What find_rarely_executed_basic_blocks_and_crossing_edges should do,
is identify hot regions and connect them to the entry block and
(usually) to the exit block. From what I understand from Teresa's
patches, this is what she has implemented.

I don't know if we can trigger this situation with the current test
infrastructure.

Ciao!
Steven


$ cat t.c
#define N 1024*1024*1024

unsigned int a[N];

void __attribute__((__noinline__,__noclone__))
foo (void)
{
  unsigned int i;
  for (i = 0; i < N; i = i + 2)
    a[i] = i % 19;
}

int
main (int argc,
      char *argv[] __attribute__((__unused__)))
{
  if (argc > 1)
    foo ();
  return 0;
}

$ ./xgcc -B. -isystem ./include -O2 -fprofile-generate t.c
$ for i in 1 2 3 4 5 ; do ./a.out ; done
$ ./a.out 1
$ ./xgcc -B. -isystem ./include -O2 -fprofile-use t.c
-fdump-rtl-bbpart{,-graph} -freorder-blocks-and-partition
Steven Bosscher - May 14, 2013, 10:47 p.m.
On Wed, Apr 24, 2013 at 12:58 AM, Sriraman Tallam wrote:
> Hi,
>
>   This patch generates labels for cold function parts that are split when
> using the option -freorder-blocks-and-partition.  The cold label name
> is generated by suffixing ".cold" to the assembler name of the hot
> function.
>
>   This is useful when getting back traces from gdb when the cold function
> part does get executed.

Can you show how this changes your gdb back traces? Can you not
already let GDB inform you that it's in another .text section?

I don't like this label idea much. It seems so arbitrary, unless
you're only interested in knowing when a piece of function from
another .text section is executed. But I'm personally more interested
in the whole function, i.e. also in not-so-hot (or even cold) blocks
that are not in the .text.unlikely section. That means somehow passing
the profile-use information from the compiler to the debugger.

So what I would really like to have instead, is some kind of DWARF
annotations for profile information in the function, with some way in
GDB to inspect and modify the values, and to add break points based on
execution count threshold.

I haven't really thought this through yet, but I have this dream of
having the data of a gcno file encoded in DWARF instead, and to have
profile information included in the FDO-compiled binary to add the
ability to verify that those parts of the program that you expect to
be hot are really also considered hot by the compiler. The only way to
check this now is by inspecting some dumps and that's IMHO not very
practical.

Ciao!
Steven
Teresa Johnson - Nov. 8, 2013, 3:45 p.m.
On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Wed, Apr 24, 2013 at 12:58 AM, Sriraman Tallam wrote:
>> Hi,
>>
>>   This patch generates labels for cold function parts that are split when
>> using the option -freorder-blocks-and-partition.  The cold label name
>> is generated by suffixing ".cold" to the assembler name of the hot
>> function.
>>
>>   This is useful when getting back traces from gdb when the cold function
>> part does get executed.
>
> Can you show how this changes your gdb back traces? Can you not
> already let GDB inform you that it's in another .text section?
>
> I don't like this label idea much. It seems so arbitrary, unless
> you're only interested in knowing when a piece of function from
> another .text section is executed. But I'm personally more interested
> in the whole function, i.e. also in not-so-hot (or even cold) blocks
> that are not in the .text.unlikely section. That means somehow passing
> the profile-use information from the compiler to the debugger.
>
> So what I would really like to have instead, is some kind of DWARF
> annotations for profile information in the function, with some way in
> GDB to inspect and modify the values, and to add break points based on
> execution count threshold.
>
> I haven't really thought this through yet, but I have this dream of
> having the data of a gcno file encoded in DWARF instead, and to have
> profile information included in the FDO-compiled binary to add the
> ability to verify that those parts of the program that you expect to
> be hot are really also considered hot by the compiler. The only way to
> check this now is by inspecting some dumps and that's IMHO not very
> practical.
>
> Ciao!
> Steven

I'd like to revisit this old patch from Sri to generate a label for
the split out cold section, now that all the patches to fix the
failures and insanities from -freorder-blocks-and-partition are in
(and I want to send a patch to enable it by default with fdo). The
problem is that the split code doesn't currently have any label to
identify where it came from, so in the final binary it appears to be
part of whatever non-cold function the linker happens to place it
after. This confuses tools like objdump, profilers and the debugger.

For example, from mcf with -freorder-blocks-and-partition:

main:
.LFB40:
        .cfi_startproc
        ...
        jne     .L27           <---- jump to main's text.unlikely
        ...
        .cfi_endproc
        .section        .text.unlikely
        .cfi_startproc
.L27:                           <--- start of main's text.unlikely
        ...

$ objdump -d mcf

0000000000400aa0 <main>:
  ...
  400b1a:       0f 85 1b fb ff ff       jne    40063b
<replace_weaker_arc+0x17b>   <---- jump to main's text.unlikely

00000000004004c0 <replace_weaker_arc>:
  ...
  40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi    <---
start of main's text.unlikely
  ...
  40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
<--- jump back to main's hot text


Note that objdump thinks we are jumping to/from another function. If
we end up executing the cold section (e.g. due to non-representative
profiles), profiling tools and gdb are going to think we are executing
replace_weaker_arc. With Sri's patch, this becomes much clearer:

0000000000400aa0 <main>:
  ...
  400b1a:       0f 85 1b fb ff ff       jne    40063b <main.cold>

000000000040063b <main.cold>:
  40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi
  ...
  40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>


Steven, I think the ideas you describe above about passing profile
information in DWARF and doing some special handling in gdb to use
that seem orthogonal to this.

Given that, is this patch ok for trunk (pending successful re-runs of
regression testing)

Thanks,
Teresa
Steven Bosscher - Nov. 8, 2013, 4:14 p.m.
On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
> On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher wrote:
> I'd like to revisit this old patch from Sri to generate a label for
> the split out cold section, now that all the patches to fix the
> failures and insanities from -freorder-blocks-and-partition are in
> (and I want to send a patch to enable it by default with fdo). The
> problem is that the split code doesn't currently have any label to
> identify where it came from, so in the final binary it appears to be
> part of whatever non-cold function the linker happens to place it
> after. This confuses tools like objdump, profilers and the debugger.

Surely there are .loc directives in the code to clarify where the code
came from? So at least the debugger should know what function the code
belongs to.

> For example, from mcf with -freorder-blocks-and-partition:
>
> main:
> .LFB40:
>         .cfi_startproc
>         ...
>         jne     .L27           <---- jump to main's text.unlikely
>         ...
>         .cfi_endproc
>         .section        .text.unlikely
>         .cfi_startproc
> .L27:                           <--- start of main's text.unlikely
>         ...
>
> $ objdump -d mcf
>
> 0000000000400aa0 <main>:
>   ...
>   400b1a:       0f 85 1b fb ff ff       jne    40063b
> <replace_weaker_arc+0x17b>   <---- jump to main's text.unlikely
>
> 00000000004004c0 <replace_weaker_arc>:
>   ...
>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi    <---
> start of main's text.unlikely
>   ...
>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
> <--- jump back to main's hot text
>
>
> Note that objdump thinks we are jumping to/from another function. If
> we end up executing the cold section (e.g. due to non-representative
> profiles), profiling tools and gdb are going to think we are executing
> replace_weaker_arc. With Sri's patch, this becomes much clearer:
>
> 0000000000400aa0 <main>:
>   ...
>   400b1a:       0f 85 1b fb ff ff       jne    40063b <main.cold>
>
> 000000000040063b <main.cold>:
>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi
>   ...
>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
>

Does -ffunction-sections help here?


> Steven, I think the ideas you describe above about passing profile
> information in DWARF and doing some special handling in gdb to use
> that seem orthogonal to this.

It's not just about profile information but about what function some
code belongs to. I just cannot believe that there isn't already
something in DWARF to mark address ranges as being part of a given
function.


> Given that, is this patch ok for trunk (pending successful re-runs of
> regression testing)


IMHO, no. This is conceptually wrong. You create artificial named
labels in the compiler that might clash with user labels. Is that
likely to happen? No. But it could and therefore the approach of
adding functionname.cold labels is wrong.

Ciao!
Steven
Teresa Johnson - Nov. 8, 2013, 4:20 p.m.
On Fri, Nov 8, 2013 at 8:14 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
>> On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher wrote:
>> I'd like to revisit this old patch from Sri to generate a label for
>> the split out cold section, now that all the patches to fix the
>> failures and insanities from -freorder-blocks-and-partition are in
>> (and I want to send a patch to enable it by default with fdo). The
>> problem is that the split code doesn't currently have any label to
>> identify where it came from, so in the final binary it appears to be
>> part of whatever non-cold function the linker happens to place it
>> after. This confuses tools like objdump, profilers and the debugger.
>
> Surely there are .loc directives in the code to clarify where the code
> came from? So at least the debugger should know what function the code
> belongs to.
>
>> For example, from mcf with -freorder-blocks-and-partition:
>>
>> main:
>> .LFB40:
>>         .cfi_startproc
>>         ...
>>         jne     .L27           <---- jump to main's text.unlikely
>>         ...
>>         .cfi_endproc
>>         .section        .text.unlikely
>>         .cfi_startproc
>> .L27:                           <--- start of main's text.unlikely
>>         ...
>>
>> $ objdump -d mcf
>>
>> 0000000000400aa0 <main>:
>>   ...
>>   400b1a:       0f 85 1b fb ff ff       jne    40063b
>> <replace_weaker_arc+0x17b>   <---- jump to main's text.unlikely
>>
>> 00000000004004c0 <replace_weaker_arc>:
>>   ...
>>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi    <---
>> start of main's text.unlikely
>>   ...
>>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
>> <--- jump back to main's hot text
>>
>>
>> Note that objdump thinks we are jumping to/from another function. If
>> we end up executing the cold section (e.g. due to non-representative
>> profiles), profiling tools and gdb are going to think we are executing
>> replace_weaker_arc. With Sri's patch, this becomes much clearer:
>>
>> 0000000000400aa0 <main>:
>>   ...
>>   400b1a:       0f 85 1b fb ff ff       jne    40063b <main.cold>
>>
>> 000000000040063b <main.cold>:
>>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi
>>   ...
>>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
>>
>
> Does -ffunction-sections help here?

No. It just moves the cold section to a different spot (so the
function it appears part of is different, but same issue).

>
>
>> Steven, I think the ideas you describe above about passing profile
>> information in DWARF and doing some special handling in gdb to use
>> that seem orthogonal to this.
>
> It's not just about profile information but about what function some
> code belongs to. I just cannot believe that there isn't already
> something in DWARF to mark address ranges as being part of a given
> function.

Cary, any ideas here?

>
>
>> Given that, is this patch ok for trunk (pending successful re-runs of
>> regression testing)
>
>
> IMHO, no. This is conceptually wrong. You create artificial named
> labels in the compiler that might clash with user labels. Is that
> likely to happen? No. But it could and therefore the approach of
> adding functionname.cold labels is wrong.

Is there a format for compiler-defined labels that would not be able
to clash with other user-generated labels?

Teresa

>
> Ciao!
> Steven
Steven Bosscher - Nov. 8, 2013, 4:22 p.m.
On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
> For example, from mcf with -freorder-blocks-and-partition:
>
> main:
> .LFB40:
>         .cfi_startproc
>         ...
>         jne     .L27           <---- jump to main's text.unlikely
>         ...
>         .cfi_endproc
>         .section        .text.unlikely
>         .cfi_startproc
> .L27:                           <--- start of main's text.unlikely
>         ...
>
> $ objdump -d mcf
>
> 0000000000400aa0 <main>:
>   ...
>   400b1a:       0f 85 1b fb ff ff       jne    40063b
> <replace_weaker_arc+0x17b>   <---- jump to main's text.unlikely
>
> 00000000004004c0 <replace_weaker_arc>:
>   ...
>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi    <---
> start of main's text.unlikely
>   ...
>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
> <--- jump back to main's hot text
>
>
> Note that objdump thinks we are jumping to/from another function. If
> we end up executing the cold section (e.g. due to non-representative
> profiles), profiling tools and gdb are going to think we are executing
> replace_weaker_arc.

Isn't this something that should be expressed in DWARF with
DW_AT_ranges? See DWARF4, section 2.17,

Does GCC generate such ranges?

Ciao!
Steven
Teresa Johnson - Nov. 12, 2013, 2:49 p.m.
On Fri, Nov 8, 2013 at 8:20 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Fri, Nov 8, 2013 at 8:14 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
>>> On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher wrote:
>>> I'd like to revisit this old patch from Sri to generate a label for
>>> the split out cold section, now that all the patches to fix the
>>> failures and insanities from -freorder-blocks-and-partition are in
>>> (and I want to send a patch to enable it by default with fdo). The
>>> problem is that the split code doesn't currently have any label to
>>> identify where it came from, so in the final binary it appears to be
>>> part of whatever non-cold function the linker happens to place it
>>> after. This confuses tools like objdump, profilers and the debugger.
>>
>> Surely there are .loc directives in the code to clarify where the code
>> came from? So at least the debugger should know what function the code
>> belongs to.
>>
>>> For example, from mcf with -freorder-blocks-and-partition:
>>>
>>> main:
>>> .LFB40:
>>>         .cfi_startproc
>>>         ...
>>>         jne     .L27           <---- jump to main's text.unlikely
>>>         ...
>>>         .cfi_endproc
>>>         .section        .text.unlikely
>>>         .cfi_startproc
>>> .L27:                           <--- start of main's text.unlikely
>>>         ...
>>>
>>> $ objdump -d mcf
>>>
>>> 0000000000400aa0 <main>:
>>>   ...
>>>   400b1a:       0f 85 1b fb ff ff       jne    40063b
>>> <replace_weaker_arc+0x17b>   <---- jump to main's text.unlikely
>>>
>>> 00000000004004c0 <replace_weaker_arc>:
>>>   ...
>>>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi    <---
>>> start of main's text.unlikely
>>>   ...
>>>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
>>> <--- jump back to main's hot text
>>>
>>>
>>> Note that objdump thinks we are jumping to/from another function. If
>>> we end up executing the cold section (e.g. due to non-representative
>>> profiles), profiling tools and gdb are going to think we are executing
>>> replace_weaker_arc. With Sri's patch, this becomes much clearer:
>>>
>>> 0000000000400aa0 <main>:
>>>   ...
>>>   400b1a:       0f 85 1b fb ff ff       jne    40063b <main.cold>
>>>
>>> 000000000040063b <main.cold>:
>>>   40063b:       bf 06 8a 49 00          mov    $0x498a06,%edi
>>>   ...
>>>   40065e:       e9 0d 05 00 00          jmpq   400b70 <main+0xd0>
>>>
>>
>> Does -ffunction-sections help here?
>
> No. It just moves the cold section to a different spot (so the
> function it appears part of is different, but same issue).
>
>>
>>
>>> Steven, I think the ideas you describe above about passing profile
>>> information in DWARF and doing some special handling in gdb to use
>>> that seem orthogonal to this.
>>
>> It's not just about profile information but about what function some
>> code belongs to. I just cannot believe that there isn't already
>> something in DWARF to mark address ranges as being part of a given
>> function.
>
> Cary, any ideas here?

I spoke with Cary a little bit (Cary, feel free to add to this).
Regarding this question and the follow-up email:

> Isn't this something that should be expressed in DWARF with
> DW_AT_ranges? See DWARF4, section 2.17,
>
> Does GCC generate such ranges?

GCC does generate these ranges. However, according to Cary many tools
do not rely on dwarf info for locating the corresponding function
name, they just look at the symbols to identify what function an
address resides in. Nor would we want tools such as objdump and
profilers to rely on dwarf for locating the function names as this
would not work for binaries that were generated without -g options or
had their debug info stripped.

>
>>
>>
>>> Given that, is this patch ok for trunk (pending successful re-runs of
>>> regression testing)
>>
>>
>> IMHO, no. This is conceptually wrong. You create artificial named
>> labels in the compiler that might clash with user labels. Is that
>> likely to happen? No. But it could and therefore the approach of
>> adding functionname.cold labels is wrong.
>
> Is there a format for compiler-defined labels that would not be able
> to clash with other user-generated labels?

My understanding is that the "." in the generated name ensures that it
will not clash with user-generated labels which cannot contain ".". So
this should not be an issue.

Thanks,
Teresa

>
> Teresa
>
>>
>> Ciao!
>> Steven
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Steven Bosscher - Nov. 12, 2013, 3:54 p.m.
On Tue, Nov 12, 2013 at 3:49 PM, Teresa Johnson wrote:
>> Is there a format for compiler-defined labels that would not be able
>> to clash with other user-generated labels?
>
> My understanding is that the "." in the generated name ensures that it
> will not clash with user-generated labels which cannot contain ".". So
> this should not be an issue.

Is that true for all languages? I don't think we can make that
assumption. Also, there may be clashes with other compiler generated
symbols. I know of at least one Fortran compiler that composes names
for module symbols as "modulename.symbolname", and if symbolname is
"cold" then you can have a clash.

Ciao!
Steven
Sriraman Tallam - Nov. 12, 2013, 5:20 p.m.
On Tue, Nov 12, 2013 at 7:54 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Tue, Nov 12, 2013 at 3:49 PM, Teresa Johnson wrote:
>>> Is there a format for compiler-defined labels that would not be able
>>> to clash with other user-generated labels?
>>
>> My understanding is that the "." in the generated name ensures that it
>> will not clash with user-generated labels which cannot contain ".". So
>> this should not be an issue.
>
> Is that true for all languages? I don't think we can make that
> assumption. Also, there may be clashes with other compiler generated
> symbols. I know of at least one Fortran compiler that composes names
> for module symbols as "modulename.symbolname", and if symbolname is
> "cold" then you can have a clash.

Can we use a language hook to decide what label is kosher for that
language and use the "." suffixed label for c,c++? Also, other passes
that clone functions seem to be doing something similar, like isra.
Function multiversioning too uses ".<target-name>" to disambiguate asm
names of versioned functions. For c++, c++filt already does a good job
of demangling symbols with "." suffixes. For instance "_Z3foov.cold"
demangles to "foo() [clone .cold]" which seems perfect for our
purposes.

Thanks,
Sri


>
> Ciao!
> Steven
Cary Coutant - Nov. 12, 2013, 6:10 p.m.
>>> Is there a format for compiler-defined labels that would not be able
>>> to clash with other user-generated labels?
>>
>> My understanding is that the "." in the generated name ensures that it
>> will not clash with user-generated labels which cannot contain ".". So
>> this should not be an issue.
>
> Is that true for all languages? I don't think we can make that
> assumption. Also, there may be clashes with other compiler generated
> symbols. I know of at least one Fortran compiler that composes names
> for module symbols as "modulename.symbolname", and if symbolname is
> "cold" then you can have a clash.

GCC already uses "." to generate names for cloned functions, and picks
a different separator for targets where "." is legal (see
clone_function_name in cgraphclones.c). It would probably be a good
idea to use clone_function_name to generate the new name (in
particular, because the C++ demangler already knows how to handle that
style when it's applied to mangled names).

-cary
Cary Coutant - Nov. 12, 2013, 6:14 p.m.
>> Isn't this something that should be expressed in DWARF with
>> DW_AT_ranges? See DWARF4, section 2.17,
>>
>> Does GCC generate such ranges?
>
> GCC does generate these ranges. However, according to Cary many tools
> do not rely on dwarf info for locating the corresponding function
> name, they just look at the symbols to identify what function an
> address resides in. Nor would we want tools such as objdump and
> profilers to rely on dwarf for locating the function names as this
> would not work for binaries that were generated without -g options or
> had their debug info stripped.

Yes, while the information needed is in the DWARF info, I don't think
it's a good idea to depend on having debug info in all binaries. It's
quite common to need to symbolize binaries that don't have debug info,
and without a symbol such as Sri and Teresa are proposing, the result
will be not just an address that didn't get symbolized, but an address
that gets symbolized incorrectly (in a way that will often be quite
misleading).

-cary
Richard Sandiford - Nov. 16, 2013, 10:41 a.m.
Cary Coutant <ccoutant@google.com> writes:
>>> Isn't this something that should be expressed in DWARF with
>>> DW_AT_ranges? See DWARF4, section 2.17,
>>>
>>> Does GCC generate such ranges?
>>
>> GCC does generate these ranges. However, according to Cary many tools
>> do not rely on dwarf info for locating the corresponding function
>> name, they just look at the symbols to identify what function an
>> address resides in. Nor would we want tools such as objdump and
>> profilers to rely on dwarf for locating the function names as this
>> would not work for binaries that were generated without -g options or
>> had their debug info stripped.
>
> Yes, while the information needed is in the DWARF info, I don't think
> it's a good idea to depend on having debug info in all binaries. It's
> quite common to need to symbolize binaries that don't have debug info,
> and without a symbol such as Sri and Teresa are proposing, the result
> will be not just an address that didn't get symbolized, but an address
> that gets symbolized incorrectly (in a way that will often be quite
> misleading).

+1 FWIW.

Another reason is that on MIPS, we could be throwing cold MIPS and
MIPS16/microMIPS code into the same section.  Tools like objdump rely
on symbols to figure out which ISA mode is being used where.

Thanks,
Richard

Patch

Index: final.c
===================================================================
--- final.c	(revision 198212)
+++ final.c	(working copy)
@@ -2101,6 +2101,21 @@  final_scan_insn (rtx insn, FILE *file, int optimiz
 	  targetm.asm_out.function_switched_text_sections (asm_out_file,
 							   current_function_decl,
 							   in_cold_section_p);
+	  /* Emit a label for the split cold section.  Form label name by
+	     suffixing ".cold" to the function's assembler name.  */
+	  if (in_cold_section_p)
+	    {
+	      char *cold_function_name;
+	      const char *mangled_function_name;
+	      tree asm_name = DECL_ASSEMBLER_NAME (current_function_decl);
+
+	      mangled_function_name = IDENTIFIER_POINTER (asm_name);
+	      cold_function_name = XNEWVEC (char,
+	          strlen (mangled_function_name) + strlen (".cold") + 1);
+	      sprintf (cold_function_name, "%s.cold", mangled_function_name);
+	      ASM_OUTPUT_LABEL (asm_out_file, cold_function_name);
+	      XDELETEVEC (cold_function_name);
+	    }
 	  break;
 
 	case NOTE_INSN_BASIC_BLOCK:
Index: testsuite/gcc.dg/tree-prof/cold_partition_label.c
===================================================================
--- testsuite/gcc.dg/tree-prof/cold_partition_label.c	(revision 0)
+++ testsuite/gcc.dg/tree-prof/cold_partition_label.c	(revision 0)
@@ -0,0 +1,39 @@ 
+/* Test case to check if function foo gets split and the cold function
+   gets a label.  */
+/* { dg-require-effective-target freorder } */
+/* { dg-options "-O2 -freorder-blocks-and-partition --save-temps" } */
+
+#define SIZE 10000
+
+const char *sarr[SIZE];
+const char *buf_hot;
+const char *buf_cold;
+
+__attribute__((noinline))
+void 
+foo (int path)
+{
+  int i;
+  if (path)
+    {
+      for (i = 0; i < SIZE; i++)
+	sarr[i] = buf_hot;
+    }
+  else
+    {
+      for (i = 0; i < SIZE; i++)
+	sarr[i] = buf_cold;
+    }
+}
+
+int
+main (int argc, char *argv[])
+{
+  buf_hot =  "hello";
+  buf_cold = "world";
+  foo (argc);
+  return 0;
+}
+
+/* { dg-final-use { scan-assembler "foo.cold" } } */
+/* { dg-final-use { cleanup-saved-temps } } */