Message ID | 20170905212725.GA2291@tucnak |
---|---|
State | New |
Headers | show |
Series | Fix rs6000 sysv4 -fPIC hot/cold partitioning handling (PR target/81979) | expand |
Hi, On Tue, Sep 05, 2017 at 11:27:25PM +0200, Jakub Jelinek wrote: > On powerpc with sysv4 -fPIC we emit something like > .LCL0: > .long .LCTOC1-.LCF0 > before we start emitting the function, and in the prologue we emit > .LCF0: > and some code. This fails to assemble if the prologue is emitted in a > different partition from the start of the function, as e.g. the following > testcase, where the start of the function is hot, i.e. in .text section, > but the shrink-wrapped prologue is cold, emitted in .text.unlikely section. > .LCL0 is still emitted in the section the function starts, thus .text, and > there is no relocation for subtraction of two symbols in other sections > (the second - operand has to be in the current section so that a PC-relative > relocation can be used). This probably never worked, but is now more > severe, as we enable hot/cold partitioning in GCC 8, where it > has been previously only enabled for -fprofile-use. I wonder if that helps performance at all, for rs6000 anyway... It's is a never-ending source of ICEs though :-( > --- gcc/config/rs6000/rs6000.c.jj 2017-09-04 09:55:28.000000000 +0200 > +++ gcc/config/rs6000/rs6000.c 2017-09-04 16:36:49.033213325 +0200 > @@ -25248,12 +25248,15 @@ get_TOC_alias_set (void) > > /* This returns nonzero if the current function uses the TOC. This is > determined by the presence of (use (unspec ... UNSPEC_TOC)), which > - is generated by the ABI_V4 load_toc_* patterns. */ > + is generated by the ABI_V4 load_toc_* patterns. > + Return 2 instead of 1 if the load_toc_* pattern is in the function > + partition that doesn't start the function. */ > #if TARGET_ELF > static int > uses_TOC (void) > { > rtx_insn *insn; > + int ret = 1; > > for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) { > if (INSN_P (insn)) > @@ -25270,10 +25273,14 @@ uses_TOC (void) > sub = XEXP (sub, 0); > if (GET_CODE (sub) == UNSPEC > && XINT (sub, 1) == UNSPEC_TOC) > - return 1; > + return ret; > } > } > } > + else if (crtl->has_bb_partition > + && NOTE_P (insn) > + && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) > + ret = 2; } > return 0; > } > #endif > @@ -33304,14 +33311,20 @@ rs6000_elf_declare_function_name (FILE * > return; > } > > + int uses_toc; > if (DEFAULT_ABI == ABI_V4 > && (TARGET_RELOCATABLE || flag_pic > 1) > && !TARGET_SECURE_PLT > && (!constant_pool_empty_p () || crtl->profile) > - && uses_TOC ()) > + && (uses_toc = uses_TOC ())) > { > char buf[256]; > > + if (uses_toc == 2) > + { > + in_cold_section_p = !in_cold_section_p; > + switch_to_section (current_function_section ()); > + } > (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno); > > fprintf (file, "\t.long "); > @@ -33321,6 +33334,11 @@ rs6000_elf_declare_function_name (FILE * > ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno); > assemble_name (file, buf); > putc ('\n', file); > + if (uses_toc == 2) > + { > + in_cold_section_p = !in_cold_section_p; > + switch_to_section (current_function_section ()); > + } > } Hrm, does that work if not hot/cold partitioning? Oh, that cannot happen because uses_toc==2. Tricky. Maybe this "switch to the other section" thing should be abstracted out? Messing with in_cold_section_p is a bit dirty. Otherwise looks okay; please add the {} in the first fragment. Thanks, Segher
On Wed, Sep 06, 2017 at 11:10:07AM -0500, Segher Boessenkool wrote: > > for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) > > { > > > if (INSN_P (insn)) > > @@ -25270,10 +25273,14 @@ uses_TOC (void) > > sub = XEXP (sub, 0); > > if (GET_CODE (sub) == UNSPEC > > && XINT (sub, 1) == UNSPEC_TOC) > > - return 1; > > + return ret; > > } > > } > > } > > + else if (crtl->has_bb_partition > > + && NOTE_P (insn) > > + && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) > > + ret = 2; > > } Ok. > > + if (uses_toc == 2) I could repeat the crtl->has_bb_partition test here if it made things clearer, but it is redundant with the above. > > + { > > + in_cold_section_p = !in_cold_section_p; > > + switch_to_section (current_function_section ()); > > + } > > (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno); > > > > fprintf (file, "\t.long "); > > @@ -33321,6 +33334,11 @@ rs6000_elf_declare_function_name (FILE * > > ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno); > > assemble_name (file, buf); > > putc ('\n', file); > > + if (uses_toc == 2) > > + { > > + in_cold_section_p = !in_cold_section_p; > > + switch_to_section (current_function_section ()); > > + } > > } > > Hrm, does that work if not hot/cold partitioning? Oh, that cannot happen > because uses_toc==2. Tricky. > > Maybe this "switch to the other section" thing should be abstracted out? > Messing with in_cold_section_p is a bit dirty. But it reflects the reality, and is what final.c and varasm.c also do. Without changing in_cold_section_p, that flag will be incorrect while inside of the other section. There are no switch_to_* functions except to switch_to_section, and as argument that can use current_function_section which uses the in_cold_section_p flag, or unlikely_text_section which hardcodes true for in cold, or function_section which uses first_function_block_is_cold. Even if we introduced function_other_section that used !first_function_block_is_cold the in_cold_section_p flag would be incorrect there. Jakub
On Wed, Sep 06, 2017 at 06:26:10PM +0200, Jakub Jelinek wrote: > > Maybe this "switch to the other section" thing should be abstracted out? > > Messing with in_cold_section_p is a bit dirty. > > But it reflects the reality, and is what final.c and varasm.c also do. Yes, but those aren't target code :-) I'm suggesting adding a generic switch_from_hot_to_cold_or_the_other_way_around function (but with a better name ;-) ) that just does these same two lines, only not in target code. Seems cleaner to me, less surprising. But, okay either way. Segher
On Wed, Sep 06, 2017 at 11:48:00AM -0500, Segher Boessenkool wrote: > On Wed, Sep 06, 2017 at 06:26:10PM +0200, Jakub Jelinek wrote: > > > Maybe this "switch to the other section" thing should be abstracted out? > > > Messing with in_cold_section_p is a bit dirty. > > > > But it reflects the reality, and is what final.c and varasm.c also do. > > Yes, but those aren't target code :-) > > I'm suggesting adding a generic switch_from_hot_to_cold_or_the_other_way_around > function (but with a better name ;-) ) that just does these same two lines, > only not in target code. Seems cleaner to me, less surprising. Richard, is this generic change ok? 2017-09-07 Jakub Jelinek <jakub@redhat.com> PR target/81979 * output.h (switch_to_other_text_partition): New declaration. * varasm.c (switch_to_other_text_partition): New function. * config/rs6000/rs6000.c (uses_TOC): Return 2 if NOTE_INSN_SWITCH_TEXT_SECTIONS is seen before finding load_toc_* insn. (rs6000_elf_declare_function_name): If uses_TOC returned 2, switch to the other text partition before emitting LCL label and switch back after emitting the word after it. * gcc.dg/pr81979.c: New test. --- gcc/output.h.jj 2017-09-01 09:26:48.000000000 +0200 +++ gcc/output.h 2017-09-07 11:38:48.668090305 +0200 @@ -537,6 +537,7 @@ extern section *mergeable_constant_secti extern section *function_section (tree); extern section *unlikely_text_section (void); extern section *current_function_section (void); +extern void switch_to_other_text_partition (void); /* Return the numbered .ctors.N (if CONSTRUCTOR_P) or .dtors.N (if not) section for PRIORITY. */ --- gcc/varasm.c.jj 2017-09-06 11:09:39.000000000 +0200 +++ gcc/varasm.c 2017-09-07 11:35:34.366442544 +0200 @@ -695,6 +695,16 @@ unlikely_text_section_p (section *sect) return sect == function_section_1 (current_function_decl, true); } +/* Switch to the other function partition (if inside of hot section + into cold section, otherwise into the hot section). */ + +void +switch_to_other_text_partition (void) +{ + in_cold_section_p = !in_cold_section_p; + switch_to_section (current_function_section ()); +} + /* Return the read-only data section associated with function DECL. */ section * --- gcc/config/rs6000/rs6000.c.jj 2017-09-05 23:28:22.238928428 +0200 +++ gcc/config/rs6000/rs6000.c 2017-09-07 11:39:25.781640963 +0200 @@ -25324,32 +25324,41 @@ get_TOC_alias_set (void) /* This returns nonzero if the current function uses the TOC. This is determined by the presence of (use (unspec ... UNSPEC_TOC)), which - is generated by the ABI_V4 load_toc_* patterns. */ + is generated by the ABI_V4 load_toc_* patterns. + Return 2 instead of 1 if the load_toc_* pattern is in the function + partition that doesn't start the function. */ #if TARGET_ELF static int uses_TOC (void) { rtx_insn *insn; + int ret = 1; for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) - if (INSN_P (insn)) - { - rtx pat = PATTERN (insn); - int i; + { + if (INSN_P (insn)) + { + rtx pat = PATTERN (insn); + int i; - if (GET_CODE (pat) == PARALLEL) - for (i = 0; i < XVECLEN (pat, 0); i++) - { - rtx sub = XVECEXP (pat, 0, i); - if (GET_CODE (sub) == USE) - { - sub = XEXP (sub, 0); - if (GET_CODE (sub) == UNSPEC - && XINT (sub, 1) == UNSPEC_TOC) - return 1; - } - } - } + if (GET_CODE (pat) == PARALLEL) + for (i = 0; i < XVECLEN (pat, 0); i++) + { + rtx sub = XVECEXP (pat, 0, i); + if (GET_CODE (sub) == USE) + { + sub = XEXP (sub, 0); + if (GET_CODE (sub) == UNSPEC + && XINT (sub, 1) == UNSPEC_TOC) + return ret; + } + } + } + else if (crtl->has_bb_partition + && NOTE_P (insn) + && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) + ret = 2; + } return 0; } #endif @@ -33380,14 +33389,17 @@ rs6000_elf_declare_function_name (FILE * return; } + int uses_toc; if (DEFAULT_ABI == ABI_V4 && (TARGET_RELOCATABLE || flag_pic > 1) && !TARGET_SECURE_PLT && (!constant_pool_empty_p () || crtl->profile) - && uses_TOC ()) + && (uses_toc = uses_TOC ())) { char buf[256]; + if (uses_toc == 2) + switch_to_other_text_partition (); (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno); fprintf (file, "\t.long "); @@ -33397,6 +33409,8 @@ rs6000_elf_declare_function_name (FILE * ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno); assemble_name (file, buf); putc ('\n', file); + if (uses_toc == 2) + switch_to_other_text_partition (); } ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function"); --- gcc/testsuite/gcc.dg/pr81979.c.jj 2017-09-07 11:31:15.893566211 +0200 +++ gcc/testsuite/gcc.dg/pr81979.c 2017-09-07 11:31:15.893566211 +0200 @@ -0,0 +1,32 @@ +/* PR target/81979 */ +/* { dg-do link } */ +/* { dg-options "-O2 -w" } */ +/* { dg-additional-options "-fPIC" { target fpic } } */ +/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */ + +int d; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + int c; + while (c < 1) + { + int o; + for (o = 0; o < 4; ++o) + c /= (x != 0) ? 2 : x; + } + + d = 1; + for (;;) + ; +} + +int +main () +{ + asm volatile ("" : : "r" (&d) : "memory"); + foo (d); + asm volatile ("" : : "r" (&d) : "memory"); + return 0; +} Jakub
On Thu, 7 Sep 2017, Jakub Jelinek wrote: > On Wed, Sep 06, 2017 at 11:48:00AM -0500, Segher Boessenkool wrote: > > On Wed, Sep 06, 2017 at 06:26:10PM +0200, Jakub Jelinek wrote: > > > > Maybe this "switch to the other section" thing should be abstracted out? > > > > Messing with in_cold_section_p is a bit dirty. > > > > > > But it reflects the reality, and is what final.c and varasm.c also do. > > > > Yes, but those aren't target code :-) > > > > I'm suggesting adding a generic switch_from_hot_to_cold_or_the_other_way_around > > function (but with a better name ;-) ) that just does these same two lines, > > only not in target code. Seems cleaner to me, less surprising. > > Richard, is this generic change ok? works for me. Thanks, Richard. > 2017-09-07 Jakub Jelinek <jakub@redhat.com> > > PR target/81979 > * output.h (switch_to_other_text_partition): New declaration. > * varasm.c (switch_to_other_text_partition): New function. > * config/rs6000/rs6000.c (uses_TOC): Return 2 if > NOTE_INSN_SWITCH_TEXT_SECTIONS is seen before finding load_toc_* insn. > (rs6000_elf_declare_function_name): If uses_TOC returned 2, switch > to the other text partition before emitting LCL label and switch back > after emitting the word after it. > > * gcc.dg/pr81979.c: New test. > > --- gcc/output.h.jj 2017-09-01 09:26:48.000000000 +0200 > +++ gcc/output.h 2017-09-07 11:38:48.668090305 +0200 > @@ -537,6 +537,7 @@ extern section *mergeable_constant_secti > extern section *function_section (tree); > extern section *unlikely_text_section (void); > extern section *current_function_section (void); > +extern void switch_to_other_text_partition (void); > > /* Return the numbered .ctors.N (if CONSTRUCTOR_P) or .dtors.N (if > not) section for PRIORITY. */ > --- gcc/varasm.c.jj 2017-09-06 11:09:39.000000000 +0200 > +++ gcc/varasm.c 2017-09-07 11:35:34.366442544 +0200 > @@ -695,6 +695,16 @@ unlikely_text_section_p (section *sect) > return sect == function_section_1 (current_function_decl, true); > } > > +/* Switch to the other function partition (if inside of hot section > + into cold section, otherwise into the hot section). */ > + > +void > +switch_to_other_text_partition (void) > +{ > + in_cold_section_p = !in_cold_section_p; > + switch_to_section (current_function_section ()); > +} > + > /* Return the read-only data section associated with function DECL. */ > > section * > --- gcc/config/rs6000/rs6000.c.jj 2017-09-05 23:28:22.238928428 +0200 > +++ gcc/config/rs6000/rs6000.c 2017-09-07 11:39:25.781640963 +0200 > @@ -25324,32 +25324,41 @@ get_TOC_alias_set (void) > > /* This returns nonzero if the current function uses the TOC. This is > determined by the presence of (use (unspec ... UNSPEC_TOC)), which > - is generated by the ABI_V4 load_toc_* patterns. */ > + is generated by the ABI_V4 load_toc_* patterns. > + Return 2 instead of 1 if the load_toc_* pattern is in the function > + partition that doesn't start the function. */ > #if TARGET_ELF > static int > uses_TOC (void) > { > rtx_insn *insn; > + int ret = 1; > > for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) > - if (INSN_P (insn)) > - { > - rtx pat = PATTERN (insn); > - int i; > + { > + if (INSN_P (insn)) > + { > + rtx pat = PATTERN (insn); > + int i; > > - if (GET_CODE (pat) == PARALLEL) > - for (i = 0; i < XVECLEN (pat, 0); i++) > - { > - rtx sub = XVECEXP (pat, 0, i); > - if (GET_CODE (sub) == USE) > - { > - sub = XEXP (sub, 0); > - if (GET_CODE (sub) == UNSPEC > - && XINT (sub, 1) == UNSPEC_TOC) > - return 1; > - } > - } > - } > + if (GET_CODE (pat) == PARALLEL) > + for (i = 0; i < XVECLEN (pat, 0); i++) > + { > + rtx sub = XVECEXP (pat, 0, i); > + if (GET_CODE (sub) == USE) > + { > + sub = XEXP (sub, 0); > + if (GET_CODE (sub) == UNSPEC > + && XINT (sub, 1) == UNSPEC_TOC) > + return ret; > + } > + } > + } > + else if (crtl->has_bb_partition > + && NOTE_P (insn) > + && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) > + ret = 2; > + } > return 0; > } > #endif > @@ -33380,14 +33389,17 @@ rs6000_elf_declare_function_name (FILE * > return; > } > > + int uses_toc; > if (DEFAULT_ABI == ABI_V4 > && (TARGET_RELOCATABLE || flag_pic > 1) > && !TARGET_SECURE_PLT > && (!constant_pool_empty_p () || crtl->profile) > - && uses_TOC ()) > + && (uses_toc = uses_TOC ())) > { > char buf[256]; > > + if (uses_toc == 2) > + switch_to_other_text_partition (); > (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno); > > fprintf (file, "\t.long "); > @@ -33397,6 +33409,8 @@ rs6000_elf_declare_function_name (FILE * > ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno); > assemble_name (file, buf); > putc ('\n', file); > + if (uses_toc == 2) > + switch_to_other_text_partition (); > } > > ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function"); > --- gcc/testsuite/gcc.dg/pr81979.c.jj 2017-09-07 11:31:15.893566211 +0200 > +++ gcc/testsuite/gcc.dg/pr81979.c 2017-09-07 11:31:15.893566211 +0200 > @@ -0,0 +1,32 @@ > +/* PR target/81979 */ > +/* { dg-do link } */ > +/* { dg-options "-O2 -w" } */ > +/* { dg-additional-options "-fPIC" { target fpic } } */ > +/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */ > + > +int d; > + > +__attribute__((noinline, noclone)) void > +foo (int x) > +{ > + int c; > + while (c < 1) > + { > + int o; > + for (o = 0; o < 4; ++o) > + c /= (x != 0) ? 2 : x; > + } > + > + d = 1; > + for (;;) > + ; > +} > + > +int > +main () > +{ > + asm volatile ("" : : "r" (&d) : "memory"); > + foo (d); > + asm volatile ("" : : "r" (&d) : "memory"); > + return 0; > +} > > > Jakub > >
On Thu, Sep 07, 2017 at 11:46:09AM +0200, Jakub Jelinek wrote: > On Wed, Sep 06, 2017 at 11:48:00AM -0500, Segher Boessenkool wrote: > > On Wed, Sep 06, 2017 at 06:26:10PM +0200, Jakub Jelinek wrote: > > > > Maybe this "switch to the other section" thing should be abstracted out? > > > > Messing with in_cold_section_p is a bit dirty. > > > > > > But it reflects the reality, and is what final.c and varasm.c also do. > > > > Yes, but those aren't target code :-) > > > > I'm suggesting adding a generic switch_from_hot_to_cold_or_the_other_way_around > > function (but with a better name ;-) ) that just does these same two lines, > > only not in target code. Seems cleaner to me, less surprising. > > Richard, is this generic change ok? Thanks Jakub. The rs6000 parts are okay, if I didn't say that yet. Segher > PR target/81979 > * output.h (switch_to_other_text_partition): New declaration. > * varasm.c (switch_to_other_text_partition): New function. > * config/rs6000/rs6000.c (uses_TOC): Return 2 if > NOTE_INSN_SWITCH_TEXT_SECTIONS is seen before finding load_toc_* insn. > (rs6000_elf_declare_function_name): If uses_TOC returned 2, switch > to the other text partition before emitting LCL label and switch back > after emitting the word after it. > > * gcc.dg/pr81979.c: New test.
--- gcc/config/rs6000/rs6000.c.jj 2017-09-04 09:55:28.000000000 +0200 +++ gcc/config/rs6000/rs6000.c 2017-09-04 16:36:49.033213325 +0200 @@ -25248,12 +25248,15 @@ get_TOC_alias_set (void) /* This returns nonzero if the current function uses the TOC. This is determined by the presence of (use (unspec ... UNSPEC_TOC)), which - is generated by the ABI_V4 load_toc_* patterns. */ + is generated by the ABI_V4 load_toc_* patterns. + Return 2 instead of 1 if the load_toc_* pattern is in the function + partition that doesn't start the function. */ #if TARGET_ELF static int uses_TOC (void) { rtx_insn *insn; + int ret = 1; for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) if (INSN_P (insn)) @@ -25270,10 +25273,14 @@ uses_TOC (void) sub = XEXP (sub, 0); if (GET_CODE (sub) == UNSPEC && XINT (sub, 1) == UNSPEC_TOC) - return 1; + return ret; } } } + else if (crtl->has_bb_partition + && NOTE_P (insn) + && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) + ret = 2; return 0; } #endif @@ -33304,14 +33311,20 @@ rs6000_elf_declare_function_name (FILE * return; } + int uses_toc; if (DEFAULT_ABI == ABI_V4 && (TARGET_RELOCATABLE || flag_pic > 1) && !TARGET_SECURE_PLT && (!constant_pool_empty_p () || crtl->profile) - && uses_TOC ()) + && (uses_toc = uses_TOC ())) { char buf[256]; + if (uses_toc == 2) + { + in_cold_section_p = !in_cold_section_p; + switch_to_section (current_function_section ()); + } (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno); fprintf (file, "\t.long "); @@ -33321,6 +33334,11 @@ rs6000_elf_declare_function_name (FILE * ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno); assemble_name (file, buf); putc ('\n', file); + if (uses_toc == 2) + { + in_cold_section_p = !in_cold_section_p; + switch_to_section (current_function_section ()); + } } ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function"); --- gcc/testsuite/gcc.dg/pr81979.c.jj 2017-09-04 16:49:08.839334897 +0200 +++ gcc/testsuite/gcc.dg/pr81979.c 2017-09-04 16:48:54.000000000 +0200 @@ -0,0 +1,32 @@ +/* PR target/81979 */ +/* { dg-do link } */ +/* { dg-options "-O2 -w" } */ +/* { dg-additional-options "-fPIC" { target fpic } } */ +/* { dg-additional-options "-freorder-blocks-and-partition" { target freorder } } */ + +int d; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + int c; + while (c < 1) + { + int o; + for (o = 0; o < 4; ++o) + c /= (x != 0) ? 2 : x; + } + + d = 1; + for (;;) + ; +} + +int +main () +{ + asm volatile ("" : : "r" (&d) : "memory"); + foo (d); + asm volatile ("" : : "r" (&d) : "memory"); + return 0; +}