Message ID | 20180902142758.GA7162@gmail.com |
---|---|
State | New |
Headers | show |
Series | x86: Replace hard frame pointer with stack pointer - UNITS_PER_WORD | expand |
Hi, On Sun, 2 Sep 2018, H.J. Lu wrote: > Here is the patch to replace hard frame pointer with stack pointer > - UNITS_PER_WORD in x86 backend. > > OK for trunk? > > +// { dg-options "-O -g -fno-omit-frame-pointer -fvar-tracking" } > + > +struct Foo > +{ > + int bar(int a, int b, int c, int i1, int i2, int i3, int d); > +}; > + > +int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d) > +{ > + return 0; > +} > diff --git a/gcc/testsuite/g++.dg/pr86593-2.C b/gcc/testsuite/g++.dg/pr86593-2.C > +// { dg-options "-O -g -fno-omit-frame-pointer -fno-var-tracking" } > + > +struct Foo > +{ > + int bar(int a, int b, int c, int i1, int i2, int i3, int d); > +}; > + > +int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d) > +{ > + return 0; > +} I can't seem to reproduce any ICEs with trunk on the above two testcases (configured for x86_64-linux): % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fno-var-tracking x.cc % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fvar-tracking x.cc I get the feeling this needs to be fixed somewhere else, namely eliminate_regs. Basically if the frame pointer isn't needed, then why is hard_frame_pointer_rtx even somewhere in the final RTL? The elimination you do explicitely in your new routines should have been done as part of general register elimination. Why isn't it? Ciao, Michael.
On Mon, Sep 3, 2018 at 7:32 AM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Sun, 2 Sep 2018, H.J. Lu wrote: > >> Here is the patch to replace hard frame pointer with stack pointer >> - UNITS_PER_WORD in x86 backend. >> >> OK for trunk? >> >> +// { dg-options "-O -g -fno-omit-frame-pointer -fvar-tracking" } >> + >> +struct Foo >> +{ >> + int bar(int a, int b, int c, int i1, int i2, int i3, int d); >> +}; >> + >> +int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d) >> +{ >> + return 0; >> +} >> diff --git a/gcc/testsuite/g++.dg/pr86593-2.C b/gcc/testsuite/g++.dg/pr86593-2.C >> +// { dg-options "-O -g -fno-omit-frame-pointer -fno-var-tracking" } >> + >> +struct Foo >> +{ >> + int bar(int a, int b, int c, int i1, int i2, int i3, int d); >> +}; >> + >> +int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d) >> +{ >> + return 0; >> +} > > I can't seem to reproduce any ICEs with trunk on the above two testcases > (configured for x86_64-linux): > > % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fno-var-tracking x.cc > % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fvar-tracking x.cc It is because of gcc_assert ((SUPPORTS_STACK_ALIGNMENT && (elim == hard_frame_pointer_rtx <<<<<<<<<<<< It allows hard frame pointer even if frame pointer isn't available. || elim == stack_pointer_rtx)) || elim == (frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx)); > I get the feeling this needs to be fixed somewhere else, namely > eliminate_regs. Basically if the frame pointer isn't needed, then why is > hard_frame_pointer_rtx even somewhere in the final RTL? The elimination > you do explicitely in your new routines should have been done as part of > general register elimination. Why isn't it? hard frame pointer isn't referenced directly in the final RTL. It is arg pointer in debug info, which is eliminated by hard frame pointer in based_loc_desc.
Hi, On Mon, 3 Sep 2018, H.J. Lu wrote: > > % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fno-var-tracking x.cc > > % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fvar-tracking x.cc > > It is because of > > gcc_assert ((SUPPORTS_STACK_ALIGNMENT > && (elim == hard_frame_pointer_rtx <<<<<<<<<<<< So, what's the testcase testing then? Before the patch it doesn't ICE, after the patch it doesn't ICE. What should I look out for so I can see that what the testcase is producing without the patch is wrong? > > I get the feeling this needs to be fixed somewhere else, namely > > eliminate_regs. Basically if the frame pointer isn't needed, then why is > > hard_frame_pointer_rtx even somewhere in the final RTL? The elimination > > you do explicitely in your new routines should have been done as part of > > general register elimination. Why isn't it? > > hard frame pointer isn't referenced directly in the final RTL. It is arg > pointer in debug info, which is eliminated by hard frame pointer in > based_loc_desc. You talking about this, right: /* We only use "frame base" when we're sure we're talking about the post-prologue local stack frame. We do this by *not* running register elimination until this point, and recognizing the special argument pointer and soft frame pointer rtx's. */ if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) { rtx elim = (ira_use_lra_p ? lra_eliminate_regs (reg, VOIDmode, NULL_RTX) : eliminate_regs (reg, VOIDmode, NULL_RTX)); if (elim != reg) { ... So, why would eliminate_regs return hard_frame_pointer_rtx if no frame pointer is desired? Ciao, Michael.
On Mon, Sep 3, 2018 at 8:20 AM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Mon, 3 Sep 2018, H.J. Lu wrote: > >> > % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fno-var-tracking x.cc >> > % ./gcc/cc1plus -quiet -O -g -fno-omit-frame-pointer -fvar-tracking x.cc >> >> It is because of >> >> gcc_assert ((SUPPORTS_STACK_ALIGNMENT >> && (elim == hard_frame_pointer_rtx <<<<<<<<<<<< > > So, what's the testcase testing then? Before the patch it doesn't ICE, > after the patch it doesn't ICE. What should I look out for so I can see > that what the testcase is producing without the patch is wrong? Before the patch, debug info is wrong since it uses hard frame pointer which isn't set up for the function. You can do "readelf -w" on .o file to verify the debug info. >> > I get the feeling this needs to be fixed somewhere else, namely >> > eliminate_regs. Basically if the frame pointer isn't needed, then why is >> > hard_frame_pointer_rtx even somewhere in the final RTL? The elimination >> > you do explicitely in your new routines should have been done as part of >> > general register elimination. Why isn't it? >> >> hard frame pointer isn't referenced directly in the final RTL. It is arg >> pointer in debug info, which is eliminated by hard frame pointer in >> based_loc_desc. > > You talking about this, right: > > /* We only use "frame base" when we're sure we're talking about the > post-prologue local stack frame. We do this by *not* running > register elimination until this point, and recognizing the special > argument pointer and soft frame pointer rtx's. */ > if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) > { > rtx elim = (ira_use_lra_p > ? lra_eliminate_regs (reg, VOIDmode, NULL_RTX) > : eliminate_regs (reg, VOIDmode, NULL_RTX)); > > if (elim != reg) > { > ... > > So, why would eliminate_regs return hard_frame_pointer_rtx if no frame > pointer is desired? > Frame pointer was skipped at the last minute in x86_finalize_stack_frame_flags. But eliminate_regs uses the info which was computed when frame pointer was available.
Hi, On Mon, 3 Sep 2018, H.J. Lu wrote: > > So, what's the testcase testing then? Before the patch it doesn't ICE, > > after the patch it doesn't ICE. What should I look out for so I can see > > that what the testcase is producing without the patch is wrong? > > Before the patch, debug info is wrong since it uses hard frame pointer > which isn't set up for the function. You can do "readelf -w" on .o file to > verify the debug info. Yeah, that's what I thought as well, but it's correct: % ./gcc/cc1plus -quiet -O2 -g -fno-omit-frame-pointer -fvar-tracking x.cc % gcc -c x.s % readelf -wfi x.o ... <1><8a>: Abbrev Number: 9 (DW_TAG_subprogram) <8b> DW_AT_specification: <0x3a> <8f> DW_AT_decl_line : 6 <90> DW_AT_decl_column : 5 <91> DW_AT_object_pointer: <0xa7> <95> DW_AT_low_pc : 0x0 <9d> DW_AT_high_pc : 0x3 <a5> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <a7> DW_AT_GNU_all_call_sites: 1 ... <2><fe>: Abbrev Number: 11 (DW_TAG_formal_parameter) <ff> DW_AT_name : d <101> DW_AT_decl_file : 1 <102> DW_AT_decl_line : 6 <103> DW_AT_decl_column : 63 <104> DW_AT_type : <0x78> <108> DW_AT_location : 2 byte block: 91 8 (DW_OP_fbreg: 8) ... DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop ... So, argument 'd' is supposed to be at DW_AT_frame_base + 8, which is %rsp+8+8, aka %rsp+16, which is correct given that it's the eigth argument (including the implicit this parameter). So, can you actually show here what's broken before patch? > > You talking about this, right: > > > > /* We only use "frame base" when we're sure we're talking about the > > post-prologue local stack frame. We do this by *not* running > > register elimination until this point, and recognizing the special > > argument pointer and soft frame pointer rtx's. */ > > if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) > > { > > rtx elim = (ira_use_lra_p > > ? lra_eliminate_regs (reg, VOIDmode, NULL_RTX) > > : eliminate_regs (reg, VOIDmode, NULL_RTX)); > > > > if (elim != reg) > > { > > ... > > > > So, why would eliminate_regs return hard_frame_pointer_rtx if no frame > > pointer is desired? > > Frame pointer was skipped at the last minute in > x86_finalize_stack_frame_flags. But eliminate_regs uses the info which > was computed when frame pointer was available. Let's assume something needs fixing (though I can't reproduce what right now) then I think changing frame_pointer_needed somehow needs to affect calls to {lra_,}eliminate_regs that come afterwards (by e.g. recalculating its info). Everything else is just asking for hacks upon hacks. Ciao, Michael.
On Mon, Sep 3, 2018 at 9:44 AM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Mon, 3 Sep 2018, H.J. Lu wrote: > >> > So, what's the testcase testing then? Before the patch it doesn't ICE, >> > after the patch it doesn't ICE. What should I look out for so I can see >> > that what the testcase is producing without the patch is wrong? >> >> Before the patch, debug info is wrong since it uses hard frame pointer >> which isn't set up for the function. You can do "readelf -w" on .o file to >> verify the debug info. > > Yeah, that's what I thought as well, but it's correct: > > % ./gcc/cc1plus -quiet -O2 -g -fno-omit-frame-pointer -fvar-tracking x.cc > % gcc -c x.s > % readelf -wfi x.o > ... > <1><8a>: Abbrev Number: 9 (DW_TAG_subprogram) > <8b> DW_AT_specification: <0x3a> > <8f> DW_AT_decl_line : 6 > <90> DW_AT_decl_column : 5 > <91> DW_AT_object_pointer: <0xa7> > <95> DW_AT_low_pc : 0x0 > <9d> DW_AT_high_pc : 0x3 > <a5> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) > <a7> DW_AT_GNU_all_call_sites: 1 > ... > <2><fe>: Abbrev Number: 11 (DW_TAG_formal_parameter) > <ff> DW_AT_name : d > <101> DW_AT_decl_file : 1 > <102> DW_AT_decl_line : 6 > <103> DW_AT_decl_column : 63 > <104> DW_AT_type : <0x78> > <108> DW_AT_location : 2 byte block: 91 8 (DW_OP_fbreg: 8) > ... > DW_CFA_def_cfa: r7 (rsp) ofs 8 > DW_CFA_offset: r16 (rip) at cfa-8 > DW_CFA_nop > DW_CFA_nop > ... > > So, argument 'd' is supposed to be at DW_AT_frame_base + 8, which is > %rsp+8+8, aka %rsp+16, which is correct given that it's the eigth argument > (including the implicit this parameter). Can we use DW_AT_frame_base when the frame pointer isn't available? If yes, gcc_assert ((SUPPORTS_STACK_ALIGNMENT && (elim == hard_frame_pointer_rtx || elim == stack_pointer_rtx)) || elim == (frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx)); should be changed to gcc_assert (elim == hard_frame_pointer_rtx || elim == stack_pointer_rtx); This will also fix: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86593 > So, can you actually show here what's broken before patch? > >> > You talking about this, right: >> > >> > /* We only use "frame base" when we're sure we're talking about the >> > post-prologue local stack frame. We do this by *not* running >> > register elimination until this point, and recognizing the special >> > argument pointer and soft frame pointer rtx's. */ >> > if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) >> > { >> > rtx elim = (ira_use_lra_p >> > ? lra_eliminate_regs (reg, VOIDmode, NULL_RTX) >> > : eliminate_regs (reg, VOIDmode, NULL_RTX)); >> > >> > if (elim != reg) >> > { >> > ... >> > >> > So, why would eliminate_regs return hard_frame_pointer_rtx if no frame >> > pointer is desired? >> >> Frame pointer was skipped at the last minute in >> x86_finalize_stack_frame_flags. But eliminate_regs uses the info which >> was computed when frame pointer was available. > > Let's assume something needs fixing (though I can't reproduce what right > now) then I think changing frame_pointer_needed somehow needs to affect > calls to {lra_,}eliminate_regs that come afterwards (by e.g. recalculating > its info). Everything else is just asking for hacks upon hacks. > The only reference to hard frame pointer is in debug info. Is recomputing eliminate info really necessary?
* H. J. Lu: > With > > commit cd557ff63f388ad27c376d0a225e74d3594a6f9d > Author: hjl <hjl@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Thu Aug 10 15:29:05 2017 +0000 > > i386: Don't use frame pointer without stack access > > When there is no stack access, there is no need to use frame pointer > even if -fno-omit-frame-pointer is used and caller's frame pointer is > unchanged. > > frame pointer may not be available even if -fno-omit-frame-pointer is I think you should reference a Subversion revision number, not a Git commit hash.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 8672a666024..8166ea06d42 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13161,6 +13161,11 @@ ix86_finalize_stack_frame_flags (void) df_compute_regs_ever_live (true); df_analyze (); + /* Replace hard frame pointer with stack pointer in debug + info. */ + cfun->machine->replace_fp_with_sp_in_debug_info + = debug_info_level > DINFO_LEVEL_NONE; + if (flag_var_tracking) { /* Since frame pointer is no longer available, replace it with @@ -41944,6 +41949,99 @@ ix86_seh_fixup_eh_fallthru (void) } } +/* Help function for ix86_replace_fp_with_sp. */ + +static rtx +ix86_replace_fp_with_sp_1 (rtx x) +{ + if (!x) + return x; + + if (x == hard_frame_pointer_rtx) + return plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD); + + const char *fmt = GET_RTX_FORMAT (GET_CODE (x)); + int i, j; + for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + XEXP (x, i) = ix86_replace_fp_with_sp_1 (XEXP (x, i)); + else if (fmt[i] == 'E') + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + XVECEXP (x, i, j) = ix86_replace_fp_with_sp_1 (XVECEXP (x, i, j)); + } + + return x; +} + +/* Replace hard frame pointer in debug info with stack pointer + - UNITS_PER_WORD. */ + +static void +ix86_replace_fp_with_sp (void) +{ + if (flag_var_tracking) + { + rtx_insn *insn, *next; + rtx replace, note; + for (insn = get_insns (); insn; insn = next) + { + next = NEXT_INSN (insn); + if (NOTE_P (insn)) + { + if (NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION) + { + replace = PATTERN (insn); + if (reg_mentioned_p (arg_pointer_rtx, replace)) + { + replace = eliminate_regs (replace, VOIDmode, + NULL_RTX); + if (reg_mentioned_p (hard_frame_pointer_rtx, + replace)) + { + ix86_replace_fp_with_sp_1 (replace); + PATTERN (insn) = replace; + } + } + } + } + else if (CALL_P (insn)) + { + note = find_reg_note (insn, REG_CALL_ARG_LOCATION, + NULL_RTX); + if (note && reg_mentioned_p (arg_pointer_rtx, note)) + { + replace = eliminate_regs (XEXP (note, 0), VOIDmode, + NULL_RTX); + if (reg_mentioned_p (hard_frame_pointer_rtx, replace)) + { + ix86_replace_fp_with_sp_1 (replace); + remove_note (insn, note); + add_reg_note (insn, REG_CALL_ARG_LOCATION, + replace); + } + } + } + } + } + + for (tree arg = DECL_ARGUMENTS (current_function_decl); + arg; + arg = TREE_CHAIN (arg)) + { + rtx incoming = DECL_INCOMING_RTL (arg); + if (reg_mentioned_p (arg_pointer_rtx, incoming)) + { + incoming = eliminate_regs (incoming, VOIDmode, NULL_RTX); + if (reg_mentioned_p (hard_frame_pointer_rtx, incoming)) + { + ix86_replace_fp_with_sp_1 (incoming); + DECL_INCOMING_RTL (arg) = incoming; + } + } + } +} + /* Implement machine specific optimizations. We implement padding of returns for K8 CPUs and pass to avoid 4 jumps in the single 16 byte window. */ static void @@ -41956,6 +42054,11 @@ ix86_reorg (void) if (TARGET_SEH && current_function_has_exception_handlers ()) ix86_seh_fixup_eh_fallthru (); + /* Replace hard frame pointer in debug info with stack pointer + - UNITS_PER_WORD after the variable tracking pass. */ + if (cfun->machine->replace_fp_with_sp_in_debug_info) + ix86_replace_fp_with_sp (); + if (optimize && optimize_function_for_speed_p (cfun)) { if (TARGET_PAD_SHORT_FUNCTION) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 2a46fccdec1..23cef7a9c61 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -2610,6 +2610,10 @@ struct GTY(()) machine_function { /* Nonzero if the function places outgoing arguments on stack. */ BOOL_BITFIELD outgoing_args_on_stack : 1; + /* If true, replace hard frame pointer with stack pointer in debug + info. */ + BOOL_BITFIELD replace_fp_with_sp_in_debug_info : 1; + /* The largest alignment, in bytes, of stack slot actually used. */ unsigned int max_used_stack_alignment; diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 77317ed2575..a879ac1f18c 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -14327,8 +14327,7 @@ based_loc_descr (rtx reg, poly_int64 offset, { elim = strip_offset_and_add (elim, &offset); gcc_assert ((SUPPORTS_STACK_ALIGNMENT - && (elim == hard_frame_pointer_rtx - || elim == stack_pointer_rtx)) + && elim == stack_pointer_rtx) || elim == (frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx)); @@ -20515,8 +20514,7 @@ compute_frame_pointer_to_fb_displacement (poly_int64 offset) frame_pointer_fb_offset, we won't need one either. */ frame_pointer_fb_offset_valid = ((SUPPORTS_STACK_ALIGNMENT - && (elim == hard_frame_pointer_rtx - || elim == stack_pointer_rtx)) + && elim == stack_pointer_rtx) || elim == (frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx)); diff --git a/gcc/testsuite/g++.dg/pr86593-1.C b/gcc/testsuite/g++.dg/pr86593-1.C new file mode 100644 index 00000000000..31dc4153cb9 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr86593-1.C @@ -0,0 +1,11 @@ +// { dg-options "-O -g -fno-omit-frame-pointer -fvar-tracking" } + +struct Foo +{ + int bar(int a, int b, int c, int i1, int i2, int i3, int d); +}; + +int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d) +{ + return 0; +} diff --git a/gcc/testsuite/g++.dg/pr86593-2.C b/gcc/testsuite/g++.dg/pr86593-2.C new file mode 100644 index 00000000000..65aeac98099 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr86593-2.C @@ -0,0 +1,11 @@ +// { dg-options "-O -g -fno-omit-frame-pointer -fno-var-tracking" } + +struct Foo +{ + int bar(int a, int b, int c, int i1, int i2, int i3, int d); +}; + +int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d) +{ + return 0; +}