diff mbox series

i386: For noreturn functions save at least the bp register if it is used [PR114116]

Message ID Zd2gF9RPOF0z9fIE@tucnak
State New
Headers show
Series i386: For noreturn functions save at least the bp register if it is used [PR114116] | expand

Commit Message

Jakub Jelinek Feb. 27, 2024, 8:40 a.m. UTC
Hi!

As mentioned in the PR, on x86_64 currently a lot of ICEs end up
with crashes in the unwinder like:
during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:208
    5 |   __builtin_clzg (a);
      |   ^~~~~~~~~~~~~~~~~~
0x7d9246 expand_fn_using_insn
	../../gcc/internal-fn.cc:208

pr114044-2.c:5:3: internal compiler error: Segmentation fault
0x1554262 crash_signal
	../../gcc/toplev.cc:319
0x2b20320 x86_64_fallback_frame_state
	./md-unwind-support.h:63
0x2b20320 uw_frame_state_for
	../../../libgcc/unwind-dw2.c:1013
0x2b2165d _Unwind_Backtrace
	../../../libgcc/unwind.inc:303
0x2acbd69 backtrace_full
	../../libbacktrace/backtrace.c:127
0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
	../../gcc/diagnostic.cc:781
0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
	../../gcc/diagnostic.h:1002
0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
	../../gcc/diagnostic.cc:1633
0x2a33543 diagnostic_impl
	../../gcc/diagnostic.cc:1767
0x2a33c26 internal_error(char const*, ...)
	../../gcc/diagnostic.cc:2225
0xe232c8 fancy_abort(char const*, int, char const*)
	../../gcc/diagnostic.cc:2336
0x7d9246 expand_fn_using_insn
	../../gcc/internal-fn.cc:208
Segmentation fault (core dumped)

The problem are the PR38534 r14-8470 changes which avoid saving call-saved
registers in noreturn functions.  If such functions ever touch the
bp register but because of the r14-8470 changes don't save it in the
prologue, the caller or any other function in the backtrace uses a frame
pointer and the noreturn function or anything it calls directly or
indirectly calls backtrace, then the unwinder crashes, because bp register
contains some unrelated value, but in the frames which do use frame pointer
CFA is based on the bp register.

In theory this could happen with any other call-saved register, e.g. code
written by hand in assembly with .cfi_* directives could use any other
call-saved register as register into which store the CFA or something
related to that, but in reality at least compiler generated code and usual
assembly probably just making sure bp doesn't contain garbage could be
enough for backtrace purposes.  In the debugger of course it will not be
enough, the values of the arguments etc. can be lost (if DW_CFA_undefined
is emitted) or garbage.

So, I think for noreturn function we should at least save the bp register
if we use it.  If user asks for it using no_callee_saved_registers
attribute, let's honor what is asked for (but then it is up to the user
to make sure e.g. backtrace isn't called from the function or anything it
calls).  As discussed in the PR, whether to save bp or not shouldn't be
based on whether compiling with -g or -g0, because we don't want code
generation changes without/with debugging, it would also break
-fcompare-debug, and users can call backtrace(3), that doesn't use debug
info, just unwind info, even backtrace_symbols{,_fd}(3) don't use debug info
but just looks at dynamic symbol table.

The patch also adds check for no_caller_saved_registers
attribute in the implicit addition of not saving callee saved register
in noreturn functions, because on I think
__attribute__((no_caller_saved_registers, noreturn)) will otherwise
error that no_caller_saved_registers and no_callee_saved_registers
attributes are incompatible (but user didn't specify anything like that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-27  Jakub Jelinek  <jakub@redhat.com>

	PR target/114116
	* config/i386/i386.h (enum call_saved_registers_type): Add
	TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP enumerator.
	* config/i386/i386-options.cc (ix86_set_func_type): Remove
	has_no_callee_saved_registers variable, add no_callee_saved_registers
	instead, initialize it depending on whether it is
	no_callee_saved_registers function or not.  Don't set it if
	no_caller_saved_registers attribute is present.  Adjust users.
	* config/i386/i386.cc (ix86_function_ok_for_sibcall): Handle
	TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP like
	TYPE_NO_CALLEE_SAVED_REGISTERS.
	(ix86_save_reg): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.

	* gcc.target/i386/pr38534-1.c: Allow push/pop of bp.
	* gcc.target/i386/pr38534-4.c: Likewise.
	* gcc.target/i386/pr38534-2.c: Likewise.
	* gcc.target/i386/pr38534-3.c: Likewise.
	* gcc.target/i386/pr114097-1.c: Likewise.
	* gcc.target/i386/stack-check-17.c: Expect no pop on ! ia32.


	Jakub

Comments

Richard Biener Feb. 27, 2024, 8:54 a.m. UTC | #1
On Tue, Feb 27, 2024 at 9:42 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> As mentioned in the PR, on x86_64 currently a lot of ICEs end up
> with crashes in the unwinder like:
> during RTL pass: expand
> pr114044-2.c: In function ‘foo’:
> pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:208
>     5 |   __builtin_clzg (a);
>       |   ^~~~~~~~~~~~~~~~~~
> 0x7d9246 expand_fn_using_insn
>         ../../gcc/internal-fn.cc:208
>
> pr114044-2.c:5:3: internal compiler error: Segmentation fault
> 0x1554262 crash_signal
>         ../../gcc/toplev.cc:319
> 0x2b20320 x86_64_fallback_frame_state
>         ./md-unwind-support.h:63
> 0x2b20320 uw_frame_state_for
>         ../../../libgcc/unwind-dw2.c:1013
> 0x2b2165d _Unwind_Backtrace
>         ../../../libgcc/unwind.inc:303
> 0x2acbd69 backtrace_full
>         ../../libbacktrace/backtrace.c:127
> 0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
>         ../../gcc/diagnostic.cc:781
> 0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
>         ../../gcc/diagnostic.h:1002
> 0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
>         ../../gcc/diagnostic.cc:1633
> 0x2a33543 diagnostic_impl
>         ../../gcc/diagnostic.cc:1767
> 0x2a33c26 internal_error(char const*, ...)
>         ../../gcc/diagnostic.cc:2225
> 0xe232c8 fancy_abort(char const*, int, char const*)
>         ../../gcc/diagnostic.cc:2336
> 0x7d9246 expand_fn_using_insn
>         ../../gcc/internal-fn.cc:208
> Segmentation fault (core dumped)
>
> The problem are the PR38534 r14-8470 changes which avoid saving call-saved
> registers in noreturn functions.  If such functions ever touch the
> bp register but because of the r14-8470 changes don't save it in the
> prologue, the caller or any other function in the backtrace uses a frame
> pointer and the noreturn function or anything it calls directly or
> indirectly calls backtrace, then the unwinder crashes, because bp register
> contains some unrelated value, but in the frames which do use frame pointer
> CFA is based on the bp register.
>
> In theory this could happen with any other call-saved register, e.g. code
> written by hand in assembly with .cfi_* directives could use any other
> call-saved register as register into which store the CFA or something
> related to that, but in reality at least compiler generated code and usual
> assembly probably just making sure bp doesn't contain garbage could be
> enough for backtrace purposes.  In the debugger of course it will not be
> enough, the values of the arguments etc. can be lost (if DW_CFA_undefined
> is emitted) or garbage.
>
> So, I think for noreturn function we should at least save the bp register
> if we use it.  If user asks for it using no_callee_saved_registers
> attribute, let's honor what is asked for (but then it is up to the user
> to make sure e.g. backtrace isn't called from the function or anything it
> calls).  As discussed in the PR, whether to save bp or not shouldn't be
> based on whether compiling with -g or -g0, because we don't want code
> generation changes without/with debugging, it would also break
> -fcompare-debug, and users can call backtrace(3), that doesn't use debug
> info, just unwind info, even backtrace_symbols{,_fd}(3) don't use debug info
> but just looks at dynamic symbol table.
>
> The patch also adds check for no_caller_saved_registers
> attribute in the implicit addition of not saving callee saved register
> in noreturn functions, because on I think
> __attribute__((no_caller_saved_registers, noreturn)) will otherwise
> error that no_caller_saved_registers and no_callee_saved_registers
> attributes are incompatible (but user didn't specify anything like that).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Just to add we've in the past opted to avoid tail-calling abort () and friends
exactly to help debugging.  So treating noreturn functions specially with
respect to caller saves looks inconsistent in this regard - it makes debugging
harder since a lot of info in the calling frame is going to be lost?

I hope we at least avoid that at -O0, possibly also with -Og?

> 2024-02-27  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/114116
>         * config/i386/i386.h (enum call_saved_registers_type): Add
>         TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP enumerator.
>         * config/i386/i386-options.cc (ix86_set_func_type): Remove
>         has_no_callee_saved_registers variable, add no_callee_saved_registers
>         instead, initialize it depending on whether it is
>         no_callee_saved_registers function or not.  Don't set it if
>         no_caller_saved_registers attribute is present.  Adjust users.
>         * config/i386/i386.cc (ix86_function_ok_for_sibcall): Handle
>         TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP like
>         TYPE_NO_CALLEE_SAVED_REGISTERS.
>         (ix86_save_reg): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
>
>         * gcc.target/i386/pr38534-1.c: Allow push/pop of bp.
>         * gcc.target/i386/pr38534-4.c: Likewise.
>         * gcc.target/i386/pr38534-2.c: Likewise.
>         * gcc.target/i386/pr38534-3.c: Likewise.
>         * gcc.target/i386/pr114097-1.c: Likewise.
>         * gcc.target/i386/stack-check-17.c: Expect no pop on ! ia32.
>
> --- gcc/config/i386/i386.h.jj   2024-01-27 22:59:53.121319813 +0100
> +++ gcc/config/i386/i386.h      2024-02-26 16:33:14.644745362 +0100
> @@ -2730,9 +2730,12 @@ enum call_saved_registers_type
>    /* The current function is a function specified with the "interrupt"
>       or "no_caller_saved_registers" attribute.  */
>    TYPE_NO_CALLER_SAVED_REGISTERS,
> +  /* The current function is a function specified with the
> +     "no_callee_saved_registers" attribute.  */
> +  TYPE_NO_CALLEE_SAVED_REGISTERS,
>    /* The current function is a function specified with the "noreturn"
> -     or "no_callee_saved_registers" attribute.  */
> -  TYPE_NO_CALLEE_SAVED_REGISTERS
> +     attribute.  */
> +  TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP,
>  };
>
>  enum queued_insn_type
> --- gcc/config/i386/i386-options.cc.jj  2024-02-26 16:29:07.471201524 +0100
> +++ gcc/config/i386/i386-options.cc     2024-02-26 16:42:01.730358609 +0100
> @@ -3384,7 +3384,9 @@ ix86_set_func_type (tree fndecl)
>  {
>    /* No need to save and restore callee-saved registers for a noreturn
>       function with nothrow or compiled with -fno-exceptions unless when
> -     compiling with -O0 or -Og.
> +     compiling with -O0 or -Og.  So that backtrace works for those at least
> +     in most cases, save the bp register if it is used, because it often
> +     is used in callers to compute CFA.
>
>       NB: Can't use just TREE_THIS_VOLATILE to check if this is a noreturn
>       function.  The local-pure-const pass turns an interrupt function
> @@ -3394,15 +3396,20 @@ ix86_set_func_type (tree fndecl)
>       function is marked with TREE_THIS_VOLATILE in the IR output, which
>       leads to the incompatible attribute error in LTO1.  Ignore the
>       interrupt function in this case.  */
> -  bool has_no_callee_saved_registers
> -    = ((TREE_THIS_VOLATILE (fndecl)
> -       && !lookup_attribute ("interrupt",
> -                             TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
> -       && optimize
> -       && !optimize_debug
> -       && (TREE_NOTHROW (fndecl) || !flag_exceptions))
> -       || lookup_attribute ("no_callee_saved_registers",
> -                           TYPE_ATTRIBUTES (TREE_TYPE (fndecl))));
> +  enum call_saved_registers_type no_callee_saved_registers
> +    = TYPE_DEFAULT_CALL_SAVED_REGISTERS;
> +  if (lookup_attribute ("no_callee_saved_registers",
> +                       TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
> +    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS;
> +  else if (TREE_THIS_VOLATILE (fndecl)
> +          && optimize
> +          && !optimize_debug
> +          && (TREE_NOTHROW (fndecl) || !flag_exceptions)
> +          && !lookup_attribute ("interrupt",
> +                                TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
> +          && !lookup_attribute ("no_caller_saved_registers",
> +                                TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
> +    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP;
>
>    if (cfun->machine->func_type == TYPE_UNKNOWN)
>      {
> @@ -3413,7 +3420,7 @@ ix86_set_func_type (tree fndecl)
>             error_at (DECL_SOURCE_LOCATION (fndecl),
>                       "interrupt and naked attributes are not compatible");
>
> -         if (has_no_callee_saved_registers)
> +         if (no_callee_saved_registers)
>             error_at (DECL_SOURCE_LOCATION (fndecl),
>                       "%qs and %qs attributes are not compatible",
>                       "interrupt", "no_callee_saved_registers");
> @@ -3442,7 +3449,7 @@ ix86_set_func_type (tree fndecl)
>                                 TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
>             cfun->machine->call_saved_registers
>               = TYPE_NO_CALLER_SAVED_REGISTERS;
> -         if (has_no_callee_saved_registers)
> +         if (no_callee_saved_registers)
>             {
>               if (cfun->machine->call_saved_registers
>                   == TYPE_NO_CALLER_SAVED_REGISTERS)
> @@ -3451,7 +3458,7 @@ ix86_set_func_type (tree fndecl)
>                           "no_caller_saved_registers",
>                           "no_callee_saved_registers");
>               cfun->machine->call_saved_registers
> -               = TYPE_NO_CALLEE_SAVED_REGISTERS;
> +               = no_callee_saved_registers;
>             }
>         }
>      }
> --- gcc/config/i386/i386.cc.jj  2024-02-26 11:12:36.275453513 +0100
> +++ gcc/config/i386/i386.cc     2024-02-26 17:24:50.701602352 +0100
> @@ -984,8 +984,9 @@ ix86_function_ok_for_sibcall (tree decl,
>
>    /* Sibling call isn't OK if callee has no callee-saved registers
>       and the calling function has callee-saved registers.  */
> -  if ((cfun->machine->call_saved_registers
> -       != TYPE_NO_CALLEE_SAVED_REGISTERS)
> +  if (cfun->machine->call_saved_registers != TYPE_NO_CALLEE_SAVED_REGISTERS
> +      && (cfun->machine->call_saved_registers
> +         != TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP)
>        && lookup_attribute ("no_callee_saved_registers",
>                            TYPE_ATTRIBUTES (type)))
>      return false;
> @@ -6649,6 +6650,11 @@ ix86_save_reg (unsigned int regno, bool
>
>      case TYPE_NO_CALLEE_SAVED_REGISTERS:
>        return false;
> +
> +    case TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP:
> +      if (regno != HARD_FRAME_POINTER_REGNUM)
> +       return false;
> +      break;
>      }
>
>    if (regno == REAL_PIC_OFFSET_TABLE_REGNUM
> --- gcc/testsuite/gcc.target/i386/pr38534-1.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-1.c   2024-02-27 08:59:08.421878166 +0100
> @@ -22,5 +22,5 @@ no_return_to_caller (void)
>    while (1);
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> --- gcc/testsuite/gcc.target/i386/pr38534-4.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-4.c   2024-02-27 09:00:09.869021464 +0100
> @@ -12,7 +12,7 @@ foo (fn_t bar)
>    fn ();
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
>  /* { dg-final { scan-assembler-not "jmp" } } */
>  /* { dg-final { scan-assembler "call\[\\t \]+" } } */
> --- gcc/testsuite/gcc.target/i386/pr38534-2.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-2.c   2024-02-27 08:59:35.723497526 +0100
> @@ -12,7 +12,7 @@ foo (void)
>    fn ();
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
>  /* { dg-final { scan-assembler-not "jmp\[\\t \]+_?bar" } } */
>  /* { dg-final { scan-assembler "call\[\\t \]+_?bar" } } */
> --- gcc/testsuite/gcc.target/i386/pr38534-3.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-3.c   2024-02-27 08:59:49.669303090 +0100
> @@ -13,7 +13,7 @@ foo (void)
>    fn ();
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
>  /* { dg-final { scan-assembler-not "jmp" } } */
>  /* { dg-final { scan-assembler "call\[\\t \]+" } } */
> --- gcc/testsuite/gcc.target/i386/pr114097-1.c.jj       2024-02-26 16:29:07.529200714 +0100
> +++ gcc/testsuite/gcc.target/i386/pr114097-1.c  2024-02-27 09:01:37.876794453 +0100
> @@ -22,5 +22,5 @@ no_return_to_caller (void)
>    while (1);
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> --- gcc/testsuite/gcc.target/i386/stack-check-17.c.jj   2024-01-27 22:59:53.186318915 +0100
> +++ gcc/testsuite/gcc.target/i386/stack-check-17.c      2024-02-27 09:06:10.805989259 +0100
> @@ -32,5 +32,5 @@ f3 (void)
>     register on ia32 for a noreturn function.  */
>  /* { dg-final { scan-assembler-times "push\[ql\]" 1 { target { ! ia32 } } } }  */
>  /* { dg-final { scan-assembler-times "push\[ql\]" 3 { target ia32 } } }  */
> -/* { dg-final { scan-assembler-times "pop" 1 } } */
> -
> +/* { dg-final { scan-assembler-not "pop" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "pop" 1 { target ia32 } } } */
>
>         Jakub
>
Jakub Jelinek Feb. 27, 2024, 9:04 a.m. UTC | #2
On Tue, Feb 27, 2024 at 09:54:45AM +0100, Richard Biener wrote:
> Just to add we've in the past opted to avoid tail-calling abort () and friends
> exactly to help debugging.  So treating noreturn functions specially with
> respect to caller saves looks inconsistent in this regard - it makes debugging
> harder since a lot of info in the calling frame is going to be lost?

I know and am not very happy about the changes either.

> I hope we at least avoid that at -O0, possibly also with -Og?

r14-8495 fixed at least that.

Of course, it can break debugging experience even when the noreturn function
is compiled with -O2 but some or all callers of that are -O0 or -Og.
So, if unlucky and e.g. abort function in glibc gets the
no_callee_saved_registers treatment and actually uses some call saved
registers, backtraces when something aborts will be worse or useless.
And we don't even have any attribute which would tell gcc not to do that.

So sure, another option is just revert the PR38534 changes.

	Jakub
Jakub Jelinek Feb. 27, 2024, 9:13 a.m. UTC | #3
On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > I hope we at least avoid that at -O0, possibly also with -Og?
> 
> r14-8495 fixed at least that.
> 
> Of course, it can break debugging experience even when the noreturn function
> is compiled with -O2 but some or all callers of that are -O0 or -Og.
> So, if unlucky and e.g. abort function in glibc gets the
> no_callee_saved_registers treatment and actually uses some call saved
> registers, backtraces when something aborts will be worse or useless.
> And we don't even have any attribute which would tell gcc not to do that.
> 
> So sure, another option is just revert the PR38534 changes.

For __libc_start_main, glibc surely just could use no_callee_saved_registers
attribute, because that is typically the outermost frame in backtrace,
there is no need to save those there.
And for kernel if it really wants it and nothing will use the backtraces,
perhaps the patch wouldn't need to be reverted completely but just guarded
the implicit no_callee_saved_registers treatment of noreturn
functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

	Jakub
Richard Biener Feb. 27, 2024, 9:50 a.m. UTC | #4
On Tue, Feb 27, 2024 at 10:13 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > > I hope we at least avoid that at -O0, possibly also with -Og?
> >
> > r14-8495 fixed at least that.
> >
> > Of course, it can break debugging experience even when the noreturn function
> > is compiled with -O2 but some or all callers of that are -O0 or -Og.
> > So, if unlucky and e.g. abort function in glibc gets the
> > no_callee_saved_registers treatment and actually uses some call saved
> > registers, backtraces when something aborts will be worse or useless.
> > And we don't even have any attribute which would tell gcc not to do that.
> >
> > So sure, another option is just revert the PR38534 changes.
>
> For __libc_start_main, glibc surely just could use no_callee_saved_registers
> attribute, because that is typically the outermost frame in backtrace,
> there is no need to save those there.
> And for kernel if it really wants it and nothing will use the backtraces,
> perhaps the patch wouldn't need to be reverted completely but just guarded
> the implicit no_callee_saved_registers treatment of noreturn
> functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

I guess the kernel could also add the attribute to appropriate places?  Or
we have a -mno-noreturn-callee-saved-registers

But yeah, I think this is an overall bad change (to make this the default
behavior for noreturn functions).

Richard.

>         Jakub
>
Jakub Jelinek Feb. 27, 2024, 9:55 a.m. UTC | #5
On Tue, Feb 27, 2024 at 10:13:14AM +0100, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > > I hope we at least avoid that at -O0, possibly also with -Og?
> > 
> > r14-8495 fixed at least that.
> > 
> > Of course, it can break debugging experience even when the noreturn function
> > is compiled with -O2 but some or all callers of that are -O0 or -Og.
> > So, if unlucky and e.g. abort function in glibc gets the
> > no_callee_saved_registers treatment and actually uses some call saved
> > registers, backtraces when something aborts will be worse or useless.
> > And we don't even have any attribute which would tell gcc not to do that.
> > 
> > So sure, another option is just revert the PR38534 changes.
> 
> For __libc_start_main, glibc surely just could use no_callee_saved_registers
> attribute, because that is typically the outermost frame in backtrace,
> there is no need to save those there.
> And for kernel if it really wants it and nothing will use the backtraces,
> perhaps the patch wouldn't need to be reverted completely but just guarded
> the implicit no_callee_saved_registers treatment of noreturn
> functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

So, let's look at admittedly artificial testcase which could very well
happen in real world code though, just using some meaningful code creating the
register preassure instead of it being created artificially, noipa attribute
just a replacement for functions in different TUs or too large to be
inlinable, and the noreturn function in a different library where everything
is built with -O2, not -O0 or -Og for debugging.
Note, it doesn't have to be even abort, can be exit too.

I've compiled this with -Og -g with gcc 12, trunk and trunk patched with the
patch I've posted, in each case I run it under debugger, type run and bt
when it aborts.
The gcc 12 case is the expected one:
#0  0x00007ffff7dbd765 in abort () from /lib64/libc.so.6
#1  0x00000000004011ca in bar () at /tmp/1.c:30
#2  0x00000000004011f1 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=d@entry=45, e=e@entry=46, f=f@entry=47, g=48, h=49) at /tmp/1.c:38
#3  0x00000000004012d8 in qux () at /tmp/1.c:55
#4  0x0000000000401319 in main () at /tmp/1.c:62
The gcc trunk hits the backtrace not possible problem because rbp is
clobbered and needed in upper frame CFA computation:
#0  0x00007ffff7dbd765 in abort () from /lib64/libc.so.6
#1  0x00000000004011b0 in bar () at /tmp/1.c:30
#2  0x00000000004011d1 in baz (a=<error reading variable: Cannot access memory at address 0xdeadbebb>, b=<error reading variable: Cannot access memory at address 0xdeadbeb7>, 
    c=<error reading variable: Cannot access memory at address 0xdeadbeb3>, d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, h=49) at /tmp/1.c:38
#3  0x00000000004012a9 in qux () at /tmp/1.c:55
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
And in the patched gcc backtrace works but many of the values are bogus:
#0  0x00007ffff7dbd765 in abort () from /lib64/libc.so.6
#1  0x00000000004011b1 in bar () at /tmp/1.c:30
#2  0x00000000004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, h=49) at /tmp/1.c:38
#3  0x00000000004012aa in qux () at /tmp/1.c:55
#4  0x00000000004012e4 in main () at /tmp/1.c:62

extern void abort (void);
volatile unsigned v = 0xdeadbeefU;
int w;

__attribute__((noipa)) void
corge (char *p)
{
  (void) p;
}

__attribute__((noipa)) int
foo (int x)
{
  return x;
}

__attribute__((noipa, noreturn, optimize (2))) void
bar (void)
{
  unsigned a = v;
  unsigned b = v;
  unsigned c = v;
  unsigned d = v;
  unsigned e = v;
  unsigned f = v;
  unsigned g = v;
  unsigned h = v;
  int i = foo (50);
  v = a + b + c + d + e + f + g + h;
  abort ();
}

__attribute__((noipa)) void
baz (int a, int b, int c, int d, int e, int f, int g, int h)
{
  int i = foo (51);
  if (w)
    bar ();
}

__attribute__((noipa)) void
qux (void)
{
  int a = foo (42);
  int b = foo (43);
  int c = foo (44);
  int d = foo (45);
  int e = foo (46);
  int f = foo (47);
  int g = foo (48);
  int h = foo (49);
  corge (__builtin_alloca (foo (52)));
  baz (a, b, c, d, e, f, g, h);
  w++;
  baz (a, b, c, d, e, f, g, h);
  baz (a, b, c, d, e, f, g, h);
}

int
main ()
{
  qux ();
}


	Jakub
Jakub Jelinek Feb. 27, 2024, 12:09 p.m. UTC | #6
On Tue, Feb 27, 2024 at 10:13:14AM +0100, Jakub Jelinek wrote:
> For __libc_start_main, glibc surely just could use no_callee_saved_registers
> attribute, because that is typically the outermost frame in backtrace,
> there is no need to save those there.
> And for kernel if it really wants it and nothing will use the backtraces,
> perhaps the patch wouldn't need to be reverted completely but just guarded
> the implicit no_callee_saved_registers treatment of noreturn
> functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

Guarding on -fno-asynchronous-unwind-tables isn't a good idea,
with just -g we emit in that case unwind info in .debug_frame section
and even that shouldn't break, and we shouldn't generate different code for
-g vs. -g0.
The problem with the changes is that it breaks the unwinding and debugging
experience not just in the functions on which the optimization triggers,
but on all functions in the backtrace as well.

So, IMHO either revert the changes altogether, or guard on -mcmodel=kernel
(but talk to kernel people on linux-toolchains if that is what they actually
want).

	Jakub
Michael Matz Feb. 29, 2024, 4:25 p.m. UTC | #7
Hello,

On Tue, 27 Feb 2024, Jakub Jelinek wrote:

> On Tue, Feb 27, 2024 at 10:13:14AM +0100, Jakub Jelinek wrote:
> > For __libc_start_main, glibc surely just could use no_callee_saved_registers
> > attribute, because that is typically the outermost frame in backtrace,
> > there is no need to save those there.
> > And for kernel if it really wants it and nothing will use the backtraces,
> > perhaps the patch wouldn't need to be reverted completely but just guarded
> > the implicit no_callee_saved_registers treatment of noreturn
> > functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.
> 
> Guarding on -fno-asynchronous-unwind-tables isn't a good idea,
> with just -g we emit in that case unwind info in .debug_frame section
> and even that shouldn't break, and we shouldn't generate different code for
> -g vs. -g0.
> The problem with the changes is that it breaks the unwinding and debugging
> experience not just in the functions on which the optimization triggers,
> but on all functions in the backtrace as well.
> 
> So, IMHO either revert the changes altogether, or guard on -mcmodel=kernel
> (but talk to kernel people on linux-toolchains if that is what they actually
> want).

What is the underlying real purpose of the changes anyway?  It's a 
nano-optimization: for functions to be called multiple times 
they must either return or be recursive.  The latter is not very likely, 
so a noreturn function is called only once in the vast majority of cases.  
Any optimizations that diddle with the frame setup code for functions 
called only once seems to be ... well, not so very useful, especially so 
when they impact anything that is actually useful, like debugging.

I definitely think this shouldn't be done by default.


Ciao,
Michael.
diff mbox series

Patch

--- gcc/config/i386/i386.h.jj	2024-01-27 22:59:53.121319813 +0100
+++ gcc/config/i386/i386.h	2024-02-26 16:33:14.644745362 +0100
@@ -2730,9 +2730,12 @@  enum call_saved_registers_type
   /* The current function is a function specified with the "interrupt"
      or "no_caller_saved_registers" attribute.  */
   TYPE_NO_CALLER_SAVED_REGISTERS,
+  /* The current function is a function specified with the
+     "no_callee_saved_registers" attribute.  */
+  TYPE_NO_CALLEE_SAVED_REGISTERS,
   /* The current function is a function specified with the "noreturn"
-     or "no_callee_saved_registers" attribute.  */
-  TYPE_NO_CALLEE_SAVED_REGISTERS
+     attribute.  */
+  TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP,
 };
 
 enum queued_insn_type
--- gcc/config/i386/i386-options.cc.jj	2024-02-26 16:29:07.471201524 +0100
+++ gcc/config/i386/i386-options.cc	2024-02-26 16:42:01.730358609 +0100
@@ -3384,7 +3384,9 @@  ix86_set_func_type (tree fndecl)
 {
   /* No need to save and restore callee-saved registers for a noreturn
      function with nothrow or compiled with -fno-exceptions unless when
-     compiling with -O0 or -Og.
+     compiling with -O0 or -Og.  So that backtrace works for those at least
+     in most cases, save the bp register if it is used, because it often
+     is used in callers to compute CFA.
 
      NB: Can't use just TREE_THIS_VOLATILE to check if this is a noreturn
      function.  The local-pure-const pass turns an interrupt function
@@ -3394,15 +3396,20 @@  ix86_set_func_type (tree fndecl)
      function is marked with TREE_THIS_VOLATILE in the IR output, which
      leads to the incompatible attribute error in LTO1.  Ignore the
      interrupt function in this case.  */
-  bool has_no_callee_saved_registers
-    = ((TREE_THIS_VOLATILE (fndecl)
-	&& !lookup_attribute ("interrupt",
-			      TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
-	&& optimize
-	&& !optimize_debug
-	&& (TREE_NOTHROW (fndecl) || !flag_exceptions))
-       || lookup_attribute ("no_callee_saved_registers",
-			    TYPE_ATTRIBUTES (TREE_TYPE (fndecl))));
+  enum call_saved_registers_type no_callee_saved_registers
+    = TYPE_DEFAULT_CALL_SAVED_REGISTERS;
+  if (lookup_attribute ("no_callee_saved_registers",
+			TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
+    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS;
+  else if (TREE_THIS_VOLATILE (fndecl)
+	   && optimize
+	   && !optimize_debug
+	   && (TREE_NOTHROW (fndecl) || !flag_exceptions)
+	   && !lookup_attribute ("interrupt",
+				 TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
+	   && !lookup_attribute ("no_caller_saved_registers",
+				 TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
+    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP;
 
   if (cfun->machine->func_type == TYPE_UNKNOWN)
     {
@@ -3413,7 +3420,7 @@  ix86_set_func_type (tree fndecl)
 	    error_at (DECL_SOURCE_LOCATION (fndecl),
 		      "interrupt and naked attributes are not compatible");
 
-	  if (has_no_callee_saved_registers)
+	  if (no_callee_saved_registers)
 	    error_at (DECL_SOURCE_LOCATION (fndecl),
 		      "%qs and %qs attributes are not compatible",
 		      "interrupt", "no_callee_saved_registers");
@@ -3442,7 +3449,7 @@  ix86_set_func_type (tree fndecl)
 				TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
 	    cfun->machine->call_saved_registers
 	      = TYPE_NO_CALLER_SAVED_REGISTERS;
-	  if (has_no_callee_saved_registers)
+	  if (no_callee_saved_registers)
 	    {
 	      if (cfun->machine->call_saved_registers
 		  == TYPE_NO_CALLER_SAVED_REGISTERS)
@@ -3451,7 +3458,7 @@  ix86_set_func_type (tree fndecl)
 			  "no_caller_saved_registers",
 			  "no_callee_saved_registers");
 	      cfun->machine->call_saved_registers
-		= TYPE_NO_CALLEE_SAVED_REGISTERS;
+		= no_callee_saved_registers;
 	    }
 	}
     }
--- gcc/config/i386/i386.cc.jj	2024-02-26 11:12:36.275453513 +0100
+++ gcc/config/i386/i386.cc	2024-02-26 17:24:50.701602352 +0100
@@ -984,8 +984,9 @@  ix86_function_ok_for_sibcall (tree decl,
 
   /* Sibling call isn't OK if callee has no callee-saved registers
      and the calling function has callee-saved registers.  */
-  if ((cfun->machine->call_saved_registers
-       != TYPE_NO_CALLEE_SAVED_REGISTERS)
+  if (cfun->machine->call_saved_registers != TYPE_NO_CALLEE_SAVED_REGISTERS
+      && (cfun->machine->call_saved_registers
+	  != TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP)
       && lookup_attribute ("no_callee_saved_registers",
 			   TYPE_ATTRIBUTES (type)))
     return false;
@@ -6649,6 +6650,11 @@  ix86_save_reg (unsigned int regno, bool
 
     case TYPE_NO_CALLEE_SAVED_REGISTERS:
       return false;
+
+    case TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP:
+      if (regno != HARD_FRAME_POINTER_REGNUM)
+	return false;
+      break;
     }
 
   if (regno == REAL_PIC_OFFSET_TABLE_REGNUM
--- gcc/testsuite/gcc.target/i386/pr38534-1.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-1.c	2024-02-27 08:59:08.421878166 +0100
@@ -22,5 +22,5 @@  no_return_to_caller (void)
   while (1);
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
--- gcc/testsuite/gcc.target/i386/pr38534-4.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-4.c	2024-02-27 09:00:09.869021464 +0100
@@ -12,7 +12,7 @@  foo (fn_t bar)
   fn ();
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
 /* { dg-final { scan-assembler-not "jmp" } } */
 /* { dg-final { scan-assembler "call\[\\t \]+" } } */
--- gcc/testsuite/gcc.target/i386/pr38534-2.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-2.c	2024-02-27 08:59:35.723497526 +0100
@@ -12,7 +12,7 @@  foo (void)
   fn ();
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
 /* { dg-final { scan-assembler-not "jmp\[\\t \]+_?bar" } } */
 /* { dg-final { scan-assembler "call\[\\t \]+_?bar" } } */
--- gcc/testsuite/gcc.target/i386/pr38534-3.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-3.c	2024-02-27 08:59:49.669303090 +0100
@@ -13,7 +13,7 @@  foo (void)
   fn ();
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
 /* { dg-final { scan-assembler-not "jmp" } } */
 /* { dg-final { scan-assembler "call\[\\t \]+" } } */
--- gcc/testsuite/gcc.target/i386/pr114097-1.c.jj	2024-02-26 16:29:07.529200714 +0100
+++ gcc/testsuite/gcc.target/i386/pr114097-1.c	2024-02-27 09:01:37.876794453 +0100
@@ -22,5 +22,5 @@  no_return_to_caller (void)
   while (1);
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
--- gcc/testsuite/gcc.target/i386/stack-check-17.c.jj	2024-01-27 22:59:53.186318915 +0100
+++ gcc/testsuite/gcc.target/i386/stack-check-17.c	2024-02-27 09:06:10.805989259 +0100
@@ -32,5 +32,5 @@  f3 (void)
    register on ia32 for a noreturn function.  */
 /* { dg-final { scan-assembler-times "push\[ql\]" 1 { target { ! ia32 } } } }  */
 /* { dg-final { scan-assembler-times "push\[ql\]" 3 { target ia32 } } }  */
-/* { dg-final { scan-assembler-times "pop" 1 } } */
-
+/* { dg-final { scan-assembler-not "pop" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "pop" 1 { target ia32 } } } */