diff mbox series

[ARM] Fix ICE during thunk generation with -mlong-calls

Message ID 1722777.YijAB52ccF@polaris
State New
Headers show
Series [ARM] Fix ICE during thunk generation with -mlong-calls | expand

Commit Message

Eric Botcazou Sept. 17, 2018, 11:19 a.m. UTC
Hi,

this is a regression present on mainline, 8 and 7 branches.  The new, RTL 
implementation of arm32_output_mi_thunk breaks during the libstdc++ build if 
you configure the compiler with -mlong-calls by default:

0xdb57eb gen_reg_rtx(machine_mode)
        /home/eric/svn/gcc/gcc/emit-rtl.c:1155
0xde9ae7 force_reg(machine_mode, rtx_def*)
        /home/eric/svn/gcc/gcc/explow.c:654
0x1bf73bf gen_sibcall(rtx_def*, rtx_def*, rtx_def*)
        /home/eric/svn/gcc/gcc/config/arm/arm.md:8272
0x187d3b1 arm32_output_mi_thunk
        /home/eric/svn/gcc/gcc/config/arm/arm.c:26762
0x187d4af arm_output_mi_thunk
        /home/eric/svn/gcc/gcc/config/arm/arm.c:26783
0xcb9c94 cgraph_node::expand_thunk(bool, bool)
        /home/eric/svn/gcc/gcc/cgraphunit.c:1783

because the code is wired for a short call.  Moreover, in PIC mode you need to 
work harder and fix up the minipool too with -mlong-calls.

Tested on ARM/Linux, OK for mainline, 8 and 7 branches?


2018-09-17  Eric Botcazou  <ebotcazou@adacore.com>

	* config/arm/arm.c (arm_reorg): Skip Thumb reorg pass for thunks.
	(arm32_output_mi_thunk): Deal with long calls.


2018-09-17  Eric Botcazou  <ebotcazou@adacore.com>

	* g++.dg/other/thunk2a.C: New test.
	* g++.dg/other/thunk2b.C: Likewise.

Comments

Richard Earnshaw (lists) Sept. 17, 2018, 12:54 p.m. UTC | #1
On 17/09/18 12:19, Eric Botcazou wrote:
> Hi,
> 
> this is a regression present on mainline, 8 and 7 branches.  The new, RTL 
> implementation of arm32_output_mi_thunk breaks during the libstdc++ build if 
> you configure the compiler with -mlong-calls by default:
> 
> 0xdb57eb gen_reg_rtx(machine_mode)
>         /home/eric/svn/gcc/gcc/emit-rtl.c:1155
> 0xde9ae7 force_reg(machine_mode, rtx_def*)
>         /home/eric/svn/gcc/gcc/explow.c:654
> 0x1bf73bf gen_sibcall(rtx_def*, rtx_def*, rtx_def*)
>         /home/eric/svn/gcc/gcc/config/arm/arm.md:8272
> 0x187d3b1 arm32_output_mi_thunk
>         /home/eric/svn/gcc/gcc/config/arm/arm.c:26762
> 0x187d4af arm_output_mi_thunk
>         /home/eric/svn/gcc/gcc/config/arm/arm.c:26783
> 0xcb9c94 cgraph_node::expand_thunk(bool, bool)
>         /home/eric/svn/gcc/gcc/cgraphunit.c:1783
> 
> because the code is wired for a short call.  Moreover, in PIC mode you need to 
> work harder and fix up the minipool too with -mlong-calls.
> 
> Tested on ARM/Linux, OK for mainline, 8 and 7 branches?
> 
> 
> 2018-09-17  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	* config/arm/arm.c (arm_reorg): Skip Thumb reorg pass for thunks.

this seems to contradict your statement above about having to work
harder to fix up minipools.  why is it correct to skip this entirely?

> 	(arm32_output_mi_thunk): Deal with long calls.
> 
> 
> 2018-09-17  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	* g++.dg/other/thunk2a.C: New test.
> 	* g++.dg/other/thunk2b.C: Likewise.
> 
> 
> p.diff
> 
> 
> Index: config/arm/arm.c
> ===================================================================
> --- config/arm/arm.c	(revision 264342)
> +++ config/arm/arm.c	(working copy)
> @@ -17647,7 +17647,9 @@ arm_reorg (void)
>  
>    if (use_cmse)
>      cmse_nonsecure_call_clear_caller_saved ();
> -  if (TARGET_THUMB1)
> +  if (cfun->is_thunk)
> +    ;
> +  else if (TARGET_THUMB1)
>      thumb1_reorg ();
>    else if (TARGET_THUMB2)
>      thumb2_reorg ();
> @@ -26721,6 +26723,8 @@ static void
>  arm32_output_mi_thunk (FILE *file, tree, HOST_WIDE_INT delta,
>  		       HOST_WIDE_INT vcall_offset, tree function)
>  {
> +  const bool long_call_p = arm_is_long_call_p (function);
> +
>    /* On ARM, this_regno is R0 or R1 depending on
>       whether the function returns an aggregate or not.
>    */
> @@ -26758,9 +26762,22 @@ arm32_output_mi_thunk (FILE *file, tree,
>        TREE_USED (function) = 1;
>      }
>    rtx funexp = XEXP (DECL_RTL (function), 0);
> +  if (long_call_p)
> +    {
> +      emit_move_insn (temp, funexp);
> +      funexp = temp;
> +    }
>    funexp = gen_rtx_MEM (FUNCTION_MODE, funexp);
> -  rtx_insn * insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, NULL_RTX));
> +  rtx_insn *insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, NULL_RTX));
>    SIBLING_CALL_P (insn) = 1;
> +  emit_barrier ();

Why do we need a barrier here unconditionally (ie in the non-longcall case)?

R.

> +
> +  /* Indirect calls require a bit of fixup in PIC mode.  */
> +  if (long_call_p)
> +    {
> +      split_all_insns_noflow ();
> +      arm_reorg ();
> +    }
>  
>    insn = get_insns ();
>    shorten_branches (insn);
> 
> 
> thunk2a.C
> 
> 
> // { dg-do compile { target arm*-*-* } }
> // { dg-options "-mlong-calls -ffunction-sections }
> 
> class a {
> public:
>   virtual ~a();
> };
> 
> class b : virtual a {};
> 
> class c : b {
>   ~c();
> };
> 
> c::~c() {}
> 
> 
> thunk2b.C
> 
> 
> // { dg-do compile { target arm*-*-* && fpic } }
> // { dg-options "-mlong-calls -ffunction-sections -fPIC }
> 
> class a {
> public:
>   virtual ~a();
> };
> 
> class b : virtual a {};
> 
> class c : b {
>   ~c();
> };
> 
> c::~c() {}
>
Eric Botcazou Sept. 18, 2018, 9 a.m. UTC | #2
> this seems to contradict your statement above about having to work
> harder to fix up minipools.

Why?  Fixing up minipools is done in the generic ARM reorg pass, not in the 
Thumb reorg pass(es).

> Why do we need a barrier here unconditionally (ie in the non-longcall case)?

We don't, but it doesn't harm to put it either.  For example, the x86, PowerPC 
and SPARC ports always do it.
Richard Earnshaw (lists) Sept. 18, 2018, 10:26 a.m. UTC | #3
On 18/09/18 10:00, Eric Botcazou wrote:
>> this seems to contradict your statement above about having to work
>> harder to fix up minipools.
> 
> Why?  Fixing up minipools is done in the generic ARM reorg pass, not in the 
> Thumb reorg pass(es).
> 

Ah!  But that still doesn't explain why you want to skip these passes
when building thunks.

>> Why do we need a barrier here unconditionally (ie in the non-longcall case)?
> 
> We don't, but it doesn't harm to put it either.  For example, the x86, PowerPC 
> and SPARC ports always do it.
> 

So is the barrier correct, or isn't it?  There's really no two ways
about this.  I don't like arbitrary changes that are justified solely on
'that's what another port does'.

R.
Eric Botcazou Sept. 24, 2018, 9:19 a.m. UTC | #4
> Ah!  But that still doesn't explain why you want to skip these passes
> when building thunks.

They simply don't work because there is no CFG for thunks; I can add a blurb 
about that.

> So is the barrier correct, or isn't it?  There's really no two ways
> about this.  I don't like arbitrary changes that are justified solely on
> 'that's what another port does'.

The barrier is required by the arm_reorg pass, but it is optional when the 
pass is not run.  I think that we can consider that it is also correct.
Richard Earnshaw (lists) Sept. 25, 2018, 2:27 p.m. UTC | #5
On 24/09/18 10:19, Eric Botcazou wrote:
>> Ah!  But that still doesn't explain why you want to skip these passes
>> when building thunks.
> 
> They simply don't work because there is no CFG for thunks; I can add a blurb 
> about that.

Yes, this needs a comment as it's far from obvious when looking at the code.

> 
>> So is the barrier correct, or isn't it?  There's really no two ways
>> about this.  I don't like arbitrary changes that are justified solely on
>> 'that's what another port does'.
> 
> The barrier is required by the arm_reorg pass, but it is optional when the 
> pass is not run.  I think that we can consider that it is also correct.
> 

Ah, because you're now calling arm_reorg directly and it needs the
barrier to drop the minipool in the right place.

So OK with the additional comment.

R.
diff mbox series

Patch

Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c	(revision 264342)
+++ config/arm/arm.c	(working copy)
@@ -17647,7 +17647,9 @@  arm_reorg (void)
 
   if (use_cmse)
     cmse_nonsecure_call_clear_caller_saved ();
-  if (TARGET_THUMB1)
+  if (cfun->is_thunk)
+    ;
+  else if (TARGET_THUMB1)
     thumb1_reorg ();
   else if (TARGET_THUMB2)
     thumb2_reorg ();
@@ -26721,6 +26723,8 @@  static void
 arm32_output_mi_thunk (FILE *file, tree, HOST_WIDE_INT delta,
 		       HOST_WIDE_INT vcall_offset, tree function)
 {
+  const bool long_call_p = arm_is_long_call_p (function);
+
   /* On ARM, this_regno is R0 or R1 depending on
      whether the function returns an aggregate or not.
   */
@@ -26758,9 +26762,22 @@  arm32_output_mi_thunk (FILE *file, tree,
       TREE_USED (function) = 1;
     }
   rtx funexp = XEXP (DECL_RTL (function), 0);
+  if (long_call_p)
+    {
+      emit_move_insn (temp, funexp);
+      funexp = temp;
+    }
   funexp = gen_rtx_MEM (FUNCTION_MODE, funexp);
-  rtx_insn * insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, NULL_RTX));
+  rtx_insn *insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, NULL_RTX));
   SIBLING_CALL_P (insn) = 1;
+  emit_barrier ();
+
+  /* Indirect calls require a bit of fixup in PIC mode.  */
+  if (long_call_p)
+    {
+      split_all_insns_noflow ();
+      arm_reorg ();
+    }
 
   insn = get_insns ();
   shorten_branches (insn);