Patchwork Problem with ARM longcalls

login
register
mail settings
Submitter Bernd Schmidt
Date March 23, 2011, 3:46 p.m.
Message ID <4D8A15E2.5020005@codesourcery.com>
Download mbox | patch
Permalink /patch/88099/
State New
Headers show

Comments

Bernd Schmidt - March 23, 2011, 3:46 p.m.
I've discovered a problem with -mlong-calls on ARM. The bug was first
reported against a new target, but I'd copied the relevant code from the
ARM backend.

We use current_function_section in arm_is_long_call_p to decide whether
we're calling something that goes into the same section. The problem
with this is that current_function_section can only be used during
final, since it relies on the global variable in_cold_section_p which is
set up only in assemble_start_function. On ARM, this problem manifests
as short-calls when a long-call would be required; in the other port it
was an "insn doesn't satisfy its constraints" error.

The following patch is against 4.5, since the problem appears hidden in
mainline (the initialization of first_function_block_is_cold has
changed). Ok for trunk and branches after arm-linux tests complete?


Bernd
* function.c (init_function_start): Call decide_function_section.
	* varasm.c (decide_function_section): New function.
	(assemble_start_function): When not using
	flag_reorder_blocks_and_partition, don't compute in_cold_section_p
	or first_function_block_is_cold.
	* rtl.h (decide_function_section): Declare.

	* gcc.target/arm/cold-lc.c: New test.
Richard Earnshaw - March 24, 2011, 2:24 p.m.
On Wed, 2011-03-23 at 16:46 +0100, Bernd Schmidt wrote:
> I've discovered a problem with -mlong-calls on ARM. The bug was first
> reported against a new target, but I'd copied the relevant code from the
> ARM backend.
> 
> We use current_function_section in arm_is_long_call_p to decide whether
> we're calling something that goes into the same section. The problem
> with this is that current_function_section can only be used during
> final, since it relies on the global variable in_cold_section_p which is
> set up only in assemble_start_function. On ARM, this problem manifests
> as short-calls when a long-call would be required; in the other port it
> was an "insn doesn't satisfy its constraints" error.
> 
> The following patch is against 4.5, since the problem appears hidden in
> mainline (the initialization of first_function_block_is_cold has
> changed). Ok for trunk and branches after arm-linux tests complete?
> 
> 
> Bernd

The ARM port currently doesn't support hot/cold partitioning of code
(and can't until the constant pool code is rewritten to deal with it),
so how is this a problem?

R.
Bernd Schmidt - March 24, 2011, 2:27 p.m.
On 03/24/2011 03:24 PM, Richard Earnshaw wrote:
> 
> On Wed, 2011-03-23 at 16:46 +0100, Bernd Schmidt wrote:
>> I've discovered a problem with -mlong-calls on ARM. The bug was first
>> reported against a new target, but I'd copied the relevant code from the
>> ARM backend.
>>
>> We use current_function_section in arm_is_long_call_p to decide whether
>> we're calling something that goes into the same section. The problem
>> with this is that current_function_section can only be used during
>> final, since it relies on the global variable in_cold_section_p which is
>> set up only in assemble_start_function. On ARM, this problem manifests
>> as short-calls when a long-call would be required; in the other port it
>> was an "insn doesn't satisfy its constraints" error.
>>
>> The following patch is against 4.5, since the problem appears hidden in
>> mainline (the initialization of first_function_block_is_cold has
>> changed). Ok for trunk and branches after arm-linux tests complete?
>>
>>
>> Bernd
> 
> The ARM port currently doesn't support hot/cold partitioning of code
> (and can't until the constant pool code is rewritten to deal with it),
> so how is this a problem?

Different functions can still go into different sections. When we start
expanding a function, in_cold_section_p still contains the value that
was correct for the previous one. It will change at the start of final,
which means that current_function_section can return different values
while compiling a function. See the included testcase.


Bernd
Bernd Schmidt - April 14, 2011, 12:40 p.m.
Ping.  Contains only changes outside config/arm.

http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01509.html


Bernd

On 03/23/2011 04:46 PM, Bernd Schmidt wrote:
> I've discovered a problem with -mlong-calls on ARM. The bug was first
> reported against a new target, but I'd copied the relevant code from the
> ARM backend.
> 
> We use current_function_section in arm_is_long_call_p to decide whether
> we're calling something that goes into the same section. The problem
> with this is that current_function_section can only be used during
> final, since it relies on the global variable in_cold_section_p which is
> set up only in assemble_start_function. On ARM, this problem manifests
> as short-calls when a long-call would be required; in the other port it
> was an "insn doesn't satisfy its constraints" error.
> 
> The following patch is against 4.5, since the problem appears hidden in
> mainline (the initialization of first_function_block_is_cold has
> changed). Ok for trunk and branches after arm-linux tests complete?

> 	* function.c (init_function_start): Call decide_function_section.
> 	* varasm.c (decide_function_section): New function.
> 	(assemble_start_function): When not using
> 	flag_reorder_blocks_and_partition, don't compute in_cold_section_p
> 	or first_function_block_is_cold.
> 	* rtl.h (decide_function_section): Declare.
> 
> 	* gcc.target/arm/cold-lc.c: New test.
Richard Guenther - April 14, 2011, 1 p.m.
On Thu, Apr 14, 2011 at 2:40 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> Ping.  Contains only changes outside config/arm.
>
> http://gcc.gnu.org/m/gcc-patches/2011-03/msg01509.html

Ok.

Thanks,
Richard.

>
> Bernd
>
> On 03/23/2011 04:46 PM, Bernd Schmidt wrote:
>> I've discovered a problem with -mlong-calls on ARM. The bug was first
>> reported against a new target, but I'd copied the relevant code from the
>> ARM backend.
>>
>> We use current_function_section in arm_is_long_call_p to decide whether
>> we're calling something that goes into the same section. The problem
>> with this is that current_function_section can only be used during
>> final, since it relies on the global variable in_cold_section_p which is
>> set up only in assemble_start_function. On ARM, this problem manifests
>> as short-calls when a long-call would be required; in the other port it
>> was an "insn doesn't satisfy its constraints" error.
>>
>> The following patch is against 4.5, since the problem appears hidden in
>> mainline (the initialization of first_function_block_is_cold has
>> changed). Ok for trunk and branches after arm-linux tests complete?
>
>>       * function.c (init_function_start): Call decide_function_section.
>>       * varasm.c (decide_function_section): New function.
>>       (assemble_start_function): When not using
>>       flag_reorder_blocks_and_partition, don't compute in_cold_section_p
>>       or first_function_block_is_cold.
>>       * rtl.h (decide_function_section): Declare.
>>
>>       * gcc.target/arm/cold-lc.c: New test.
>

Patch

Index: gcc/testsuite/gcc.target/arm/cold-lc.c
===================================================================
--- gcc/testsuite/gcc.target/arm/cold-lc.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/cold-lc.c	(revision 0)
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlong-calls" } */
+/* { dg-final { scan-assembler-not "bl\[^\n\]*dump_stack" } } */
+
+extern void dump_stack (void) __attribute__ ((__cold__)) __attribute__ ((noinline));
+struct thread_info {
+    struct task_struct *task;
+};
+extern struct thread_info *current_thread_info (void);
+
+void dump_stack (void)
+{
+    unsigned long stack;
+    show_stack ((current_thread_info ()->task), &stack);
+}
+
+void die (char *str, void *fp, int nr)
+{
+    dump_stack ();
+    while (1);
+}
+
Index: gcc/function.c
===================================================================
--- gcc/function.c	(revision 170052)
+++ gcc/function.c	(working copy)
@@ -4227,6 +4227,7 @@  init_function_start (tree subr)
   else
     allocate_struct_function (subr, false);
   prepare_function_start ();
+  decide_function_section (subr);
 
   /* Warn if this value is an aggregate type,
      regardless of which calling convention we are using for it.  */
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 170052)
+++ gcc/varasm.c	(working copy)
@@ -1687,6 +1687,38 @@  notice_global_symbol (tree decl)
     }
 }
 
+/* If not using flag_reorder_blocks_and_partition, decide early whether the
+   current function goes into the cold section, so that targets can use
+   current_function_section during RTL expansion.  DECL describes the
+   function.  */
+
+void
+decide_function_section (tree decl)
+{
+  first_function_block_is_cold = false;
+
+  if (flag_reorder_blocks_and_partition)
+    /* We will decide in assemble_start_function.  */
+    return;
+
+ if (DECL_SECTION_NAME (decl))
+    {
+      /* Calls to function_section rely on first_function_block_is_cold
+	 being accurate.  The first block may be cold even if we aren't
+	 doing partitioning, if the entire function was decided by
+	 choose_function_section (predict.c) to be cold.  */
+
+      initialize_cold_section_name ();
+
+      if (crtl->subsections.unlikely_text_section_name
+	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
+		     crtl->subsections.unlikely_text_section_name) == 0)
+	first_function_block_is_cold = true;
+    }
+
+  in_cold_section_p = first_function_block_is_cold;
+}
+
 /* Output assembler code for the constant pool of a function and associated
    with defining the name of the function.  DECL describes the function.
    NAME is the function's name.  For the constant pool, we use the current
@@ -1701,7 +1733,6 @@  assemble_start_function (tree decl, cons
 
   crtl->subsections.unlikely_text_section_name = NULL;
 
-  first_function_block_is_cold = false;
   if (flag_reorder_blocks_and_partition)
     {
       ASM_GENERATE_INTERNAL_LABEL (tmp_label, "LHOTB", const_labelno);
@@ -1738,6 +1769,8 @@  assemble_start_function (tree decl, cons
 
   if (flag_reorder_blocks_and_partition)
     {
+      first_function_block_is_cold = false;
+
       switch_to_section (unlikely_text_section ());
       assemble_align (DECL_ALIGN (decl));
       ASM_OUTPUT_LABEL (asm_out_file, crtl->subsections.cold_section_label);
@@ -1754,23 +1787,8 @@  assemble_start_function (tree decl, cons
 	  hot_label_written = true;
 	  first_function_block_is_cold = true;
 	}
+      in_cold_section_p = first_function_block_is_cold;
     }
-  else if (DECL_SECTION_NAME (decl))
-    {
-      /* Calls to function_section rely on first_function_block_is_cold
-	 being accurate.  The first block may be cold even if we aren't
-	 doing partitioning, if the entire function was decided by
-	 choose_function_section (predict.c) to be cold.  */
-
-      initialize_cold_section_name ();
-
-      if (crtl->subsections.unlikely_text_section_name
-	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
-		     crtl->subsections.unlikely_text_section_name) == 0)
-	first_function_block_is_cold = true;
-    }
-
-  in_cold_section_p = first_function_block_is_cold;
 
   /* Switch to the correct text section for the start of the function.  */
 
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	(revision 170052)
+++ gcc/rtl.h	(working copy)
@@ -1644,6 +1644,7 @@  extern rtx get_pool_constant (rtx);
 extern rtx get_pool_constant_mark (rtx, bool *);
 extern enum machine_mode get_pool_mode (const_rtx);
 extern rtx simplify_subtraction (rtx);
+extern void decide_function_section (tree);
 
 /* In function.c  */
 extern rtx assign_stack_local (enum machine_mode, HOST_WIDE_INT, int);