diff mbox

[ARM,v2] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

Message ID CADnVucB6z2yLRmvhfObO_OV2HMSs8VZSnR4z1u6y4+vvM=sa=g@mail.gmail.com
State New
Headers show

Commit Message

Charles Baylis March 24, 2016, 9:02 p.m. UTC
When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc
function are not generated as long calls.

This is the sequel to this patch
    https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html

This patch fixes the following problems with the previous patch.
. Nested functions now work (thanks to Richard E for spotting this)
. Thumb-1 now works

This patch works by adding new patterns (one for ARM/Thumb-2 and one
for Thumb-1) which are placed in the prologue as a placeholder for
some RTL which describes the address. This is either a SYMBOL_REF for
targets with MOVW/MOVT, or a literal pool reference for other targets.
The implementation of ARM_FUNCTION_PROFILER is changed to search for
this insn so that the the address of the __gnu_mcount_nc function can
be loaded using an appropriate sequence for the target.

I also tried generating the profiling call sequence directly in the
prologue, but this requires some unpleasant hacks to prevent spurious
register pushes from ASM_OUTPUT_REG_PUSH.

Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU.
The generated code sequences have been inspected for normal and nested
functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets.

This does not fix a regression, so I don't expect to apply it for
GCC6, is it OK for when stage 1 re-opens.

gcc/ChangeLog:

2016-03-24  Charles Baylis  <charles.baylis@linaro.org>

        * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
        * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
        (arm_expand_prologue): Likewise.
        (thumb1_expand_prologue): Likewise.
        (arm_output_long_call_to_profile_func): Likewise.
        (arm_emit_long_call_profile): Likewise.
        * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
        * config/arm/arm.md (arm_long_call_profile): New pattern.
        * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
        define.
        * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
        * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

gcc/testsuite/ChangeLog:

2016-03-24  Charles Baylis  <charles.baylis@linaro.org>

        * gcc.target/arm/pr69770.c: New test.

Comments

Kugan Vivekanandarajah March 29, 2016, 1:16 a.m. UTC | #1
On 25/03/16 08:02, Charles Baylis wrote:
> When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc
> function are not generated as long calls.
> 
> This is the sequel to this patch
>     https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html
> 
> This patch fixes the following problems with the previous patch.
> . Nested functions now work (thanks to Richard E for spotting this)
> . Thumb-1 now works
> 
> This patch works by adding new patterns (one for ARM/Thumb-2 and one
> for Thumb-1) which are placed in the prologue as a placeholder for
> some RTL which describes the address. This is either a SYMBOL_REF for
> targets with MOVW/MOVT, or a literal pool reference for other targets.
> The implementation of ARM_FUNCTION_PROFILER is changed to search for
> this insn so that the the address of the __gnu_mcount_nc function can
> be loaded using an appropriate sequence for the target.
> 
> I also tried generating the profiling call sequence directly in the
> prologue, but this requires some unpleasant hacks to prevent spurious
> register pushes from ASM_OUTPUT_REG_PUSH.
> 
> Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU.
> The generated code sequences have been inspected for normal and nested
> functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets.
> 
> This does not fix a regression, so I don't expect to apply it for
> GCC6, is it OK for when stage 1 re-opens.
> 

Hi Charles,

+static void
+arm_emit_long_call_profile_insn ()
+{
+  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
+  /* if movt/movw are not available, use a constant pool */
+  if (!arm_arch_thumb2)

Should this be !TARGET_USE_MOVT?

Thanks,
Kugan
Ramana Radhakrishnan March 30, 2016, 9:33 a.m. UTC | #2
On 24/03/16 21:02, Charles Baylis wrote:
> When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc
> function are not generated as long calls.
> 
> This is the sequel to this patch
>     https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html
> 
> This patch fixes the following problems with the previous patch.
> . Nested functions now work (thanks to Richard E for spotting this)
> . Thumb-1 now works

> 
> This patch works by adding new patterns (one for ARM/Thumb-2 and one
> for Thumb-1) which are placed in the prologue as a placeholder for
> some RTL which describes the address. This is either a SYMBOL_REF for
> targets with MOVW/MOVT, or a literal pool reference for other targets.
> The implementation of ARM_FUNCTION_PROFILER is changed to search for
> this insn so that the the address of the __gnu_mcount_nc function can
> be loaded using an appropriate sequence for the target.

I'm reasonably happy with the approach but there are nits.

> 
> I also tried generating the profiling call sequence directly in the
> prologue, but this requires some unpleasant hacks to prevent spurious
> register pushes from ASM_OUTPUT_REG_PUSH.
> 
> Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU.
> The generated code sequences have been inspected for normal and nested
> functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets.
> 
> This does not fix a regression, so I don't expect to apply it for
> GCC6, is it OK for when stage 1 re-opens.


I'm not sure how much testing coverage you get by just running the testsuite. Doing a profiled bootstrap with -mlong-calls and a regression test run for arm and / or thumb2 would be a useful test to do additionally - Given that this originally came from kernel builds with allyesconfig how important is this to be fixed for GCC 6 ? I'd rather take the fix into GCC 6 to get the kernel building again but that's something we can discuss with the RM's once the issues with the patch are fixed.
> 
> gcc/ChangeLog:
> 
> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
> 
>         * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
>         * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
>         (arm_expand_prologue): Likewise.
>         (thumb1_expand_prologue): Likewise.
>         (arm_output_long_call_to_profile_func): Likewise.
>         (arm_emit_long_call_profile): Likewise.
>         * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
>         * config/arm/arm.md (arm_long_call_profile): New pattern.
>         * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
>         define.
>         * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
>         * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
> 
>         * gcc.target/arm/pr69770.c: New test.
> 
> 
> 0001-PR69770-mlong-calls-does-not-affect-calls-to-__gnu_m.patch
> 
> 
> From 5a39451f34be9b6ca98b3460bf40d879d6ee61a5 Mon Sep 17 00:00:00 2001
> From: Charles Baylis <charles.baylis@linaro.org>
> Date: Thu, 24 Mar 2016 20:43:25 +0000
> Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc
>  generated by -pg
> 
> gcc/ChangeLog:
> 
> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
> 
>         * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
>         * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
>         (arm_expand_prologue): Likewise.
>         (thumb1_expand_prologue): Likewise.
>         (arm_output_long_call_to_profile_func): Likewise.
>         (arm_emit_long_call_profile): Likewise.
>         * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
>         * config/arm/arm.md (arm_long_call_profile): New pattern.
>         * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
> 	define.
>         * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
>         * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
> 
>         * gcc.target/arm/pr69770.c: New test.
> 
> Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 0083673..324c9f4 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -343,6 +343,7 @@ extern void arm_register_target_pragmas (void);
>  extern void arm_cpu_cpp_builtins (struct cpp_reader *);
>  
>  extern bool arm_is_constant_pool_ref (rtx);
> +void arm_emit_long_call_profile ();
>  
>  /* Flags used to identify the presence of processor capabilities.  */
>  
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index c868490..040b255 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -21426,6 +21426,22 @@ output_probe_stack_range (rtx reg1, rtx reg2)
>    return "";
>  }
>  
> +static void
> +arm_emit_long_call_profile_insn ()

s/()/(void)

> +{
> +  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
> +  /* if movt/movw are not available, use a constant pool */
> +  if (!arm_arch_thumb2)

Please don't do that - use TARGET_USE_MOVT && !target_word_relocations please. And better still look at adding !target_word_relocations to the define of TARGET_USE_MOVT.

> +  {
> +    sym_ref = force_const_mem(Pmode, sym_ref);

Space between force_const_mem and `('

> +  }
> +  rtvec vec = gen_rtvec (1, sym_ref);
> +  rtx tmp =
> +    gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE);
> +  emit_insn (tmp);
> +}
> +

> +
>  /* Generate the prologue instructions for entry into an ARM or Thumb-2
>     function.  */
>  void
> @@ -21789,6 +21805,10 @@ arm_expand_prologue (void)
>        arm_load_pic_register (mask);
>      }
>  
> +  if (crtl->profile && TARGET_LONG_CALLS
> +      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
> +    arm_emit_long_call_profile_insn ();
> +
>    /* If we are profiling, make sure no instructions are scheduled before
>       the call to mcount.  Similarly if the user has requested no
>       scheduling in the prolog.  Similarly if we want non-call exceptions
> @@ -24985,6 +25005,10 @@ thumb1_expand_prologue (void)
>    if (frame_pointer_needed)
>      thumb_set_frame_pointer (offsets);
>  
> +  if (crtl->profile && TARGET_LONG_CALLS
> +      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
> +    arm_emit_long_call_profile_insn ();
> +
>    /* If we are profiling, make sure no instructions are scheduled before
>       the call to mcount.  Similarly if the user has requested no
>       scheduling in the prolog.  Similarly if we want non-call exceptions
> @@ -30289,4 +30313,70 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri,
>    return;
>  }
>  
> +static void
> +arm_output_long_call_to_profile_func (rtx * operands, bool push_scratch)
> +{

Missing comment above function.

> +  /* operands[0] is the address of the __gnu_mcount_nc function
> +     operands[1] is the scratch register we use to load that address */
> +  if (push_scratch)
> +    output_asm_insn ("push\t{%1}", operands);
> +  output_asm_insn ("push\t{lr}", operands);
> +  if (GET_CODE (operands[0]) == SYMBOL_REF)
> +    {
> +      output_asm_insn ("movw\t%1, #:lower16:%c0", operands);
> +      output_asm_insn ("movt\t%1, #:upper16:%c0", operands);
> +    }
> +  else
> +    {
> +      output_asm_insn ("ldr\t%1, %0", operands);
> +    }
> +  if (!arm_arch5)
> +    {
> +      output_asm_insn ("mov\tlr, pc", operands);
> +      output_asm_insn ("mov\tpc, %1", operands);
> +    }
> +  else
> +    output_asm_insn ("blx\t%1", operands);
> +  if (push_scratch)
> +    output_asm_insn ("pop\t{%1}", operands);
> +}
> +


> +void
> +arm_emit_long_call_profile()

s/()/ (void)

Missing comment.

> +{
> +  rtx alcp = NULL;
> +  rtx operands[2];
> +  bool push_scratch;
> +  /* find the arm_long_call_profile */
> +  for (rtx_insn * insn = get_insns (); insn; insn = NEXT_INSN (insn))
> +    {
> +      if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_PROLOGUE_END)
> +	break;
> +      if (INSN_CODE (insn) == CODE_FOR_arm_long_call_profile
> +	  || INSN_CODE (insn) == CODE_FOR_thumb1_long_call_profile)
> +	{
> +	  alcp = PATTERN (insn);
> +	  break;
> +	}
> +    }
> +  gcc_assert (alcp);
> +
> +  operands[0] = XEXP (XEXP (alcp, 0), 0);
> +  if (TARGET_32BIT)
> +    {
> +      operands[1] = gen_rtx_REG (SImode, IP_REGNUM);
> +      push_scratch = false;
> +    }
> +  else
> +    {
> +      /* for nested functions, we can set push_scratch to false, since
> +	 final.c:profile_function.c and ASM_OUTPUT_REG_PUSH preserve it as
> +         part of the sequence to preserve ip across the call to the 
> +         profiling function. */
> +      operands[1] = gen_rtx_REG (SImode, R0_REGNUM + 7);
> +      push_scratch = !IS_NESTED (arm_current_func_type ());
> +    }
> +  arm_output_long_call_to_profile_func (operands, push_scratch);
> +}
> +
>  #include "gt-arm.h"
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 6352140..44f1c64 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -2044,7 +2044,9 @@ extern int making_const_table;
>     that ASM_OUTPUT_REG_PUSH will be matched with ASM_OUTPUT_REG_POP, and
>     that r7 isn't used by the function profiler, so we can use it as a
>     scratch reg.  WARNING: This isn't safe in the general case!  It may be
> -   sensitive to future changes in final.c:profile_function.  */
> +   sensitive to future changes in final.c:profile_function. This is also
> +   relied on in arm_emit_long_call_profile() which assumes r7 can be
> +   used as a scratch register to load the address of the function profiler.  */
>  #define ASM_OUTPUT_REG_PUSH(STREAM, REGNO)		\
>    do							\
>      {							\
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 47171b9..beb44d5 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -11424,6 +11424,15 @@
>    DONE;
>  })
>  
> +(define_insn "arm_long_call_profile"
> +  [(unspec_volatile [(match_operand:SI 0 "general_operand" "ji,m")
> +                    ] VUNSPEC_LONG_CALL_PROFILE)]
> +  "TARGET_32BIT"
> +  "%@ arm_long_call_profile"
> +  [(set_attr "arm_pool_range" "*,4096")
> +   (set_attr "arm_neg_pool_range" "*,4084")]
> +)

Please set the length of this insn appropriately. IIUC this is the length of the instructions from FUNCTION_PROFILER, though I suspect we could push all of this out into the RTL stream. Underestimating the length of this insn can cause issues with the minipool placement code later especially with -pg.

> +
>  ;; Vector bits common to IWMMXT and Neon
>  (include "vec-common.md")
>  ;; Load the Intel Wireless Multimedia Extension patterns
> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
> index 5d6c4ed..7acf66b 100644
> --- a/gcc/config/arm/bpabi.h
> +++ b/gcc/config/arm/bpabi.h
> @@ -174,11 +174,18 @@
>  
>  #undef  NO_PROFILE_COUNTERS
>  #define NO_PROFILE_COUNTERS 1
> +#undef  ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS
> +#define ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS 1

Why do we need this ?

>  #undef  ARM_FUNCTION_PROFILER
>  #define ARM_FUNCTION_PROFILER(STREAM, LABELNO)  			\
>  {									\
> -  fprintf (STREAM, "\tpush\t{lr}\n");					\
> -  fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");				\
> +  if (TARGET_LONG_CALLS)						\
> +  { 									\
> +    arm_emit_long_call_profile();					\
> +  } else {								\
> +    fprintf (STREAM, "\tpush\t{lr}\n");					\
> +    fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");			\
> +  }									\
>  }
>  

While you are here delete the definition of FUNCTION_PROFILER that is conditional on THUMB_FUNCTION_PROFILER, looks like that code is dead.

>  #undef SUBTARGET_FRAME_POINTER_REQUIRED
> diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
> index 072ed4d..9b1de5f 100644
> --- a/gcc/config/arm/thumb1.md
> +++ b/gcc/config/arm/thumb1.md
> @@ -1798,7 +1798,7 @@
>    [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")]
>  		    VUNSPEC_EH_RETURN)
>     (clobber (match_scratch:SI 1 "=&l"))]
> -  "TARGET_THUMB1"
> +  "TARGET_THUMB1 && 0"

This means it wasn't tested for Thumb1 or you had to turn this off to get proper results for Thumb1 ?

Please retest.

>    "#"
>    "&& reload_completed"
>    [(const_int 0)]
> @@ -1809,4 +1809,13 @@
>    }"
>    [(set_attr "type" "mov_reg")]
>  )
> +
> +(define_insn "thumb1_long_call_profile"
> +  [(unspec_volatile [(match_operand:SI 0 "general_operand" "j,m")
> +                    ] VUNSPEC_LONG_CALL_PROFILE)]
> +  "TARGET_THUMB1"
> +  "%@ thumb1_long_call_profile"
> +  [(set_attr "pool_range" "1018")]
> +)
> +
>  

Please set length accordingly as well as check on the pool_range here.

> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
> index 5744c62..d7ddc3a 100644
> --- a/gcc/config/arm/unspecs.md
> +++ b/gcc/config/arm/unspecs.md
> @@ -148,6 +148,7 @@
>    VUNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
>    VUNSPEC_SET_FPSCR	; Represent assign of FPSCR content.
>    VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
> +  VUNSPEC_LONG_CALL_PROFILE ; Represent a long call to profile function
>  ])
>  
>  ;; Enumerators for NEON unspecs.
> diff --git a/gcc/testsuite/gcc.target/arm/pr69770.c b/gcc/testsuite/gcc.target/arm/pr69770.c
> new file mode 100644
> index 0000000..61e5c6d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr69770.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-pg -mlong-calls" } */
> +
> +extern void g(void);
> +
> +int f() { g(); return 0; }
> +
> +/* { dg-final { scan-assembler-not "bl\[ \t\]+__gnu_mcount_nc" } } */
> +/* { dg-final { scan-assembler "__gnu_mcount_nc" } } */
> -- 1.9.1
>
Ramana Radhakrishnan March 30, 2016, 9:44 a.m. UTC | #3
And some more formatting issues.

On 30/03/16 10:33, Ramana Radhakrishnan wrote:
> 
> 
> On 24/03/16 21:02, Charles Baylis wrote:
>> When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc
>> function are not generated as long calls.
>>
>> This is the sequel to this patch
>>     https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html
>>
>> This patch fixes the following problems with the previous patch.
>> . Nested functions now work (thanks to Richard E for spotting this)
>> . Thumb-1 now works
> 
>>
>> This patch works by adding new patterns (one for ARM/Thumb-2 and one
>> for Thumb-1) which are placed in the prologue as a placeholder for
>> some RTL which describes the address. This is either a SYMBOL_REF for
>> targets with MOVW/MOVT, or a literal pool reference for other targets.
>> The implementation of ARM_FUNCTION_PROFILER is changed to search for
>> this insn so that the the address of the __gnu_mcount_nc function can
>> be loaded using an appropriate sequence for the target.
> 
> I'm reasonably happy with the approach but there are nits.
> 
>>
>> I also tried generating the profiling call sequence directly in the
>> prologue, but this requires some unpleasant hacks to prevent spurious
>> register pushes from ASM_OUTPUT_REG_PUSH.
>>
>> Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU.
>> The generated code sequences have been inspected for normal and nested
>> functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets.
>>
>> This does not fix a regression, so I don't expect to apply it for
>> GCC6, is it OK for when stage 1 re-opens.
> 
> 
> I'm not sure how much testing coverage you get by just running the testsuite. Doing a profiled bootstrap with -mlong-calls and a regression test run for arm and / or thumb2 would be a useful test to do additionally - Given that this originally came from kernel builds with allyesconfig how important is this to be fixed for GCC 6 ? I'd rather take the fix into GCC 6 to get the kernel building again but that's something we can discuss with the RM's once the issues with the patch are fixed.
>>
>> gcc/ChangeLog:
>>
>> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
>>         * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
>>         (arm_expand_prologue): Likewise.
>>         (thumb1_expand_prologue): Likewise.
>>         (arm_output_long_call_to_profile_func): Likewise.
>>         (arm_emit_long_call_profile): Likewise.
>>         * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
>>         * config/arm/arm.md (arm_long_call_profile): New pattern.
>>         * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
>>         define.
>>         * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
>>         * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * gcc.target/arm/pr69770.c: New test.
>>
>>
>> 0001-PR69770-mlong-calls-does-not-affect-calls-to-__gnu_m.patch
>>
>>
>> From 5a39451f34be9b6ca98b3460bf40d879d6ee61a5 Mon Sep 17 00:00:00 2001
>> From: Charles Baylis <charles.baylis@linaro.org>
>> Date: Thu, 24 Mar 2016 20:43:25 +0000
>> Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc
>>  generated by -pg
>>
>> gcc/ChangeLog:
>>
>> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
>>         * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
>>         (arm_expand_prologue): Likewise.
>>         (thumb1_expand_prologue): Likewise.
>>         (arm_output_long_call_to_profile_func): Likewise.
>>         (arm_emit_long_call_profile): Likewise.
>>         * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
>>         * config/arm/arm.md (arm_long_call_profile): New pattern.
>>         * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
>> 	define.
>>         * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
>>         * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-03-24  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * gcc.target/arm/pr69770.c: New test.
>>
>> Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6
>>
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index 0083673..324c9f4 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -343,6 +343,7 @@ extern void arm_register_target_pragmas (void);
>>  extern void arm_cpu_cpp_builtins (struct cpp_reader *);
>>  
>>  extern bool arm_is_constant_pool_ref (rtx);
>> +void arm_emit_long_call_profile ();
>>  
>>  /* Flags used to identify the presence of processor capabilities.  */
>>  
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index c868490..040b255 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -21426,6 +21426,22 @@ output_probe_stack_range (rtx reg1, rtx reg2)
>>    return "";
>>  }
>>  
>> +static void
>> +arm_emit_long_call_profile_insn ()
> 
> s/()/(void)
> 
>> +{
>> +  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
>> +  /* if movt/movw are not available, use a constant pool */
>> +  if (!arm_arch_thumb2)
> 
> Please don't do that - use TARGET_USE_MOVT && !target_word_relocations please. And better still look at adding !target_word_relocations to the define of TARGET_USE_MOVT.
> 
>> +  {
>> +    sym_ref = force_const_mem(Pmode, sym_ref);
> 
> Space between force_const_mem and `('
> 
>> +  }
>> +  rtvec vec = gen_rtvec (1, sym_ref);
>> +  rtx tmp =
>> +    gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE);
>> +  emit_insn (tmp);
>> +}
>> +
> 
>> +
>>  /* Generate the prologue instructions for entry into an ARM or Thumb-2
>>     function.  */
>>  void
>> @@ -21789,6 +21805,10 @@ arm_expand_prologue (void)
>>        arm_load_pic_register (mask);
>>      }
>>  
>> +  if (crtl->profile && TARGET_LONG_CALLS
>> +      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
>> +    arm_emit_long_call_profile_insn ();
>> +
>>    /* If we are profiling, make sure no instructions are scheduled before
>>       the call to mcount.  Similarly if the user has requested no
>>       scheduling in the prolog.  Similarly if we want non-call exceptions
>> @@ -24985,6 +25005,10 @@ thumb1_expand_prologue (void)
>>    if (frame_pointer_needed)
>>      thumb_set_frame_pointer (offsets);
>>  
>> +  if (crtl->profile && TARGET_LONG_CALLS
>> +      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
>> +    arm_emit_long_call_profile_insn ();
>> +
>>    /* If we are profiling, make sure no instructions are scheduled before
>>       the call to mcount.  Similarly if the user has requested no
>>       scheduling in the prolog.  Similarly if we want non-call exceptions
>> @@ -30289,4 +30313,70 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri,
>>    return;
>>  }
>>  
>> +static void
>> +arm_output_long_call_to_profile_func (rtx * operands, bool push_scratch)
>> +{
> 
> Missing comment above function.

Blank lines between if blocks please

> 
>> +  /* operands[0] is the address of the __gnu_mcount_nc function
>> +     operands[1] is the scratch register we use to load that address */
>> +  if (push_scratch)
>> +    output_asm_insn ("push\t{%1}", operands);

Here .

>> +  output_asm_insn ("push\t{lr}", operands);
>> +  if (GET_CODE (operands[0]) == SYMBOL_REF)
>> +    {
>> +      output_asm_insn ("movw\t%1, #:lower16:%c0", operands);
>> +      output_asm_insn ("movt\t%1, #:upper16:%c0", operands);
>> +    }
>> +  else
>> +    {
>> +      output_asm_insn ("ldr\t%1, %0", operands);
>> +    }

Here.

>> +  if (!arm_arch5)
>> +    {
>> +      output_asm_insn ("mov\tlr, pc", operands);
>> +      output_asm_insn ("mov\tpc, %1", operands);
>> +    }
>> +  else
>> +    output_asm_insn ("blx\t%1", operands);

Here.

>> +  if (push_scratch)
>> +    output_asm_insn ("pop\t{%1}", operands);

Here and check the rest of the patch.


>> +}
>> +
> 
> 
>> +void
>> +arm_emit_long_call_profile()
> 
> s/()/ (void)
> 
> Missing comment.
> 
>> +{
>> +  rtx alcp = NULL;
>> +  rtx operands[2];
>> +  bool push_scratch;
>> +  /* find the arm_long_call_profile */
>> +  for (rtx_insn * insn = get_insns (); insn; insn = NEXT_INSN (insn))
>> +    {
>> +      if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_PROLOGUE_END)
>> +	break;
>> +      if (INSN_CODE (insn) == CODE_FOR_arm_long_call_profile
>> +	  || INSN_CODE (insn) == CODE_FOR_thumb1_long_call_profile)
>> +	{
>> +	  alcp = PATTERN (insn);
>> +	  break;
>> +	}
>> +    }
>> +  gcc_assert (alcp);
>> +
>> +  operands[0] = XEXP (XEXP (alcp, 0), 0);
>> +  if (TARGET_32BIT)
>> +    {
>> +      operands[1] = gen_rtx_REG (SImode, IP_REGNUM);
>> +      push_scratch = false;
>> +    }
>> +  else
>> +    {
>> +      /* for nested functions, we can set push_scratch to false, since
>> +	 final.c:profile_function.c and ASM_OUTPUT_REG_PUSH preserve it as
>> +         part of the sequence to preserve ip across the call to the 
>> +         profiling function. */
>> +      operands[1] = gen_rtx_REG (SImode, R0_REGNUM + 7);
>> +      push_scratch = !IS_NESTED (arm_current_func_type ());
>> +    }
>> +  arm_output_long_call_to_profile_func (operands, push_scratch);
>> +}
>> +
>>  #include "gt-arm.h"
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index 6352140..44f1c64 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -2044,7 +2044,9 @@ extern int making_const_table;
>>     that ASM_OUTPUT_REG_PUSH will be matched with ASM_OUTPUT_REG_POP, and
>>     that r7 isn't used by the function profiler, so we can use it as a
>>     scratch reg.  WARNING: This isn't safe in the general case!  It may be
>> -   sensitive to future changes in final.c:profile_function.  */
>> +   sensitive to future changes in final.c:profile_function. This is also
>> +   relied on in arm_emit_long_call_profile() which assumes r7 can be
>> +   used as a scratch register to load the address of the function profiler.  */
>>  #define ASM_OUTPUT_REG_PUSH(STREAM, REGNO)		\
>>    do							\
>>      {							\
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index 47171b9..beb44d5 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -11424,6 +11424,15 @@
>>    DONE;
>>  })
>>  
>> +(define_insn "arm_long_call_profile"
>> +  [(unspec_volatile [(match_operand:SI 0 "general_operand" "ji,m")
>> +                    ] VUNSPEC_LONG_CALL_PROFILE)]
>> +  "TARGET_32BIT"
>> +  "%@ arm_long_call_profile"
>> +  [(set_attr "arm_pool_range" "*,4096")
>> +   (set_attr "arm_neg_pool_range" "*,4084")]
>> +)
> 
> Please set the length of this insn appropriately. IIUC this is the length of the instructions from FUNCTION_PROFILER, though I suspect we could push all of this out into the RTL stream. Underestimating the length of this insn can cause issues with the minipool placement code later especially with -pg.
> 
>> +
>>  ;; Vector bits common to IWMMXT and Neon
>>  (include "vec-common.md")
>>  ;; Load the Intel Wireless Multimedia Extension patterns
>> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
>> index 5d6c4ed..7acf66b 100644
>> --- a/gcc/config/arm/bpabi.h
>> +++ b/gcc/config/arm/bpabi.h
>> @@ -174,11 +174,18 @@
>>  
>>  #undef  NO_PROFILE_COUNTERS
>>  #define NO_PROFILE_COUNTERS 1
>> +#undef  ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS
>> +#define ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS 1
> 
> Why do we need this ?
> 
>>  #undef  ARM_FUNCTION_PROFILER
>>  #define ARM_FUNCTION_PROFILER(STREAM, LABELNO)  			\
>>  {									\
>> -  fprintf (STREAM, "\tpush\t{lr}\n");					\
>> -  fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");				\
>> +  if (TARGET_LONG_CALLS)						\
>> +  { 									\
>> +    arm_emit_long_call_profile();					\
>> +  } else {								\
>> +    fprintf (STREAM, "\tpush\t{lr}\n");					\
>> +    fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");			\
>> +  }									\
>>  }
>>  
> 
> While you are here delete the definition of FUNCTION_PROFILER that is conditional on THUMB_FUNCTION_PROFILER, looks like that code is dead.
> 
>>  #undef SUBTARGET_FRAME_POINTER_REQUIRED
>> diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
>> index 072ed4d..9b1de5f 100644
>> --- a/gcc/config/arm/thumb1.md
>> +++ b/gcc/config/arm/thumb1.md
>> @@ -1798,7 +1798,7 @@
>>    [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")]
>>  		    VUNSPEC_EH_RETURN)
>>     (clobber (match_scratch:SI 1 "=&l"))]
>> -  "TARGET_THUMB1"
>> +  "TARGET_THUMB1 && 0"
> 
> This means it wasn't tested for Thumb1 or you had to turn this off to get proper results for Thumb1 ?
> 
> Please retest.
> 
>>    "#"
>>    "&& reload_completed"
>>    [(const_int 0)]
>> @@ -1809,4 +1809,13 @@
>>    }"
>>    [(set_attr "type" "mov_reg")]
>>  )
>> +
>> +(define_insn "thumb1_long_call_profile"
>> +  [(unspec_volatile [(match_operand:SI 0 "general_operand" "j,m")
>> +                    ] VUNSPEC_LONG_CALL_PROFILE)]
>> +  "TARGET_THUMB1"
>> +  "%@ thumb1_long_call_profile"
>> +  [(set_attr "pool_range" "1018")]
>> +)
>> +
>>  
> 
> Please set length accordingly as well as check on the pool_range here.
> 
>> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
>> index 5744c62..d7ddc3a 100644
>> --- a/gcc/config/arm/unspecs.md
>> +++ b/gcc/config/arm/unspecs.md
>> @@ -148,6 +148,7 @@
>>    VUNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
>>    VUNSPEC_SET_FPSCR	; Represent assign of FPSCR content.
>>    VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
>> +  VUNSPEC_LONG_CALL_PROFILE ; Represent a long call to profile function
>>  ])
>>  
>>  ;; Enumerators for NEON unspecs.
>> diff --git a/gcc/testsuite/gcc.target/arm/pr69770.c b/gcc/testsuite/gcc.target/arm/pr69770.c
>> new file mode 100644
>> index 0000000..61e5c6d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/pr69770.c
>> @@ -0,0 +1,9 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-pg -mlong-calls" } */
>> +
>> +extern void g(void);
>> +
>> +int f() { g(); return 0; }
>> +
>> +/* { dg-final { scan-assembler-not "bl\[ \t\]+__gnu_mcount_nc" } } */
>> +/* { dg-final { scan-assembler "__gnu_mcount_nc" } } */
>> -- 1.9.1
>>
diff mbox

Patch

From 5a39451f34be9b6ca98b3460bf40d879d6ee61a5 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Thu, 24 Mar 2016 20:43:25 +0000
Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc
 generated by -pg

gcc/ChangeLog:

2016-03-24  Charles Baylis  <charles.baylis@linaro.org>

        * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
        * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
        (arm_expand_prologue): Likewise.
        (thumb1_expand_prologue): Likewise.
        (arm_output_long_call_to_profile_func): Likewise.
        (arm_emit_long_call_profile): Likewise.
        * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
        * config/arm/arm.md (arm_long_call_profile): New pattern.
        * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
	define.
        * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
        * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

gcc/testsuite/ChangeLog:

2016-03-24  Charles Baylis  <charles.baylis@linaro.org>

        * gcc.target/arm/pr69770.c: New test.

Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 0083673..324c9f4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -343,6 +343,7 @@  extern void arm_register_target_pragmas (void);
 extern void arm_cpu_cpp_builtins (struct cpp_reader *);
 
 extern bool arm_is_constant_pool_ref (rtx);
+void arm_emit_long_call_profile ();
 
 /* Flags used to identify the presence of processor capabilities.  */
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..040b255 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21426,6 +21426,22 @@  output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }
 
+static void
+arm_emit_long_call_profile_insn ()
+{
+  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
+  /* if movt/movw are not available, use a constant pool */
+  if (!arm_arch_thumb2)
+  {
+    sym_ref = force_const_mem(Pmode, sym_ref);
+  }
+  rtvec vec = gen_rtvec (1, sym_ref);
+  rtx tmp =
+    gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE);
+  emit_insn (tmp);
+}
+
+
 /* Generate the prologue instructions for entry into an ARM or Thumb-2
    function.  */
 void
@@ -21789,6 +21805,10 @@  arm_expand_prologue (void)
       arm_load_pic_register (mask);
     }
 
+  if (crtl->profile && TARGET_LONG_CALLS
+      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
+    arm_emit_long_call_profile_insn ();
+
   /* If we are profiling, make sure no instructions are scheduled before
      the call to mcount.  Similarly if the user has requested no
      scheduling in the prolog.  Similarly if we want non-call exceptions
@@ -24985,6 +25005,10 @@  thumb1_expand_prologue (void)
   if (frame_pointer_needed)
     thumb_set_frame_pointer (offsets);
 
+  if (crtl->profile && TARGET_LONG_CALLS
+      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
+    arm_emit_long_call_profile_insn ();
+
   /* If we are profiling, make sure no instructions are scheduled before
      the call to mcount.  Similarly if the user has requested no
      scheduling in the prolog.  Similarly if we want non-call exceptions
@@ -30289,4 +30313,70 @@  arm_sched_fusion_priority (rtx_insn *insn, int max_pri,
   return;
 }
 
+static void
+arm_output_long_call_to_profile_func (rtx * operands, bool push_scratch)
+{
+  /* operands[0] is the address of the __gnu_mcount_nc function
+     operands[1] is the scratch register we use to load that address */
+  if (push_scratch)
+    output_asm_insn ("push\t{%1}", operands);
+  output_asm_insn ("push\t{lr}", operands);
+  if (GET_CODE (operands[0]) == SYMBOL_REF)
+    {
+      output_asm_insn ("movw\t%1, #:lower16:%c0", operands);
+      output_asm_insn ("movt\t%1, #:upper16:%c0", operands);
+    }
+  else
+    {
+      output_asm_insn ("ldr\t%1, %0", operands);
+    }
+  if (!arm_arch5)
+    {
+      output_asm_insn ("mov\tlr, pc", operands);
+      output_asm_insn ("mov\tpc, %1", operands);
+    }
+  else
+    output_asm_insn ("blx\t%1", operands);
+  if (push_scratch)
+    output_asm_insn ("pop\t{%1}", operands);
+}
+
+void
+arm_emit_long_call_profile()
+{
+  rtx alcp = NULL;
+  rtx operands[2];
+  bool push_scratch;
+  /* find the arm_long_call_profile */
+  for (rtx_insn * insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_PROLOGUE_END)
+	break;
+      if (INSN_CODE (insn) == CODE_FOR_arm_long_call_profile
+	  || INSN_CODE (insn) == CODE_FOR_thumb1_long_call_profile)
+	{
+	  alcp = PATTERN (insn);
+	  break;
+	}
+    }
+  gcc_assert (alcp);
+
+  operands[0] = XEXP (XEXP (alcp, 0), 0);
+  if (TARGET_32BIT)
+    {
+      operands[1] = gen_rtx_REG (SImode, IP_REGNUM);
+      push_scratch = false;
+    }
+  else
+    {
+      /* for nested functions, we can set push_scratch to false, since
+	 final.c:profile_function.c and ASM_OUTPUT_REG_PUSH preserve it as
+         part of the sequence to preserve ip across the call to the 
+         profiling function. */
+      operands[1] = gen_rtx_REG (SImode, R0_REGNUM + 7);
+      push_scratch = !IS_NESTED (arm_current_func_type ());
+    }
+  arm_output_long_call_to_profile_func (operands, push_scratch);
+}
+
 #include "gt-arm.h"
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 6352140..44f1c64 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2044,7 +2044,9 @@  extern int making_const_table;
    that ASM_OUTPUT_REG_PUSH will be matched with ASM_OUTPUT_REG_POP, and
    that r7 isn't used by the function profiler, so we can use it as a
    scratch reg.  WARNING: This isn't safe in the general case!  It may be
-   sensitive to future changes in final.c:profile_function.  */
+   sensitive to future changes in final.c:profile_function. This is also
+   relied on in arm_emit_long_call_profile() which assumes r7 can be
+   used as a scratch register to load the address of the function profiler.  */
 #define ASM_OUTPUT_REG_PUSH(STREAM, REGNO)		\
   do							\
     {							\
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 47171b9..beb44d5 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11424,6 +11424,15 @@ 
   DONE;
 })
 
+(define_insn "arm_long_call_profile"
+  [(unspec_volatile [(match_operand:SI 0 "general_operand" "ji,m")
+                    ] VUNSPEC_LONG_CALL_PROFILE)]
+  "TARGET_32BIT"
+  "%@ arm_long_call_profile"
+  [(set_attr "arm_pool_range" "*,4096")
+   (set_attr "arm_neg_pool_range" "*,4084")]
+)
+
 ;; Vector bits common to IWMMXT and Neon
 (include "vec-common.md")
 ;; Load the Intel Wireless Multimedia Extension patterns
diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
index 5d6c4ed..7acf66b 100644
--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -174,11 +174,18 @@ 
 
 #undef  NO_PROFILE_COUNTERS
 #define NO_PROFILE_COUNTERS 1
+#undef  ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS
+#define ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS 1
 #undef  ARM_FUNCTION_PROFILER
 #define ARM_FUNCTION_PROFILER(STREAM, LABELNO)  			\
 {									\
-  fprintf (STREAM, "\tpush\t{lr}\n");					\
-  fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");				\
+  if (TARGET_LONG_CALLS)						\
+  { 									\
+    arm_emit_long_call_profile();					\
+  } else {								\
+    fprintf (STREAM, "\tpush\t{lr}\n");					\
+    fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");			\
+  }									\
 }
 
 #undef SUBTARGET_FRAME_POINTER_REQUIRED
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 072ed4d..9b1de5f 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1798,7 +1798,7 @@ 
   [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")]
 		    VUNSPEC_EH_RETURN)
    (clobber (match_scratch:SI 1 "=&l"))]
-  "TARGET_THUMB1"
+  "TARGET_THUMB1 && 0"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -1809,4 +1809,13 @@ 
   }"
   [(set_attr "type" "mov_reg")]
 )
+
+(define_insn "thumb1_long_call_profile"
+  [(unspec_volatile [(match_operand:SI 0 "general_operand" "j,m")
+                    ] VUNSPEC_LONG_CALL_PROFILE)]
+  "TARGET_THUMB1"
+  "%@ thumb1_long_call_profile"
+  [(set_attr "pool_range" "1018")]
+)
+
 
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 5744c62..d7ddc3a 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -148,6 +148,7 @@ 
   VUNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
   VUNSPEC_SET_FPSCR	; Represent assign of FPSCR content.
   VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
+  VUNSPEC_LONG_CALL_PROFILE ; Represent a long call to profile function
 ])
 
 ;; Enumerators for NEON unspecs.
diff --git a/gcc/testsuite/gcc.target/arm/pr69770.c b/gcc/testsuite/gcc.target/arm/pr69770.c
new file mode 100644
index 0000000..61e5c6d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr69770.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-pg -mlong-calls" } */
+
+extern void g(void);
+
+int f() { g(); return 0; }
+
+/* { dg-final { scan-assembler-not "bl\[ \t\]+__gnu_mcount_nc" } } */
+/* { dg-final { scan-assembler "__gnu_mcount_nc" } } */
-- 
1.9.1