From patchwork Thu Mar 24 21:02:35 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charles Baylis X-Patchwork-Id: 601810 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qWJnT0xNRz9sCk for ; Fri, 25 Mar 2016 08:03:40 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Wm/EGZOG; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=bXSbPOAdYmRxY5yEyAiFoB/32WQmB9VndkwCy6ZHLGs X/dvOzZQrLpmrY6d4CnRVAF1fYAoj0lKxPYyhUHdsIzYjMhqugyoiL63o+v3A37s Nn/9Pm56pjgtvb6JJ1pMjXC1mJqy9TvP0ekAgT1459ydipRRQc59eRssB4FEHx5c = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=6yudsA2R/W504UtfEkItnGi4dJQ=; b=Wm/EGZOG6f/vOgELa 2UdaaCBYKgB/AHWKKGhkUZ8SPTdUjrbN+3uLvEV4T1MvQ2vl90yV90Tjpsit5pwY QrSnkxTUn2BgEX1dZjlY3UpEHocbO+SzktM2zWoYI+HgFLrm1RAOjrgMiGZ9hyPG LueJ+iVTlaEleDxxutGP/ehKjI= Received: (qmail 42745 invoked by alias); 24 Mar 2016 21:02:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 42633 invoked by uid 89); 24 Mar 2016 21:02:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=wireless, Wireless, scheduled, GET_CODE X-HELO: mail-ob0-f181.google.com Received: from mail-ob0-f181.google.com (HELO mail-ob0-f181.google.com) (209.85.214.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 24 Mar 2016 21:02:37 +0000 Received: by mail-ob0-f181.google.com with SMTP id fp4so47064690obb.2 for ; Thu, 24 Mar 2016 14:02:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to:cc; bh=/N485rzxrcdYBxvsfuTCXVIJXX9EQVCXWZDiH09AO/s=; b=TR5pzgqp95zJAc90sU2oyisdSj2DA75DlL3U4dE5OllWC33hch9IaH2fLz9l5ODBf+ iEPgnL/X+oKvCxwjQnpHARb32+o3DAidoPwiDIbGejxuDb9MDnABaVnSFJgEWJ3Szqi3 CJVgERley9N90/ZgOStkipyUltdEUc4md/vd9PbiNZIfX+BlNLR6hZMBctB5FAbuAJeT PfeNQ5G8BAgThp5YaLcY5dkHBl4vClVcdfV3euMUiGvmQbB7rHzuEJTUn9UFn2Z3faXA vz/8YZCV7ar1rYZSjpndU5k2NXI01QjDQoUWvJ2/UGFsjaeuu36tOro5qhdBoOnIVwpg xjBA== X-Gm-Message-State: AD7BkJJDFIQbidEc9GFwbtVpZxqL+Q73Tb0f8lu2QTtV3h2bLWgAyDu/4IMHc5DHEpc1wn9Nw5qaH5hmAYuTeFG8 MIME-Version: 1.0 X-Received: by 10.182.138.73 with SMTP id qo9mr5697591obb.0.1458853355710; Thu, 24 Mar 2016 14:02:35 -0700 (PDT) Received: by 10.202.214.2 with HTTP; Thu, 24 Mar 2016 14:02:35 -0700 (PDT) Date: Thu, 24 Mar 2016 21:02:35 +0000 Message-ID: Subject: [PATCH ARM v2] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg From: Charles Baylis To: Kyrylo Tkachov , Ramana Radhakrishnan , Richard Earnshaw Cc: GCC Patches X-IsSubscribed: yes When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc function are not generated as long calls. This is the sequel to this patch https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html This patch fixes the following problems with the previous patch. . Nested functions now work (thanks to Richard E for spotting this) . Thumb-1 now works This patch works by adding new patterns (one for ARM/Thumb-2 and one for Thumb-1) which are placed in the prologue as a placeholder for some RTL which describes the address. This is either a SYMBOL_REF for targets with MOVW/MOVT, or a literal pool reference for other targets. The implementation of ARM_FUNCTION_PROFILER is changed to search for this insn so that the the address of the __gnu_mcount_nc function can be loaded using an appropriate sequence for the target. I also tried generating the profiling call sequence directly in the prologue, but this requires some unpleasant hacks to prevent spurious register pushes from ASM_OUTPUT_REG_PUSH. Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU. The generated code sequences have been inspected for normal and nested functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets. This does not fix a regression, so I don't expect to apply it for GCC6, is it OK for when stage 1 re-opens. gcc/ChangeLog: 2016-03-24 Charles Baylis * config/arm/arm-protos.h (arm_emit_long_call_profile): New function. * config/arm/arm.c (arm_emit_long_call_profile_insn): New function. (arm_expand_prologue): Likewise. (thumb1_expand_prologue): Likewise. (arm_output_long_call_to_profile_func): Likewise. (arm_emit_long_call_profile): Likewise. * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment. * config/arm/arm.md (arm_long_call_profile): New pattern. * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New define. * config/arm/thumb1.md (thumb1_long_call_profile): New pattern. * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE. gcc/testsuite/ChangeLog: 2016-03-24 Charles Baylis * gcc.target/arm/pr69770.c: New test. From 5a39451f34be9b6ca98b3460bf40d879d6ee61a5 Mon Sep 17 00:00:00 2001 From: Charles Baylis Date: Thu, 24 Mar 2016 20:43:25 +0000 Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg gcc/ChangeLog: 2016-03-24 Charles Baylis * config/arm/arm-protos.h (arm_emit_long_call_profile): New function. * config/arm/arm.c (arm_emit_long_call_profile_insn): New function. (arm_expand_prologue): Likewise. (thumb1_expand_prologue): Likewise. (arm_output_long_call_to_profile_func): Likewise. (arm_emit_long_call_profile): Likewise. * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment. * config/arm/arm.md (arm_long_call_profile): New pattern. * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New define. * config/arm/thumb1.md (thumb1_long_call_profile): New pattern. * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE. gcc/testsuite/ChangeLog: 2016-03-24 Charles Baylis * gcc.target/arm/pr69770.c: New test. Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6 diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 0083673..324c9f4 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -343,6 +343,7 @@ extern void arm_register_target_pragmas (void); extern void arm_cpu_cpp_builtins (struct cpp_reader *); extern bool arm_is_constant_pool_ref (rtx); +void arm_emit_long_call_profile (); /* Flags used to identify the presence of processor capabilities. */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c868490..040b255 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -21426,6 +21426,22 @@ output_probe_stack_range (rtx reg1, rtx reg2) return ""; } +static void +arm_emit_long_call_profile_insn () +{ + rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc"); + /* if movt/movw are not available, use a constant pool */ + if (!arm_arch_thumb2) + { + sym_ref = force_const_mem(Pmode, sym_ref); + } + rtvec vec = gen_rtvec (1, sym_ref); + rtx tmp = + gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE); + emit_insn (tmp); +} + + /* Generate the prologue instructions for entry into an ARM or Thumb-2 function. */ void @@ -21789,6 +21805,10 @@ arm_expand_prologue (void) arm_load_pic_register (mask); } + if (crtl->profile && TARGET_LONG_CALLS + && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS) + arm_emit_long_call_profile_insn (); + /* If we are profiling, make sure no instructions are scheduled before the call to mcount. Similarly if the user has requested no scheduling in the prolog. Similarly if we want non-call exceptions @@ -24985,6 +25005,10 @@ thumb1_expand_prologue (void) if (frame_pointer_needed) thumb_set_frame_pointer (offsets); + if (crtl->profile && TARGET_LONG_CALLS + && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS) + arm_emit_long_call_profile_insn (); + /* If we are profiling, make sure no instructions are scheduled before the call to mcount. Similarly if the user has requested no scheduling in the prolog. Similarly if we want non-call exceptions @@ -30289,4 +30313,70 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri, return; } +static void +arm_output_long_call_to_profile_func (rtx * operands, bool push_scratch) +{ + /* operands[0] is the address of the __gnu_mcount_nc function + operands[1] is the scratch register we use to load that address */ + if (push_scratch) + output_asm_insn ("push\t{%1}", operands); + output_asm_insn ("push\t{lr}", operands); + if (GET_CODE (operands[0]) == SYMBOL_REF) + { + output_asm_insn ("movw\t%1, #:lower16:%c0", operands); + output_asm_insn ("movt\t%1, #:upper16:%c0", operands); + } + else + { + output_asm_insn ("ldr\t%1, %0", operands); + } + if (!arm_arch5) + { + output_asm_insn ("mov\tlr, pc", operands); + output_asm_insn ("mov\tpc, %1", operands); + } + else + output_asm_insn ("blx\t%1", operands); + if (push_scratch) + output_asm_insn ("pop\t{%1}", operands); +} + +void +arm_emit_long_call_profile() +{ + rtx alcp = NULL; + rtx operands[2]; + bool push_scratch; + /* find the arm_long_call_profile */ + for (rtx_insn * insn = get_insns (); insn; insn = NEXT_INSN (insn)) + { + if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_PROLOGUE_END) + break; + if (INSN_CODE (insn) == CODE_FOR_arm_long_call_profile + || INSN_CODE (insn) == CODE_FOR_thumb1_long_call_profile) + { + alcp = PATTERN (insn); + break; + } + } + gcc_assert (alcp); + + operands[0] = XEXP (XEXP (alcp, 0), 0); + if (TARGET_32BIT) + { + operands[1] = gen_rtx_REG (SImode, IP_REGNUM); + push_scratch = false; + } + else + { + /* for nested functions, we can set push_scratch to false, since + final.c:profile_function.c and ASM_OUTPUT_REG_PUSH preserve it as + part of the sequence to preserve ip across the call to the + profiling function. */ + operands[1] = gen_rtx_REG (SImode, R0_REGNUM + 7); + push_scratch = !IS_NESTED (arm_current_func_type ()); + } + arm_output_long_call_to_profile_func (operands, push_scratch); +} + #include "gt-arm.h" diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 6352140..44f1c64 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2044,7 +2044,9 @@ extern int making_const_table; that ASM_OUTPUT_REG_PUSH will be matched with ASM_OUTPUT_REG_POP, and that r7 isn't used by the function profiler, so we can use it as a scratch reg. WARNING: This isn't safe in the general case! It may be - sensitive to future changes in final.c:profile_function. */ + sensitive to future changes in final.c:profile_function. This is also + relied on in arm_emit_long_call_profile() which assumes r7 can be + used as a scratch register to load the address of the function profiler. */ #define ASM_OUTPUT_REG_PUSH(STREAM, REGNO) \ do \ { \ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 47171b9..beb44d5 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11424,6 +11424,15 @@ DONE; }) +(define_insn "arm_long_call_profile" + [(unspec_volatile [(match_operand:SI 0 "general_operand" "ji,m") + ] VUNSPEC_LONG_CALL_PROFILE)] + "TARGET_32BIT" + "%@ arm_long_call_profile" + [(set_attr "arm_pool_range" "*,4096") + (set_attr "arm_neg_pool_range" "*,4084")] +) + ;; Vector bits common to IWMMXT and Neon (include "vec-common.md") ;; Load the Intel Wireless Multimedia Extension patterns diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h index 5d6c4ed..7acf66b 100644 --- a/gcc/config/arm/bpabi.h +++ b/gcc/config/arm/bpabi.h @@ -174,11 +174,18 @@ #undef NO_PROFILE_COUNTERS #define NO_PROFILE_COUNTERS 1 +#undef ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS +#define ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS 1 #undef ARM_FUNCTION_PROFILER #define ARM_FUNCTION_PROFILER(STREAM, LABELNO) \ { \ - fprintf (STREAM, "\tpush\t{lr}\n"); \ - fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n"); \ + if (TARGET_LONG_CALLS) \ + { \ + arm_emit_long_call_profile(); \ + } else { \ + fprintf (STREAM, "\tpush\t{lr}\n"); \ + fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n"); \ + } \ } #undef SUBTARGET_FRAME_POINTER_REQUIRED diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index 072ed4d..9b1de5f 100644 --- a/gcc/config/arm/thumb1.md +++ b/gcc/config/arm/thumb1.md @@ -1798,7 +1798,7 @@ [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")] VUNSPEC_EH_RETURN) (clobber (match_scratch:SI 1 "=&l"))] - "TARGET_THUMB1" + "TARGET_THUMB1 && 0" "#" "&& reload_completed" [(const_int 0)] @@ -1809,4 +1809,13 @@ }" [(set_attr "type" "mov_reg")] ) + +(define_insn "thumb1_long_call_profile" + [(unspec_volatile [(match_operand:SI 0 "general_operand" "j,m") + ] VUNSPEC_LONG_CALL_PROFILE)] + "TARGET_THUMB1" + "%@ thumb1_long_call_profile" + [(set_attr "pool_range" "1018")] +) + diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 5744c62..d7ddc3a 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -148,6 +148,7 @@ VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content. VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing. + VUNSPEC_LONG_CALL_PROFILE ; Represent a long call to profile function ]) ;; Enumerators for NEON unspecs. diff --git a/gcc/testsuite/gcc.target/arm/pr69770.c b/gcc/testsuite/gcc.target/arm/pr69770.c new file mode 100644 index 0000000..61e5c6d --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr69770.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-pg -mlong-calls" } */ + +extern void g(void); + +int f() { g(); return 0; } + +/* { dg-final { scan-assembler-not "bl\[ \t\]+__gnu_mcount_nc" } } */ +/* { dg-final { scan-assembler "__gnu_mcount_nc" } } */ -- 1.9.1