diff mbox

[v6] add -fprolog-pad=N,M option

Message ID 20170217165721.GA25613@suse.de
State New
Headers show

Commit Message

Torsten Duwe Feb. 17, 2017, 4:57 p.m. UTC
Hi,

Thanks for all the feedback. Hopefully it's all incorporated now.
I will reply to you individually on the specific topics, but here is
the new v6 for you to rip apart ;-)

Changes since v5:

* ChangeLogs split, reshuffled, reformatted.

* cmdline option parsing again with integral_argument ()

* Documentation has less "pad"s

* completely reworked default_print_prolog_pad ()
  -- never liked the old version either.

	Torsten



gcc/c-family/ChangeLog
2017-02-17  Torsten Duwe  <duwe@suse.de>

	* c-attribs.c (c_common_attribute_table): Add entry for "prolog_pad".

gcc/lto/ChangeLog
2017-02-17  Torsten Duwe  <duwe@suse.de>

	* lto-lang.c (lto_attribute_table): Add entry for "prolog_pad".

gcc/ChangeLog
2017-02-17  Torsten Duwe  <duwe@suse.de>

	* common.opt: Introduce -fprolog_pad command line option,
	and its variables prolog_nop_pad_size and prolog_nop_pad_entry.
	* opts.c (common_handle_option): Add -fprolog_pad_ case,
	including a two-value parser.
	* target.def (print_prolog_pad): New target hook.
	* targhooks.h (default_print_prolog_pad): New function.
	* targhooks.c (default_print_prolog_pad): Likewise.
	* toplev.c (process_options): Switch off IPA-RA if
	prolog pads are being generated.
	* varasm.c (assemble_start_function): Look at the prolog-pad command
	line switch and current function attributes and maybe generate NOP
	instructions by calling the print_prolog_pad hook.
	* doc/extend.texi: Document prolog_pad attribute.
	* doc/invoke.texi: Document -fprolog_pad command line option.
	* doc/tm.texi.in (TARGET_ASM_PRINT_PROLOG_PAD): New target hook.
	* doc/tm.texi: Likewise.

gcc/testsuite/ChangeLog
2017-02-17  Torsten Duwe  <duwe@suse.de>

	* c-c++-common/attribute-prolog_pad-1.c: New test.

Comments

Sandra Loosemore Feb. 18, 2017, 6:30 a.m. UTC | #1
On 02/17/2017 09:57 AM, Torsten Duwe wrote:

> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 3d1546a..ef7e985 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3076,6 +3076,23 @@ that affect more than one function.
>   This attribute should be used for debugging purposes only.  It is not
>   suitable in production code.
>
> +@item prolog_pad
> +@cindex @code{prolog_pad} function attribute

I'm only a documentation maintainer so this is out of my area of 
responsibility, but I really wish we could rename the attribute and 
command-line option.  Per

per https://gcc.gnu.org/codingconventions.html#Spelling

the correct spelling is "prologue".

> +@cindex extra NOP instructions at the function entry point
> +In case the target's text segment can be made writable at run time
> +by any means, padding the function entry with a number of NOPs can
> +be used to provide a universal tool for instrumentation.  Usually,
> +prolog padding is enabled globally using the @option{-fprolog-pad=N,M}

definitely s/prolog/prologue/ in the running text here.

> +command-line switch, and disabled with attribute @code{prolog_pad (0)}
> +for functions that are part of the actual instrumentation framework.
> +This conveniently avoids an endless recursion.
> +The @code{prolog_pad} function attribute can be used to
> +change the pad size to any desired value.  The two-value syntax is
> +the same as for the command-line switch @option{-fprolog-pad=N,M},

Add a cross-reference here.

> +generating a NOP pad of size @var{N}, with the function entry point

Sizes are usually expressed in bytes.  I think some other unit is 
intended here, though, so I'd avoid "size" and use some other way to 
describe it.  Maybe "generating a pad of @var{N} NOP instructions".

> +@var{M} NOP instructions into the pad.  @var{M} defaults to 0
> +if omitted e.g. function entry point is before the first NOP.
> +
>   @item pure
>   @cindex @code{pure} function attribute
>   @cindex functions that have no side effects
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 56ca53f..75a7e2c 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11370,6 +11370,31 @@ of the function name, it is considered to be a match.  For C99 and C++
>   extended identifiers, the function name must be given in UTF-8, not
>   using universal character names.
>
> +@item -fprolog-pad=@var{N}[,@var{M}]
> +@opindex fprolog-pad
> +Generate a pad of @var{N} NOPs right at the beginning
> +of each function, with the function entry point @var{M} NOPs into
> +the pad.  If @var{M} is omitted, it defaults to @code{0} so the
> +function entry points to the address just at the first NOP.
> +The NOP instructions reserve extra space which can be used to patch in
> +any desired instrumentation at run time, provided that the code segment
> +is writable.  The amount of space is only controllable indirectly via
> +the number of NOPs, so implementers are advised to use the smallest
> +NOP instruction available for the current CPU mode should there be a
> +choice, in order to achieve the finest granularity.

The audience of the GCC user manual is users, not implementers.  If this 
is really just "advice" on what the option should do in the presence of 
multiple instruction sizes, and not a firm requirement, then rewrite as 
something like:

The amount of space reserved is expressed as the number of NOP 
instructions to insert. On targets that have multiple instruction sizes, 
typically the smallest NOP instruction available for the current CPU 
mode is used to achieve the finest granularity.

...except that I don't think "CPU mode" is really what you intend here. 
  E.g. on nios2, support for 16-bit instructions is a code generation 
option (-mcdx) rather than a -mcpu= or -march= option, and there is 
certainly no runtime processor mode selection involved.

If this is really a firm requirement, I think the burden is on you to 
identify all backends that have multiple NOP sizes for which the default 
hook implementation won't give the required behavior, and either provide 
an appropriate hook or work with the backend maintainers to develop one.

I'd put a paragraph break here, before:

> +For run-time identification, the starting addresses
> +of these pads, which correspond to their respective function entries
> +minus @var{M}, are additionally collected in the @code{__prolog_pads_loc}
> +section of the resulting binary.
> +
> +Note that the value of @code{__attribute__ ((prolog_pad (N,M)))} takes
> +precedence over command-line option @option{-fprolog-pad=N,M}.

@var{N} and @var{M} in both places, please.  And add a cross-reference.

> +This can be used to increase the pad size or to remove it completely
> +on a single function.  If @code{N=0}, no pad location is recorded.

That's sloppy markup.  How about

If @var{N} is zero, ....

> +The NOP instructions are inserted at (and maybe before) the function entry
> +address, even before the prologue.
> +
>   @end table
>
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 348fd68..5155d10 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -4566,6 +4566,10 @@ will select the smallest suitable mode.
>   This section describes the macros that output function entry
>   (@dfn{prologue}) and exit (@dfn{epilogue}) code.
>
> +@deftypefn {Target Hook} void TARGET_ASM_PRINT_PROLOG_PAD (FILE *@var{file}, unsigned HOST_WIDE_INT @var{pad_size}, bool @var{record_p})
> +Generate prologue pad

Sigh again.  Every other prologue-related target hook uses the PROLOGUE 
spelling.  :-S

Missing punctuation at the end of the sentence, and there's not nearly 
enough information about what this hook should do.

Also, I suggest moving this down in the section instead of listing it 
first, because of the usual principle of giving the most important 
information first.

> +@end deftypefn
> +
>   @deftypefn {Target Hook} void TARGET_ASM_FUNCTION_PROLOGUE (FILE *@var{file}, HOST_WIDE_INT @var{size})
>   If defined, a function that outputs the assembler code for entry to a
>   function.  The prologue is responsible for setting up the stack frame,

-Sandra
Torsten Duwe March 1, 2017, 11:26 a.m. UTC | #2
On Fri, Feb 17, 2017 at 11:30:29PM -0700, Sandra Loosemore wrote:
> >
> >+@item prolog_pad
> >+@cindex @code{prolog_pad} function attribute
> 
> I'm only a documentation maintainer so this is out of my area of
> responsibility, but I really wish we could rename the attribute and
> command-line option.  Per
> 
> per https://gcc.gnu.org/codingconventions.html#Spelling
> 
> the correct spelling is "prologue".
> 
> >+@cindex extra NOP instructions at the function entry point
> >+In case the target's text segment can be made writable at run time
> >+by any means, padding the function entry with a number of NOPs can
> >+be used to provide a universal tool for instrumentation.  Usually,
> >+prolog padding is enabled globally using the @option{-fprolog-pad=N,M}
> 
> definitely s/prolog/prologue/ in the running text here.

Well, you're definitely right in both cases.

About 400 occurrences of "prolog" in the source without ChangeLogs,
mainly in gcc/tree-vect-loop-manip.c and in libgcc/config/libbid/bid_conf.h;
about 3000 lines with "prologue". However there is a "-mprolog-function"
switch. One might call this a broken window. I don't want to contribute
to that.

However, writing some more documentation and being asked for clarity,
I found it more depicting to talk about the function entry point than
about the prologue. Also, this is about generic instrumentation, and it
surely involves NOPs.

So, hereby I'd like to start a small poll for a good name for this feature.
Anyone with a better idea please speak up now. Otherwise I'll just
s/prolog/prologue/g.

> The amount of space reserved is expressed as the number of NOP instructions
> to insert. On targets that have multiple instruction sizes, typically the
> smallest NOP instruction available for the current CPU mode is used to
> achieve the finest granularity.

I've made another improvement which makes the code even more robust now.
+DEF_TARGET_INSN (nop, (void))
In gcc/target-insns.def. This way I can easily check whether there is a
(define_insn "nop" ...) in the target md. Currently, all CPUs have it, but
who knows.

This will also be the default instruction used (It can be overridden
in the terget hook), so that rule has changed.

So, before the next version, any clever name suggestions?

	Torsten
Richard Earnshaw (lists) March 1, 2017, 11:34 a.m. UTC | #3
On 01/03/17 11:26, Torsten Duwe wrote:
> On Fri, Feb 17, 2017 at 11:30:29PM -0700, Sandra Loosemore wrote:
>>>
>>> +@item prolog_pad
>>> +@cindex @code{prolog_pad} function attribute
>>
>> I'm only a documentation maintainer so this is out of my area of
>> responsibility, but I really wish we could rename the attribute and
>> command-line option.  Per
>>
>> per https://gcc.gnu.org/codingconventions.html#Spelling
>>
>> the correct spelling is "prologue".
>>
>>> +@cindex extra NOP instructions at the function entry point
>>> +In case the target's text segment can be made writable at run time
>>> +by any means, padding the function entry with a number of NOPs can
>>> +be used to provide a universal tool for instrumentation.  Usually,
>>> +prolog padding is enabled globally using the @option{-fprolog-pad=N,M}
>>
>> definitely s/prolog/prologue/ in the running text here.
> 
> Well, you're definitely right in both cases.
> 
> About 400 occurrences of "prolog" in the source without ChangeLogs,
> mainly in gcc/tree-vect-loop-manip.c and in libgcc/config/libbid/bid_conf.h;
> about 3000 lines with "prologue". However there is a "-mprolog-function"
> switch. One might call this a broken window. I don't want to contribute
> to that.
> 
> However, writing some more documentation and being asked for clarity,
> I found it more depicting to talk about the function entry point than
> about the prologue. Also, this is about generic instrumentation, and it
> surely involves NOPs.
> 
> So, hereby I'd like to start a small poll for a good name for this feature.
> Anyone with a better idea please speak up now. Otherwise I'll just
> s/prolog/prologue/g.

Hmm, I'd prefer the bike shed to be green :-)

How about --fpatchable-function-entry=<size-spec>?

> 
>> The amount of space reserved is expressed as the number of NOP instructions
>> to insert. On targets that have multiple instruction sizes, typically the
>> smallest NOP instruction available for the current CPU mode is used to
>> achieve the finest granularity.
> 
> I've made another improvement which makes the code even more robust now.
> +DEF_TARGET_INSN (nop, (void))
> In gcc/target-insns.def. This way I can easily check whether there is a
> (define_insn "nop" ...) in the target md. Currently, all CPUs have it, but
> who knows.

The mid-end already has direct calls to gen_nop with no guards on the
pattern existing,  So the compiler won't build without a NOP pattern.

> 
> This will also be the default instruction used (It can be overridden
> in the terget hook), so that rule has changed.
> 
> So, before the next version, any clever name suggestions?
> 
> 	Torsten
> 

R.
Torsten Duwe March 1, 2017, 1:32 p.m. UTC | #4
On Wed, Mar 01, 2017 at 11:34:37AM +0000, Richard Earnshaw (lists) wrote:
> On 01/03/17 11:26, Torsten Duwe wrote:
> > 
> > However, writing some more documentation and being asked for clarity,
> > I found it more depicting to talk about the function entry point than
> > about the prologue. Also, this is about generic instrumentation, and it
> > surely involves NOPs.
> > 
> > So, hereby I'd like to start a small poll for a good name for this feature.
> > Anyone with a better idea please speak up now. Otherwise I'll just
> > s/prolog/prologue/g.
> 
> Hmm, I'd prefer the bike shed to be green :-)
> 
> How about --fpatchable-function-entry=<size-spec>?
> 
IMHO qualifies as "better". And green is best anyway :-]

> > I've made another improvement which makes the code even more robust now.
> > +DEF_TARGET_INSN (nop, (void))
> > In gcc/target-insns.def. This way I can easily check whether there is a
> > (define_insn "nop" ...) in the target md. Currently, all CPUs have it, but
> > who knows.
> 
> The mid-end already has direct calls to gen_nop with no guards on the
> pattern existing,  So the compiler won't build without a NOP pattern.

Richard told me "don't do that", and we found the DEF_TARGET_INSN. So far
I can see gen_nop only in target specifics and in cfgrtl.c -- admittedly
I don't know what that does.

So the v6 code is basically OK?

Names better than -fpatchable-function-entry anyone?

	Torsten
Richard Earnshaw (lists) March 1, 2017, 1:35 p.m. UTC | #5
On 01/03/17 13:32, Torsten Duwe wrote:
> On Wed, Mar 01, 2017 at 11:34:37AM +0000, Richard Earnshaw (lists) wrote:
>> On 01/03/17 11:26, Torsten Duwe wrote:
>>>
>>> However, writing some more documentation and being asked for clarity,
>>> I found it more depicting to talk about the function entry point than
>>> about the prologue. Also, this is about generic instrumentation, and it
>>> surely involves NOPs.
>>>
>>> So, hereby I'd like to start a small poll for a good name for this feature.
>>> Anyone with a better idea please speak up now. Otherwise I'll just
>>> s/prolog/prologue/g.
>>
>> Hmm, I'd prefer the bike shed to be green :-)
>>
>> How about --fpatchable-function-entry=<size-spec>?
>>
> IMHO qualifies as "better". And green is best anyway :-]
> 
>>> I've made another improvement which makes the code even more robust now.
>>> +DEF_TARGET_INSN (nop, (void))
>>> In gcc/target-insns.def. This way I can easily check whether there is a
>>> (define_insn "nop" ...) in the target md. Currently, all CPUs have it, but
>>> who knows.
>>
>> The mid-end already has direct calls to gen_nop with no guards on the
>> pattern existing,  So the compiler won't build without a NOP pattern.
> 
> Richard told me "don't do that", and we found the DEF_TARGET_INSN. So far
> I can see gen_nop only in target specifics and in cfgrtl.c -- admittedly
> I don't know what that does.
> 
> So the v6 code is basically OK?
> 
I haven't reviewed it yet.  I'm not really planning to spend any more
time on this until stage1 re-opens.

R.

> Names better than -fpatchable-function-entry anyone?
> 
> 	Torsten
>
diff mbox

Patch

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index ce7fcaa..9f0f580 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -139,6 +139,7 @@  static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -345,6 +346,8 @@  const struct attribute_spec c_common_attribute_table[] =
 			      handle_bnd_instrument, false },
   { "fallthrough",	      0, 0, false, false, false,
 			      handle_fallthrough_attribute, false },
+  { "prolog_pad",	      1, 2, true, false, false,
+			      handle_prolog_pad_attribute, false },
   { NULL,                     0, 0, false, false, false, NULL, false }
 };
 
@@ -3173,3 +3176,10 @@  handle_fallthrough_attribute (tree *, tree name, tree, int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_prolog_pad_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index ad6baa3..02993b1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@  bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; How many NOP insns to place before each function prologue by default
+Variable
+HOST_WIDE_INT prolog_nop_pad_size
+
+; And how far the asm entry point is into this pad
+Variable
+HOST_WIDE_INT prolog_nop_pad_entry
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2022,6 +2029,10 @@  fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fprolog-pad=
+Common Joined Optimization
+Insert NOP instructions before each function prologue.
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3d1546a..ef7e985 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3076,6 +3076,23 @@  that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item prolog_pad
+@cindex @code{prolog_pad} function attribute
+@cindex extra NOP instructions at the function entry point
+In case the target's text segment can be made writable at run time
+by any means, padding the function entry with a number of NOPs can
+be used to provide a universal tool for instrumentation.  Usually,
+prolog padding is enabled globally using the @option{-fprolog-pad=N,M}
+command-line switch, and disabled with attribute @code{prolog_pad (0)}
+for functions that are part of the actual instrumentation framework.
+This conveniently avoids an endless recursion.
+The @code{prolog_pad} function attribute can be used to
+change the pad size to any desired value.  The two-value syntax is
+the same as for the command-line switch @option{-fprolog-pad=N,M},
+generating a NOP pad of size @var{N}, with the function entry point
+@var{M} NOP instructions into the pad.  @var{M} defaults to 0
+if omitted e.g. function entry point is before the first NOP.
+
 @item pure
 @cindex @code{pure} function attribute
 @cindex functions that have no side effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56ca53f..75a7e2c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11370,6 +11370,31 @@  of the function name, it is considered to be a match.  For C99 and C++
 extended identifiers, the function name must be given in UTF-8, not
 using universal character names.
 
+@item -fprolog-pad=@var{N}[,@var{M}]
+@opindex fprolog-pad
+Generate a pad of @var{N} NOPs right at the beginning
+of each function, with the function entry point @var{M} NOPs into
+the pad.  If @var{M} is omitted, it defaults to @code{0} so the
+function entry points to the address just at the first NOP.
+The NOP instructions reserve extra space which can be used to patch in
+any desired instrumentation at run time, provided that the code segment
+is writable.  The amount of space is only controllable indirectly via
+the number of NOPs, so implementers are advised to use the smallest
+NOP instruction available for the current CPU mode should there be a
+choice, in order to achieve the finest granularity.
+For run-time identification, the starting addresses
+of these pads, which correspond to their respective function entries
+minus @var{M}, are additionally collected in the @code{__prolog_pads_loc}
+section of the resulting binary.
+
+Note that the value of @code{__attribute__ ((prolog_pad (N,M)))} takes
+precedence over command-line option @option{-fprolog-pad=N,M}.
+This can be used to increase the pad size or to remove it completely
+on a single function.  If @code{N=0}, no pad location is recorded.
+
+The NOP instructions are inserted at (and maybe before) the function entry
+address, even before the prologue.
+
 @end table
 
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 348fd68..5155d10 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4566,6 +4566,10 @@  will select the smallest suitable mode.
 This section describes the macros that output function entry
 (@dfn{prologue}) and exit (@dfn{epilogue}) code.
 
+@deftypefn {Target Hook} void TARGET_ASM_PRINT_PROLOG_PAD (FILE *@var{file}, unsigned HOST_WIDE_INT @var{pad_size}, bool @var{record_p})
+Generate prologue pad
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_ASM_FUNCTION_PROLOGUE (FILE *@var{file}, HOST_WIDE_INT @var{size})
 If defined, a function that outputs the assembler code for entry to a
 function.  The prologue is responsible for setting up the stack frame,
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 6cde83c..b1d9d99 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3650,6 +3650,8 @@  will select the smallest suitable mode.
 This section describes the macros that output function entry
 (@dfn{prologue}) and exit (@dfn{epilogue}) code.
 
+@hook TARGET_ASM_PRINT_PROLOG_PAD
+
 @hook TARGET_ASM_FUNCTION_PROLOGUE
 
 @hook TARGET_ASM_FUNCTION_END_PROLOGUE
diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index ca8945e..9143328 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -48,6 +48,7 @@  static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
 static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
 static tree handle_transaction_pure_attribute (tree *, tree, tree, int, bool *);
 static tree handle_returns_twice_attribute (tree *, tree, tree, int, bool *);
+static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *);
 static tree ignore_attribute (tree *, tree, tree, int, bool *);
 
 static tree handle_format_attribute (tree *, tree, tree, int, bool *);
@@ -76,6 +77,8 @@  const struct attribute_spec lto_attribute_table[] =
 			      handle_nonnull_attribute, false },
   { "nothrow",                0, 0, true,  false, false,
 			      handle_nothrow_attribute, false },
+  { "prolog_pad",	      1, 2, true, false, false,
+			      handle_prolog_pad_attribute, false },
   { "returns_twice",          0, 0, true,  false, false,
 			      handle_returns_twice_attribute, false },
   { "sentinel",               0, 1, false, true, true,
@@ -473,6 +476,13 @@  handle_returns_twice_attribute (tree *node, tree ARG_UNUSED (name),
   return NULL_TREE;
 }
 
+static tree
+handle_prolog_pad_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
+
 /* Ignore the given attribute.  Used when this attribute may be usefully
    overridden by the target, but is not used generically.  */
 
diff --git a/gcc/opts.c b/gcc/opts.c
index b38e9b4..10f751f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2159,6 +2159,29 @@  common_handle_option (struct gcc_options *opts,
         opts->x_flag_ipa_reference = false;
       break;
 
+    case OPT_fprolog_pad_:
+      {
+	char *pad_arg = xstrdup (arg);
+	char *comma = strchr (pad_arg, ',');
+	if (comma)
+	  {
+	    *comma = '\0';
+	    prolog_nop_pad_size = integral_argument (pad_arg);
+	    prolog_nop_pad_entry = integral_argument (comma + 1);
+	  }
+	else
+	  {
+	    prolog_nop_pad_size = integral_argument (pad_arg);
+	    prolog_nop_pad_entry = 0;
+	  }
+	if (prolog_nop_pad_size < 0
+	    || prolog_nop_pad_entry < 0
+	    || prolog_nop_pad_size < prolog_nop_pad_entry)
+	  error ("invalid arguments for %<-fprolog_pad%>");
+	free (pad_arg);
+      }
+      break;
+
     case OPT_ftree_vectorize:
       if (!opts_set->x_flag_tree_loop_vectorize)
         opts->x_flag_tree_loop_vectorize = value;
diff --git a/gcc/target.def b/gcc/target.def
index 43600ae..bdc47b4 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -288,6 +288,12 @@  hidden, protected or internal visibility as specified by @var{visibility}.",
  void, (tree decl, int visibility),
  default_assemble_visibility)
 
+DEFHOOK
+(print_prolog_pad,
+ "Generate prologue pad",
+ void, (FILE *file, unsigned HOST_WIDE_INT pad_size, bool record_p),
+ default_print_prolog_pad)
+
 /* Output the assembler code for entry to a function.  */
 DEFHOOK
 (function_prologue,
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 1cdec06..6729e6c 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1609,6 +1609,52 @@  default_compare_by_pieces_branch_ratio (machine_mode)
   return 1;
 }
 
+/* Write PAD_SIZE NOPs into the asm outfile FILE before a function
+   prologue.  If RECORD_P is true, the location of the pad will be
+   recorded in a special object section called "__prolog_pads_loc".
+   This routine may be called twice per function to put NOPs before
+   and after the function entry.  */
+
+void
+default_print_prolog_pad (FILE *file, unsigned HOST_WIDE_INT pad_size,
+			  bool record_p)
+{
+  static const char *nop_templ = 0;
+
+  /* We use the template alone, relying on the (currently sane) assumption
+     that the NOP template does not have variable operands.  */
+  if (!nop_templ)
+    {
+      int code_num;
+      rtx_insn *my_nop = make_insn_raw (gen_nop ());
+
+      code_num = recog_memoized (my_nop);
+      nop_templ = get_insn_template (code_num, my_nop);
+    }
+
+  if (record_p)
+    {
+      char buf[256];
+      static int pad_number;
+      section *previous_section = in_section;
+
+      pad_number++;
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LPPAD", pad_number);
+
+      switch_to_section (get_section ("__prolog_pads_loc", 0, NULL));
+      fputs (integer_asm_op (POINTER_SIZE_UNITS, false), file);
+      assemble_name_raw (file, buf);
+      fputc ('\n', file);
+
+      switch_to_section (previous_section);
+      ASM_OUTPUT_LABEL (file, buf);
+    }
+
+  unsigned i;
+  for (i = 0; i < pad_size; ++i)
+    fprintf (file, "\t%s\n", nop_templ);
+}
+
 bool
 default_profile_before_prologue (void)
 {
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index a5565f5..e302e8d 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -203,6 +203,7 @@  extern bool default_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT,
 						    bool);
 extern int default_compare_by_pieces_branch_ratio (machine_mode);
 
+extern void default_print_prolog_pad (FILE *, unsigned HOST_WIDE_INT , bool);
 extern bool default_profile_before_prologue (void);
 extern reg_class_t default_preferred_reload_class (rtx, reg_class_t);
 extern reg_class_t default_preferred_output_reload_class (rtx, reg_class_t);
diff --git a/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c b/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c
new file mode 100644
index 0000000..2236aa8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c
@@ -0,0 +1,34 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fprolog-pad=3,1" } */
+
+void f1 (void) __attribute__((prolog_pad(2,1)));
+void f2 (void) __attribute__((prolog_pad(3)));
+int f3 (void);
+
+void
+f1 (void)
+{
+  f2 ();
+}
+
+void
+f2 (void)
+{
+  f1 ();
+}
+
+/* F3 should never have a NOP pad.  */
+int
+__attribute__((prolog_pad(0)))
+__attribute__((noinline))
+f3 (void)
+{
+  return 5;
+}
+
+/* F4 should receive the command line default setting.  */
+int
+f4 (void)
+{
+  return 3*f3 ()+1;
+}
diff --git a/gcc/toplev.c b/gcc/toplev.c
index beb581a..3afda4a 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1596,8 +1596,10 @@  process_options (void)
     }
 
  /* Do not use IPA optimizations for register allocation if profiler is active
+    or prolog pads are inserted for run-time instrumentation
     or port does not emit prologue and epilogue as RTL.  */
-  if (profile_flag || !targetm.have_prologue () || !targetm.have_epilogue ())
+  if (profile_flag || prolog_nop_pad_size
+      || !targetm.have_prologue () || !targetm.have_epilogue ())
     flag_ipa_ra = 0;
 
   /* Enable -Werror=coverage-mismatch when -Werror and -Wno-error
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 11a8ac4..84c739a 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -1830,6 +1830,44 @@  assemble_start_function (tree decl, const char *fnname)
   if (DECL_PRESERVE_P (decl))
     targetm.asm_out.mark_decl_preserved (fnname);
 
+  unsigned HOST_WIDE_INT pad_size = prolog_nop_pad_size;
+  unsigned HOST_WIDE_INT pad_entry = prolog_nop_pad_entry;
+
+  tree prolog_pad_attr
+    = lookup_attribute ("prolog_pad", DECL_ATTRIBUTES (decl));
+  if (prolog_pad_attr)
+    {
+      tree pp_val = TREE_VALUE (prolog_pad_attr);
+      tree prolog_pad_value1 = TREE_VALUE (pp_val);
+
+      if (tree_fits_uhwi_p (prolog_pad_value1))
+	pad_size = tree_to_uhwi (prolog_pad_value1);
+      else
+	gcc_unreachable ();
+
+      pad_entry = 0;
+      if (list_length (pp_val) > 1)
+	{
+	  tree prolog_pad_value2 = TREE_VALUE (TREE_CHAIN (pp_val));
+
+	  if (tree_fits_uhwi_p (prolog_pad_value2))
+	    pad_entry = tree_to_uhwi (prolog_pad_value2);
+	  else
+	    gcc_unreachable ();
+	}
+    }
+
+  if (pad_entry > pad_size)
+    {
+      if (pad_size > 0)
+	warning (OPT_Wattributes, "Prolog nop pad entry > size");
+      pad_entry = 0;
+    }
+
+  /* Emit the prolog padding before the entry label, if any.  */
+  if (pad_entry > 0)
+    targetm.asm_out.print_prolog_pad (asm_out_file, pad_entry, true);
+
   /* Do any machine/system dependent processing of the function name.  */
 #ifdef ASM_DECLARE_FUNCTION_NAME
   ASM_DECLARE_FUNCTION_NAME (asm_out_file, fnname, current_function_decl);
@@ -1838,6 +1876,11 @@  assemble_start_function (tree decl, const char *fnname)
   ASM_OUTPUT_FUNCTION_LABEL (asm_out_file, fnname, current_function_decl);
 #endif /* ASM_DECLARE_FUNCTION_NAME */
 
+  /* And the padding after the label.  Record it if we haven't done so yet.  */
+  if (pad_size > pad_entry)
+    targetm.asm_out.print_prolog_pad (asm_out_file, pad_size-pad_entry,
+				      pad_entry == 0);
+
   if (lookup_attribute ("no_split_stack", DECL_ATTRIBUTES (decl)))
     saw_no_split_stack = true;
 }