Patchwork [IRA] Analysis of register usage of functions for usage by IRA.

login
register
mail settings
Submitter Tom de Vries
Date Jan. 25, 2013, 1:05 p.m.
Message ID <510282FE.1060809@mentor.com>
Download mbox | patch
Permalink /patch/215660/
State New
Headers show

Comments

Tom de Vries - Jan. 25, 2013, 1:05 p.m.
Vladimir,

this patch adds analysis of register usage of functions for usage by IRA.

The patch:
- adds analysis in pass_final to track which hard registers are set or clobbered
  by the function body, and stores that information in a struct cgraph_node.
- adds a target hook fn_other_hard_reg_usage to list hard registers that are
  set or clobbered by a call to a function, but are not listed as such in the
  function body, such as f.i. registers clobbered by veneers inserted by the
  linker.
- adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
  corresponding declaration, even after the calls may have been split into an
  insn (set register to function address) and a call_insn (call register), which
  can happen for f.i. sh, and mips with -mabi-calls.
- uses the register analysis in IRA.
- adds an option -fuse-caller-save to control the optimization, on by default
  at -Os and -O2 and higher.


The patch (original version by Radovan Obradovic) is similar to your patch
( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
But this patch doesn't implement save area stack slot sharing.
( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
patch ).

[ Steven, you mentioned in this discussion
  ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
  porting the 2007 patch to trunk. What is the status of that effort?
]


As an example of the functionality, consider foo and bar from test-case aru-1.c:
...
static int __attribute__((noinline))
bar (int x)
{
  return x + 3;
}

int __attribute__((noinline))
foo (int y)
{
  return y + bar (y);
}
...

Compiled at -O2, bar only sets register $2 (the first return register):
...
bar:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        j       $31
        addiu   $2,$4,3
...

foo then can use register $3 (the second return register) instead of register
$16 to save the value in register $4 (the first argument register) over the
call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
...
foo:                                    foo:
# vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
.frame  $sp,32,$31                      .frame  $sp,32,$31
.mask   0x80010000,-4                 | .mask   0x80000000,-4
.fmask  0x00000000,0                    .fmask  0x00000000,0
.set    noreorder                       .set    noreorder
.set    nomacro                         .set    nomacro
addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
sw      $31,28($sp)                     sw      $31,28($sp)
sw      $16,24($sp)                   <
.option pic0                            .option pic0
jal     bar                             jal     bar
.option pic2                            .option pic2
move    $16,$4                        | move    $3,$4

lw      $31,28($sp)                     lw      $31,28($sp)
addu    $2,$2,$16                     | addu    $2,$2,$3
lw      $16,24($sp)                   <
j       $31                             j       $31
addiu   $sp,$sp,32                      addiu   $sp,$sp,32
...
That way we skip the save and restore of register $16, which is not necessary
for $3. Btw, a further improvement could be to reuse $4 after the call, and
eliminate the move.


A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
where a register was clobbered by a stack pop instruction, while that was not
visible in the rtl representation. This instruction was introduced in
arm_output_epilogue by code marked with the comment 'pop call clobbered
registers if it avoids a separate stack adjustment'.
I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
that the epilogue instructions now list all registers set by it, so
collect_fn_hard_reg_usage is able to analyze all clobbered registers.


Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
mips, arm, ppc and sh. No issues found. OK for stage1 trunk?

Thanks,
- Tom

2013-01-24  Radovan Obradovic  <robradovic@mips.com>
	    Tom de Vries  <tom@codesourcery.com>

	* hooks.c (hook_void_hard_reg_set_containerp): New function.
	* hooks.h (hook_void_hard_reg_set_containerp): Declare.
	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
	arm_fn_other_hard_reg_usage.
	(arm_fn_other_hard_reg_usage): New function.
	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
	Hooks to @menu.
	(@node Miscellaneous Register Hooks): New node.
	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
	* doc/tm.texi: Regenerate.
	* reg-notes.def (REG_NOTE (CALL_DECL)): New reg-note REG_CALL_DECL.
	* calls.c (expand_call, emit_library_call_value_1): Add REG_CALL_DECL
	reg-note.
	* combine.c (distribute_notes): Handle REG_CALL_DECL reg-note.
	* emit-rtl.c (try_split): Same.
	* rtlanal.c (find_all_hard_reg_sets): Add bool implicit parameter and
	handle.
	* rtl.h (find_all_hard_reg_sets): Add bool parameter.
	* haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
	new argument to find_all_hard_reg_sets call.
	cgraph.h (struct cgraph_node): Add function_used_regs,
	function_used_regs_initialized and function_used_regs_valid fields.
	* common.opt (fuse-caller-save): New option.
	* opts.c (default_options_table): Add OPT_LEVELS_2_PLUS entry with
	OPT_fuse_caller_save.
	* final.c: Move include of hard-reg-set.h to before rtl.h to declare
	find_all_hard_reg_sets.
	(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
	(get_call_reg_set_usage): New function.
	(rest_of_handle_final): Use collect_fn_hard_reg_usage.
	* regs.h (get_call_reg_set_usage): Declare.
	* df-scan.c (df_get_call_refs): Use get_call_reg_set_usage.
	* caller-save.c (setup_save_areas, save_call_clobbered_regs): Use
	get_call_reg_set_usage.
	* resource.c (mark_set_resources, mark_target_live_regs): Use
	get_call_reg_set_usage.
	* ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs
	field.
	(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define.
	* ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage.
	Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-build.c (ira_create_allocno): Init
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	(create_cap_allocno, propagate_allocno_info)
	(propagate_some_info_from_allocno)
	(copy_info_to_removed_store_destinations): Handle
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-costs.c (ira_tune_allocno_costs): Use
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.
	* doc/invoke.texi (@item Optimization Options): Add -fuse-caller-save to
	gccoptlist.
	(@item -fuse-caller-save): New item.

	* lib/target-supports.exp (check_effective_target_mips16)
	(check_effective_target_micromips): New proc.
	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.  Add -save-temps to mips_option_groups.
	* gcc.target/mips/aru-1.c: New test.
Vladimir Makarov - Jan. 25, 2013, 3:36 p.m.
On 01/25/2013 08:05 AM, Tom de Vries wrote:
> Vladimir,
>
> this patch adds analysis of register usage of functions for usage by IRA.
>
> The patch:
> - adds analysis in pass_final to track which hard registers are set or clobbered
>    by the function body, and stores that information in a struct cgraph_node.
> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>    set or clobbered by a call to a function, but are not listed as such in the
>    function body, such as f.i. registers clobbered by veneers inserted by the
>    linker.
> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>    corresponding declaration, even after the calls may have been split into an
>    insn (set register to function address) and a call_insn (call register), which
>    can happen for f.i. sh, and mips with -mabi-calls.
> - uses the register analysis in IRA.
> - adds an option -fuse-caller-save to control the optimization, on by default
>    at -Os and -O2 and higher.
>
>
> The patch (original version by Radovan Obradovic) is similar to your patch
> ( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
> But this patch doesn't implement save area stack slot sharing.
> ( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
> patch ).
>
> [ Steven, you mentioned in this discussion
>    ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
>    porting the 2007 patch to trunk. What is the status of that effort?
> ]
>
>
> As an example of the functionality, consider foo and bar from test-case aru-1.c:
> ...
> static int __attribute__((noinline))
> bar (int x)
> {
>    return x + 3;
> }
>
> int __attribute__((noinline))
> foo (int y)
> {
>    return y + bar (y);
> }
> ...
>
> Compiled at -O2, bar only sets register $2 (the first return register):
> ...
> bar:
>          .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
>          .mask   0x00000000,0
>          .fmask  0x00000000,0
>          .set    noreorder
>          .set    nomacro
>          j       $31
>          addiu   $2,$4,3
> ...
>
> foo then can use register $3 (the second return register) instead of register
> $16 to save the value in register $4 (the first argument register) over the
> call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
> ...
> foo:                                    foo:
> # vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
> .frame  $sp,32,$31                      .frame  $sp,32,$31
> .mask   0x80010000,-4                 | .mask   0x80000000,-4
> .fmask  0x00000000,0                    .fmask  0x00000000,0
> .set    noreorder                       .set    noreorder
> .set    nomacro                         .set    nomacro
> addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
> sw      $31,28($sp)                     sw      $31,28($sp)
> sw      $16,24($sp)                   <
> .option pic0                            .option pic0
> jal     bar                             jal     bar
> .option pic2                            .option pic2
> move    $16,$4                        | move    $3,$4
>
> lw      $31,28($sp)                     lw      $31,28($sp)
> addu    $2,$2,$16                     | addu    $2,$2,$3
> lw      $16,24($sp)                   <
> j       $31                             j       $31
> addiu   $sp,$sp,32                      addiu   $sp,$sp,32
> ...
> That way we skip the save and restore of register $16, which is not necessary
> for $3. Btw, a further improvement could be to reuse $4 after the call, and
> eliminate the move.
>
>
> A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
> where a register was clobbered by a stack pop instruction, while that was not
> visible in the rtl representation. This instruction was introduced in
> arm_output_epilogue by code marked with the comment 'pop call clobbered
> registers if it avoids a separate stack adjustment'.
> I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
> that the epilogue instructions now list all registers set by it, so
> collect_fn_hard_reg_usage is able to analyze all clobbered registers.
>
>
> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>
>
Thanks for the patch.  I'll look at it during the next week.

Right now I see that the code is based on reload which uses 
caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we 
have LRA support only for x86/x86-64 but the next version will probably 
have a few more targets based on LRA.  Fortunately, LRA modification 
will be pretty easy with all this machinery.

I am going to use ira-improv branch for some my future work for gcc4.9.  
And I am going to regularly (about once per month) merge trunk into it.  
So if you want you could use the branch for your work too.  But this is 
absolutely up to you.  I don't mind if you put this patch directly to 
the trunk at stage1 when the review is finished.
Tom de Vries - Feb. 7, 2013, 7:11 p.m.
Vladimir,

On 25/01/13 16:36, Vladimir Makarov wrote:
> On 01/25/2013 08:05 AM, Tom de Vries wrote:
>> Vladimir,
>>
>> this patch adds analysis of register usage of functions for usage by IRA.
>>
>> The patch:
>> - adds analysis in pass_final to track which hard registers are set or clobbered
>>    by the function body, and stores that information in a struct cgraph_node.
>> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>>    set or clobbered by a call to a function, but are not listed as such in the
>>    function body, such as f.i. registers clobbered by veneers inserted by the
>>    linker.
>> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>>    corresponding declaration, even after the calls may have been split into an
>>    insn (set register to function address) and a call_insn (call register), which
>>    can happen for f.i. sh, and mips with -mabi-calls.
>> - uses the register analysis in IRA.
>> - adds an option -fuse-caller-save to control the optimization, on by default
>>    at -Os and -O2 and higher.

<SNIP>

>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
>> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>>
>>
> Thanks for the patch.  I'll look at it during the next week.
> 

Did you get a chance to look at this?

> Right now I see that the code is based on reload which uses 
> caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we 
> have LRA support only for x86/x86-64 but the next version will probably 
> have a few more targets based on LRA.  Fortunately, LRA modification 
> will be pretty easy with all this machinery.
> 

I see, thanks for noticing that. Btw I'm now working on a testsuite construct
dg-size-compare to be able to do
  dg-size-compare "text" "-fuse-caller-save" "<" "-fno-use-caller-save"
which I could have used to create a generic testcase, which would have
demonstrated that the optimization didn't work for x86_64.

I'm also currently looking at how to use the analysis in LRA.
AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
of how many calls we've seen (calls_num), and mark insns with that number. Then
when looking at a live-range segment consisting of a def or use insn a and a
following use insn b, we can compare the number of calls seen for each insn, and
if they're not equal there is at least one call between the 2 insns, and if the
corresponding hard register is clobbered by calls, we spill after insn a and
restore before insn b.

That is too coarse-grained to use with our analysis, since we need to know which
calls occur in between insn a and insn b, and more precisely which registers
those calls clobbered.

I wonder though if we can do something similar: we keep an array
call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
When encountering a call, we increase the call_clobbers_num entries for the hard
registers clobbered by the call.
When encountering a use, we set the call_clobbers_num field of the use to
call_clobbers_num[reg_renumber[original_regno]].
And when looking at a live-range segment, we compare the clobbers_num field of
insn a and insn b, and if it is not equal, the hard register was clobbered by at
least one call between insn a and insn b.
Would that work? WDYT?

> I am going to use ira-improv branch for some my future work for gcc4.9.  
> And I am going to regularly (about once per month) merge trunk into it.  
> So if you want you could use the branch for your work too.  But this is 
> absolutely up to you.  I don't mind if you put this patch directly to 
> the trunk at stage1 when the review is finished.
> 

OK, I'd say stage1 then unless during review a reason pops up why it's better to
use the ira-improv branch.

Thanks,
- Tom
Vladimir Makarov - Feb. 13, 2013, 10:35 p.m.
On 13-02-07 2:11 PM, Tom de Vries wrote:
> Vladimir,
>
> On 25/01/13 16:36, Vladimir Makarov wrote:
>> On 01/25/2013 08:05 AM, Tom de Vries wrote:
>>> Vladimir,
>>>
>>> this patch adds analysis of register usage of functions for usage by IRA.
>>>
>>> The patch:
>>> - adds analysis in pass_final to track which hard registers are set or clobbered
>>>     by the function body, and stores that information in a struct cgraph_node.
>>> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>>>     set or clobbered by a call to a function, but are not listed as such in the
>>>     function body, such as f.i. registers clobbered by veneers inserted by the
>>>     linker.
>>> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>>>     corresponding declaration, even after the calls may have been split into an
>>>     insn (set register to function address) and a call_insn (call register), which
>>>     can happen for f.i. sh, and mips with -mabi-calls.
>>> - uses the register analysis in IRA.
>>> - adds an option -fuse-caller-save to control the optimization, on by default
>>>     at -Os and -O2 and higher.
> <SNIP>
>
>>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
>>> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>>>
>>>
>> Thanks for the patch.  I'll look at it during the next week.
>>
> Did you get a chance to look at this?
Sorry for the delay with the answer.  I was and am quite busy with other 
more urgent things.  I'll work on it when I have more free time.  In any 
case, I'll do it before stage1 to have your patch ready.
>> Right now I see that the code is based on reload which uses
>> caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we
>> have LRA support only for x86/x86-64 but the next version will probably
>> have a few more targets based on LRA.  Fortunately, LRA modification
>> will be pretty easy with all this machinery.
>>
> I see, thanks for noticing that. Btw I'm now working on a testsuite construct
> dg-size-compare to be able to do
>    dg-size-compare "text" "-fuse-caller-save" "<" "-fno-use-caller-save"
> which I could have used to create a generic testcase, which would have
> demonstrated that the optimization didn't work for x86_64.
I thought about implementing your optimization for LRA by myself. But it 
is ok if you decide to work on it.  At least, I am not going to start 
this work for a month.
> I'm also currently looking at how to use the analysis in LRA.
> AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
> of how many calls we've seen (calls_num), and mark insns with that number. Then
> when looking at a live-range segment consisting of a def or use insn a and a
> following use insn b, we can compare the number of calls seen for each insn, and
> if they're not equal there is at least one call between the 2 insns, and if the
> corresponding hard register is clobbered by calls, we spill after insn a and
> restore before insn b.
>
> That is too coarse-grained to use with our analysis, since we need to know which
> calls occur in between insn a and insn b, and more precisely which registers
> those calls clobbered.

> I wonder though if we can do something similar: we keep an array
> call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
> When encountering a call, we increase the call_clobbers_num entries for the hard
> registers clobbered by the call.
> When encountering a use, we set the call_clobbers_num field of the use to
> call_clobbers_num[reg_renumber[original_regno]].
> And when looking at a live-range segment, we compare the clobbers_num field of
> insn a and insn b, and if it is not equal, the hard register was clobbered by at
> least one call between insn a and insn b.
> Would that work? WDYT?
>
As I understand you looked at live-range splitting code in 
lra-constraints.c.  To get necessary info you should look at ira-lives.c.
>> I am going to use ira-improv branch for some my future work for gcc4.9.
>> And I am going to regularly (about once per month) merge trunk into it.
>> So if you want you could use the branch for your work too.  But this is
>> absolutely up to you.  I don't mind if you put this patch directly to
>> the trunk at stage1 when the review is finished.
>>
> OK, I'd say stage1 then unless during review a reason pops up why it's better to
> use the ira-improv branch.
>
That is ok.  Stage1 then.
Tom de Vries - March 14, 2013, 9:34 a.m.
On 13/02/13 23:35, Vladimir Makarov wrote:
> On 13-02-07 2:11 PM, Tom de Vries wrote:
>> Vladimir,
>>
>> On 25/01/13 16:36, Vladimir Makarov wrote:
>>> On 01/25/2013 08:05 AM, Tom de Vries wrote:
>>>> Vladimir,
>>>>
>>>> this patch adds analysis of register usage of functions for usage by IRA.
>>>>
>>>> The patch:
>>>> - adds analysis in pass_final to track which hard registers are set or clobbered
>>>>     by the function body, and stores that information in a struct cgraph_node.
>>>> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>>>>     set or clobbered by a call to a function, but are not listed as such in the
>>>>     function body, such as f.i. registers clobbered by veneers inserted by the
>>>>     linker.
>>>> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>>>>     corresponding declaration, even after the calls may have been split into an
>>>>     insn (set register to function address) and a call_insn (call register), which
>>>>     can happen for f.i. sh, and mips with -mabi-calls.
>>>> - uses the register analysis in IRA.
>>>> - adds an option -fuse-caller-save to control the optimization, on by default
>>>>     at -Os and -O2 and higher.
>> <SNIP>
>>
>>>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
>>>> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>>>>
>>>>
>>> Thanks for the patch.  I'll look at it during the next week.
>>>
>> Did you get a chance to look at this?
> Sorry for the delay with the answer.  I was and am quite busy with other 
> more urgent things.  I'll work on it when I have more free time.  In any 
> case, I'll do it before stage1 to have your patch ready.

Vladimir,

do you have an ETA on this review?

>>> Right now I see that the code is based on reload which uses
>>> caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we
>>> have LRA support only for x86/x86-64 but the next version will probably
>>> have a few more targets based on LRA.  Fortunately, LRA modification
>>> will be pretty easy with all this machinery.
>>>
>> I see, thanks for noticing that. Btw I'm now working on a testsuite construct
>> dg-size-compare to be able to do
>>    dg-size-compare "text" "-fuse-caller-save" "<" "-fno-use-caller-save"
>> which I could have used to create a generic testcase, which would have
>> demonstrated that the optimization didn't work for x86_64.
> I thought about implementing your optimization for LRA by myself. But it 
> is ok if you decide to work on it.  At least, I am not going to start 
> this work for a month.
>> I'm also currently looking at how to use the analysis in LRA.
>> AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
>> of how many calls we've seen (calls_num), and mark insns with that number. Then
>> when looking at a live-range segment consisting of a def or use insn a and a
>> following use insn b, we can compare the number of calls seen for each insn, and
>> if they're not equal there is at least one call between the 2 insns, and if the
>> corresponding hard register is clobbered by calls, we spill after insn a and
>> restore before insn b.
>>
>> That is too coarse-grained to use with our analysis, since we need to know which
>> calls occur in between insn a and insn b, and more precisely which registers
>> those calls clobbered.
> 
>> I wonder though if we can do something similar: we keep an array
>> call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
>> When encountering a call, we increase the call_clobbers_num entries for the hard
>> registers clobbered by the call.
>> When encountering a use, we set the call_clobbers_num field of the use to
>> call_clobbers_num[reg_renumber[original_regno]].
>> And when looking at a live-range segment, we compare the clobbers_num field of
>> insn a and insn b, and if it is not equal, the hard register was clobbered by at
>> least one call between insn a and insn b.
>> Would that work? WDYT?
>>
> As I understand you looked at live-range splitting code in 
> lra-constraints.c.  To get necessary info you should look at ira-lives.c.

Unfortunately I haven't been able to find time to work further on the LRA part.
So if you're still willing to pick up that part, that would be great.

Thanks,
- Tom
Vladimir Makarov - March 14, 2013, 3:11 p.m.
On 03/14/2013 05:34 AM, Tom de Vries wrote:
> On 13/02/13 23:35, Vladimir Makarov wrote:
>>
>> Sorry for the delay with the answer.  I was and am quite busy with other
>> more urgent things.  I'll work on it when I have more free time.  In any
>> case, I'll do it before stage1 to have your patch ready.
> Vladimir,
>
> do you have an ETA on this review?
>
>
Actually, I am done with it.  In general, it is ok.  Although I have 
some minors comments:

In Changelog, you missed '*" before cgraph.h:

     * haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
     new argument to find_all_hard_reg_sets call.
     cgraph.h (struct cgraph_node): Add function_used_regs,
     function_used_regs_initialized and function_used_regs_valid fields.


@@ -3391,6 +3394,7 @@ df_get_call_refs (struct df_collection_r
          }
      }
        else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)

I'd remove the test of regs_invalidated_by_call.

+           && TEST_HARD_REG_BIT (fn_reg_set_usage, i)
             /* no clobbers for regs that are the result of the call */
             && !TEST_HARD_REG_BIT (defs_generated, i)

+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+  struct hard_reg_set_container other_usage;
+
+  if (!flag_use_caller_save)
+    return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  gcc_assert (!node->function_used_regs_initialized);
+  node->function_used_regs_initialized = 1;
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+    {
+      HARD_REG_SET insn_used_regs;
+
+      if (!NONDEBUG_INSN_P (insn))
+    continue;
+
+      find_all_hard_reg_sets (insn, &insn_used_regs, false);
+
+      if (CALL_P (insn)
+      && !get_call_reg_set_usage (insn, &insn_used_regs, 
call_used_reg_set))
+    {
+      CLEAR_HARD_REG_SET (node->function_used_regs);
+      return;
+    }
+

I'd put it before find_all_hard_reg_sets

+      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
+    }
+



But you can ignore my two last 2 comments.

The patch is ok for me for trunk at stage1.  But I think you need a 
formal approval for df-scan.c, arm.c, mips.c, GCC testsuite expect files 
(lib/target-supports.exp and gcc.target/mips/mips.exp) as I am not a 
maintainer of these parts although these changes look ok for me.

Thanks for your hard work and sorry for the review delay.

I guess you need to pay attention to reported problems for some time 
after you commit the patch as it affects all targets.
Tom de Vries - March 29, 2013, 12:54 p.m.
On 14/03/13 16:11, Vladimir Makarov wrote:
> On 03/14/2013 05:34 AM, Tom de Vries wrote:
>> On 13/02/13 23:35, Vladimir Makarov wrote:
>>
> Actually, I am done with it.  In general, it is ok.  Although I have 
> some minors comments:
> 

Vladimir,

Thanks for the review.

I split the patch up into 10 patches, to facilitate further review:
...
0001-Add-command-line-option.patch
0002-Add-new-reg-note-REG_CALL_DECL.patch
0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
0006-Collect-register-usage-information.patch
0007-Use-collected-register-usage-information.patch
0008-Enable-by-default-at-O2-and-higher.patch
0009-Add-documentation.patch
0010-Add-test-case.patch
...
I'll post these in reply to this email.

> In Changelog, you missed '*" before cgraph.h:
> 
>      * haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
>      new argument to find_all_hard_reg_sets call.
>      cgraph.h (struct cgraph_node): Add function_used_regs,
>      function_used_regs_initialized and function_used_regs_valid fields.
> 

Fixed (in the log of 0006-Collect-register-usage-information.patch).

> 
> @@ -3391,6 +3394,7 @@ df_get_call_refs (struct df_collection_r
>           }
>       }
>         else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)
> 
> I'd remove the test of regs_invalidated_by_call.
> 
> +           && TEST_HARD_REG_BIT (fn_reg_set_usage, i)
>              /* no clobbers for regs that are the result of the call */
>              && !TEST_HARD_REG_BIT (defs_generated, i)
> 

Fixed (in 0007-Use-collected-register-usage-information.patch).

> +static void
> +collect_fn_hard_reg_usage (void)
> +{
> +  rtx insn;
> +  int i;
> +  struct cgraph_node *node;
> +  struct hard_reg_set_container other_usage;
> +
> +  if (!flag_use_caller_save)
> +    return;
> +
> +  node = cgraph_get_node (current_function_decl);
> +  gcc_assert (node != NULL);
> +
> +  gcc_assert (!node->function_used_regs_initialized);
> +  node->function_used_regs_initialized = 1;
> +
> +  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
> +    {
> +      HARD_REG_SET insn_used_regs;
> +
> +      if (!NONDEBUG_INSN_P (insn))
> +    continue;
> +
> +      find_all_hard_reg_sets (insn, &insn_used_regs, false);
> +
> +      if (CALL_P (insn)
> +      && !get_call_reg_set_usage (insn, &insn_used_regs, 
> call_used_reg_set))
> +    {
> +      CLEAR_HARD_REG_SET (node->function_used_regs);
> +      return;
> +    }
> +
> 
> I'd put it before find_all_hard_reg_sets
> 
> +      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
> +    }
> +
> 
> 

insn_used_regs is set by both find_all_hard_reg_sets, and by
get_call_reg_set_usage. If we move the IOR to before find_all_hard_reg_sets,
we're using an undefined value.

> 
> But you can ignore my two last 2 comments.
> 
> The patch is ok for me for trunk at stage1.  But I think you need a 
> formal approval for df-scan.c, arm.c, mips.c, GCC testsuite expect files 
> (lib/target-supports.exp and gcc.target/mips/mips.exp) as I am not a 
> maintainer of these parts although these changes look ok for me.
> 

I'm assuming you've ok'ed patch 1, 2, 3, 4, 6, 8, 9 and the non-df-scan part of 7.

I'll ask other maintainers about the other parts (5, 10 and the df-scan part of 7).

Thanks,
- Tom
Tom de Vries - March 30, 2013, 4:10 p.m.
On 29/03/13 13:54, Tom de Vries wrote:
> I split the patch up into 10 patches, to facilitate further review:
> ...
> 0001-Add-command-line-option.patch
> 0002-Add-new-reg-note-REG_CALL_DECL.patch
> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
> 0006-Collect-register-usage-information.patch
> 0007-Use-collected-register-usage-information.patch
> 0008-Enable-by-default-at-O2-and-higher.patch
> 0009-Add-documentation.patch
> 0010-Add-test-case.patch
> ...
> I'll post these in reply to this email.
> 

Something went wrong with those emails, which were generated.

I tested the emails by sending them to my work email, where they looked fine.
I managed to reproduce the problem by sending them to my private email.
It seems the problem was inconsistent EOL format.

I've written a python script to handle composing the email, and posted it here
using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
Given that that email looks ok, I think I've addressed the problems now.

I'll repost the patches. Sorry about the noise.

Thanks,
- Tom
Richard Earnshaw - Jan. 9, 2014, 2:41 p.m.
On 30/03/13 16:10, Tom de Vries wrote:
> On 29/03/13 13:54, Tom de Vries wrote:
>> I split the patch up into 10 patches, to facilitate further review:
>> ...
>> 0001-Add-command-line-option.patch
>> 0002-Add-new-reg-note-REG_CALL_DECL.patch
>> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
>> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
>> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
>> 0006-Collect-register-usage-information.patch
>> 0007-Use-collected-register-usage-information.patch
>> 0008-Enable-by-default-at-O2-and-higher.patch
>> 0009-Add-documentation.patch
>> 0010-Add-test-case.patch
>> ...
>> I'll post these in reply to this email.
>>
> 
> Something went wrong with those emails, which were generated.
> 
> I tested the emails by sending them to my work email, where they looked fine.
> I managed to reproduce the problem by sending them to my private email.
> It seems the problem was inconsistent EOL format.
> 
> I've written a python script to handle composing the email, and posted it here
> using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
> Given that that email looks ok, I think I've addressed the problems now.
> 
> I'll repost the patches. Sorry about the noise.
> 
> Thanks,
> - Tom
> 
> 

It's unfortunate that this feature doesn't fail safe when a port has not
explicitly defined what should happen.

Consequently, you'll need to add a patch for AArch64 which has two
registers clobbered by PLT-based calls.

R.

Patch

Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c (revision 195240)
+++ gcc/hooks.c (working copy)
@@ -446,3 +446,11 @@  void
 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
 }
+
+/* Generic hook that takes a struct hard_reg_set_container * and returns
+   void.  */
+
+void
+hook_void_hard_reg_set_containerp (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
+{
+}
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h (revision 195240)
+++ gcc/hooks.h (working copy)
@@ -69,6 +69,7 @@  extern void hook_void_tree (tree);
 extern void hook_void_tree_treeptr (tree, tree *);
 extern void hook_void_int_int (int, int);
 extern void hook_void_gcc_optionsp (struct gcc_options *);
+extern void hook_void_hard_reg_set_containerp (struct hard_reg_set_container *);
 
 extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);
 extern int hook_int_const_tree_0 (const_tree);
Index: gcc/target.def
===================================================================
--- gcc/target.def (revision 195240)
+++ gcc/target.def (working copy)
@@ -2859,6 +2859,17 @@  DEFHOOK
  void, (bitmap regs),
  hook_void_bitmap)
 
+/* For targets that need to mark extra registers as clobbered on entry to
+   the function, they should define this target hook and set their
+   bits in the struct hard_reg_set_container passed in.  */
+DEFHOOK
+(fn_other_hard_reg_usage,
+ "Add any hard registers to @var{regs} that are set or clobbered by a call to\
+ the function.  This hook only needs to be defined to provide registers that\
+ cannot be found by examination of the final RTL representation of a function.",
+ void, (struct hard_reg_set_container *regs),
+ hook_void_hard_reg_set_containerp)
+
 /* Fill in additional registers set up by prologue into a regset.  */
 DEFHOOK
 (set_up_by_prologue,
Index: gcc/cgraph.h
===================================================================
--- gcc/cgraph.h (revision 195240)
+++ gcc/cgraph.h (working copy)
@@ -251,6 +251,15 @@  struct GTY(()) cgraph_node {
   /* Unique id of the node.  */
   int uid;
 
+  /* Call unsaved hard registers really used by the corresponding
+     function (including ones used by functions called by the
+     function).  */
+  HARD_REG_SET function_used_regs;
+  /* Set if function_used_regs is initialized.  */
+  unsigned function_used_regs_initialized: 1;
+  /* Set if function_used_regs is valid.  */
+  unsigned function_used_regs_valid: 1;
+
   /* Set when decl is an abstract function pointed to by the
      ABSTRACT_DECL_ORIGIN of a reachable function.  */
   unsigned abstract_and_needed : 1;
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c (revision 195240)
+++ gcc/rtlanal.c (working copy)
@@ -1028,13 +1028,13 @@  record_hard_reg_sets (rtx x, const_rtx p
 /* Examine INSN, and compute the set of hard registers written by it.
    Store it in *PSET.  Should only be called after reload.  */
 void
-find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset)
+find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset, bool implicit)
 {
   rtx link;
 
   CLEAR_HARD_REG_SET (*pset);
   note_stores (PATTERN (insn), record_hard_reg_sets, pset);
-  if (CALL_P (insn))
+  if (implicit && CALL_P (insn))
     IOR_HARD_REG_SET (*pset, call_used_reg_set);
   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
     if (REG_NOTE_KIND (link) == REG_INC)
Index: gcc/final.c
===================================================================
--- gcc/final.c (revision 195240)
+++ gcc/final.c (working copy)
@@ -48,6 +48,7 @@  along with GCC; see the file COPYING3.
 #include "tm.h"
 
 #include "tree.h"
+#include "hard-reg-set.h"
 #include "rtl.h"
 #include "tm_p.h"
 #include "regs.h"
@@ -56,7 +57,6 @@  along with GCC; see the file COPYING3.
 #include "recog.h"
 #include "conditions.h"
 #include "flags.h"
-#include "hard-reg-set.h"
 #include "output.h"
 #include "except.h"
 #include "function.h"
@@ -219,6 +219,7 @@  static int alter_cond (rtx);
 static int final_addr_vec_align (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
+static void collect_fn_hard_reg_usage (void);
 
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4277,6 +4278,8 @@  rest_of_handle_final (void)
   rtx x;
   const char *fnname;
 
+  collect_fn_hard_reg_usage ();
+
   /* Get the function's name, as described by its RTL.  This may be
      different from the DECL_NAME name used in the source file.  */
 
@@ -4533,3 +4536,121 @@  struct rtl_opt_pass pass_clean_state =
   0                                     /* todo_flags_finish */
  }
 };
+
+/* Collect hard register usage for the current function.  */
+
+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+  struct hard_reg_set_container other_usage;
+
+  if (!flag_use_caller_save)
+    return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  gcc_assert (!node->function_used_regs_initialized);
+  node->function_used_regs_initialized = 1;
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+    {
+      HARD_REG_SET insn_used_regs;
+
+      if (!NONDEBUG_INSN_P (insn))
+	continue;
+
+      find_all_hard_reg_sets (insn, &insn_used_regs, false);
+
+      if (CALL_P (insn)
+	  && !get_call_reg_set_usage (insn, &insn_used_regs, call_used_reg_set))
+	{
+	  CLEAR_HARD_REG_SET (node->function_used_regs);
+	  return;
+	}
+
+      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
+    }
+
+  /* Be conservative - mark fixed and global registers as used.  */
+  IOR_HARD_REG_SET (node->function_used_regs, fixed_reg_set);
+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+    if (global_regs[i])
+      SET_HARD_REG_BIT (node->function_used_regs, i);
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+     provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
+    SET_HARD_REG_BIT (node->function_used_regs, i);
+#endif
+
+  CLEAR_HARD_REG_SET (other_usage.set);
+  targetm.fn_other_hard_reg_usage (&other_usage);
+  IOR_HARD_REG_SET (node->function_used_regs, other_usage.set);
+
+  node->function_used_regs_valid = 1;
+}
+
+/* Get the declaration of the function called by INSN.  */
+
+static tree
+get_call_fndecl (rtx insn)
+{
+  rtx note, datum;
+
+  if (!flag_use_caller_save)
+    return NULL_TREE;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+    return NULL_TREE;
+
+  datum = XEXP (note, 0);
+  if (datum != NULL_RTX)
+    return SYMBOL_REF_DECL (datum);
+
+  return NULL_TREE;
+}
+
+static struct cgraph_node *
+get_call_cgraph_node (rtx insn)
+{
+  tree fndecl;
+
+  if (insn == NULL_RTX)
+    return NULL;
+
+  fndecl = get_call_fndecl (insn);
+  if (fndecl == NULL_TREE
+      || !targetm.binds_local_p (fndecl))
+    return NULL;
+
+  return cgraph_get_node (fndecl);
+}
+
+/* Find hard registers used by function call instruction INSN, and return them
+   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */
+
+bool
+get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,
+			HARD_REG_SET default_set)
+{
+  struct cgraph_node *node = get_call_cgraph_node (insn);
+  if (node != NULL
+      && node->function_used_regs_valid)
+    {
+      COPY_HARD_REG_SET (*reg_set, node->function_used_regs);
+      AND_HARD_REG_SET (*reg_set, default_set);
+      return true;
+    }
+  else
+    {
+      COPY_HARD_REG_SET (*reg_set, default_set);
+      return false;
+    }
+}
Index: gcc/regs.h
===================================================================
--- gcc/regs.h (revision 195240)
+++ gcc/regs.h (working copy)
@@ -419,4 +419,8 @@  range_in_hard_reg_set_p (const HARD_REG_
   return true;
 }
 
+/* Get registers used by given function call instruction.  */
+extern bool get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,
+				    HARD_REG_SET default_set);
+
 #endif /* GCC_REGS_H */
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c (revision 195240)
+++ gcc/df-scan.c (working copy)
@@ -3363,10 +3363,13 @@  df_get_call_refs (struct df_collection_r
   bool is_sibling_call;
   unsigned int i;
   HARD_REG_SET defs_generated;
+  HARD_REG_SET fn_reg_set_usage;
 
   CLEAR_HARD_REG_SET (defs_generated);
   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);
   is_sibling_call = SIBLING_CALL_P (insn_info->insn);
+  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,
+			  regs_invalidated_by_call);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
@@ -3391,6 +3394,7 @@  df_get_call_refs (struct df_collection_r
 	    }
 	}
       else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)
+	       && TEST_HARD_REG_BIT (fn_reg_set_usage, i)
 	       /* no clobbers for regs that are the result of the call */
 	       && !TEST_HARD_REG_BIT (defs_generated, i)
 	       && (!is_sibling_call
Index: gcc/haifa-sched.c
===================================================================
--- gcc/haifa-sched.c (revision 195240)
+++ gcc/haifa-sched.c (working copy)
@@ -1271,7 +1271,7 @@  recompute_todo_spec (rtx next, bool for_
 	  {
 	    HARD_REG_SET t;
 
-	    find_all_hard_reg_sets (prev, &t);
+	    find_all_hard_reg_sets (prev, &t, true);
 	    if (TEST_HARD_REG_BIT (t, regno))
 	      return HARD_DEP;
 	    if (prev == pro)
@@ -3041,7 +3041,7 @@  check_clobbered_conditions (rtx insn)
   if ((current_sched_info->flags & DO_PREDICATION) == 0)
     return;
 
-  find_all_hard_reg_sets (insn, &t);
+  find_all_hard_reg_sets (insn, &t, true);
 
  restart:
   for (i = 0; i < ready.n_ready; i++)
Index: gcc/caller-save.c
===================================================================
--- gcc/caller-save.c (revision 195240)
+++ gcc/caller-save.c (working copy)
@@ -441,7 +441,7 @@  setup_save_areas (void)
       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
-      COPY_HARD_REG_SET (used_regs, call_used_reg_set);
+      get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -525,7 +525,7 @@  setup_save_areas (void)
 
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
-	  COPY_HARD_REG_SET (used_regs, call_used_reg_set);
+	  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -804,6 +804,7 @@  save_call_clobbered_regs (void)
 	    {
 	      unsigned regno;
 	      HARD_REG_SET hard_regs_to_save;
+	      HARD_REG_SET call_def_reg_set;
 	      reg_set_iterator rsi;
 	      rtx cheap;
 
@@ -854,7 +855,9 @@  save_call_clobbered_regs (void)
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved);
-	      AND_HARD_REG_SET (hard_regs_to_save, call_used_reg_set);
+	      get_call_reg_set_usage (insn, &call_def_reg_set,
+				      call_used_reg_set);
+	      AND_HARD_REG_SET (hard_regs_to_save, call_def_reg_set);
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 		if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))
Index: gcc/ira-int.h
===================================================================
--- gcc/ira-int.h (revision 195240)
+++ gcc/ira-int.h (working copy)
@@ -374,6 +374,8 @@  struct ira_allocno
   /* The number of calls across which it is live, but which should not
      affect register preferences.  */
   int cheap_calls_crossed_num;
+  /* Registers clobbered by intersected calls.  */
+   HARD_REG_SET crossed_calls_clobbered_regs;
   /* Array of usage costs (accumulated and the one updated during
      coloring) for each hard register of the allocno class.  The
      member value can be NULL if all costs are the same and equal to
@@ -417,6 +419,8 @@  struct ira_allocno
 #define ALLOCNO_CALL_FREQ(A) ((A)->call_freq)
 #define ALLOCNO_CALLS_CROSSED_NUM(A) ((A)->calls_crossed_num)
 #define ALLOCNO_CHEAP_CALLS_CROSSED_NUM(A) ((A)->cheap_calls_crossed_num)
+#define ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS(A) \
+  ((A)->crossed_calls_clobbered_regs)
 #define ALLOCNO_MEM_OPTIMIZED_DEST(A) ((A)->mem_optimized_dest)
 #define ALLOCNO_MEM_OPTIMIZED_DEST_P(A) ((A)->mem_optimized_dest_p)
 #define ALLOCNO_SOMEWHERE_RENAMED_P(A) ((A)->somewhere_renamed_p)
Index: gcc/opts.c
===================================================================
--- gcc/opts.c (revision 195240)
+++ gcc/opts.c (working copy)
@@ -484,6 +484,7 @@  static const struct default_options defa
     { OPT_LEVELS_2_PLUS, OPT_ftree_tail_merge, NULL, 1 },
     { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c (revision 195240)
+++ gcc/ira-lives.c (working copy)
@@ -1273,6 +1273,10 @@  process_bb_node_lives (ira_loop_tree_nod
 		  ira_object_t obj = ira_object_id_map[i];
 		  ira_allocno_t a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
+		  HARD_REG_SET this_call_used_reg_set;
+
+		  get_call_reg_set_usage (insn, &this_call_used_reg_set,
+					  call_used_reg_set);
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
@@ -1287,9 +1291,9 @@  process_bb_node_lives (ira_loop_tree_nod
 		  if (can_throw_internal (insn))
 		    {
 		      IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj),
-					call_used_reg_set);
+					this_call_used_reg_set);
 		      IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
-					call_used_reg_set);
+					this_call_used_reg_set);
 		    }
 
 		  if (sparseset_bit_p (allocnos_processed, num))
@@ -1306,6 +1310,8 @@  process_bb_node_lives (ira_loop_tree_nod
 		  /* Mark it as saved at the next call.  */
 		  allocno_saved_at_call[num] = last_call_num + 1;
 		  ALLOCNO_CALLS_CROSSED_NUM (a)++;
+		  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+				    this_call_used_reg_set);
 		  if (cheap_reg != NULL_RTX
 		      && ALLOCNO_REGNO (a) == (int) REGNO (cheap_reg))
 		    ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)++;
Index: gcc/ira-build.c
===================================================================
--- gcc/ira-build.c (revision 195240)
+++ gcc/ira-build.c (working copy)
@@ -506,6 +506,7 @@  ira_create_allocno (int regno, bool cap_
   ALLOCNO_CALL_FREQ (a) = 0;
   ALLOCNO_CALLS_CROSSED_NUM (a) = 0;
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a) = 0;
+  CLEAR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 #ifdef STACK_REGS
   ALLOCNO_NO_STACK_REG_P (a) = false;
   ALLOCNO_TOTAL_NO_STACK_REG_P (a) = false;
@@ -903,6 +904,8 @@  create_cap_allocno (ira_allocno_t a)
 
   ALLOCNO_CALLS_CROSSED_NUM (cap) = ALLOCNO_CALLS_CROSSED_NUM (a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (cap) = ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (cap),
+		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
   if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
     {
       fprintf (ira_dump_file, "    Creating cap ");
@@ -1822,6 +1825,8 @@  propagate_allocno_info (void)
 	    += ALLOCNO_CALLS_CROSSED_NUM (a);
 	  ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	    += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+ 	  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),
+ 			    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 	  ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 	    += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
 	  aclass = ALLOCNO_CLASS (a);
@@ -2202,6 +2207,9 @@  propagate_some_info_from_allocno (ira_al
   ALLOCNO_CALLS_CROSSED_NUM (a) += ALLOCNO_CALLS_CROSSED_NUM (from_a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)
     += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (from_a);
+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+ 		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (from_a));
+
   ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a)
     += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (from_a);
   if (! ALLOCNO_BAD_SPILL_P (from_a))
@@ -2827,6 +2835,8 @@  copy_info_to_removed_store_destinations
 	+= ALLOCNO_CALLS_CROSSED_NUM (a);
       ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	+= ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+      IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),
+ 			ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
       ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 	+= ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
       merged_p = true;
Index: gcc/calls.c
===================================================================
--- gcc/calls.c (revision 195240)
+++ gcc/calls.c (working copy)
@@ -3158,6 +3158,19 @@  expand_call (tree exp, rtx target, int i
 		   next_arg_reg, valreg, old_inhibit_defer_pop, call_fusage,
 		   flags, args_so_far);
 
+      if (flag_use_caller_save)
+	{
+	  rtx last, datum = NULL_RTX;
+	  if (fndecl != NULL_TREE)
+	    {
+	      datum = XEXP (DECL_RTL (fndecl), 0);
+	      gcc_assert (datum != NULL_RTX
+			  && GET_CODE (datum) == SYMBOL_REF);
+	    }
+	  last = last_call_insn ();
+	  add_reg_note (last, REG_CALL_DECL, datum);
+	}
+
       /* If the call setup or the call itself overlaps with anything
 	 of the argument setup we probably clobbered our call address.
 	 In that case we can't do sibcalls.  */
@@ -4183,6 +4196,14 @@  emit_library_call_value_1 (int retval, r
 	       valreg,
 	       old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far);
 
+  if (flag_use_caller_save)
+    {
+      rtx last, datum = orgfun;
+      gcc_assert (GET_CODE (datum) == SYMBOL_REF);
+      last = last_call_insn ();
+      add_reg_note (last, REG_CALL_DECL, datum);
+    }
+
   /* Right-shift returned value if necessary.  */
   if (!pcc_struct_value
       && TYPE_MODE (tfom) != BLKmode
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c (revision 195240)
+++ gcc/emit-rtl.c (working copy)
@@ -3517,6 +3517,7 @@  try_split (rtx pat, rtx trial, int last)
   int probability;
   rtx insn_last, insn;
   int njumps = 0;
+  rtx call_insn = NULL_RTX;
 
   /* We're not good at redistributing frame information.  */
   if (RTX_FRAME_RELATED_P (trial))
@@ -3589,6 +3590,9 @@  try_split (rtx pat, rtx trial, int last)
 	  {
 	    rtx next, *p;
 
+	    gcc_assert (call_insn == NULL_RTX);
+	    call_insn = insn;
+
 	    /* Add the old CALL_INSN_FUNCTION_USAGE to whatever the
 	       target may have explicitly specified.  */
 	    p = &CALL_INSN_FUNCTION_USAGE (insn);
@@ -3660,6 +3664,11 @@  try_split (rtx pat, rtx trial, int last)
 	  fixup_args_size_notes (NULL_RTX, insn_last, INTVAL (XEXP (note, 0)));
 	  break;
 
+	case REG_CALL_DECL:
+	  gcc_assert (call_insn != NULL_RTX);
+	  add_reg_note (call_insn, REG_NOTE_KIND (note), XEXP (note, 0));
+	  break;
+
 	default:
 	  break;
 	}
Index: gcc/common.opt
===================================================================
--- gcc/common.opt (revision 195240)
+++ gcc/common.opt (working copy)
@@ -2540,4 +2540,8 @@  Create a position independent executable
 z
 Driver Joined Separate
 
+fuse-caller-save
+Common Report Var(flag_use_caller_save) Optimization
+Use caller save register across calls if possible
+
 ; This comment is to ensure we retain the blank line above.
Index: gcc/ira-costs.c
===================================================================
--- gcc/ira-costs.c (revision 195240)
+++ gcc/ira-costs.c (working copy)
@@ -2082,6 +2082,7 @@  ira_tune_allocno_costs (void)
   ira_allocno_object_iterator oi;
   ira_object_t obj;
   bool skip_p;
+  HARD_REG_SET *crossed_calls_clobber_regs;
 
   FOR_EACH_ALLOCNO (a, ai)
     {
@@ -2116,17 +2117,24 @@  ira_tune_allocno_costs (void)
 		continue;
 	      rclass = REGNO_REG_CLASS (regno);
 	      cost = 0;
-	      if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set)
-		  || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
-		cost += (ALLOCNO_CALL_FREQ (a)
-			 * (ira_memory_move_cost[mode][rclass][0]
-			    + ira_memory_move_cost[mode][rclass][1]));
+	      crossed_calls_clobber_regs
+		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
+	      if (ira_hard_reg_set_intersection_p (regno, mode,
+						   *crossed_calls_clobber_regs))
+		{
+		  if (ira_hard_reg_set_intersection_p (regno, mode,
+						       call_used_reg_set)
+		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
+		    cost += (ALLOCNO_CALL_FREQ (a)
+			     * (ira_memory_move_cost[mode][rclass][0]
+				+ ira_memory_move_cost[mode][rclass][1]));
 #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
-	      cost += ((ira_memory_move_cost[mode][rclass][0]
-			+ ira_memory_move_cost[mode][rclass][1])
-		       * ALLOCNO_FREQ (a)
-		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
+		  cost += ((ira_memory_move_cost[mode][rclass][0]
+			    + ira_memory_move_cost[mode][rclass][1])
+			   * ALLOCNO_FREQ (a)
+			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
 #endif
+		}
 	      if (INT_MAX - cost < reg_costs[j])
 		reg_costs[j] = INT_MAX;
 	      else
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h (revision 195240)
+++ gcc/rtl.h (working copy)
@@ -2039,7 +2039,7 @@  extern const_rtx set_of (const_rtx, cons
 extern void record_hard_reg_sets (rtx, const_rtx, void *);
 extern void record_hard_reg_uses (rtx *, void *);
 #ifdef HARD_CONST
-extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *);
+extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *, bool);
 #endif
 extern void note_stores (const_rtx, void (*) (rtx, const_rtx, void *), void *);
 extern void note_uses (rtx *, void (*) (rtx *, void *), void *);
Index: gcc/combine.c
===================================================================
--- gcc/combine.c (revision 195240)
+++ gcc/combine.c (working copy)
@@ -13188,6 +13188,7 @@  distribute_notes (rtx notes, rtx from_in
 	case REG_NORETURN:
 	case REG_SETJMP:
 	case REG_TM:
+	case REG_CALL_DECL:
 	  /* These notes must remain with the call.  It should not be
 	     possible for both I2 and I3 to be a call.  */
 	  if (CALL_P (i3))
Index: gcc/resource.c
===================================================================
--- gcc/resource.c (revision 195240)
+++ gcc/resource.c (working copy)
@@ -649,10 +649,12 @@  mark_set_resources (rtx x, struct resour
       if (mark_type == MARK_SRC_DEST_CALL)
 	{
 	  rtx link;
+	  HARD_REG_SET regs;
 
 	  res->cc = res->memory = 1;
 
-	  IOR_HARD_REG_SET (res->regs, regs_invalidated_by_call);
+	  get_call_reg_set_usage (x, &regs, regs_invalidated_by_call);
+	  IOR_HARD_REG_SET (res->regs, regs);
 
 	  for (link = CALL_INSN_FUNCTION_USAGE (x);
 	       link; link = XEXP (link, 1))
@@ -998,11 +1000,15 @@  mark_target_live_regs (rtx insns, rtx ta
 
 	  if (CALL_P (real_insn))
 	    {
+	      HARD_REG_SET regs_invalidated_by_this_call;
 	      /* CALL clobbers all call-used regs that aren't fixed except
 		 sp, ap, and fp.  Do this before setting the result of the
 		 call live.  */
-	      AND_COMPL_HARD_REG_SET (current_live_regs,
+	      get_call_reg_set_usage (real_insn,
+				      &regs_invalidated_by_this_call,
 				      regs_invalidated_by_call);
+	      AND_COMPL_HARD_REG_SET (current_live_regs,
+				      regs_invalidated_by_this_call);
 
 	      /* A CALL_INSN sets any global register live, since it may
 		 have been modified by the call.  */
Index: gcc/reg-notes.def
===================================================================
--- gcc/reg-notes.def (revision 195240)
+++ gcc/reg-notes.def (working copy)
@@ -216,3 +216,8 @@  REG_NOTE (ARGS_SIZE)
    that the return value of a call can be used to reinitialize a
    pseudo reg.  */
 REG_NOTE (RETURNED)
+
+/* Used to mark a call with the function decl called by the call.
+   The decl might not be available in the call due to splitting of the call
+   insn.  This note is a SYMBOL_REF.  */
+REG_NOTE (CALL_DECL)
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi (revision 195418)
+++ gcc/doc/tm.texi (working copy)
@@ -3074,6 +3074,7 @@  This describes the stack layout and call
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -4999,6 +5000,14 @@  normally defined in @file{libgcc2.c}.
 Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
 @end deftypefn
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@deftypefn {Target Hook} void TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
+Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to be defined to provide registers that cannot be found by examination of the final RTL representation of a function.
+@end deftypefn
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in (revision 195418)
+++ gcc/doc/tm.texi.in (working copy)
@@ -3042,6 +3042,7 @@  This describes the stack layout and call
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -4922,6 +4923,12 @@  normally defined in @file{libgcc2.c}.
 
 @hook TARGET_SUPPORTS_SPLIT_STACK
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@hook TARGET_FN_OTHER_HARD_REG_USAGE
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 195418)
+++ gcc/doc/invoke.texi (working copy)
@@ -419,8 +419,8 @@  Objective-C and Objective-C++ Dialects}.
 -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol
 -funit-at-a-time -funroll-all-loops -funroll-loops @gol
 -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
--fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol
--fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
+-fuse-caller-save -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol
+-fweb -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
 --param @var{name}=@var{value}
 -O  -O0  -O1  -O2  -O3  -Os -Ofast -Og}
 
@@ -7355,6 +7355,14 @@  and then tries to find ways to combine t
 
 Enabled by default at @option{-O1} and higher.
 
+@item -fuse-caller-save
+Use caller save registers for allocation if those registers are not used by
+any called function.  In that case it is not necessary to save and restore
+them around calls.  This is only possible if called functions are part of
+same compilation unit as current function and they are compiled before it.
+
+Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
+
 @item -fconserve-stack
 @opindex fconserve-stack
 Attempt to minimize stack usage.  The compiler attempts to use less
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c (revision 195240)
+++ gcc/config/arm/arm.c (working copy)
@@ -270,6 +270,7 @@  static bool arm_vectorize_vec_perm_const
 					     const unsigned char *sel);
 static void arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
 					 bool op0_preserve_value);
+static void arm_fn_other_hard_reg_usage (struct hard_reg_set_container *);
 
 /* Table of machine attributes.  */
 static const struct attribute_spec arm_attribute_table[] =
@@ -633,6 +634,10 @@  static const struct attribute_spec arm_a
 #define TARGET_CANONICALIZE_COMPARISON \
   arm_canonicalize_comparison
 
+#undef TARGET_FN_OTHER_HARD_REG_USAGE
+#define TARGET_FN_OTHER_HARD_REG_USAGE \
+  arm_fn_other_hard_reg_usage
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Obstack for minipool constant handling.  */
@@ -3695,6 +3700,19 @@  arm_canonicalize_comparison (int *code,
     }
 }
 
+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
+
+static void
+arm_fn_other_hard_reg_usage (struct hard_reg_set_container *regs)
+{
+  if (TARGET_AAPCS_BASED)
+    {
+      /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
+	 linker.  */
+      SET_HARD_REG_BIT (regs->set, IP_REGNUM);
+      SET_HARD_REG_BIT (regs->set, CC_REGNUM);
+    }
+}
 
 /* Define how to find the value returned by a function.  */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp (revision 195240)
+++ gcc/testsuite/lib/target-supports.exp (working copy)
@@ -897,6 +897,26 @@  proc check_effective_target_mips16_attri
     } [add_options_for_mips16_attribute ""]]
 }
 
+# Return 1 if the target generates mips16 code by default.
+
+proc check_effective_target_mips16 { } {
+    return [check_no_compiler_messages mips16 assembly {
+	#if !(defined __mips16)
+	#error FOO
+	#endif
+    } ""]
+}
+
+# Return 1 if the target generates micromips code by default.
+
+proc check_effective_target_micromips { } {
+    return [check_no_compiler_messages micromips assembly {
+	#if !(defined __mips_micromips)
+	#error FOO
+	#endif
+    } ""]
+}
+
 # Return 1 if the target supports long double larger than double when
 # using the new ABI, 0 otherwise.
 
Index: gcc/testsuite/gcc.target/mips/mips.exp
===================================================================
--- gcc/testsuite/gcc.target/mips/mips.exp (revision 195240)
+++ gcc/testsuite/gcc.target/mips/mips.exp (working copy)
@@ -245,6 +245,7 @@  set mips_option_groups {
     small-data "-G[0-9]+"
     warnings "-w"
     dump "-fdump-.*"
+    save_temps "-save-temps"
 }
 
 # Add -mfoo/-mno-foo options to mips_option_groups.
@@ -301,6 +302,7 @@  foreach option {
     tree-vectorize
     unroll-all-loops
     unroll-loops
+    use-caller-save
 } {
     lappend mips_option_groups $option "-f(no-|)$option"
 }
Index: gcc/testsuite/gcc.target/mips/aru-1.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/mips/aru-1.c (revision 0)
@@ -0,0 +1,38 @@ 
+/* { dg-do run } */
+/* { dg-options "-fuse-caller-save -save-temps" } */
+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline))
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline))
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* Check that there are only 2 stack-saves: r31 in main and foo.  */
+
+/* Variant not mips16.  Check that there only 2 sw/sd.  */
+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */
+
+/* Variant not mips16, Subvariant micromips.  Additionally check there's no
+   swm.  */
+/* { dg-final { scan-assembler-times "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */
+
+/* Variant mips16.  The save can save 1 or more registers, check that only 1 is
+   saved, twice in total.  */
+/* { dg-final { scan-assembler-times "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */
+
+/* Check that the first caller-save register is unused.  */
+/* { dg-final { scan-assembler-not "(\\\$16)" } } */