[IRA] Analysis of register usage of functions for usage by IRA.

Vladimir,

this patch adds analysis of register usage of functions for usage by IRA.

The patch:
- adds analysis in pass_final to track which hard registers are set or clobbered
  by the function body, and stores that information in a struct cgraph_node.
- adds a target hook fn_other_hard_reg_usage to list hard registers that are
  set or clobbered by a call to a function, but are not listed as such in the
  function body, such as f.i. registers clobbered by veneers inserted by the
  linker.
- adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
  corresponding declaration, even after the calls may have been split into an
  insn (set register to function address) and a call_insn (call register), which
  can happen for f.i. sh, and mips with -mabi-calls.
- uses the register analysis in IRA.
- adds an option -fuse-caller-save to control the optimization, on by default
  at -Os and -O2 and higher.

The patch (original version by Radovan Obradovic) is similar to your patch
( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
But this patch doesn't implement save area stack slot sharing.
( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
patch ).

[ Steven, you mentioned in this discussion
  ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
  porting the 2007 patch to trunk. What is the status of that effort?
]

As an example of the functionality, consider foo and bar from test-case aru-1.c:
...
static int __attribute__((noinline))
bar (int x)
{
  return x + 3;
}

int __attribute__((noinline))
foo (int y)
{
  return y + bar (y);
}
...

Compiled at -O2, bar only sets register $2 (the first return register):
...
bar:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        j       $31
        addiu   $2,$4,3
...

foo then can use register $3 (the second return register) instead of register
$16 to save the value in register $4 (the first argument register) over the
call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
...
foo:                                    foo:
# vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
.frame  $sp,32,$31                      .frame  $sp,32,$31
.mask   0x80010000,-4                 | .mask   0x80000000,-4
.fmask  0x00000000,0                    .fmask  0x00000000,0
.set    noreorder                       .set    noreorder
.set    nomacro                         .set    nomacro
addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
sw      $31,28($sp)                     sw      $31,28($sp)
sw      $16,24($sp)                   <
.option pic0                            .option pic0
jal     bar                             jal     bar
.option pic2                            .option pic2
move    $16,$4                        | move    $3,$4

lw      $31,28($sp)                     lw      $31,28($sp)
addu    $2,$2,$16                     | addu    $2,$2,$3
lw      $16,24($sp)                   <
j       $31                             j       $31
addiu   $sp,$sp,32                      addiu   $sp,$sp,32
...
That way we skip the save and restore of register $16, which is not necessary
for $3. Btw, a further improvement could be to reuse $4 after the call, and
eliminate the move.

A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
where a register was clobbered by a stack pop instruction, while that was not
visible in the rtl representation. This instruction was introduced in
arm_output_epilogue by code marked with the comment 'pop call clobbered
registers if it avoids a separate stack adjustment'.
I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
that the epilogue instructions now list all registers set by it, so
collect_fn_hard_reg_usage is able to analyze all clobbered registers.

Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
mips, arm, ppc and sh. No issues found. OK for stage1 trunk?

Thanks,
- Tom

2013-01-24  Radovan Obradovic  <robradovic@mips.com>
	    Tom de Vries  <tom@codesourcery.com>

	* hooks.c (hook_void_hard_reg_set_containerp): New function.
	* hooks.h (hook_void_hard_reg_set_containerp): Declare.
	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
	arm_fn_other_hard_reg_usage.
	(arm_fn_other_hard_reg_usage): New function.
	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
	Hooks to @menu.
	(@node Miscellaneous Register Hooks): New node.
	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
	* doc/tm.texi: Regenerate.
	* reg-notes.def (REG_NOTE (CALL_DECL)): New reg-note REG_CALL_DECL.
	* calls.c (expand_call, emit_library_call_value_1): Add REG_CALL_DECL
	reg-note.
	* combine.c (distribute_notes): Handle REG_CALL_DECL reg-note.
	* emit-rtl.c (try_split): Same.
	* rtlanal.c (find_all_hard_reg_sets): Add bool implicit parameter and
	handle.
	* rtl.h (find_all_hard_reg_sets): Add bool parameter.
	* haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
	new argument to find_all_hard_reg_sets call.
	cgraph.h (struct cgraph_node): Add function_used_regs,
	function_used_regs_initialized and function_used_regs_valid fields.
	* common.opt (fuse-caller-save): New option.
	* opts.c (default_options_table): Add OPT_LEVELS_2_PLUS entry with
	OPT_fuse_caller_save.
	* final.c: Move include of hard-reg-set.h to before rtl.h to declare
	find_all_hard_reg_sets.
	(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
	(get_call_reg_set_usage): New function.
	(rest_of_handle_final): Use collect_fn_hard_reg_usage.
	* regs.h (get_call_reg_set_usage): Declare.
	* df-scan.c (df_get_call_refs): Use get_call_reg_set_usage.
	* caller-save.c (setup_save_areas, save_call_clobbered_regs): Use
	get_call_reg_set_usage.
	* resource.c (mark_set_resources, mark_target_live_regs): Use
	get_call_reg_set_usage.
	* ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs
	field.
	(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define.
	* ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage.
	Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-build.c (ira_create_allocno): Init
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	(create_cap_allocno, propagate_allocno_info)
	(propagate_some_info_from_allocno)
	(copy_info_to_removed_store_destinations): Handle
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-costs.c (ira_tune_allocno_costs): Use
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.
	* doc/invoke.texi (@item Optimization Options): Add -fuse-caller-save to
	gccoptlist.
	(@item -fuse-caller-save): New item.

	* lib/target-supports.exp (check_effective_target_mips16)
	(check_effective_target_micromips): New proc.
	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.  Add -save-temps to mips_option_groups.
	* gcc.target/mips/aru-1.c: New test.

[IRA] Analysis of register usage of functions for usage by IRA.

Commit Message

Comments

Patch