diff mbox

Fix estimation of array accesses

Message ID 20121029174114.GA25639@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Oct. 29, 2012, 5:41 p.m. UTC
> > ICK ...
> > 
> > Why not sth as simple as
> > 
> >      return num_ssa_operands (stmt, SSA_OP_USE);
> > 
> > ?  a[1][2] and b[2] really have the same cost, variable length
> > objects have extra SSA operands in ARRAY_REF/COMPONENT_REF for
> > the size.  Thus, stmt cost somehow should reflect the number
> > of dependent stmts.
> > 
> > So in estimate_num_insns I'd try
> > 
> > int
> > estimate_num_insns (gimple stmt, eni_weights *weights)
> > {
> >   unsigned cost, i;
> >   enum gimple_code code = gimple_code (stmt);
> >   tree lhs;
> >   tree rhs;
> > 
> >   switch (code)
> >   {
> >    case GIMPLE_ASSIGN:
> >     /* Initialize with the number of SSA uses, one is free.  */
> >     cost = num_ssa_operands (stmt, SSA_OP_USE);
> >     if (cost > 1)
> >       --cost;
> 
> Hi,
> this is the udpated patch I am testing after today discussion.  I decided to
> drop the ipa-inline-analysis changes and do that incrementally. So the patch
> now trashes tramp3d performance by increasing need for early-inlining-insns,
> but it is not inexpected.
> 
> The patch also fixes accounting of addr expr that got previously confused with
> a load.
> 
> Does this seem sane?
> 
> 	* tree-inline.c (estimate_operator_cost): Return 1 for non-trivial
> 	ADDR_EXPR operations.
> 	(estimate_num_insns): Do not confuse general single rhs with
> 	loads; account cost of non-trivial addr_expr for ASSIGN_EXPR,
> 	GIMPLE_RETURN and GIMPLE_ASM

Hi,
this patch actually do not cause that much of tramp3d fuzz and no differences
in testsuite due to unroling changes

The change are the constants when accounting loads and stores.
Typical store has two SSA uses (one for address and one for value to store).
Of course we lose difference in between array offset and pointer dereference.

Typical load/address expression has one SSA use (for the address)

Bootstrapped/regtested x86_64-linux, OK?

Honza

Comments

Richard Biener Oct. 30, 2012, 11:39 a.m. UTC | #1
On Mon, 29 Oct 2012, Jan Hubicka wrote:

> > > ICK ...
> > > 
> > > Why not sth as simple as
> > > 
> > >      return num_ssa_operands (stmt, SSA_OP_USE);
> > > 
> > > ?  a[1][2] and b[2] really have the same cost, variable length
> > > objects have extra SSA operands in ARRAY_REF/COMPONENT_REF for
> > > the size.  Thus, stmt cost somehow should reflect the number
> > > of dependent stmts.
> > > 
> > > So in estimate_num_insns I'd try
> > > 
> > > int
> > > estimate_num_insns (gimple stmt, eni_weights *weights)
> > > {
> > >   unsigned cost, i;
> > >   enum gimple_code code = gimple_code (stmt);
> > >   tree lhs;
> > >   tree rhs;
> > > 
> > >   switch (code)
> > >   {
> > >    case GIMPLE_ASSIGN:
> > >     /* Initialize with the number of SSA uses, one is free.  */
> > >     cost = num_ssa_operands (stmt, SSA_OP_USE);
> > >     if (cost > 1)
> > >       --cost;
> > 
> > Hi,
> > this is the udpated patch I am testing after today discussion.  I decided to
> > drop the ipa-inline-analysis changes and do that incrementally. So the patch
> > now trashes tramp3d performance by increasing need for early-inlining-insns,
> > but it is not inexpected.
> > 
> > The patch also fixes accounting of addr expr that got previously confused with
> > a load.
> > 
> > Does this seem sane?
> > 
> > 	* tree-inline.c (estimate_operator_cost): Return 1 for non-trivial
> > 	ADDR_EXPR operations.
> > 	(estimate_num_insns): Do not confuse general single rhs with
> > 	loads; account cost of non-trivial addr_expr for ASSIGN_EXPR,
> > 	GIMPLE_RETURN and GIMPLE_ASM
> 
> Hi,
> this patch actually do not cause that much of tramp3d fuzz and no differences
> in testsuite due to unroling changes
> 
> The change are the constants when accounting loads and stores.
> Typical store has two SSA uses (one for address and one for value to store).
> Of course we lose difference in between array offset and pointer dereference.
> 
> Typical load/address expression has one SSA use (for the address)
> 
> Bootstrapped/regtested x86_64-linux, OK?

ChangeLog?

> Honza
> 
> Index: tree-inline.c
> ===================================================================
> --- tree-inline.c	(revision 192945)
> +++ tree-inline.c	(working copy)
> @@ -3447,6 +3447,19 @@ estimate_operator_cost (enum tree_code c
>        if (TREE_CODE (op2) != INTEGER_CST)
>          return weights->div_mod_cost;
>        return 1;
> +    case ADDR_EXPR:
> +      { 
> +        tree addr_base;
> +        HOST_WIDE_INT addr_offset;
> +
> +	addr_base = get_addr_base_and_unit_offset (TREE_OPERAND (op1, 0),
> +						   &addr_offset);
> +	/* If the offset is variable or with non-zero offset, return 2.  */
> +	if (!addr_base || addr_offset != 0 || TREE_CODE (addr_base) != MEM_REF
> +	    || !integer_zerop (TREE_OPERAND (addr_base, 1)))
> +	  return 1;

The comment doesn't match the code.  Why the TREE_CODE (addr_base) != 
MEM_REF check?  If it isn't a MEM_REF then it is a plain decl, thus
no dereference.  So it's not clear what you want here ...?

It looks like you want to pessimize variable addresses here,
like &a.a[i]?  Before all ADDR_EXPR cost was zero.

I'd say you want simply

        if (!addr_base || addr_offset != 0)
          return 1;

no?  Or even

        if (!addr_base
            || (addr_offset != 0 && TREE_CODE (addr_base) == MEM_REF))
          return 1;

that keeps &decl + CST as cost 0.  Or even

        if (!addr_base)
          return 1;

that even keeps ptr + CST as cost 0.  Both because they are likely
combined with some complex addressing mode later.

> +      }
> +      return 0;

Inside the } please.

>  
>      default:
>        /* We expect a copy assignment with no operator.  */
> @@ -3512,14 +3525,24 @@ estimate_num_insns (gimple stmt, eni_wei
>        lhs = gimple_assign_lhs (stmt);
>        rhs = gimple_assign_rhs1 (stmt);
>  
> -      if (is_gimple_reg (lhs))
> -	cost = 0;
> -      else
> +      /* Store.  */
> +      if (gimple_vdef (stmt))
>  	cost = estimate_move_cost (TREE_TYPE (lhs));
> +      else
> +	cost = 0;

That change seems bogus.  A !is_gimple_reg lhs is always a store.

>  
> -      if (!is_gimple_reg (rhs) && !is_gimple_min_invariant (rhs))
> +      /* Load.  */
> +      if (gimple_vuse (stmt))
>  	cost += estimate_move_cost (TREE_TYPE (rhs));

Likewise.  If rhs1 is not a register or a constant this is a load.
All stores also have a VUSE so you are always accounting a store
as aggregate copy this way.

>  
> +      /* Stores, loads and address expressions may have variable array
> +	 references in them. Account these.  */
> +      if (gimple_vdef (stmt))
> +	cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 2);
> +      else if (gimple_vuse (stmt)
> +               || gimple_assign_rhs_code (stmt) == ADDR_EXPR)
> +	cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);
> +

Rather than doing this here in this awkward and bogus way (see above)
why not do it in estimate_operator_cost?  You added ADDR_EXPR already,
so simply add DECLs and handled-components.  Then you can do
better than estimating from num_ssa_operands of the stmt by simply
walking the handled-components looking for SSA uses in the
relevant positions.

>        cost += estimate_operator_cost (gimple_assign_rhs_code (stmt), weights,
>        				      gimple_assign_rhs1 (stmt),
>  				      get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
> @@ -3596,6 +3619,7 @@ estimate_num_insns (gimple stmt, eni_wei
>  
>      case GIMPLE_RETURN:
>        return weights->return_cost;
> +	     + MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);

Well ... why not estimate_operator_cost?

>      case GIMPLE_GOTO:
>      case GIMPLE_LABEL:
> @@ -3606,7 +3630,10 @@ estimate_num_insns (gimple stmt, eni_wei
>        return 0;
>  
>      case GIMPLE_ASM:
> -      return asm_str_count (gimple_asm_string (stmt));
> +      return (asm_str_count (gimple_asm_string (stmt))
> +	      /* Account also possible non-trivial addr_exprs
> +		 in the arguments.  */
> +	      + num_ssa_operands (stmt, SSA_OP_USE));

The comment is misleading, you are counting all simple register
operands as well.  If you want to improve ASM cost then walk
over the inputs/outputs and use estimate_operator_cost again.

Richard.
diff mbox

Patch

Index: tree-inline.c
===================================================================
--- tree-inline.c	(revision 192945)
+++ tree-inline.c	(working copy)
@@ -3447,6 +3447,19 @@  estimate_operator_cost (enum tree_code c
       if (TREE_CODE (op2) != INTEGER_CST)
         return weights->div_mod_cost;
       return 1;
+    case ADDR_EXPR:
+      { 
+        tree addr_base;
+        HOST_WIDE_INT addr_offset;
+
+	addr_base = get_addr_base_and_unit_offset (TREE_OPERAND (op1, 0),
+						   &addr_offset);
+	/* If the offset is variable or with non-zero offset, return 2.  */
+	if (!addr_base || addr_offset != 0 || TREE_CODE (addr_base) != MEM_REF
+	    || !integer_zerop (TREE_OPERAND (addr_base, 1)))
+	  return 1;
+      }
+      return 0;
 
     default:
       /* We expect a copy assignment with no operator.  */
@@ -3512,14 +3525,24 @@  estimate_num_insns (gimple stmt, eni_wei
       lhs = gimple_assign_lhs (stmt);
       rhs = gimple_assign_rhs1 (stmt);
 
-      if (is_gimple_reg (lhs))
-	cost = 0;
-      else
+      /* Store.  */
+      if (gimple_vdef (stmt))
 	cost = estimate_move_cost (TREE_TYPE (lhs));
+      else
+	cost = 0;
 
-      if (!is_gimple_reg (rhs) && !is_gimple_min_invariant (rhs))
+      /* Load.  */
+      if (gimple_vuse (stmt))
 	cost += estimate_move_cost (TREE_TYPE (rhs));
 
+      /* Stores, loads and address expressions may have variable array
+	 references in them. Account these.  */
+      if (gimple_vdef (stmt))
+	cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 2);
+      else if (gimple_vuse (stmt)
+               || gimple_assign_rhs_code (stmt) == ADDR_EXPR)
+	cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);
+
       cost += estimate_operator_cost (gimple_assign_rhs_code (stmt), weights,
       				      gimple_assign_rhs1 (stmt),
 				      get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
@@ -3596,6 +3619,7 @@  estimate_num_insns (gimple stmt, eni_wei
 
     case GIMPLE_RETURN:
       return weights->return_cost;
+	     + MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);
 
     case GIMPLE_GOTO:
     case GIMPLE_LABEL:
@@ -3606,7 +3630,10 @@  estimate_num_insns (gimple stmt, eni_wei
       return 0;
 
     case GIMPLE_ASM:
-      return asm_str_count (gimple_asm_string (stmt));
+      return (asm_str_count (gimple_asm_string (stmt))
+	      /* Account also possible non-trivial addr_exprs
+		 in the arguments.  */
+	      + num_ssa_operands (stmt, SSA_OP_USE));
 
     case GIMPLE_RESX:
       /* This is either going to be an external function call with one