diff mbox

Simplify address expression in IVOPT

Message ID 001501ceda0f$bd913da0$38b3b8e0$@arm.com
State New
Headers show

Commit Message

Bin Cheng Nov. 5, 2013, 10:13 a.m. UTC
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of bin.cheng
> Sent: Monday, November 04, 2013 4:35 PM
> To: 'Richard Biener'
> Cc: GCC Patches
> Subject: RE: [PATCH GCC]Simplify address expression in IVOPT
> 
> 
> 
> > -----Original Message-----
> > From: Richard Biener [mailto:richard.guenther@gmail.com]
> > Sent: Wednesday, October 30, 2013 10:46 PM
> > To: Bin Cheng
> > Cc: GCC Patches
> > Subject: Re: [PATCH GCC]Simplify address expression in IVOPT
> >
> > On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng <bin.cheng@arm.com> wrote:
> >
> > Hmm.  I think you want what get_inner_reference_aff does, using the
> > return value of get_inner_reference as starting point for
> determine_base_object.
> > And you don't want to restrict yourselves so much on what exprs to
> process,
> > but only exclude DECL_Ps.
> Did you mean I should pass all address expressions to
> get_inner_reference_aff except the one with declaration operand?
> I am a little confused here why DECL_Ps should be handled specially?
Seems
> it can be handled properly by the get_* function.  Anything important I
> missed?
> 
> > Just amend get_inner_reference_aff to return the tree base object.
> I found it's inappropriate to do this because: functions like
> get_inner_reference* only accept reference expressions, while base_object
> has to be computed for other kinds of expressions too.  Take gimple
> statement "a_1 = *ptr_1" as an example, the base passed to alloc_iv is
ptr_1;
> the base_object computed by determine_base_object is also ptr_1, which
> we can't rely on get_inner_reference* .
> 
> Also It seems the comment before determine_base_object might be
> inaccurate:
> " Returns a memory object to that EXPR points." which I think is " Returns
a
> pointer pointing to the main memory object to that EXPR points."
> Right?
> >
> > Note that this isn't really "simplifying" but rather lowering all
> addresses.
> >
> > The other changes in this patch are unrelated, right?
> Right, I should have named this message like "refine cost computation of
> expressions in IVOPT" just as the patch file.

Hi,
I updated the patch according to review comments.  Now it lowers all address
expressions except ones with DECL_P to get_inner_reference_aff, it also
computes base_object which is used later to ease determine_base_object.

Bootstrap and test on x86/x86_64/arm on going,  is it OK if tests pass?

Thanks.
bin

gcc/testsuite/ChangeLog
2013-11-05  Bin Cheng  <bin.cheng@arm.com>

	* gcc.dg/tree-ssa/loop-2.c: Refine check condition.
	* gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto.
	* gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto.

2013-11-05  Bin Cheng  <bin.cheng@arm.com>

	* tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions.
	(get_shiftadd_cost): Check equality using operand_equal_p.
	(force_expr_to_var_cost): Refactor the code.  Handle type
	conversion.
	(split_address_cost): Call force_expr_to_var_cost.
	* tree-affine.c (get_inner_reference_aff): Return base_addr.
	* tree-affine.h (get_inner_reference_aff): Change prototype.

Comments

Yufeng Zhang Nov. 5, 2013, 11:19 a.m. UTC | #1
On 11/05/13 10:13, bin.cheng wrote:
> Index: gcc/tree-affine.c
> ===================================================================
> --- gcc/tree-affine.c	(revision 204117)
> +++ gcc/tree-affine.c	(working copy)
> @@ -874,10 +874,11 @@ debug_aff (aff_tree *val)
>     fprintf (stderr, "\n");
>   }
>
> -/* Returns address of the reference REF in ADDR.  The size of the accessed
> -   location is stored to SIZE.  */
> +/* Computes address of the reference REF in ADDR.  The size of the accessed
> +   location is stored to SIZE.  Returns pointer to the ultimate containing
> +   object to which REF refers.  */
>
> -void
> +tree
>   get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size)
>   {
>     HOST_WIDE_INT bitsize, bitpos;
> @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr,
>     aff_combination_add (addr,&tmp);
>
>     *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT);
> +
> +  return base_addr;
>   }
>

I think what Richard suggests is to return the base object rather the 
address of the base object, i.e.

   return base;

This is good in keeping the consistency in the return values between 
get_inner_reference_aff and get_inner_reference.

Yufeng
Bin.Cheng Nov. 5, 2013, 11:45 a.m. UTC | #2
On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote:
> On 11/05/13 10:13, bin.cheng wrote:
>>
>> Index: gcc/tree-affine.c
>> ===================================================================
>> --- gcc/tree-affine.c   (revision 204117)
>> +++ gcc/tree-affine.c   (working copy)
>> @@ -874,10 +874,11 @@ debug_aff (aff_tree *val)
>>     fprintf (stderr, "\n");
>>   }
>>
>> -/* Returns address of the reference REF in ADDR.  The size of the
>> accessed
>> -   location is stored to SIZE.  */
>> +/* Computes address of the reference REF in ADDR.  The size of the
>> accessed
>> +   location is stored to SIZE.  Returns pointer to the ultimate
>> containing
>> +   object to which REF refers.  */
>>
>> -void
>> +tree
>>   get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size)
>>   {
>>     HOST_WIDE_INT bitsize, bitpos;
>> @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr,
>>     aff_combination_add (addr,&tmp);
>>
>>     *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) /
>> BITS_PER_UNIT);
>> +
>> +  return base_addr;
>>   }
>>
>
> I think what Richard suggests is to return the base object rather the
> address of the base object, i.e.
I am not sure about that.  We have to pass pointer_type expression to
function determine_base_object for address expressions, because there
is no way to tell pointer from object once we are in
determine_base_object.
Thanks.
bin
>
>   return base;
>
> This is good in keeping the consistency in the return values between
> get_inner_reference_aff and get_inner_reference.
>
> Yufeng
>
Richard Biener Nov. 5, 2013, 12:18 p.m. UTC | #3
On Tue, Nov 5, 2013 at 11:13 AM, bin.cheng <bin.cheng@arm.com> wrote:
>
>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>> owner@gcc.gnu.org] On Behalf Of bin.cheng
>> Sent: Monday, November 04, 2013 4:35 PM
>> To: 'Richard Biener'
>> Cc: GCC Patches
>> Subject: RE: [PATCH GCC]Simplify address expression in IVOPT
>>
>>
>>
>> > -----Original Message-----
>> > From: Richard Biener [mailto:richard.guenther@gmail.com]
>> > Sent: Wednesday, October 30, 2013 10:46 PM
>> > To: Bin Cheng
>> > Cc: GCC Patches
>> > Subject: Re: [PATCH GCC]Simplify address expression in IVOPT
>> >
>> > On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng <bin.cheng@arm.com> wrote:
>> >
>> > Hmm.  I think you want what get_inner_reference_aff does, using the
>> > return value of get_inner_reference as starting point for
>> determine_base_object.
>> > And you don't want to restrict yourselves so much on what exprs to
>> process,
>> > but only exclude DECL_Ps.
>> Did you mean I should pass all address expressions to
>> get_inner_reference_aff except the one with declaration operand?
>> I am a little confused here why DECL_Ps should be handled specially?
> Seems
>> it can be handled properly by the get_* function.  Anything important I
>> missed?
>>
>> > Just amend get_inner_reference_aff to return the tree base object.
>> I found it's inappropriate to do this because: functions like
>> get_inner_reference* only accept reference expressions, while base_object
>> has to be computed for other kinds of expressions too.  Take gimple
>> statement "a_1 = *ptr_1" as an example, the base passed to alloc_iv is
> ptr_1;
>> the base_object computed by determine_base_object is also ptr_1, which
>> we can't rely on get_inner_reference* .
>>
>> Also It seems the comment before determine_base_object might be
>> inaccurate:
>> " Returns a memory object to that EXPR points." which I think is " Returns
> a
>> pointer pointing to the main memory object to that EXPR points."
>> Right?
>> >
>> > Note that this isn't really "simplifying" but rather lowering all
>> addresses.
>> >
>> > The other changes in this patch are unrelated, right?
>> Right, I should have named this message like "refine cost computation of
>> expressions in IVOPT" just as the patch file.
>
> Hi,
> I updated the patch according to review comments.  Now it lowers all address
> expressions except ones with DECL_P to get_inner_reference_aff, it also
> computes base_object which is used later to ease determine_base_object.
>
> Bootstrap and test on x86/x86_64/arm on going,  is it OK if tests pass?

 static struct iv *
 alloc_iv (tree base, tree step)
 {
+  tree expr = base;
+  tree base_object = base;
   struct iv *iv = XCNEW (struct iv);
   gcc_assert (step != NULL_TREE);

+  /* Lower all address expressions except ones with DECL_P as oeprand.
+     By doing this:
+       1) More accurate cost can be computed for address expressions;
+       2) Duplicate candidates won't be created for bases in different
+          forms, like &a[0] and &a.  */
+  STRIP_NOPS (expr);
+  if (TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0)))
+    {
+      aff_tree comb;
+      double_int size;
+      base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0),
+     &comb, &size);
+      gcc_assert (base_object != NULL_TREE);
+      base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb));
+    }
+
   iv->base = base;
-  iv->base_object = determine_base_object (base);
+  iv->base_object = determine_base_object (base_object);


for less confusion do not introduce 'expr' but just base_object
(s/expr/base_object/).  Also can you split out this change (and the
related tree-affine.c one) from the rest?

Ok with that change assuming it passes bootstrap & testing.

Thanks,
Richard.

> Thanks.
> bin
>
> gcc/testsuite/ChangeLog
> 2013-11-05  Bin Cheng  <bin.cheng@arm.com>
>
>         * gcc.dg/tree-ssa/loop-2.c: Refine check condition.
>         * gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto.
>         * gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto.
>
> 2013-11-05  Bin Cheng  <bin.cheng@arm.com>
>
>         * tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions.
>         (get_shiftadd_cost): Check equality using operand_equal_p.
>         (force_expr_to_var_cost): Refactor the code.  Handle type
>         conversion.
>         (split_address_cost): Call force_expr_to_var_cost.
>         * tree-affine.c (get_inner_reference_aff): Return base_addr.
>         * tree-affine.h (get_inner_reference_aff): Change prototype.
Yufeng Zhang Nov. 5, 2013, 12:18 p.m. UTC | #4
On 11/05/13 11:45, Bin.Cheng wrote:
> On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang<Yufeng.Zhang@arm.com>  wrote:
>> >  On 11/05/13 10:13, bin.cheng wrote:
>>> >>
>>> >>  Index: gcc/tree-affine.c
>>> >>  ===================================================================
>>> >>  --- gcc/tree-affine.c   (revision 204117)
>>> >>  +++ gcc/tree-affine.c   (working copy)
>>> >>  @@ -874,10 +874,11 @@ debug_aff (aff_tree *val)
>>> >>       fprintf (stderr, "\n");
>>> >>     }
>>> >>
>>> >>  -/* Returns address of the reference REF in ADDR.  The size of the
>>> >>  accessed
>>> >>  -   location is stored to SIZE.  */
>>> >>  +/* Computes address of the reference REF in ADDR.  The size of the
>>> >>  accessed
>>> >>  +   location is stored to SIZE.  Returns pointer to the ultimate
>>> >>  containing
>>> >>  +   object to which REF refers.  */
>>> >>
>>> >>  -void
>>> >>  +tree
>>> >>     get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size)
>>> >>     {
>>> >>       HOST_WIDE_INT bitsize, bitpos;
>>> >>  @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr,
>>> >>       aff_combination_add (addr,&tmp);
>>> >>
>>> >>       *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) /
>>> >>  BITS_PER_UNIT);
>>> >>  +
>>> >>  +  return base_addr;
>>> >>     }
>>> >>
>> >
>> >  I think what Richard suggests is to return the base object rather the
>> >  address of the base object, i.e.
> I am not sure about that.  We have to pass pointer_type expression to
> function determine_base_object for address expressions, because there
> is no way to tell pointer from object once we are in
> determine_base_object.

I'm just concerned with the consistency in what is returned between 
get_inner_reference and get_inner_reference_aff.  If 
determine_base_object expects reference only, you can probably work 
around it with something like:

   base_object = build_fold_addr_expr (base_object);

after the get_inner_reference_aff call.

Yufeng
Richard Biener Nov. 5, 2013, 12:29 p.m. UTC | #5
On Tue, Nov 5, 2013 at 1:18 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote:
> On 11/05/13 11:45, Bin.Cheng wrote:
>>
>> On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang<Yufeng.Zhang@arm.com>  wrote:
>>>
>>> >  On 11/05/13 10:13, bin.cheng wrote:
>>>>
>>>> >>
>>>> >>  Index: gcc/tree-affine.c
>>>> >>  ===================================================================
>>>> >>  --- gcc/tree-affine.c   (revision 204117)
>>>> >>  +++ gcc/tree-affine.c   (working copy)
>>>> >>  @@ -874,10 +874,11 @@ debug_aff (aff_tree *val)
>>>> >>       fprintf (stderr, "\n");
>>>> >>     }
>>>> >>
>>>> >>  -/* Returns address of the reference REF in ADDR.  The size of the
>>>> >>  accessed
>>>> >>  -   location is stored to SIZE.  */
>>>> >>  +/* Computes address of the reference REF in ADDR.  The size of the
>>>> >>  accessed
>>>> >>  +   location is stored to SIZE.  Returns pointer to the ultimate
>>>> >>  containing
>>>> >>  +   object to which REF refers.  */
>>>> >>
>>>> >>  -void
>>>> >>  +tree
>>>> >>     get_inner_reference_aff (tree ref, aff_tree *addr, double_int
>>>> >> *size)
>>>> >>     {
>>>> >>       HOST_WIDE_INT bitsize, bitpos;
>>>> >>  @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree
>>>> >> *addr,
>>>> >>       aff_combination_add (addr,&tmp);
>>>> >>
>>>> >>       *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) /
>>>> >>  BITS_PER_UNIT);
>>>> >>  +
>>>> >>  +  return base_addr;
>>>> >>     }
>>>> >>
>>>
>>> >
>>> >  I think what Richard suggests is to return the base object rather the
>>> >  address of the base object, i.e.
>>
>> I am not sure about that.  We have to pass pointer_type expression to
>> function determine_base_object for address expressions, because there
>> is no way to tell pointer from object once we are in
>> determine_base_object.
>
>
> I'm just concerned with the consistency in what is returned between
> get_inner_reference and get_inner_reference_aff.  If determine_base_object
> expects reference only, you can probably work around it with something like:
>
>   base_object = build_fold_addr_expr (base_object);
>
> after the get_inner_reference_aff call.

Note that get_inner_reference_aff is already different from get_inner_reference
in that it does not separate base object and offset but lumps them together
into an address represented as affine combination.

For this reason get_inner_reference_aff may not exactly be the best
name for the function ... it could be split to not do

  aff_combination_add (addr, &tmp);

but instead return 'base'.  The two existing users of the function could
be unified into aff_comb_cannot_overlap_p (giving that a better name
as well), even avoiding combining the base but instead handle
comparing the base itself.

Of course that's preexisting issues that I don't want to force Bin Cheng
to clean up and fix (given I've, too, just spent a minute thinking about it).

Thanks,
Richard.

> Yufeng
>
Bin.Cheng Nov. 5, 2013, 2:08 p.m. UTC | #6
On Tue, Nov 5, 2013 at 8:29 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Nov 5, 2013 at 1:18 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote:
>> On 11/05/13 11:45, Bin.Cheng wrote:
>>>
>>> On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang<Yufeng.Zhang@arm.com>  wrote:
>>>>
>>>> >  On 11/05/13 10:13, bin.cheng wrote:
>>>>>
>>>>> >>
>>>>> >>  Index: gcc/tree-affine.c
>>>>> >>  ===================================================================
>>>>> >>  --- gcc/tree-affine.c   (revision 204117)
>>>>> >>  +++ gcc/tree-affine.c   (working copy)
>>>>> >>  @@ -874,10 +874,11 @@ debug_aff (aff_tree *val)
>>>>> >>       fprintf (stderr, "\n");
>>>>> >>     }
>>>>> >>
>>>>> >>  -/* Returns address of the reference REF in ADDR.  The size of the
>>>>> >>  accessed
>>>>> >>  -   location is stored to SIZE.  */
>>>>> >>  +/* Computes address of the reference REF in ADDR.  The size of the
>>>>> >>  accessed
>>>>> >>  +   location is stored to SIZE.  Returns pointer to the ultimate
>>>>> >>  containing
>>>>> >>  +   object to which REF refers.  */
>>>>> >>
>>>>> >>  -void
>>>>> >>  +tree
>>>>> >>     get_inner_reference_aff (tree ref, aff_tree *addr, double_int
>>>>> >> *size)
>>>>> >>     {
>>>>> >>       HOST_WIDE_INT bitsize, bitpos;
>>>>> >>  @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree
>>>>> >> *addr,
>>>>> >>       aff_combination_add (addr,&tmp);
>>>>> >>
>>>>> >>       *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) /
>>>>> >>  BITS_PER_UNIT);
>>>>> >>  +
>>>>> >>  +  return base_addr;
>>>>> >>     }
>>>>> >>
>>>>
>>>> >
>>>> >  I think what Richard suggests is to return the base object rather the
>>>> >  address of the base object, i.e.
>>>
>>> I am not sure about that.  We have to pass pointer_type expression to
>>> function determine_base_object for address expressions, because there
>>> is no way to tell pointer from object once we are in
>>> determine_base_object.
>>
>>
>> I'm just concerned with the consistency in what is returned between
>> get_inner_reference and get_inner_reference_aff.  If determine_base_object
>> expects reference only, you can probably work around it with something like:
>>
>>   base_object = build_fold_addr_expr (base_object);
>>
>> after the get_inner_reference_aff call.
>
> Note that get_inner_reference_aff is already different from get_inner_reference
> in that it does not separate base object and offset but lumps them together
> into an address represented as affine combination.
>
> For this reason get_inner_reference_aff may not exactly be the best
> name for the function ... it could be split to not do
>
>   aff_combination_add (addr, &tmp);
>
> but instead return 'base'.  The two existing users of the function could
> be unified into aff_comb_cannot_overlap_p (giving that a better name
> as well), even avoiding combining the base but instead handle
> comparing the base itself.
>
> Of course that's preexisting issues that I don't want to force Bin Cheng
> to clean up and fix (given I've, too, just spent a minute thinking about it).
>
I can investigate that later after finishing all my IVOPT patches.

Thanks,
bin
diff mbox

Patch

Index: gcc/testsuite/gcc.dg/tree-ssa/loop-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-2.c	(revision 204117)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-2.c	(working copy)
@@ -27,7 +27,7 @@  void xxx(void)
 
 /* { dg-final { scan-tree-dump-times " \\* \[^\\n\\r\]*=" 0 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\[^\\n\\r\]*= \\* " 0 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "MEM" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MEM\\\[base" 1 "optimized" } } */
 
 /* 17 * iter should be strength reduced.  */
 
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c	(revision 204117)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c	(working copy)
@@ -7,7 +7,8 @@ 
 
 extern char a[];
 
-/* Can not infer loop iteration from array -- exit test can not be replaced.  */
+/* Can not infer loop iteration from array -- exit test can not be
+   replaced by the array address.  */
 void foo (unsigned int i_width, TYPE dst)
 {
   unsigned long long i = 0;
@@ -21,5 +22,5 @@  void foo (unsigned int i_width, TYPE dst)
     }
 }
 
-/* { dg-final { scan-tree-dump-times "Replacing" 0 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "\[^:\]*if \\(.*j_\[0-9\]+.*\\)" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c	(revision 204117)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c	(working copy)
@@ -18,5 +18,5 @@  long foo(long* p, long* p2, int N1, int N2)
   return s;
 }
 
-/* { dg-final { scan-tree-dump-times "Replacing" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "Replacing exit test: if \\(.*p2.*\\)" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 204117)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -924,11 +924,29 @@  determine_base_object (tree expr)
 static struct iv *
 alloc_iv (tree base, tree step)
 {
+  tree expr = base;
+  tree base_object = base;
   struct iv *iv = XCNEW (struct iv);
   gcc_assert (step != NULL_TREE);
 
+  /* Lower all address expressions except ones with DECL_P as oeprand.
+     By doing this:
+       1) More accurate cost can be computed for address expressions;
+       2) Duplicate candidates won't be created for bases in different
+          forms, like &a[0] and &a.  */
+  STRIP_NOPS (expr);
+  if (TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0)))
+    {
+      aff_tree comb;
+      double_int size;
+      base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0),
+					     &comb, &size);
+      gcc_assert (base_object != NULL_TREE);
+      base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb));
+    }
+
   iv->base = base;
-  iv->base_object = determine_base_object (base);
+  iv->base_object = determine_base_object (base_object);
   iv->step = step;
   iv->biv_p = false;
   iv->have_use_for = false;
@@ -3481,17 +3499,21 @@  get_shiftadd_cost (tree expr, enum machine_mode mo
   int m = exact_log2 (int_cst_value (cst));
   int maxm = MIN (BITS_PER_WORD, GET_MODE_BITSIZE (mode));
   int sa_cost;
+  bool equal_p = false;
 
   if (!(m >= 0 && m < maxm))
     return false;
 
+  if (operand_equal_p (op1, mult, 0))
+    equal_p = true;
+
   sa_cost = (TREE_CODE (expr) != MINUS_EXPR
              ? shiftadd_cost (speed, mode, m)
-             : (mult == op1
+             : (equal_p
                 ? shiftsub1_cost (speed, mode, m)
                 : shiftsub0_cost (speed, mode, m)));
   res = new_cost (sa_cost, 0);
-  res = add_costs (res, mult == op1 ? cost0 : cost1);
+  res = add_costs (res, equal_p ? cost0 : cost1);
 
   STRIP_NOPS (multop);
   if (!is_gimple_val (multop))
@@ -3585,30 +3607,14 @@  force_expr_to_var_cost (tree expr, bool speed)
       op1 = TREE_OPERAND (expr, 1);
       STRIP_NOPS (op0);
       STRIP_NOPS (op1);
-
-      if (is_gimple_val (op0))
-	cost0 = no_cost;
-      else
-	cost0 = force_expr_to_var_cost (op0, speed);
-
-      if (is_gimple_val (op1))
-	cost1 = no_cost;
-      else
-	cost1 = force_expr_to_var_cost (op1, speed);
-
       break;
 
+    case NOP_EXPR:
+    case CONVERT_EXPR:
     case NEGATE_EXPR:
       op0 = TREE_OPERAND (expr, 0);
       STRIP_NOPS (op0);
       op1 = NULL_TREE;
-
-      if (is_gimple_val (op0))
-	cost0 = no_cost;
-      else
-	cost0 = force_expr_to_var_cost (op0, speed);
-
-      cost1 = no_cost;
       break;
 
     default:
@@ -3616,6 +3622,18 @@  force_expr_to_var_cost (tree expr, bool speed)
       return new_cost (target_spill_cost[speed], 0);
     }
 
+  if (op0 == NULL_TREE
+      || (is_gimple_val (op0) && (TREE_CODE (op0) != ADDR_EXPR)))
+    cost0 = no_cost;
+  else
+    cost0 = force_expr_to_var_cost (op0, speed);
+
+  if (op1 == NULL_TREE
+      || (is_gimple_val (op1) && (TREE_CODE (op1) != ADDR_EXPR)))
+    cost1 = no_cost;
+  else
+    cost1 = force_expr_to_var_cost (op1, speed);
+
   mode = TYPE_MODE (TREE_TYPE (expr));
   switch (TREE_CODE (expr))
     {
@@ -3639,7 +3657,18 @@  force_expr_to_var_cost (tree expr, bool speed)
                                     speed, &sa_cost))
             return sa_cost;
         }
+
       break;
+    case NOP_EXPR:
+    case CONVERT_EXPR:
+      {
+	tree inner_mode, outer_mode;
+	outer_mode = TREE_TYPE (expr);
+	inner_mode = TREE_TYPE (op0);
+	cost = new_cost (convert_cost (TYPE_MODE (outer_mode),
+				       TYPE_MODE (inner_mode), speed), 0);
+      }
+      break;
 
     case MULT_EXPR:
       if (cst_and_fits_in_hwi (op0))
@@ -3713,7 +3742,7 @@  split_address_cost (struct ivopts_data *data,
       *var_present = true;
       fd_ivopts_data = data;
       walk_tree (&addr, find_depends, depends_on, NULL);
-      return new_cost (target_spill_cost[data->speed], 0);
+      return force_expr_to_var_cost (addr, data->speed);
     }
 
   *offset += bitpos / BITS_PER_UNIT;
Index: gcc/tree-affine.c
===================================================================
--- gcc/tree-affine.c	(revision 204117)
+++ gcc/tree-affine.c	(working copy)
@@ -874,10 +874,11 @@  debug_aff (aff_tree *val)
   fprintf (stderr, "\n");
 }
 
-/* Returns address of the reference REF in ADDR.  The size of the accessed
-   location is stored to SIZE.  */
+/* Computes address of the reference REF in ADDR.  The size of the accessed
+   location is stored to SIZE.  Returns pointer to the ultimate containing
+   object to which REF refers.  */
 
-void
+tree
 get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size)
 {
   HOST_WIDE_INT bitsize, bitpos;
@@ -904,6 +905,8 @@  get_inner_reference_aff (tree ref, aff_tree *addr,
   aff_combination_add (addr, &tmp);
 
   *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT);
+
+  return base_addr;
 }
 
 /* Returns true if a region of size SIZE1 at position 0 and a region of
Index: gcc/tree-affine.h
===================================================================
--- gcc/tree-affine.h	(revision 204117)
+++ gcc/tree-affine.h	(working copy)
@@ -74,7 +74,7 @@  bool aff_combination_constant_multiple_p (aff_tree
 void aff_combination_expand (aff_tree *, struct pointer_map_t **);
 void tree_to_aff_combination_expand (tree, tree, aff_tree *,
 				     struct pointer_map_t **);
-void get_inner_reference_aff (tree, aff_tree *, double_int *);
+tree get_inner_reference_aff (tree, aff_tree *, double_int *);
 void free_affine_expand_cache (struct pointer_map_t **);
 bool aff_comb_cannot_overlap_p (aff_tree *, double_int, double_int);