diff mbox

Preserve pointer types in ivopts

Message ID 4F624077.8040103@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt March 15, 2012, 7:18 p.m. UTC
Currently, tree-ssa-loop-ivopts assumes that pointers and integers can
be used interchangeably. It prefers to compute everything in unsigned
integer types rather than pointer types.
On a new target that I'm working on, this assumption is problematic;
casting a pointer to an integer and doing arithmetic on the integer
would be much too expensive to be useful. I came up with the patch
below, which makes ivopts try harder to preserve pointer types.
tree-affine is changed to keep track of base pointers, and do arithmetic
(in particular subtraction) properly if base pointers are involved.

Bootstrapped and regression tested with all languages on i686-linux. I
get FAILs in:

* gcc.dg/guality/pr43051-1.c
  A somewhat contrived testcase; gdb now produces "optimized out"
  messages which I think is acceptable? In any case I remain to be
  convinced that this testcase demonstrates any real problem.
* gcc.dg/tree-ssa/reassoc-19.c scan-tree-dump-times reassoc2 " \+ " 0
  This seems to fail because we do POINTER_PLUS (ptr, NEG offset))
  rather than (char *)((int)ptr - offset). As far as I can tell this
  is also not a real problem, but I don't know how to adjust the
  testcase.

Comments? Ok?


Bernd
* tree-affine.c (aff_combination_zero): Initialize baseptr.
	(aff_combination_add): Handle baseptrs.  Abort if both are set.
	(aff_combination_diff): New function.
	(tree_to_aff_combination_1): Renamed from tree_to_aff_combination.
	Remove code to handle pointers; changed to call itself recursively.
	(tree_to_aff_combination): New function.  Handle the pointer cases
	formerly found in the function of the same name, and use
	tree_to_aff_combination_1 to compute the offsets.
	(aff_combination_to_tree): Build a POINTER_PLUS_EXPR around the
	offset if the baseptr is nonnull.
	* tree-affine.h (struct affine_tree_combination): New member baseptr.
	(aff_combination_diff): Declare.
	* tree-predcom.c (determine_offset, valid_initialier_p): Use
	aff_combination_diff and return false if it fails.
	* tree-ssa-loop-ivopts.c (determine_base_object): If an non-pointer
	is cast to a pointer, return the cast.
	(add_candidate_1): Use sizetype for steps of a pointer-type iv.
	(add_old_iv_candidates): Only add a zero-base pointer candidate if
	the precision of pointers and sizetype is equal.
	(get_computation_aff): Don't convert steps to pointer types.
	Ensure pointers are not scaled. Use aff_combination_diff for
	subtraction.
	(ptr_difference_cost, difference_cost): Use aff_combination_diff and
	return infinite_cost if it fails.
	(get_loop_invariant_expr_id): Likewise, returning -1 on failure.
	(get_computation_cost_at): Fail if bad pointer expressions would be
	generated.
	(rewrite_use_nonlinear_expr): Use POINTER_PLUS_EXPR if necessary.
	* tree-ssa-address.c (addr_to_parts): Look for a baseptr in the
	aff_tree.
	* tree-ssa-loop-im.c (mem_refs_may_alias_p): Use aff_combination_diff.

testsuite/
	* gcc.dg/tree-ssa/loop-4.c: Scan only for real MEMs, not addresses of
	them.

Comments

Zdenek Dvorak March 15, 2012, 10:12 p.m. UTC | #1
Hi,

> Currently, tree-ssa-loop-ivopts assumes that pointers and integers can
> be used interchangeably. It prefers to compute everything in unsigned
> integer types rather than pointer types.
> On a new target that I'm working on, this assumption is problematic;
> casting a pointer to an integer and doing arithmetic on the integer
> would be much too expensive to be useful. I came up with the patch
> below, which makes ivopts try harder to preserve pointer types.
> tree-affine is changed to keep track of base pointers, and do arithmetic
> (in particular subtraction) properly if base pointers are involved.
> 
> Bootstrapped and regression tested with all languages on i686-linux. I
> get FAILs in:
> 
> * gcc.dg/guality/pr43051-1.c
>   A somewhat contrived testcase; gdb now produces "optimized out"
>   messages which I think is acceptable? In any case I remain to be
>   convinced that this testcase demonstrates any real problem.
> * gcc.dg/tree-ssa/reassoc-19.c scan-tree-dump-times reassoc2 " \+ " 0
>   This seems to fail because we do POINTER_PLUS (ptr, NEG offset))
>   rather than (char *)((int)ptr - offset). As far as I can tell this
>   is also not a real problem, but I don't know how to adjust the
>   testcase.
> 
> Comments? Ok?

the reason unsigned integer types are prefered is that possible overflows
during the computation have defined semantics.  With pointer types, the
intermediate steps of the computations could have undefined behavior, possibly
confusing further optimizations.  Is the patch with this regard?

Zdenek
Bernd Schmidt March 15, 2012, 11:03 p.m. UTC | #2
On 03/15/2012 11:12 PM, Zdenek Dvorak wrote:

> the reason unsigned integer types are prefered is that possible overflows
> during the computation have defined semantics.  With pointer types, the
> intermediate steps of the computations could have undefined behavior, possibly
> confusing further optimizations.  Is the patch with this regard?

It's trying to use sizetype for pointer offset computations. As far as I
can tell that's supposed to be an unsigned type, so it should be OK. I
think the final POINTER_PLUS_EXPRs we make can't overflow in valid programs.


Bernd
Jakub Jelinek March 15, 2012, 11:16 p.m. UTC | #3
On Fri, Mar 16, 2012 at 12:03:08AM +0100, Bernd Schmidt wrote:
> On 03/15/2012 11:12 PM, Zdenek Dvorak wrote:
> 
> > the reason unsigned integer types are prefered is that possible overflows
> > during the computation have defined semantics.  With pointer types, the
> > intermediate steps of the computations could have undefined behavior, possibly
> > confusing further optimizations.  Is the patch with this regard?
> 
> It's trying to use sizetype for pointer offset computations. As far as I
> can tell that's supposed to be an unsigned type, so it should be OK. I
> think the final POINTER_PLUS_EXPRs we make can't overflow in valid programs.

In the IL before ivopts it shouldn't for valid programs, but what ivopts
makes out of it often would, that is why it uses unsigned integers instead.

	Jakub
Bernd Schmidt March 15, 2012, 11:20 p.m. UTC | #4
On 03/16/2012 12:16 AM, Jakub Jelinek wrote:
> On Fri, Mar 16, 2012 at 12:03:08AM +0100, Bernd Schmidt wrote:
>> On 03/15/2012 11:12 PM, Zdenek Dvorak wrote:
>>
>>> the reason unsigned integer types are prefered is that possible overflows
>>> during the computation have defined semantics.  With pointer types, the
>>> intermediate steps of the computations could have undefined behavior, possibly
>>> confusing further optimizations.  Is the patch with this regard?
>>
>> It's trying to use sizetype for pointer offset computations. As far as I
>> can tell that's supposed to be an unsigned type, so it should be OK. I
>> think the final POINTER_PLUS_EXPRs we make can't overflow in valid programs.
> 
> In the IL before ivopts it shouldn't for valid programs, but what ivopts
> makes out of it often would, that is why it uses unsigned integers instead.

Well, what are our rules for whether overflow on POINTER_PLUS_EXPR is
defined or not? A quick search through the headers and docs doesn't turn
up anything. Would there be a downside to defining it as wrapping?

Can you show an example where a POINTER_PLUS_EXPR produced by ivopts
would overflow?


Bernd
Jakub Jelinek March 15, 2012, 11:44 p.m. UTC | #5
On Fri, Mar 16, 2012 at 12:20:44AM +0100, Bernd Schmidt wrote:
> On 03/16/2012 12:16 AM, Jakub Jelinek wrote:
> > On Fri, Mar 16, 2012 at 12:03:08AM +0100, Bernd Schmidt wrote:
> >> On 03/15/2012 11:12 PM, Zdenek Dvorak wrote:
> >>
> >>> the reason unsigned integer types are prefered is that possible overflows
> >>> during the computation have defined semantics.  With pointer types, the
> >>> intermediate steps of the computations could have undefined behavior, possibly
> >>> confusing further optimizations.  Is the patch with this regard?
> >>
> >> It's trying to use sizetype for pointer offset computations. As far as I
> >> can tell that's supposed to be an unsigned type, so it should be OK. I
> >> think the final POINTER_PLUS_EXPRs we make can't overflow in valid programs.
> > 
> > In the IL before ivopts it shouldn't for valid programs, but what ivopts
> > makes out of it often would, that is why it uses unsigned integers instead.
> 
> Well, what are our rules for whether overflow on POINTER_PLUS_EXPR is
> defined or not? A quick search through the headers and docs doesn't turn
> up anything. Would there be a downside to defining it as wrapping?
> 
> Can you show an example where a POINTER_PLUS_EXPR produced by ivopts
> would overflow?

Don't have a testcase right now, I've just seen IVOPTS many times in the
past initialize an IV with start of array minus some constant, end of array
plus some constant or similar (which is fine if the IV is unsigned integer
of the size of a pointer, but certainly wouldn't be fine if the IV had
pointer type).  For pointer arithmetics in the IL we assume the C
requirements, pointer arithmetics can be performed only within the same
object, so for
int a[10];
both of the following are invalid, even in the IL:
int *p = a - 1;
int *q = a + 11;

	Jakub
Zdenek Dvorak March 16, 2012, 12:02 a.m. UTC | #6
Hi,

> > Well, what are our rules for whether overflow on POINTER_PLUS_EXPR is
> > defined or not? A quick search through the headers and docs doesn't turn
> > up anything. Would there be a downside to defining it as wrapping?
> > 
> > Can you show an example where a POINTER_PLUS_EXPR produced by ivopts
> > would overflow?
> 
> Don't have a testcase right now, I've just seen IVOPTS many times in the
> past initialize an IV with start of array minus some constant, end of array
> plus some constant or similar (which is fine if the IV is unsigned integer
> of the size of a pointer, but certainly wouldn't be fine if the IV had
> pointer type).  For pointer arithmetics in the IL we assume the C
> requirements, pointer arithmetics can be performed only within the same
> object, so for
> int a[10];
> both of the following are invalid, even in the IL:
> int *p = a - 1;
> int *q = a + 11;

for example, something like this:

int a[1000];

void foo(int n)
{
  int i, *p = a;

  for (i = 8; i < n; i++)
    *p++ = 10;
}

ivopts may decide to change this to:

int a[1000];

void foo(int n)
{
  int i, *p = a - 8;

  for (i = 8; i < n; i++)
    p[i] = 10;
}

which may require one less adition, depending on the available addressing modes.
Of course, as written above this has undefined behavior; hence, the casts to
unsigned integer,

Zdenek
Bernd Schmidt March 16, 2012, 12:09 a.m. UTC | #7
On 03/16/2012 12:44 AM, Jakub Jelinek wrote:
> For pointer arithmetics in the IL we assume the C
> requirements, pointer arithmetics can be performed only within the same
> object, so for
> int a[10];
> both of the following are invalid, even in the IL:
> int *p = a - 1;
> int *q = a + 11;

Ok, so what's the solution? Add a second POINTER_PLUS_EXPR code with
defined overflow behaviour, or add a flag to it?


Bernd
Andrew Pinski March 16, 2012, 12:13 a.m. UTC | #8
On Thu, Mar 15, 2012 at 5:09 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 03/16/2012 12:44 AM, Jakub Jelinek wrote:
>> For pointer arithmetics in the IL we assume the C
>> requirements, pointer arithmetics can be performed only within the same
>> object, so for
>> int a[10];
>> both of the following are invalid, even in the IL:
>> int *p = a - 1;
>> int *q = a + 11;
>
> Ok, so what's the solution? Add a second POINTER_PLUS_EXPR code with
> defined overflow behaviour, or add a flag to it?

We should have one for PLUS_EXPR also.  There was some movement on
that on a branch that Richard Guenther did but I don't know if he is
going to work on it further.  I have been noticing more and more the
need for this feature while working on my tree combiner branch, that I
might try to see if Richard's branch can be revisited.

Thanks,
Andrew Pinski
Richard Biener March 16, 2012, 10:08 a.m. UTC | #9
On Fri, Mar 16, 2012 at 1:13 AM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Thu, Mar 15, 2012 at 5:09 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> On 03/16/2012 12:44 AM, Jakub Jelinek wrote:
>>> For pointer arithmetics in the IL we assume the C
>>> requirements, pointer arithmetics can be performed only within the same
>>> object, so for
>>> int a[10];
>>> both of the following are invalid, even in the IL:
>>> int *p = a - 1;
>>> int *q = a + 11;
>>
>> Ok, so what's the solution? Add a second POINTER_PLUS_EXPR code with
>> defined overflow behaviour, or add a flag to it?
>
> We should have one for PLUS_EXPR also.  There was some movement on
> that on a branch that Richard Guenther did but I don't know if he is
> going to work on it further.  I have been noticing more and more the
> need for this feature while working on my tree combiner branch, that I
> might try to see if Richard's branch can be revisited.

The branch was a too big task at one time, so presently I'm trying to get rid
of the speciality of "sizetypes" first and then plan to revisit the branch.  To
recap - on the branch we have an explicit notion on whether a operation
can overflow or not (where "not overflowing" is equivalent to "undefined
behavior if overflow happens").  Operations are marked either way by
choosing different tree codes.

See http://gcc.gnu.org/wiki/NoUndefinedOverflow

Unfortunately updating the branch stopped before tuplification (heh ...).
So I guess it will need to be re-done from start.  And yes, it's a massive
effort.  But in theory you can do the massaging incrementally - so, in
your case add only POINTER_PLUSNV_EXPR and make sure to
drop that to POINTER_PLUS_EXPR whenever you are no longer sure
that the operation follows the C language constraints of pointer arithmetic
(and change all folders that assume that semantic to work only on
POINTER_PLUSNV_EXPRs).  Or do it the other way around (which of
course will be less conservatively correct - something I'm no longer 100%
sure about).

Your patch as-is is not safe at the moment.  But why is casting a pointer
to an integer prohibitly expensive?  That fact should be an implementation
detail of GIMPLE.  Or is it the issue that IVOPTs chooses an integer
type that does not necessarily match the mode we'll use on RTL?
(thus, ptr_mode vs. Pmode issues, and/or sizeof (sizetype) != sizeof (void *)
issues?)

Richard.
Bernd Schmidt March 16, 2012, 12:36 p.m. UTC | #10
On 03/16/2012 11:08 AM, Richard Guenther wrote:
> 
> Your patch as-is is not safe at the moment.  But why is casting a pointer
> to an integer prohibitly expensive?  That fact should be an implementation
> detail of GIMPLE.  Or is it the issue that IVOPTs chooses an integer
> type that does not necessarily match the mode we'll use on RTL?
> (thus, ptr_mode vs. Pmode issues, and/or sizeof (sizetype) != sizeof (void *)
> issues?)

The machine is "special". Pointer addition is a different operation than
integer addition. It'll also need a new ptr_plus rtx code which takes a
Pmode and an SImode operand. Pmode is larger than SImode but fits in a
single register; intptr_t (which is what we'd need to use if we freely
cast between pointers and integers is DImode - that requires two regs
and can't be used for memory addressing.


Bernd
Richard Biener March 16, 2012, 12:55 p.m. UTC | #11
On Fri, Mar 16, 2012 at 1:36 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 03/16/2012 11:08 AM, Richard Guenther wrote:
>>
>> Your patch as-is is not safe at the moment.  But why is casting a pointer
>> to an integer prohibitly expensive?  That fact should be an implementation
>> detail of GIMPLE.  Or is it the issue that IVOPTs chooses an integer
>> type that does not necessarily match the mode we'll use on RTL?
>> (thus, ptr_mode vs. Pmode issues, and/or sizeof (sizetype) != sizeof (void *)
>> issues?)
>
> The machine is "special". Pointer addition is a different operation than
> integer addition. It'll also need a new ptr_plus rtx code which takes a
> Pmode and an SImode operand. Pmode is larger than SImode but fits in a
> single register; intptr_t (which is what we'd need to use if we freely
> cast between pointers and integers is DImode - that requires two regs
> and can't be used for memory addressing.

Hm, I see.  On RTL would you get away with using REG_POINTER and
paradoxical subregs for the offset operand?  Sth like (add (reg:DI
ptr) (subreg:DI (reg:SI off)))?  Or even use a plain sign/zero_extend
which is what expansion
would generate at the moment I think.  I suppose the HW either sign- or
zero-extends that offset operand anyway.

Basically your issue on trees is that pointer information is lost and (wide)
integer operations appear?

Note that IVOPTs at the moment does not try to generate lea-style
pointer arithmetic while it could use &TARGET_MEM_REF for this.
We still assume that those addresses are within the object of
operand zero (but if you use a literal zero as that operand and put
the base somewhere else you might use that as a vehicle for
telling GCC the arithmetic is not within C language scope).

Richard.
Bernd Schmidt March 16, 2012, 1:05 p.m. UTC | #12
On 03/16/2012 01:55 PM, Richard Guenther wrote:
> On Fri, Mar 16, 2012 at 1:36 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>
>> The machine is "special". Pointer addition is a different operation than
>> integer addition. It'll also need a new ptr_plus rtx code which takes a
>> Pmode and an SImode operand. Pmode is larger than SImode but fits in a
>> single register; intptr_t (which is what we'd need to use if we freely
>> cast between pointers and integers is DImode - that requires two regs
>> and can't be used for memory addressing.
> 
> Hm, I see.  On RTL would you get away with using REG_POINTER and
> paradoxical subregs for the offset operand?  Sth like (add (reg:DI
> ptr) (subreg:DI (reg:SI off)))?

No, because (reg:DI ptr) is the wrong representation. Pointers only take
a single reg, and besides such subreg games would be extremely nasty.
There's also the problem that this really isn't an arithmetic plus,
since the top bits of the pointer are unaffected. Hence it doesn't
commute: lea is a different operation than add. There is no other
arithmetic that operates on Pmode, so it is impossible to use that mode
for integer types. Well, not impossible - I have existence proof in the
form of the port I inherited - but you have to lie pretty heavily to the
compiler to even just make it limp along.

> Basically your issue on trees is that pointer information is lost and (wide)
> integer operations appear?

That's the main issue, yes.

> Note that IVOPTs at the moment does not try to generate lea-style
> pointer arithmetic while it could use &TARGET_MEM_REF for this.
> We still assume that those addresses are within the object of
> operand zero (but if you use a literal zero as that operand and put
> the base somewhere else you might use that as a vehicle for
> telling GCC the arithmetic is not within C language scope).

Oh, ok. So IIUC maybe I can simply adjust the patch to still create
POINTER_PLUS_EXPR in memory addresses, but use &TARGET_MEM_REF for the
arithmetic in initializing the ivs?


Bernd
Richard Biener March 16, 2012, 1:15 p.m. UTC | #13
On Fri, Mar 16, 2012 at 2:05 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 03/16/2012 01:55 PM, Richard Guenther wrote:
>> On Fri, Mar 16, 2012 at 1:36 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>>
>>> The machine is "special". Pointer addition is a different operation than
>>> integer addition. It'll also need a new ptr_plus rtx code which takes a
>>> Pmode and an SImode operand. Pmode is larger than SImode but fits in a
>>> single register; intptr_t (which is what we'd need to use if we freely
>>> cast between pointers and integers is DImode - that requires two regs
>>> and can't be used for memory addressing.
>>
>> Hm, I see.  On RTL would you get away with using REG_POINTER and
>> paradoxical subregs for the offset operand?  Sth like (add (reg:DI
>> ptr) (subreg:DI (reg:SI off)))?
>
> No, because (reg:DI ptr) is the wrong representation. Pointers only take
> a single reg, and besides such subreg games would be extremely nasty.
> There's also the problem that this really isn't an arithmetic plus,
> since the top bits of the pointer are unaffected. Hence it doesn't
> commute: lea is a different operation than add. There is no other
> arithmetic that operates on Pmode, so it is impossible to use that mode
> for integer types. Well, not impossible - I have existence proof in the
> form of the port I inherited - but you have to lie pretty heavily to the
> compiler to even just make it limp along.

Hmm, ok.  So to have your lea represented on RTL you cannot use
a fancy pattern using add because it would not be an add.  Hmm.
If it merely does not affect upper bits then

 (set (reg:SI ptr) (add (reg:SI ptr) (reg:SI off))

might work (if Pmode aka DImode is word_mode of course - otherwise
the upper bits would be lost ...).

Or of course an UNSPEC generated during address legitimization.
Do we even have sth like address legitimization for pointer arithmetic?
I suppose we should, when expanding POINTER_PLUS_EXPRs.
Maybe simply dispatching through a two-mode optab for expanding
POINTER_PLUS_EXPRs would suit you?

>> Basically your issue on trees is that pointer information is lost and (wide)
>> integer operations appear?
>
> That's the main issue, yes.
>
>> Note that IVOPTs at the moment does not try to generate lea-style
>> pointer arithmetic while it could use &TARGET_MEM_REF for this.
>> We still assume that those addresses are within the object of
>> operand zero (but if you use a literal zero as that operand and put
>> the base somewhere else you might use that as a vehicle for
>> telling GCC the arithmetic is not within C language scope).
>
> Oh, ok. So IIUC maybe I can simply adjust the patch to still create
> POINTER_PLUS_EXPR in memory addresses, but use &TARGET_MEM_REF for the
> arithmetic in initializing the ivs?

Yes, that could work.  Though it might end up being incredibly ugly ;)

Richard.

>
> Bernd
Bernd Schmidt March 16, 2012, 1:18 p.m. UTC | #14
On 03/16/2012 02:15 PM, Richard Guenther wrote:
> Hmm, ok.  So to have your lea represented on RTL you cannot use
> a fancy pattern using add because it would not be an add.  Hmm.
> If it merely does not affect upper bits then
> 
>  (set (reg:SI ptr) (add (reg:SI ptr) (reg:SI off))
> 
> might work (if Pmode aka DImode is word_mode of course - otherwise
> the upper bits would be lost ...).

It's not.

Really, this isn't a problem right now. I already have a patch which
makes a ptr_plus RTX code and handles it everywhere.

> Maybe simply dispatching through a two-mode optab for expanding
> POINTER_PLUS_EXPRs would suit you?

padd_optab, yes.

> Yes, that could work.  Though it might end up being incredibly ugly ;)

In the code? Should really only change a few lines in the patch I would
have thought. I'll get back to you when I have a new version - thanks
for the tip.


Bernd
Richard Biener March 16, 2012, 1:20 p.m. UTC | #15
On Fri, Mar 16, 2012 at 2:18 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 03/16/2012 02:15 PM, Richard Guenther wrote:
>> Hmm, ok.  So to have your lea represented on RTL you cannot use
>> a fancy pattern using add because it would not be an add.  Hmm.
>> If it merely does not affect upper bits then
>>
>>  (set (reg:SI ptr) (add (reg:SI ptr) (reg:SI off))
>>
>> might work (if Pmode aka DImode is word_mode of course - otherwise
>> the upper bits would be lost ...).
>
> It's not.
>
> Really, this isn't a problem right now. I already have a patch which
> makes a ptr_plus RTX code and handles it everywhere.

Fair enough.

>> Maybe simply dispatching through a two-mode optab for expanding
>> POINTER_PLUS_EXPRs would suit you?
>
> padd_optab, yes.
>
>> Yes, that could work.  Though it might end up being incredibly ugly ;)
>
> In the code? Should really only change a few lines in the patch I would
> have thought. I'll get back to you when I have a new version - thanks
> for the tip.

No, in the GIMPLE IL.

Richard.

>
> Bernd
Zdenek Dvorak March 16, 2012, 1:33 p.m. UTC | #16
> >> Yes, that could work. ?Though it might end up being incredibly ugly ;)
> >
> > In the code? Should really only change a few lines in the patch I would
> > have thought. I'll get back to you when I have a new version - thanks
> > for the tip.
> 
> No, in the GIMPLE IL.
> 
> Richard.

And I can just imagine how happy would other optimizations be about such a change.
Are we speaking about something that you would like to get into mainline?
All of this looks like a lot of hacks to deal with a rather weird target.

Zdenek
Bernd Schmidt March 16, 2012, 1:37 p.m. UTC | #17
On 03/16/2012 02:33 PM, Zdenek Dvorak wrote:
>>>> Yes, that could work. ?Though it might end up being incredibly ugly ;)
>>>
>>> In the code? Should really only change a few lines in the patch I would
>>> have thought. I'll get back to you when I have a new version - thanks
>>> for the tip.
>>
>> No, in the GIMPLE IL.
>>
>> Richard.
> 
> And I can just imagine how happy would other optimizations be about such a change.
> Are we speaking about something that you would like to get into mainline?
> All of this looks like a lot of hacks to deal with a rather weird target.

The final piece - using &TARGET_MEM_REF rather than POINTER_PLUS_EXPR to
avoid overflow issues - is clearly a bit of a hack. But otherwise,
keeping around the proper types rather than casting everything to
unsigned int is IMO much less ugly and hackish than what we have now.
Certainly the gimple IL is much less cluttered after the patch than before.


Bernd
Richard Biener March 16, 2012, 1:46 p.m. UTC | #18
On Fri, Mar 16, 2012 at 2:37 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 03/16/2012 02:33 PM, Zdenek Dvorak wrote:
>>>>> Yes, that could work. ?Though it might end up being incredibly ugly ;)
>>>>
>>>> In the code? Should really only change a few lines in the patch I would
>>>> have thought. I'll get back to you when I have a new version - thanks
>>>> for the tip.
>>>
>>> No, in the GIMPLE IL.
>>>
>>> Richard.
>>
>> And I can just imagine how happy would other optimizations be about such a change.
>> Are we speaking about something that you would like to get into mainline?
>> All of this looks like a lot of hacks to deal with a rather weird target.
>
> The final piece - using &TARGET_MEM_REF rather than POINTER_PLUS_EXPR to
> avoid overflow issues - is clearly a bit of a hack. But otherwise,
> keeping around the proper types rather than casting everything to
> unsigned int is IMO much less ugly and hackish than what we have now.
> Certainly the gimple IL is much less cluttered after the patch than before.

Btw, at one point I thought we might use Pmode integer types as base
"pointer" types for [TARGET_]MEM_REF.  But of course your target
shows that this might not be a good idea.

We do have three pointer offsetting operations at the moment,
POINTER_PLUS_EXPR, &MEM_REF and &TARGET_MEM_REF.
All of which expect C pointer semantics (if they can see the base
pointer, which is why I suggested to create a TARGET_MEM_REF
with a NULL pointer TMR_BASE and shove the base pointer to
TMR_INDEX2 (which incidentially needs to be an integer type ...).

In the end what we want is a POINTER_PLUS_EXPR variant
that does not make alias-analysis assume the result still points
to within the objects the pointer pointed to before the increment/decrement.

Btw, how do you handle pointer differences?  The frontend already
lowers them to (intptr_t)ptr1 - (intptr_t)ptr2.

Richard.

> Bernd
Bernd Schmidt March 16, 2012, 1:52 p.m. UTC | #19
On 03/16/2012 02:46 PM, Richard Guenther wrote:
> We do have three pointer offsetting operations at the moment,
> POINTER_PLUS_EXPR, &MEM_REF and &TARGET_MEM_REF.
> All of which expect C pointer semantics (if they can see the base
> pointer, which is why I suggested to create a TARGET_MEM_REF
> with a NULL pointer TMR_BASE and shove the base pointer to
> TMR_INDEX2 (which incidentially needs to be an integer type ...).

Oh; looks like misunderstood your suggestion about TARGET_MEM_REF. Maybe
I'll just have to bite the bullet and make a new tree code.

> Btw, how do you handle pointer differences?  The frontend already
> lowers them to (intptr_t)ptr1 - (intptr_t)ptr2.

More new stuff: targetm.pointer_mode_real_precision. Also, modifications
in tree-ssa-loop-niter to use the new baseptr capability in tree-affine
and subtract just the offsets in sizetype if possible.


Bernd
Bernd Schmidt March 16, 2012, 3:32 p.m. UTC | #20
On 03/16/2012 02:46 PM, Richard Guenther wrote:
> In the end what we want is a POINTER_PLUS_EXPR variant
> that does not make alias-analysis assume the result still points
> to within the objects the pointer pointed to before the increment/decrement.

Hold on, is alias analysis really affected by this? Sure, we create
temporary pointers that point outside their base object, but we don't
dereference them. Anything value that ends up in a MEM_REF can only
point into that object again.

I would have thought that it's mainly in fold-const where a
POINTER_PLUSV would be handled differently from POINTER_PLUS (e.g. in
pointer comparisons).


Bernd
Zdenek Dvorak March 16, 2012, 3:49 p.m. UTC | #21
Hello,

> On 03/16/2012 02:46 PM, Richard Guenther wrote:
> > In the end what we want is a POINTER_PLUS_EXPR variant
> > that does not make alias-analysis assume the result still points
> > to within the objects the pointer pointed to before the increment/decrement.
> 
> Hold on, is alias analysis really affected by this? Sure, we create
> temporary pointers that point outside their base object, but we don't
> dereference them. Anything value that ends up in a MEM_REF can only
> point into that object again.

it can be affected; by standard pointer arithmetics rules, you cannot
create a pointer that would be outside of an object (unless you convert
it from an integer).  Thus, alias analysis may deduce that if we have
something like

int a[100];
int *p = a + 10;
int *q = p - i;
*(q + 5) = 1;
a[1] = 2;

then q points inside a, and consequently q + 5 >= a + 5, hence the
assignments do not alias.  Ivopts may however produce this (in an equivalent
form with casts to unsigned) even if i > 10.

Zdenek
Richard Biener March 16, 2012, 4:34 p.m. UTC | #22
On Fri, Mar 16, 2012 at 4:49 PM, Zdenek Dvorak <rakdver@iuuk.mff.cuni.cz> wrote:
> Hello,
>
>> On 03/16/2012 02:46 PM, Richard Guenther wrote:
>> > In the end what we want is a POINTER_PLUS_EXPR variant
>> > that does not make alias-analysis assume the result still points
>> > to within the objects the pointer pointed to before the increment/decrement.
>>
>> Hold on, is alias analysis really affected by this? Sure, we create
>> temporary pointers that point outside their base object, but we don't
>> dereference them. Anything value that ends up in a MEM_REF can only
>> point into that object again.
>
> it can be affected; by standard pointer arithmetics rules, you cannot
> create a pointer that would be outside of an object (unless you convert
> it from an integer).  Thus, alias analysis may deduce that if we have
> something like
>
> int a[100];
> int *p = a + 10;
> int *q = p - i;
> *(q + 5) = 1;
> a[1] = 2;
>
> then q points inside a, and consequently q + 5 >= a + 5, hence the
> assignments do not alias.  Ivopts may however produce this (in an equivalent
> form with casts to unsigned) even if i > 10.

See also the various invalid PRs that essentially do

foo (int *q)
{
 int i;
 int *p = &i;
 long x = q - p;
 *(p + x) = 1;
}

thus, cleverly encode address-space differences relative to an unrelated
object (seen in Boost, for cross-shmem pointers for example).  C obviously
does not allow the "q - p" operation, and points-to analysis thinks that
p + x points to i and thus we remove the unused store to the local variable.

Another alias assumption is in offset-based analysis which assumes you
cannot have a pointer to before the object, so *(T *)(p + 1) may not
alias a global of type 'T' (because by seeing p + 1 and the access of type
T we conclude that the object pointed to is at least of size sizeof (T) + 1).

Richard.

> Zdenek
Richard Henderson March 16, 2012, 8:49 p.m. UTC | #23
On 03/16/12 05:36, Bernd Schmidt wrote:
> The machine is "special". Pointer addition is a different operation than
> integer addition. It'll also need a new ptr_plus rtx code which takes a
> Pmode and an SImode operand. Pmode is larger than SImode but fits in a
> single register; intptr_t (which is what we'd need to use if we freely
> cast between pointers and integers is DImode - that requires two regs
> and can't be used for memory addressing.

Surely the least amount of work is to not use sizetype/intptr_t, but a
custom type that has the same bit-width as a pointer?


r~
Bernd Schmidt March 16, 2012, 11:37 p.m. UTC | #24
On 03/16/2012 09:49 PM, Richard Henderson wrote:
> On 03/16/12 05:36, Bernd Schmidt wrote:
>> The machine is "special". Pointer addition is a different operation than
>> integer addition. It'll also need a new ptr_plus rtx code which takes a
>> Pmode and an SImode operand. Pmode is larger than SImode but fits in a
>> single register; intptr_t (which is what we'd need to use if we freely
>> cast between pointers and integers is DImode - that requires two regs
>> and can't be used for memory addressing.
> 
> Surely the least amount of work is to not use sizetype/intptr_t, but a
> custom type that has the same bit-width as a pointer?

See some of the later mails in this thread for more details - the
pointer add instruction isn't commutative. There is no true Pmode add,
only a Pmode lea; in fact there is no normal arithmetic in Pmode at all.
So you can't afford to lose information about which operand is the
pointer. Making Pmode integer types would mean lying to the compiler
very heavily about what operations are available, and it becomes way too
hackish very quickly.


Bernd
diff mbox

Patch

Index: gcc/tree-ssa-loop-im.c
===================================================================
--- gcc/tree-ssa-loop-im.c	(revision 184938)
+++ gcc/tree-ssa-loop-im.c	(working copy)
@@ -1772,8 +1772,8 @@  mem_refs_may_alias_p (tree mem1, tree me
   get_inner_reference_aff (mem2, &off2, &size2);
   aff_combination_expand (&off1, ttae_cache);
   aff_combination_expand (&off2, ttae_cache);
-  aff_combination_scale (&off1, double_int_minus_one);
-  aff_combination_add (&off2, &off1);
+  if (!aff_combination_diff (&off2, &off1))
+    return true;
 
   if (aff_comb_cannot_overlap_p (&off2, size1, size2))
     return false;
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-4.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-4.c	(revision 184938)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-4.c	(working copy)
@@ -37,7 +37,7 @@  void xxx(void)
 
 /* { dg-final { scan-tree-dump-times " \\* \[^\\n\\r\]*=" 0 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\[^\\n\\r\]*= \\* " 0 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "MEM" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\[^&\]MEM" 1 "optimized" } } */
 
 /* And the original induction variable should be eliminated.  */
 
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 184938)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -879,15 +879,21 @@  static tree
 determine_base_object (tree expr)
 {
   enum tree_code code = TREE_CODE (expr);
+  tree type = TREE_TYPE (expr);
   tree base, obj;
 
   /* If this is a pointer casted to any type, we need to determine
      the base object for the pointer; so handle conversions before
      throwing away non-pointer expressions.  */
   if (CONVERT_EXPR_P (expr))
-    return determine_base_object (TREE_OPERAND (expr, 0));
+    {
+      tree op = TREE_OPERAND (expr, 0);
+      if (POINTER_TYPE_P (type) && !POINTER_TYPE_P (TREE_TYPE (op)))
+	return expr;
+      return determine_base_object (op);
+    }
 
-  if (!POINTER_TYPE_P (TREE_TYPE (expr)))
+  if (!POINTER_TYPE_P (type))
     return NULL_TREE;
 
   switch (code)
@@ -2225,10 +2231,17 @@  add_candidate_1 (struct ivopts_data *dat
     {
       orig_type = TREE_TYPE (base);
       type = generic_type_for (orig_type);
-      if (type != orig_type)
+      if (TREE_CODE (orig_type) == POINTER_TYPE)
 	{
-	  base = fold_convert (type, base);
-	  step = fold_convert (type, step);
+	  if (TREE_TYPE (step) != sizetype)
+	    step = fold_convert (sizetype, step);
+	}
+      else
+	{
+	  if (type != TREE_TYPE (step))
+	    step = fold_convert (type, step);
+	  if (type != orig_type)
+	    base = fold_convert (type, base);
 	}
     }
 
@@ -2436,16 +2449,20 @@  static void
 add_old_iv_candidates (struct ivopts_data *data, struct iv *iv)
 {
   gimple phi;
-  tree def;
+  tree def, basetype;
   struct iv_cand *cand;
 
   add_candidate (data, iv->base, iv->step, true, NULL);
 
   /* The same, but with initial value zero.  */
-  if (POINTER_TYPE_P (TREE_TYPE (iv->base)))
-    add_candidate (data, size_int (0), iv->step, true, NULL);
+  basetype = TREE_TYPE (iv->base);
+  if (POINTER_TYPE_P (basetype))
+    {
+      if (TYPE_PRECISION (sizetype) == TYPE_PRECISION (basetype))
+	add_candidate (data, size_int (0), iv->step, true, NULL);
+    }
   else
-    add_candidate (data, build_int_cst (TREE_TYPE (iv->base), 0),
+    add_candidate (data, build_int_cst (basetype, 0),
 		   iv->step, true, NULL);
 
   phi = SSA_NAME_DEF_STMT (iv->ssa_name);
@@ -2979,10 +2996,12 @@  get_computation_aff (struct loop *loop,
      overflows, as all the arithmetics will in the end be performed in UUTYPE
      anyway.  */
   common_type = determine_common_wider_type (&ubase, &cbase);
+  if (!POINTER_TYPE_P (common_type) || POINTER_TYPE_P (ctype))
+    ctype = common_type;
 
   /* use = ubase - ratio * cbase + ratio * var.  */
   tree_to_aff_combination (ubase, common_type, aff);
-  tree_to_aff_combination (cbase, common_type, &cbase_aff);
+  tree_to_aff_combination (cbase, ctype, &cbase_aff);
   tree_to_aff_combination (var, uutype, &var_aff);
 
   /* We need to shift the value if we are after the increment.  */
@@ -2990,21 +3009,33 @@  get_computation_aff (struct loop *loop,
     {
       aff_tree cstep_aff;
 
-      if (common_type != uutype)
-	cstep_common = fold_convert (common_type, cstep);
-      else
-	cstep_common = cstep;
+      cstep_common = cstep;
+      if (POINTER_TYPE_P (ctype))
+	{
+	  if (TREE_TYPE (cstep) != sizetype)
+	    cstep_common = fold_convert (sizetype, cstep);
+	}
+      else if (cbase_aff.type != TREE_TYPE (cstep))
+	cstep_common = fold_convert (ctype, cstep);
 
-      tree_to_aff_combination (cstep_common, common_type, &cstep_aff);
+      tree_to_aff_combination (cstep_common, ctype, &cstep_aff);
       aff_combination_add (&cbase_aff, &cstep_aff);
     }
 
-  aff_combination_scale (&cbase_aff, double_int_neg (rat));
-  aff_combination_add (aff, &cbase_aff);
+  if (!double_int_one_p (rat))
+    {
+      if (cbase_aff.baseptr != NULL_TREE
+	  || var_aff.baseptr != NULL_TREE)
+	return false;
+      aff_combination_scale (&cbase_aff, rat);
+      aff_combination_scale (&var_aff, rat);
+    }
+  if (!aff_combination_diff (aff, &cbase_aff))
+    return false;
+
   if (common_type != uutype)
     aff_combination_convert (aff, uutype);
 
-  aff_combination_scale (&var_aff, rat);
   aff_combination_add (aff, &var_aff);
 
   return true;
@@ -3799,8 +3830,8 @@  ptr_difference_cost (struct ivopts_data
   type = signed_type_for (TREE_TYPE (e1));
   tree_to_aff_combination (e1, type, &aff_e1);
   tree_to_aff_combination (e2, type, &aff_e2);
-  aff_combination_scale (&aff_e2, double_int_minus_one);
-  aff_combination_add (&aff_e1, &aff_e2);
+  if (!aff_combination_diff (&aff_e1, &aff_e2))
+    return infinite_cost;
 
   return force_var_cost (data, aff_combination_to_tree (&aff_e1), depends_on);
 }
@@ -3854,8 +3885,8 @@  difference_cost (struct ivopts_data *dat
   type = signed_type_for (TREE_TYPE (e1));
   tree_to_aff_combination (e1, type, &aff_e1);
   tree_to_aff_combination (e2, type, &aff_e2);
-  aff_combination_scale (&aff_e2, double_int_minus_one);
-  aff_combination_add (&aff_e1, &aff_e2);
+  if (!aff_combination_diff (&aff_e1, &aff_e2))
+    return infinite_cost;
 
   return force_var_cost (data, aff_combination_to_tree (&aff_e1), depends_on);
 }
@@ -3999,8 +4030,9 @@  get_loop_invariant_expr_id (struct ivopt
   tree_to_aff_combination (ub, TREE_TYPE (ub), &ubase_aff);
   tree_to_aff_combination (cb, TREE_TYPE (cb), &cbase_aff);
 
-  aff_combination_scale (&cbase_aff, shwi_to_double_int (-1 * ratio));
-  aff_combination_add (&ubase_aff, &cbase_aff);
+  aff_combination_scale (&cbase_aff, shwi_to_double_int (ratio));
+  if (!aff_combination_diff (&ubase_aff, &cbase_aff))
+    return -1;
   expr = aff_combination_to_tree (&ubase_aff);
   return get_expr_id (data, expr);
 }
@@ -4064,6 +4096,17 @@  get_computation_cost_at (struct ivopts_d
 	  && !operand_equal_p (use->iv->base_object, cand->iv->base_object, 0))
 	return infinite_cost;
     }
+  /* Eliminate a few other odd cases where we'd end up adding pointers
+     together.  Again, this is unlikely to be useful and should only occur
+     in contrived testcases.  */
+  if (POINTER_TYPE_P (TREE_TYPE (cand->iv->base))
+      && POINTER_TYPE_P (TREE_TYPE (use->iv->base))
+      && (TREE_CODE (use->iv->base) == INTEGER_CST
+	  || TREE_CODE (cand->iv->base) == INTEGER_CST))
+    return infinite_cost;
+  if (POINTER_TYPE_P (TREE_TYPE (cand->iv->base))
+      && !POINTER_TYPE_P (TREE_TYPE (use->iv->base)))
+    return infinite_cost;
 
   if (TYPE_PRECISION (utype) < TYPE_PRECISION (ctype))
     {
@@ -6179,7 +6222,9 @@  rewrite_use_nonlinear_expr (struct ivopt
       step = cand->iv->step;
       ctype = TREE_TYPE (step);
       utype = TREE_TYPE (cand->var_after);
-      if (TREE_CODE (step) == NEGATE_EXPR)
+      if (TREE_CODE (TREE_TYPE (cand->iv->base)) == POINTER_TYPE)
+	incr_code = POINTER_PLUS_EXPR;
+      else if (TREE_CODE (step) == NEGATE_EXPR)
 	{
 	  incr_code = MINUS_EXPR;
 	  step = TREE_OPERAND (step, 0);
@@ -6213,10 +6258,14 @@  rewrite_use_nonlinear_expr (struct ivopt
 
       /* Otherwise, add the necessary computations to express
 	 the iv.  */
-      op = fold_convert (ctype, cand->var_before);
-      comp = fold_convert (utype,
-			   build2 (incr_code, ctype, op,
-				   unshare_expr (step)));
+      op = cand->var_before;
+      if (incr_code == POINTER_PLUS_EXPR)
+	comp = build2 (incr_code, utype, op, unshare_expr (step));
+      else
+	comp = fold_convert (utype,
+			     build2 (incr_code, ctype,
+				     fold_convert (ctype, op),
+				     unshare_expr (step)));
     }
   else
     {
Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c	(revision 184938)
+++ gcc/tree-ssa-address.c	(working copy)
@@ -648,7 +648,9 @@  addr_to_parts (tree type, aff_tree *addr
   /* Try to find a base of the reference.  Since at the moment
      there is no reliable way how to distinguish between pointer and its
      offset, this is just a guess.  */
-  if (!parts->symbol && base_hint)
+  if (addr->baseptr != NULL_TREE)
+    parts->base = addr->baseptr;
+  if (!parts->symbol && base_hint && !parts->base)
     move_hint_to_base (type, parts, base_hint, addr);
   if (!parts->symbol && !parts->base)
     move_pointer_to_base (parts, addr);
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c	(revision 184938)
+++ gcc/tree-data-ref.c	(working copy)
@@ -1362,8 +1362,8 @@  dr_may_alias_p (const struct data_refere
       double_int size1, size2;
       get_inner_reference_aff (DR_REF (a), &off1, &size1);
       get_inner_reference_aff (DR_REF (b), &off2, &size2);
-      aff_combination_scale (&off1, double_int_minus_one);
-      aff_combination_add (&off2, &off1);
+      if (!aff_combination_diff (&off2, &off1))
+	return true;
       if (aff_comb_cannot_overlap_p (&off2, size1, size2))
 	return false;
     }
Index: gcc/tree-affine.c
===================================================================
--- gcc/tree-affine.c	(revision 184938)
+++ gcc/tree-affine.c	(working copy)
@@ -46,6 +46,7 @@  aff_combination_zero (aff_tree *comb, tr
   comb->offset = double_int_zero;
   comb->n = 0;
   comb->rest = NULL_TREE;
+  comb->baseptr = NULL_TREE;
 }
 
 /* Sets COMB to CST.  */
@@ -202,6 +203,14 @@  aff_combination_add (aff_tree *comb1, af
 {
   unsigned i;
 
+  if (comb2->baseptr != NULL_TREE)
+    {
+      if (comb1->baseptr == NULL_TREE)
+	comb1->baseptr = comb2->baseptr;
+      else
+	gcc_unreachable ();
+    }
+
   aff_combination_add_cst (comb1, comb2->offset);
   for (i = 0; i < comb2->n; i++)
     aff_combination_add_elt (comb1, comb2->elts[i].val, comb2->elts[i].coef);
@@ -209,6 +218,44 @@  aff_combination_add (aff_tree *comb1, af
     aff_combination_add_elt (comb1, comb2->rest, double_int_one);
 }
 
+/* Subtracts COMB2 from COMB1, storing in COMB1.  Returns true on success,
+   false if a failure is encountered (i.e. mismatching base pointers).
+   If FORCE_SIZETYPE is true, we will add the result to a pointer;
+   make sure any base pointers are squashed and the result computed in
+   sizetype.  */
+
+bool
+aff_combination_diff (aff_tree *comb1, aff_tree *comb2)
+{
+  tree c1_base = comb1->baseptr;
+  tree c2_base = comb2->baseptr;
+  unsigned i;
+
+  if (c2_base != NULL_TREE)
+    {
+      if (c1_base != NULL_TREE && operand_equal_p (c1_base, c2_base, 0))
+	{
+	  comb1->baseptr = NULL_TREE;
+	  comb1->type = sizetype;
+	  c2_base = c1_base = NULL_TREE;
+	}
+      else
+	{
+	  tree val;
+	  val = fold_convert (sizetype, c2_base);
+	  aff_combination_add_elt (comb1, val, double_int_minus_one);
+	}
+    }
+
+  aff_combination_add_cst (comb1, double_int_neg (comb2->offset));
+  for (i = 0; i < comb2->n; i++)
+    aff_combination_add_elt (comb1, comb2->elts[i].val,
+			     double_int_neg (comb2->elts[i].coef));
+  if (comb2->rest)
+    aff_combination_add_elt (comb1, comb2->rest, double_int_minus_one);
+  return true;
+}
+
 /* Converts affine combination COMB to TYPE.  */
 
 void
@@ -252,17 +299,12 @@  aff_combination_convert (aff_tree *comb,
     }
 }
 
-/* Splits EXPR into an affine combination of parts.  */
-
-void
-tree_to_aff_combination (tree expr, tree type, aff_tree *comb)
+static void
+tree_to_aff_combination_1 (tree expr, tree type, aff_tree *comb)
 {
   aff_tree tmp;
   enum tree_code code;
-  tree cst, core, toffset;
-  HOST_WIDE_INT bitpos, bitsize;
-  enum machine_mode mode;
-  int unsignedp, volatilep;
+  tree cst;
 
   STRIP_NOPS (expr);
 
@@ -273,16 +315,10 @@  tree_to_aff_combination (tree expr, tree
       aff_combination_const (comb, type, tree_to_double_int (expr));
       return;
 
-    case POINTER_PLUS_EXPR:
-      tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
-      tree_to_aff_combination (TREE_OPERAND (expr, 1), sizetype, &tmp);
-      aff_combination_add (comb, &tmp);
-      return;
-
     case PLUS_EXPR:
     case MINUS_EXPR:
-      tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
-      tree_to_aff_combination (TREE_OPERAND (expr, 1), type, &tmp);
+      tree_to_aff_combination_1 (TREE_OPERAND (expr, 0), type, comb);
+      tree_to_aff_combination_1 (TREE_OPERAND (expr, 1), type, &tmp);
       if (code == MINUS_EXPR)
 	aff_combination_scale (&tmp, double_int_minus_one);
       aff_combination_add (comb, &tmp);
@@ -292,54 +328,22 @@  tree_to_aff_combination (tree expr, tree
       cst = TREE_OPERAND (expr, 1);
       if (TREE_CODE (cst) != INTEGER_CST)
 	break;
-      tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+      tree_to_aff_combination_1 (TREE_OPERAND (expr, 0), type, comb);
       aff_combination_scale (comb, tree_to_double_int (cst));
       return;
 
     case NEGATE_EXPR:
-      tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+      tree_to_aff_combination_1 (TREE_OPERAND (expr, 0), type, comb);
       aff_combination_scale (comb, double_int_minus_one);
       return;
 
     case BIT_NOT_EXPR:
       /* ~x = -x - 1 */
-      tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+      tree_to_aff_combination_1 (TREE_OPERAND (expr, 0), type, comb);
       aff_combination_scale (comb, double_int_minus_one);
       aff_combination_add_cst (comb, double_int_minus_one);
       return;
 
-    case ADDR_EXPR:
-      /* Handle &MEM[ptr + CST] which is equivalent to POINTER_PLUS_EXPR.  */
-      if (TREE_CODE (TREE_OPERAND (expr, 0)) == MEM_REF)
-	{
-	  expr = TREE_OPERAND (expr, 0);
-	  tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
-	  tree_to_aff_combination (TREE_OPERAND (expr, 1), sizetype, &tmp);
-	  aff_combination_add (comb, &tmp);
-	  return;
-	}
-      core = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, &bitpos,
-				  &toffset, &mode, &unsignedp, &volatilep,
-				  false);
-      if (bitpos % BITS_PER_UNIT != 0)
-	break;
-      aff_combination_const (comb, type,
-			     uhwi_to_double_int (bitpos / BITS_PER_UNIT));
-      core = build_fold_addr_expr (core);
-      if (TREE_CODE (core) == ADDR_EXPR)
-	aff_combination_add_elt (comb, core, double_int_one);
-      else
-	{
-	  tree_to_aff_combination (core, type, &tmp);
-	  aff_combination_add (comb, &tmp);
-	}
-      if (toffset)
-	{
-	  tree_to_aff_combination (toffset, type, &tmp);
-	  aff_combination_add (comb, &tmp);
-	}
-      return;
-
     case MEM_REF:
       if (TREE_CODE (TREE_OPERAND (expr, 0)) == ADDR_EXPR)
 	tree_to_aff_combination (TREE_OPERAND (TREE_OPERAND (expr, 0), 0),
@@ -366,6 +370,88 @@  tree_to_aff_combination (tree expr, tree
   aff_combination_elt (comb, type, expr);
 }
 
+/* Splits EXPR into an affine combination of parts.  */
+
+void
+tree_to_aff_combination (tree expr, tree type, aff_tree *comb)
+{
+  aff_tree tmp;
+  enum tree_code code;
+  tree orig_expr = expr;
+  tree expr_type = TREE_TYPE (expr);
+
+  STRIP_NOPS (expr);
+
+  if (!POINTER_TYPE_P (expr_type)
+      || TREE_CODE (expr) == INTEGER_CST)
+    {
+      tree_to_aff_combination_1 (expr, type, comb);
+      return;
+    }
+
+  aff_combination_const (comb, sizetype, double_int_zero);
+  comb->type = expr_type;
+  if (!POINTER_TYPE_P (TREE_TYPE (expr)))
+    {
+      comb->baseptr = orig_expr;
+      return;
+    }
+
+  for (;;)
+    {
+      orig_expr = expr;
+      STRIP_NOPS (expr);
+      code = TREE_CODE (expr);
+      if (code == POINTER_PLUS_EXPR)
+ 	{
+	  tree_to_aff_combination_1 (TREE_OPERAND (expr, 1), sizetype, &tmp);
+	  expr = TREE_OPERAND (expr, 0);
+ 	  aff_combination_add (comb, &tmp);
+ 	}
+      else if (code == ADDR_EXPR)
+ 	{
+	  tree core, toffset;
+	  HOST_WIDE_INT bitpos, bitsize;
+	  enum machine_mode mode;
+	  int unsignedp, volatilep;
+
+	  /* Handle &MEM[ptr + CST] which is equivalent to POINTER_PLUS_EXPR.  */
+	  if (TREE_CODE (TREE_OPERAND (expr, 0)) == MEM_REF)
+	    {
+	      expr = TREE_OPERAND (expr, 0);
+	      tree_to_aff_combination_1 (TREE_OPERAND (expr, 1), sizetype, &tmp);
+	      expr = TREE_OPERAND (expr, 0);
+	      aff_combination_add (comb, &tmp);
+	      continue;
+	    }
+	  core = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize,
+				      &bitpos, &toffset, &mode, &unsignedp,
+				      &volatilep, false);
+	  if (bitpos % BITS_PER_UNIT != 0)
+	    break;
+	  aff_combination_const (&tmp, sizetype,
+				 uhwi_to_double_int (bitpos / BITS_PER_UNIT));
+ 	  aff_combination_add (comb, &tmp);
+ 
+	  expr = build_fold_addr_expr (core);
+	  if (toffset)
+	    {
+	      tree_to_aff_combination_1 (toffset, sizetype, &tmp);
+	      aff_combination_add (comb, &tmp);
+	    }
+	  if (TREE_CODE (expr) == ADDR_EXPR
+	      && TREE_CODE (TREE_OPERAND (expr, 0)) != MEM_REF)
+	    {
+	      orig_expr = expr;
+	      break;
+	    }
+	}
+      else
+	break;
+     }
+  comb->baseptr = orig_expr;
+}
+
 /* Creates EXPR + ELT * SCALE in TYPE.  EXPR is taken from affine
    combination COMB.  */
 
@@ -375,6 +461,7 @@  add_elt_to_tree (tree expr, tree type, t
 {
   enum tree_code code;
   tree type1 = type;
+
   if (POINTER_TYPE_P (type))
     type1 = sizetype;
 
@@ -435,6 +522,7 @@  aff_combination_to_tree (aff_tree *comb)
 {
   tree type = comb->type;
   tree expr = NULL_TREE;
+  tree baseptr = comb->baseptr;
   unsigned i;
   double_int off, sgn;
   tree type1 = type;
@@ -444,11 +532,11 @@  aff_combination_to_tree (aff_tree *comb)
   gcc_assert (comb->n == MAX_AFF_ELTS || comb->rest == NULL_TREE);
 
   for (i = 0; i < comb->n; i++)
-    expr = add_elt_to_tree (expr, type, comb->elts[i].val, comb->elts[i].coef,
+    expr = add_elt_to_tree (expr, type1, comb->elts[i].val, comb->elts[i].coef,
 			    comb);
 
   if (comb->rest)
-    expr = add_elt_to_tree (expr, type, comb->rest, double_int_one, comb);
+    expr = add_elt_to_tree (expr, type1, comb->rest, double_int_one, comb);
 
   /* Ensure that we get x - 1, not x + (-1) or x + 0xff..f if x is
      unsigned.  */
@@ -462,8 +550,14 @@  aff_combination_to_tree (aff_tree *comb)
       off = comb->offset;
       sgn = double_int_one;
     }
-  return add_elt_to_tree (expr, type, double_int_to_tree (type1, off), sgn,
+  expr = add_elt_to_tree (expr, type1, double_int_to_tree (type1, off), sgn,
 			  comb);
+  if (baseptr != NULL_TREE)
+    expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (baseptr),
+			baseptr, expr);
+  else
+    expr = fold_convert (type, expr);
+  return expr;
 }
 
 /* Copies the tree elements of COMB to ensure that they are not shared.  */
Index: gcc/tree-affine.h
===================================================================
--- gcc/tree-affine.h	(revision 184938)
+++ gcc/tree-affine.h	(working copy)
@@ -35,6 +35,9 @@  struct aff_comb_elt
 
 typedef struct affine_tree_combination
 {
+  /* A base pointer, added in an outermost POINTER_PLUS_EXPR.  */
+  tree baseptr;
+
   /* Type of the result of the combination.  */
   tree type;
 
@@ -64,6 +67,7 @@  void aff_combination_elt (aff_tree *, tr
 void aff_combination_scale (aff_tree *, double_int);
 void aff_combination_mult (aff_tree *, aff_tree *, aff_tree *);
 void aff_combination_add (aff_tree *, aff_tree *);
+bool aff_combination_diff (aff_tree *, aff_tree *);
 void aff_combination_add_elt (aff_tree *, tree, double_int);
 void aff_combination_remove_elt (aff_tree *, unsigned);
 void aff_combination_convert (aff_tree *, tree);
Index: gcc/tree-predcom.c
===================================================================
--- gcc/tree-predcom.c	(revision 184938)
+++ gcc/tree-predcom.c	(working copy)
@@ -665,8 +665,8 @@  determine_offset (struct data_reference
      is a multiple of step.  */
   aff_combination_dr_offset (a, &diff);
   aff_combination_dr_offset (b, &baseb);
-  aff_combination_scale (&baseb, double_int_minus_one);
-  aff_combination_add (&diff, &baseb);
+  if (!aff_combination_diff (&diff, &baseb))
+    return false;
 
   tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
 				  &step, &name_expansions);
@@ -1048,8 +1048,8 @@  valid_initializer_p (struct data_referen
      -DISTANCE-th iteration.  */
   aff_combination_dr_offset (root, &diff);
   aff_combination_dr_offset (ref, &base);
-  aff_combination_scale (&base, double_int_minus_one);
-  aff_combination_add (&diff, &base);
+  if (!aff_combination_diff (&diff, &base))
+    return false;
 
   tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
 				  &step, &name_expansions);