diff mbox

RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10))

Message ID 4DEDB98F.6010508@redhat.com
State New
Headers show

Commit Message

Jason Merrill June 7, 2011, 5:39 a.m. UTC
In the testcase, fold_indirect_ref_1 won't fold *(T*)(s1+10) to an 
ARRAY_REF because T != unsigned.  Even if it were just a typedef to 
unsigned, that isn't close enough, but in this case it's a typedef to 
const unsigned.

I'm not sure what the type coherence rules are for ARRAY_REF.  Is it 
really necessary that the type of the ARRAY_REF match exactly the 
element type of the array?

In any case, constexpr expansion can be more flexible about type 
coherence because it is just trying to get a constant value; if that 
doesn't work out, we throw it away and fall back on the original 
expression.  We already handle some cases in cxx_eval_indirect_ref that 
aren't appropriate for fold_indirect_ref_1, but this testcase 
demonstrates that we also want to adjust the cases that are handled by 
that function.

So my options would seem to be either duplicating the whole of 
fold_indirect_ref_1, which tempts undesirable divergence, or adding a 
flag to that function to enable the more permissive type checking that 
the constexpr code wants.

Does this seem like a reasonable thing to do?

Comments

Richard Biener June 7, 2011, 10:19 a.m. UTC | #1
On Tue, Jun 7, 2011 at 7:39 AM, Jason Merrill <jason@redhat.com> wrote:
> In the testcase, fold_indirect_ref_1 won't fold *(T*)(s1+10) to an ARRAY_REF
> because T != unsigned.  Even if it were just a typedef to unsigned, that
> isn't close enough, but in this case it's a typedef to const unsigned.
>
> I'm not sure what the type coherence rules are for ARRAY_REF.  Is it really
> necessary that the type of the ARRAY_REF match exactly the element type of
> the array?

I _think_ that you can unconditionally change the code to do

  TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
  && TYPE_QUALS (t1) == TYPE_QUALS (t2)

now, I'm not sure if for the testcase T and unsigned differ in qualifiers.
Do they?

> In any case, constexpr expansion can be more flexible about type coherence
> because it is just trying to get a constant value; if that doesn't work out,
> we throw it away and fall back on the original expression.  We already
> handle some cases in cxx_eval_indirect_ref that aren't appropriate for
> fold_indirect_ref_1, but this testcase demonstrates that we also want to
> adjust the cases that are handled by that function.
>
> So my options would seem to be either duplicating the whole of
> fold_indirect_ref_1, which tempts undesirable divergence, or adding a flag
> to that function to enable the more permissive type checking that the
> constexpr code wants.
>
> Does this seem like a reasonable thing to do?

So, I'd rather go the above way if it works for this case.

Thanks,
Richard.

>
Jakub Jelinek June 7, 2011, 10:27 a.m. UTC | #2
On Tue, Jun 07, 2011 at 12:19:59PM +0200, Richard Guenther wrote:
> On Tue, Jun 7, 2011 at 7:39 AM, Jason Merrill <jason@redhat.com> wrote:
> > In the testcase, fold_indirect_ref_1 won't fold *(T*)(s1+10) to an ARRAY_REF
> > because T != unsigned.  Even if it were just a typedef to unsigned, that
> > isn't close enough, but in this case it's a typedef to const unsigned.
> >
> > I'm not sure what the type coherence rules are for ARRAY_REF.  Is it really
> > necessary that the type of the ARRAY_REF match exactly the element type of
> > the array?
> 
> I _think_ that you can unconditionally change the code to do
> 
>   TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
>   && TYPE_QUALS (t1) == TYPE_QUALS (t2)
> 
> now, I'm not sure if for the testcase T and unsigned differ in qualifiers.

I guess folding into array_ref that way is fine, but you should in the end
fold_convert_loc it to the expected type, while the middle-end has the
notion of useless type conversions, fold-const.c is also used by FEs and
I think it is expected to have the types exactly matching.
So (T)s1[10] instead of s1[10] in this case.

	Jakub
Richard Biener June 7, 2011, 12:03 p.m. UTC | #3
On Tue, Jun 7, 2011 at 12:27 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jun 07, 2011 at 12:19:59PM +0200, Richard Guenther wrote:
>> On Tue, Jun 7, 2011 at 7:39 AM, Jason Merrill <jason@redhat.com> wrote:
>> > In the testcase, fold_indirect_ref_1 won't fold *(T*)(s1+10) to an ARRAY_REF
>> > because T != unsigned.  Even if it were just a typedef to unsigned, that
>> > isn't close enough, but in this case it's a typedef to const unsigned.
>> >
>> > I'm not sure what the type coherence rules are for ARRAY_REF.  Is it really
>> > necessary that the type of the ARRAY_REF match exactly the element type of
>> > the array?
>>
>> I _think_ that you can unconditionally change the code to do
>>
>>   TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
>>   && TYPE_QUALS (t1) == TYPE_QUALS (t2)
>>
>> now, I'm not sure if for the testcase T and unsigned differ in qualifiers.
>
> I guess folding into array_ref that way is fine, but you should in the end
> fold_convert_loc it to the expected type, while the middle-end has the
> notion of useless type conversions, fold-const.c is also used by FEs and
> I think it is expected to have the types exactly matching.
> So (T)s1[10] instead of s1[10] in this case.

I'm not sure that's a good idea if the caller wants an lvalue.

Richard.

>        Jakub
>
Richard Biener June 7, 2011, 12:22 p.m. UTC | #4
On Tue, 7 Jun 2011, Richard Guenther wrote:

> On Tue, Jun 7, 2011 at 12:27 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Tue, Jun 07, 2011 at 12:19:59PM +0200, Richard Guenther wrote:
> >> On Tue, Jun 7, 2011 at 7:39 AM, Jason Merrill <jason@redhat.com> wrote:
> >> > In the testcase, fold_indirect_ref_1 won't fold *(T*)(s1+10) to an ARRAY_REF
> >> > because T != unsigned.  Even if it were just a typedef to unsigned, that
> >> > isn't close enough, but in this case it's a typedef to const unsigned.
> >> >
> >> > I'm not sure what the type coherence rules are for ARRAY_REF.  Is it really
> >> > necessary that the type of the ARRAY_REF match exactly the element type of
> >> > the array?
> >>
> >> I _think_ that you can unconditionally change the code to do
> >>
> >>   TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
> >>   && TYPE_QUALS (t1) == TYPE_QUALS (t2)
> >>
> >> now, I'm not sure if for the testcase T and unsigned differ in qualifiers.
> >
> > I guess folding into array_ref that way is fine, but you should in the end
> > fold_convert_loc it to the expected type, while the middle-end has the
> > notion of useless type conversions, fold-const.c is also used by FEs and
> > I think it is expected to have the types exactly matching.
> > So (T)s1[10] instead of s1[10] in this case.
> 
> I'm not sure that's a good idea if the caller wants an lvalue.

Rather build the array-ref with type T directly (thus, with a mismatch
between the type of the array-ref and the element type).

Richard.
Michael Matz June 7, 2011, 1:45 p.m. UTC | #5
Hi,

On Tue, 7 Jun 2011, Richard Guenther wrote:

> > > fold_convert_loc it to the expected type, while the middle-end has 
> > > the notion of useless type conversions, fold-const.c is also used by 
> > > FEs and I think it is expected to have the types exactly matching. 
> > > So (T)s1[10] instead of s1[10] in this case.
> > 
> > I'm not sure that's a good idea if the caller wants an lvalue.
> 
> Rather build the array-ref with type T directly (thus, with a mismatch 
> between the type of the array-ref and the element type).

Ick.  Sooner or later such inconsistency will bite us.  It always does.


Ciao,
Michael.
Richard Biener June 7, 2011, 1:49 p.m. UTC | #6
On Tue, 7 Jun 2011, Michael Matz wrote:

> Hi,
> 
> On Tue, 7 Jun 2011, Richard Guenther wrote:
> 
> > > > fold_convert_loc it to the expected type, while the middle-end has 
> > > > the notion of useless type conversions, fold-const.c is also used by 
> > > > FEs and I think it is expected to have the types exactly matching. 
> > > > So (T)s1[10] instead of s1[10] in this case.
> > > 
> > > I'm not sure that's a good idea if the caller wants an lvalue.
> > 
> > Rather build the array-ref with type T directly (thus, with a mismatch 
> > between the type of the array-ref and the element type).
> 
> Ick.  Sooner or later such inconsistency will bite us.  It always does.

There are already some similar "weak" type matchings done in fold-const.c.

But yeah ... I'm sure adding strong type checks to the tree checker
will detect may pre-existing inconsistencies...

Richard.
Jason Merrill June 7, 2011, 1:55 p.m. UTC | #7
On 06/07/2011 06:19 AM, Richard Guenther wrote:
> I _think_ that you can unconditionally change the code to do
>
>    TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
>    &&  TYPE_QUALS (t1) == TYPE_QUALS (t2)
>
> now, I'm not sure if for the testcase T and unsigned differ in qualifiers.
> Do they?

Hmm, I think with the changes I made to the testcase they end up with 
the same qualifiers.  But for constexpr I need to handle them having 
different qualifiers, too.

Jason
Richard Biener June 7, 2011, 2:05 p.m. UTC | #8
On Tue, Jun 7, 2011 at 3:55 PM, Jason Merrill <jason@redhat.com> wrote:
> On 06/07/2011 06:19 AM, Richard Guenther wrote:
>>
>> I _think_ that you can unconditionally change the code to do
>>
>>   TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
>>   &&  TYPE_QUALS (t1) == TYPE_QUALS (t2)
>>
>> now, I'm not sure if for the testcase T and unsigned differ in qualifiers.
>> Do they?
>
> Hmm, I think with the changes I made to the testcase they end up with the
> same qualifiers.  But for constexpr I need to handle them having different
> qualifiers, too.

In that case you could do what Jakub suggested - but only for rvalues
of course.  I'm not sure if we already avoid calling the folding where we
require lvalues.

Can't you instead adjust the type you feed to fold_indirect_ref_1 in
the caller in the C++ FE?

Richard.

> Jason
>
Jason Merrill June 7, 2011, 2:24 p.m. UTC | #9
On 06/07/2011 10:05 AM, Richard Guenther wrote:
> In that case you could do what Jakub suggested - but only for rvalues
> of course.

Right, and I need to handle lvalues as well.

> I'm not sure if we already avoid calling the folding where we
> require lvalues.

No, we don't.

> Can't you instead adjust the type you feed to fold_indirect_ref_1 in
> the caller in the C++ FE?

Not without digging down into the operand to see what type it wants.  At 
that point I might as well just copy the whole function into the FE.

Jason
Jason Merrill June 9, 2011, 7:32 p.m. UTC | #10
On 06/07/2011 10:24 AM, Jason Merrill wrote:
> On 06/07/2011 10:05 AM, Richard Guenther wrote:
>> In that case you could do what Jakub suggested - but only for rvalues
>> of course.
>
> Right, and I need to handle lvalues as well.
>
>> Can't you instead adjust the type you feed to fold_indirect_ref_1 in
>> the caller in the C++ FE?
>
> Not without digging down into the operand to see what type it wants. At
> that point I might as well just copy the whole function into the FE.

Ping?

Jason
Richard Biener June 10, 2011, 8:35 a.m. UTC | #11
On Thu, 9 Jun 2011, Jason Merrill wrote:

> On 06/07/2011 10:24 AM, Jason Merrill wrote:
> > On 06/07/2011 10:05 AM, Richard Guenther wrote:
> > > In that case you could do what Jakub suggested - but only for rvalues
> > > of course.
> > 
> > Right, and I need to handle lvalues as well.
> > 
> > > Can't you instead adjust the type you feed to fold_indirect_ref_1 in
> > > the caller in the C++ FE?
> > 
> > Not without digging down into the operand to see what type it wants. At
> > that point I might as well just copy the whole function into the FE.
> 
> Ping?

I'm out of good suggestions ;)  You can do the same-qualifier matching
and simply have a mismatched array element vs. array-ref type.  We
could also argue that whoever calls fold_indirect_ref_1 with TYPE
that doesn't even have TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (op0 (!)))
== TYPE_MAIN_VARIANT (type) is broken.  Thus we could argue that
even ignoring qualifiers is ok - but I'd be worried about folding
*((volatile int *)&a[0] + 1) to a[1] with lost volatile qualification.

Richard.
Jason Merrill June 10, 2011, 2:01 p.m. UTC | #12
On 06/10/2011 04:35 AM, Richard Guenther wrote:
> I'm out of good suggestions ;)  You can do the same-qualifier matching
> and simply have a mismatched array element vs. array-ref type.

But I need to allow different qualifiers, too.

> We could also argue that whoever calls fold_indirect_ref_1 with TYPE
> that doesn't even have TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (op0 (!)))
> == TYPE_MAIN_VARIANT (type) is broken.

Right, I only want to fold if the main variants match.

> Thus we could argue that
> even ignoring qualifiers is ok - but I'd be worried about folding
> *((volatile int *)&a[0] + 1) to a[1] with lost volatile qualification.

Right.

It would be correct to fold it to

VIEW_CONVERT_EXPR<volatile int,a[1]>

but I'm not sure how well front ends would deal with that.  Maybe I'll 
try it and see.

Jason
Richard Biener June 10, 2011, 2:03 p.m. UTC | #13
On Fri, 10 Jun 2011, Jason Merrill wrote:

> On 06/10/2011 04:35 AM, Richard Guenther wrote:
> > I'm out of good suggestions ;)  You can do the same-qualifier matching
> > and simply have a mismatched array element vs. array-ref type.
> 
> But I need to allow different qualifiers, too.
> 
> > We could also argue that whoever calls fold_indirect_ref_1 with TYPE
> > that doesn't even have TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (op0 (!)))
> > == TYPE_MAIN_VARIANT (type) is broken.
> 
> Right, I only want to fold if the main variants match.
> 
> > Thus we could argue that
> > even ignoring qualifiers is ok - but I'd be worried about folding
> > *((volatile int *)&a[0] + 1) to a[1] with lost volatile qualification.
> 
> Right.
> 
> It would be correct to fold it to
> 
> VIEW_CONVERT_EXPR<volatile int,a[1]>

No, it wouldn't be correct.  It isn't correct to fold it to an array-ref
that isn't volatile.

Richard.
Jason Merrill June 10, 2011, 2:12 p.m. UTC | #14
On 06/10/2011 10:03 AM, Richard Guenther wrote:
>>> *((volatile int *)&a[0] + 1)
>>
>> It would be correct to fold it to
>>
>> VIEW_CONVERT_EXPR<volatile int,a[1]>
>
> No, it wouldn't be correct.  It isn't correct to fold it to an array-ref
> that isn't volatile.

Hmm?  The C expression produces a volatile int lvalue referring to the 
second element of a, as does the VIEW_CONVERT_EXPR.  They seem 
equivalent to me.

Jason
Richard Biener June 10, 2011, 2:20 p.m. UTC | #15
On Fri, 10 Jun 2011, Jason Merrill wrote:

> On 06/10/2011 10:03 AM, Richard Guenther wrote:
> > > > *((volatile int *)&a[0] + 1)
> > > 
> > > It would be correct to fold it to
> > > 
> > > VIEW_CONVERT_EXPR<volatile int,a[1]>
> > 
> > No, it wouldn't be correct.  It isn't correct to fold it to an array-ref
> > that isn't volatile.
> 
> Hmm?  The C expression produces a volatile int lvalue referring to the second
> element of a, as does the VIEW_CONVERT_EXPR.  They seem equivalent to me.

no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
would turn the above to (volatile int) a[1]).

Richard.
Jason Merrill June 10, 2011, 2:32 p.m. UTC | #16
On 06/10/2011 10:20 AM, Richard Guenther wrote:
> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
> would turn the above to (volatile int) a[1]).

Really?  Seems like it would be a lot more useful if it were an lvalue.

I guess I'll just copy the whole function into the front end.

Jason
Mike Stump June 11, 2011, 5:45 p.m. UTC | #17
On Jun 10, 2011, at 7:20 AM, Richard Guenther wrote:
> On Fri, 10 Jun 2011, Jason Merrill wrote:
> 
>> On 06/10/2011 10:03 AM, Richard Guenther wrote:
>>>>> *((volatile int *)&a[0] + 1)
>>>> 
>>>> It would be correct to fold it to
>>>> 
>>>> VIEW_CONVERT_EXPR<volatile int,a[1]>
>>> 
>>> No, it wouldn't be correct.  It isn't correct to fold it to an array-ref
>>> that isn't volatile.
>> 
>> Hmm?  The C expression produces a volatile int lvalue referring to the second
>> element of a, as does the VIEW_CONVERT_EXPR.  They seem equivalent to me.
> 
> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
> would turn the above to (volatile int) a[1]).

Curious.  We have built up a built-in infrastructure that allows for lvalue register references.  I noticed that for vector types, vectors with different type names but the same in every other respect come out different, and a VIEW_CONVERT_EXPR is placed on it to get the types to match.  Presently I'm treating VIEW_CONVERT_EXPR as an lvalue.  For me not to, I'd need either for the same type to be used, or, for another conversion node to be used that can preserve the lvalueness of registers.

Now, if people want to know why, lvalue for registers, it is to support in/out and output only parameters to built-ins.

Thoughts?
Jason Merrill June 11, 2011, 8:01 p.m. UTC | #18
On 06/10/2011 10:20 AM, Richard Guenther wrote:
> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
> would turn the above to (volatile int) a[1]).

The gimplifier seems to consider it an lvalue: gimplify_expr uses 
gimplify_compound_lval for it, and gimplify_addr_expr handles taking its 
address.  And get_inner_reference handles it.  So I think fold should be 
changed, and we should clarify that VIEW_CONVERT_EXPR is an lvalue.

If not, we need a new tree code for treating an lvalue as an lvalue of a 
different type without having to take its address; that's what I thought 
VIEW_CONVERT_EXPR was for.

Jason
Richard Biener June 12, 2011, 10:55 a.m. UTC | #19
On Sat, Jun 11, 2011 at 7:45 PM, Mike Stump <mikestump@comcast.net> wrote:
> On Jun 10, 2011, at 7:20 AM, Richard Guenther wrote:
>> On Fri, 10 Jun 2011, Jason Merrill wrote:
>>
>>> On 06/10/2011 10:03 AM, Richard Guenther wrote:
>>>>>> *((volatile int *)&a[0] + 1)
>>>>>
>>>>> It would be correct to fold it to
>>>>>
>>>>> VIEW_CONVERT_EXPR<volatile int,a[1]>
>>>>
>>>> No, it wouldn't be correct.  It isn't correct to fold it to an array-ref
>>>> that isn't volatile.
>>>
>>> Hmm?  The C expression produces a volatile int lvalue referring to the second
>>> element of a, as does the VIEW_CONVERT_EXPR.  They seem equivalent to me.
>>
>> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
>> would turn the above to (volatile int) a[1]).
>
> Curious.  We have built up a built-in infrastructure that allows for lvalue register references.  I noticed that for vector types, vectors with different type names but the same in every other respect come out different, and a VIEW_CONVERT_EXPR is placed on it to get the types to match.  Presently I'm treating VIEW_CONVERT_EXPR as an lvalue.  For me not to, I'd need either for the same type to be used, or, for another conversion node to be used that can preserve the lvalueness of registers.
>
> Now, if people want to know why, lvalue for registers, it is to support in/out and output only parameters to built-ins.
>
> Thoughts?

In almost all cases(*) the need for a lvalue VIEW_CONVERT_EXPR can be avoided
by moving the VIEW_CONVERT_EXPR to the rvalue assigned too it.  Remember that
VIEW_CONVERT_EXPR always conver the full object and are not allowed to
change sizes.

So, do you have an example?

Richard.

(*) Ada uses lvalue component-refs on VIEW_CONVERT_EXPRs of aggregate types.
While I don't like it too much it's probably not too convenient (even
if it is always
possible) to move these to the RHS of assignments.
Richard Biener June 12, 2011, 10:59 a.m. UTC | #20
On Sat, Jun 11, 2011 at 10:01 PM, Jason Merrill <jason@redhat.com> wrote:
> On 06/10/2011 10:20 AM, Richard Guenther wrote:
>>
>> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
>> would turn the above to (volatile int) a[1]).
>
> The gimplifier seems to consider it an lvalue: gimplify_expr uses
> gimplify_compound_lval for it, and gimplify_addr_expr handles taking its
> address.  And get_inner_reference handles it.  So I think fold should be
> changed, and we should clarify that VIEW_CONVERT_EXPR is an lvalue.
>
> If not, we need a new tree code for treating an lvalue as an lvalue of a
> different type without having to take its address; that's what I thought
> VIEW_CONVERT_EXPR was for.

The please provide a specification on what a VIEW_CONVERT_EXPR does
to type-based alias analysis.  We are trying to avoid that by the rvalue rule.
Also you can always avoid VIEW_CONVERT_EXPRs for lvalues by simply
moving the conversion to the rvalue side.

Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
which uses it for aggregates.  I don't want us to add more lvalue
VIEW_CONVERT_EXPR
cases, especially not for register types.

Richard.

> Jason
>
Richard Biener June 12, 2011, 11:03 a.m. UTC | #21
On Sun, Jun 12, 2011 at 12:59 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, Jun 11, 2011 at 10:01 PM, Jason Merrill <jason@redhat.com> wrote:
>> On 06/10/2011 10:20 AM, Richard Guenther wrote:
>>>
>>> no, a VIEW_CONVERT_EXPR is generally not an lvalue (fold for example
>>> would turn the above to (volatile int) a[1]).
>>
>> The gimplifier seems to consider it an lvalue: gimplify_expr uses
>> gimplify_compound_lval for it, and gimplify_addr_expr handles taking its
>> address.  And get_inner_reference handles it.  So I think fold should be
>> changed, and we should clarify that VIEW_CONVERT_EXPR is an lvalue.
>>
>> If not, we need a new tree code for treating an lvalue as an lvalue of a
>> different type without having to take its address; that's what I thought
>> VIEW_CONVERT_EXPR was for.

Btw, see tree.def which says

/* Represents viewing something of one type as being of a second type.
   This corresponds to an "Unchecked Conversion" in Ada and roughly to
   the idiom *(type2 *)&X in C.  The only operand is the value to be
   viewed as being of another type.  It is undefined if the type of the
   input and of the expression have different sizes.

   This code may also be used within the LHS of a MODIFY_EXPR, in which
   case no actual data motion may occur.  TREE_ADDRESSABLE will be set in
   this case and GCC must abort if it could not do the operation without
   generating insns.  */
DEFTREECODE (VIEW_CONVERT_EXPR, "view_convert_expr", tcc_reference, 1)

> The please provide a specification on what a VIEW_CONVERT_EXPR does
> to type-based alias analysis.  We are trying to avoid that by the rvalue rule.
> Also you can always avoid VIEW_CONVERT_EXPRs for lvalues by simply
> moving the conversion to the rvalue side.
>
> Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
> which uses it for aggregates.  I don't want us to add more lvalue
> VIEW_CONVERT_EXPR
> cases, especially not for register types.
>
> Richard.
>
>> Jason
>>
>
Jason Merrill June 12, 2011, 10:10 p.m. UTC | #22
On 06/12/2011 06:59 AM, Richard Guenther wrote:
> The please provide a specification on what a VIEW_CONVERT_EXPR does
> to type-based alias analysis.

If the alias set of the VIEW_CONVERT_EXPR type the same as the set for 
the operand, ignore it; if it's a subset, handle it like a 
COMPONENT_REF; otherwise ignore the operand for TBAA.

It seems like get_alias_set currently gets this backwards; it's ignoring 
outer COMPONENT_REFs instead of the inner structure.

> Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
> which uses it for aggregates.

It also seems to be widely used for vectors, but perhaps that's only for 
rvalues.

> I don't want us to add more lvalue
> VIEW_CONVERT_EXPR cases, especially not for register types.

Then how do we convert an int lvalue to a volatile int lvalue?

> /* Represents viewing something of one type as being of a second type.
>    This corresponds to an "Unchecked Conversion" in Ada and roughly to
>    the idiom *(type2 *)&X in C.

Right, that's why I thought it was an lvalue.

>    This code may also be used within the LHS of a MODIFY_EXPR

And this.

Jason
Mike Stump June 13, 2011, 2:47 a.m. UTC | #23
On Jun 12, 2011, at 4:03 AM, Richard Guenther wrote:
> Btw, see tree.def which says
> 
> /* Represents viewing something of one type as being of a second type.
>   This corresponds to an "Unchecked Conversion" in Ada and roughly to
>   the idiom *(type2 *)&X in C.  The only operand is the value to be
>   viewed as being of another type.  It is undefined if the type of the
>   input and of the expression have different sizes.
> 
>   This code may also be used within the LHS of a MODIFY_EXPR, in which
>   case no actual data motion may occur.  TREE_ADDRESSABLE will be set in
>   this case and GCC must abort if it could not do the operation without
>   generating insns.  */

I wasn't able to follow what this was trying to say.  :-(  No actual data motion may occur?  The wording is weasely.  Does it mean: Data motion does not occur when used on the LHS of a MODIFY_EXPR?  If so, it should just directly state it.
Mike Stump June 13, 2011, 3:17 a.m. UTC | #24
On Jun 12, 2011, at 3:55 AM, Richard Guenther wrote:
> In almost all cases(*) the need for a lvalue VIEW_CONVERT_EXPR can be avoided
> by moving the VIEW_CONVERT_EXPR to the rvalue assigned too it.  Remember that
> VIEW_CONVERT_EXPR always conver the full object and are not allowed to
> change sizes.
> 
> So, do you have an example?

Sure, take a divmod type of instruction.[1]  It has two outputs, a div and a mod.  The instruction operates on registers, and produces two completely independent values.  But really, it is a general features of the built-in mechanism that Kenny and I are working on that some people would like to reuse for other ports.  The general feature is that one can declare any argument to a built-in to be input only, output only or input/output.  This nicely matches what I think quite of lot of machines do for assembly language semantics.  The support isn't to match my machine, but rather to support building a port, in which 1 or more instructions have such parameters.  Requiring memory is a non-starter, and in fact, we already have a scheme for memory_operand type things for the instructions that take memory.  The scheme used for them is borrowed from C++, where we just declare that the built-in takes a reference type.  This reuses most of the code paths from C++ and it seems to work nicely.  I'd be tempting to use it for register references, but, in my current scheme for registers, I support data flow in, out and in/out at the tree level and at the rtl level.  We believe this is nice from an optimizing perspective, and probably required to get the warnings about using uninitialized variables correct.
Richard Biener June 13, 2011, 10:51 a.m. UTC | #25
On Mon, Jun 13, 2011 at 12:10 AM, Jason Merrill <jason@redhat.com> wrote:
> On 06/12/2011 06:59 AM, Richard Guenther wrote:
>>
>> The please provide a specification on what a VIEW_CONVERT_EXPR does
>> to type-based alias analysis.
>
> If the alias set of the VIEW_CONVERT_EXPR type the same as the set for the
> operand, ignore it; if it's a subset, handle it like a COMPONENT_REF;
> otherwise ignore the operand for TBAA.

Handling it as a component-ref would mean adding subsets to the
view-convert-expr
base.

> It seems like get_alias_set currently gets this backwards; it's ignoring
> outer COMPONENT_REFs instead of the inner structure.

Right, it treats it as if it were an rvalue conversion around an inner lvalue -
everything TBAA related happens on the lvalue, so if you load a
view-convert-expr<int>(float_var) the alias set is that of the float.

>> Yes, we do handle lvalue VIEW_CONVERT_EXPRs, but that is for Ada
>> which uses it for aggregates.
>
> It also seems to be widely used for vectors, but perhaps that's only for
> rvalues.

Yes.

>> I don't want us to add more lvalue
>> VIEW_CONVERT_EXPR cases, especially not for register types.
>
> Then how do we convert an int lvalue to a volatile int lvalue?

By doing *(volatile int *)&int_var.  Alternatively you can use a MEM_REF
(which sofar isn't created by frontends but only during gimplification) of
the form MEM[&int_var, (int *)0] with TREE_THIS_VOLATILE set and
a type of volatile int.  This is btw. what you can do for your array-ref
case as well - just you wouldn't have an array-ref then, but a MEM_REF
with a non-zero (but constant) offset (which might be a limitation to you).

But I suppose you want the array-ref be folded to a constant eventually?
Or why are you not happy with the *(T *)&foo tree?

>> /* Represents viewing something of one type as being of a second type.
>>   This corresponds to an "Unchecked Conversion" in Ada and roughly to
>>   the idiom *(type2 *)&X in C.
>
> Right, that's why I thought it was an lvalue.

"roughly" ;)  It's equally roughly (type2)X in C, but that doesn't "work"
for aggregate types type2.  So we don't have a 1:1 C equivalent of
what VIEW_CONVERT_EXPR does.

>>   This code may also be used within the LHS of a MODIFY_EXPR
>
> And this.

And the following weasel-wording about "no data motion" for this
case.

Now, the middle-end shouldn't have an issue dealing with lvalue
VIEW_CONVERT_EXPRs, as MEM_REFs are exposing all
issues VIEW_CONVERT_EXPR lvalues would do.  There is
at least the folding to NOP_EXPR and eventually other frontend
specific code that might be confused though.  And of course
there is the TBAA issue, so you have to be careful to not
create wrong code - for example folding *(float *)&int_var[5]
to VIEW_CONVERT_EXPR<float>(int_var[5]) is wrong.

Richard.

> Jason
>
Richard Biener June 13, 2011, 10:57 a.m. UTC | #26
On Mon, Jun 13, 2011 at 5:17 AM, Mike Stump <mikestump@comcast.net> wrote:
> On Jun 12, 2011, at 3:55 AM, Richard Guenther wrote:
>> In almost all cases(*) the need for a lvalue VIEW_CONVERT_EXPR can be avoided
>> by moving the VIEW_CONVERT_EXPR to the rvalue assigned too it.  Remember that
>> VIEW_CONVERT_EXPR always conver the full object and are not allowed to
>> change sizes.
>>
>> So, do you have an example?
>
> Sure, take a divmod type of instruction.[1]  It has two outputs, a div and a mod.  The instruction operates on registers, and produces two completely independent values.  But really, it is a general features of the built-in mechanism that Kenny and I are working on that some people would like to reuse for other ports.  The general feature is that one can declare any argument to a built-in to be input only, output only or input/output.  This nicely matches what I think quite of lot of machines do for assembly language semantics.  The support isn't to match my machine, but rather to support building a port, in which 1 or more instructions have such parameters.  Requiring memory is a non-starter, and in fact, we already have a scheme for memory_operand type things for the instructions that take memory.  The scheme used for them is borrowed from C++, where we just declare that the built-in takes a reference type.  This reuses most of the code paths from C++ and it seems to work nicely.  I'd be tempting to use it for register references, but, in my current scheme for registers, I support data flow in, out and in/out at the tree level and at the rtl level.  We believe this is nice from an optimizing perspective, and probably required to get the warnings about using uninitialized variables correct.

That's not exactly an example - I can't think of how you want or need
to use VIEW_CONVERT_EXPRs to implement said divmod instruction or why
you would need
anything special for the _argument_ of said instruction.  An
instruction or call with multiple outputs would simply be something
like

{ div_1, mod_2 } = __builtin_divmod (arg_3);

with two SSA defs. A nice representation for the tree for { div_1,
mod_2 } remains to be found (if it should be a single tree at all, or
possibly
multiple ones).

We already play tricks for sincos for example via

tem_1 = __builtin_cexpi (arg_2);
sin_3 = REALPART_EXPR <tem_1>;
cos_4 = IMAGPART_EXPR <tem_1>;

which avoids the two defs by using a single def which is then decomposed.

So, can you elaborate a bit more on what you want to do with special
argument kinds?  Elaborate with an actual example, not words.

Thanks,
Richard.
Jason Merrill June 13, 2011, 5:51 p.m. UTC | #27
On 06/13/2011 06:51 AM, Richard Guenther wrote:
> But I suppose you want the array-ref be folded to a constant eventually?

Right.

I'm not going to keep arguing about VIEW_CONVERT_EXPR, but that brings 
me back to my original question: is it OK to add a permissive mode to 
the function, or should I copy the whole thing into the front end?

Jason
Richard Biener June 14, 2011, 8:31 a.m. UTC | #28
On Mon, 13 Jun 2011, Jason Merrill wrote:

> On 06/13/2011 06:51 AM, Richard Guenther wrote:
> > But I suppose you want the array-ref be folded to a constant eventually?
> 
> Right.
> 
> I'm not going to keep arguing about VIEW_CONVERT_EXPR, but that brings me back
> to my original question: is it OK to add a permissive mode to the function, or
> should I copy the whole thing into the front end?

I think you should copy the whole thing into the front end for now.

Note that we want to arrive at a point where our constant folding
can handle the MEM_REF case for arbitrary constant constructors.
See fold_const_aggregate_ref in gimple-fold.c - probably not usable
from the frontend directly though.  And it doesn't yet handle
non-array constructors without having a component-ref tree.
But if we eventually have all the code in that routine you might
switch to it instead.

Richard.
Mike Stump June 14, 2011, 8:13 p.m. UTC | #29
On Jun 13, 2011, at 3:57 AM, Richard Guenther wrote:
> That's not exactly an example - I can't think of how you want or need
> to use VIEW_CONVERT_EXPRs to implement said divmod instruction or why
> you would need anything special for the _argument_ of said instruction.

Oh, I completely misunderstood your question.  In my case, as I previously stated, was with a vector type that was identical, save the name of the type:

	mod = a%b

where mod didn't have the type of the expression (a%b), so someone created the VIEW_CONVERT_EXPR on the mod.  The person creating it _thought_ it would be a rvalue context, but ultimately, it was an lvalue context.  We discover the lvalue/rvalue state of the expression at target_fold_builtin time.  The actual code looks more like:

  __builtin_divmod (div, mod, a, b);

In fold_builtin, we do all the processing to handle the semantics.

> An
> instruction or call with multiple outputs would simply be something
> like
> 
> { div_1, mod_2 } = __builtin_divmod (arg_3);
> 
> with two SSA defs. A nice representation for the tree for { div_1,
> mod_2 } remains to be found (if it should be a single tree at all, or
> possibly multiple ones).

At target_fold_builtin time we regenerate it as:

s = builtin_divmod_final (a, b);
div_1 = s.div
mod_2 = s.mod

and generate a type { div, mod } on the fly.  We expect the optimizer to handle extra moves reasonably, and we want to keep the one instruction as one unit.

> We already play tricks for sincos for example via
> 
> tem_1 = __builtin_cexpi (arg_2);
> sin_3 = REALPART_EXPR <tem_1>;
> cos_4 = IMAGPART_EXPR <tem_1>;
> 
> which avoids the two defs by using a single def which is then decomposed.
> 
> So, can you elaborate a bit more on what you want to do with special
> argument kinds?  Elaborate with an actual example, not words.

We support tagging any parameter to a builtin as define_outputs, define_inputs or define_in_outs in a part of the .md file that describes the builtins for the machine, the actual divmod builtin for example is:

(define_builtin "divmod<T_ALL_DI:sign_u>" "divmod<T_ALL_DI:sign_u>_<type>"
  [
    (define_outputs [(var_operand:T_ALL_DI 0)    ;;dividend                      
                     (var_operand:T_ALL_DI 1)])  ;;mod                           
    (define_inputs  [(var_operand:T_ALL_DI 2)
                     (var_operand:T_ALL_DI 3)])
    (define_rtl_pattern "<T_ALL_DI:sign_u>divmod<m_mode>4" [0 1 2 3])
    (attributes [pure])
  ]
)

that's the actual code.  The testcase looks like:

  t_v4udi_0 = divmodu_t_v4udi (t_v4udi_1, t_v4udi_2, t_v4udi_3);


The VIEW_CONVERT_EXPR looks like:

<view_convert_expr 0x7ffff5a872d0
    type <vector_type 0x7ffff7f4b930 __attribute__((vector_size(32))) unsigned long
        type <integer_type 0x7ffff7e8c690 long unsigned int public unsigned DI
            size <integer_cst 0x7ffff7e76730 constant 64>
            unit size <integer_cst 0x7ffff7e76758 constant 8>
            align 64 symtab 0 alias set -1 canonical type 0x7ffff7e8c690 precision 64 min <integer_cst 0x7ffff7e76780 0> max\
 <integer_cst 0x7ffff7e76708 18446744073709551615>
            pointer_to_this <pointer_type 0x7ffff7f50738> reference_to_this <reference_type 0x7ffff7f513f0>>
        unsigned V4DI
        size <integer_cst 0x7ffff7f43de8 constant 256>
        unit size <integer_cst 0x7ffff7e76348 constant 32>
        align 256 symtab 0 alias set -1 canonical type 0x7ffff7f4b930 nunits 4 reference_to_this <reference_type 0x7ffff7f51\
540>>

    arg 0 <var_decl 0x7ffff59d9640 t_v4udi_1
        type <vector_type 0x7ffff5ac3888 type <integer_type 0x7ffff7e8c690 long unsigned int>
            unsigned V4DI size <integer_cst 0x7ffff7f43de8 256> unit size <integer_cst 0x7ffff7e76348 32>
            align 256 symtab 0 alias set -1 canonical type 0x7ffff5ac3888 nunits 4>
        used public static unsigned V4DI defer-output file t22.c line 262 col 48 size <integer_cst 0x7ffff7f43de8 256> unit \
size <integer_cst 0x7ffff7e76348 32>
        align 256>>

Hopefully, somewhere about is an example of what you wanted to see, if not, let me know what you'd like to see.
Richard Biener June 15, 2011, 9:04 a.m. UTC | #30
On Tue, 14 Jun 2011, Mike Stump wrote:

> On Jun 13, 2011, at 3:57 AM, Richard Guenther wrote:
> > That's not exactly an example - I can't think of how you want or need
> > to use VIEW_CONVERT_EXPRs to implement said divmod instruction or why
> > you would need anything special for the _argument_ of said instruction.
> 
> Oh, I completely misunderstood your question.  In my case, as I previously stated, was with a vector type that was identical, save the name of the type:
> 
> 	mod = a%b
> 
> where mod didn't have the type of the expression (a%b), so someone created the VIEW_CONVERT_EXPR on the mod.  The person creating it _thought_ it would be a rvalue context, but ultimately, it was an lvalue context.  We discover the lvalue/rvalue state of the expression at target_fold_builtin time.  The actual code looks more like:
> 
>   __builtin_divmod (div, mod, a, b);
> 
> In fold_builtin, we do all the processing to handle the semantics.
> 
> > An
> > instruction or call with multiple outputs would simply be something
> > like
> > 
> > { div_1, mod_2 } = __builtin_divmod (arg_3);
> > 
> > with two SSA defs. A nice representation for the tree for { div_1,
> > mod_2 } remains to be found (if it should be a single tree at all, or
> > possibly multiple ones).
> 
> At target_fold_builtin time we regenerate it as:
> 
> s = builtin_divmod_final (a, b);
> div_1 = s.div
> mod_2 = s.mod
> 
> and generate a type { div, mod } on the fly.  We expect the optimizer to handle extra moves reasonably, and we want to keep the one instruction as one unit.
> 
> > We already play tricks for sincos for example via
> > 
> > tem_1 = __builtin_cexpi (arg_2);
> > sin_3 = REALPART_EXPR <tem_1>;
> > cos_4 = IMAGPART_EXPR <tem_1>;
> > 
> > which avoids the two defs by using a single def which is then decomposed.
> > 
> > So, can you elaborate a bit more on what you want to do with special
> > argument kinds?  Elaborate with an actual example, not words.
> 
> We support tagging any parameter to a builtin as define_outputs, define_inputs or define_in_outs in a part of the .md file that describes the builtins for the machine, the actual divmod builtin for example is:
> 
> (define_builtin "divmod<T_ALL_DI:sign_u>" "divmod<T_ALL_DI:sign_u>_<type>"
>   [
>     (define_outputs [(var_operand:T_ALL_DI 0)    ;;dividend                      
>                      (var_operand:T_ALL_DI 1)])  ;;mod                           
>     (define_inputs  [(var_operand:T_ALL_DI 2)
>                      (var_operand:T_ALL_DI 3)])
>     (define_rtl_pattern "<T_ALL_DI:sign_u>divmod<m_mode>4" [0 1 2 3])
>     (attributes [pure])
>   ]
> )
> 
> that's the actual code.  The testcase looks like:
> 
>   t_v4udi_0 = divmodu_t_v4udi (t_v4udi_1, t_v4udi_2, t_v4udi_3);

I see.  So it's the C side of the representation that needs the
special operands.

> The VIEW_CONVERT_EXPR looks like:
> 
> <view_convert_expr 0x7ffff5a872d0
>     type <vector_type 0x7ffff7f4b930 __attribute__((vector_size(32))) unsigned long
>         type <integer_type 0x7ffff7e8c690 long unsigned int public unsigned DI
>             size <integer_cst 0x7ffff7e76730 constant 64>
>             unit size <integer_cst 0x7ffff7e76758 constant 8>
>             align 64 symtab 0 alias set -1 canonical type 0x7ffff7e8c690 precision 64 min <integer_cst 0x7ffff7e76780 0> max\
>  <integer_cst 0x7ffff7e76708 18446744073709551615>
>             pointer_to_this <pointer_type 0x7ffff7f50738> reference_to_this <reference_type 0x7ffff7f513f0>>
>         unsigned V4DI
>         size <integer_cst 0x7ffff7f43de8 constant 256>
>         unit size <integer_cst 0x7ffff7e76348 constant 32>
>         align 256 symtab 0 alias set -1 canonical type 0x7ffff7f4b930 nunits 4 reference_to_this <reference_type 0x7ffff7f51\
> 540>>
> 
>     arg 0 <var_decl 0x7ffff59d9640 t_v4udi_1
>         type <vector_type 0x7ffff5ac3888 type <integer_type 0x7ffff7e8c690 long unsigned int>
>             unsigned V4DI size <integer_cst 0x7ffff7f43de8 256> unit size <integer_cst 0x7ffff7e76348 32>
>             align 256 symtab 0 alias set -1 canonical type 0x7ffff5ac3888 nunits 4>
>         used public static unsigned V4DI defer-output file t22.c line 262 col 48 size <integer_cst 0x7ffff7f43de8 256> unit \
> size <integer_cst 0x7ffff7e76348 32>
>         align 256>>

This VIEW_CONVERT_EXPR looks useless - in fact useless_type_conversion_p
will tell you that, so you can omit it.

Richard.
Mike Stump June 15, 2011, 5:46 p.m. UTC | #31
On Jun 15, 2011, at 2:04 AM, Richard Guenther wrote:
> This VIEW_CONVERT_EXPR looks useless - in fact useless_type_conversion_p
> will tell you that, so you can omit it.

So, I tracked down who created it:

tree
convert_to_vector (tree type, tree expr)
{
  switch (TREE_CODE (TREE_TYPE (expr)))
    {
    case INTEGER_TYPE:
    case VECTOR_TYPE:
      if (!tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (expr))))
        {
          error ("can%'t convert between vector values of different size");
          return error_mark_node;
        }
=>    return build1 (VIEW_CONVERT_EXPR, type, expr);

If people want to not create useless conversions in the first place, though, I suspect there are lots of places that create useless conversions in the compiler.
Richard Biener June 16, 2011, 7:36 a.m. UTC | #32
On Wed, Jun 15, 2011 at 7:46 PM, Mike Stump <mikestump@comcast.net> wrote:
> On Jun 15, 2011, at 2:04 AM, Richard Guenther wrote:
>> This VIEW_CONVERT_EXPR looks useless - in fact useless_type_conversion_p
>> will tell you that, so you can omit it.
>
> So, I tracked down who created it:
>
> tree
> convert_to_vector (tree type, tree expr)
> {
>  switch (TREE_CODE (TREE_TYPE (expr)))
>    {
>    case INTEGER_TYPE:
>    case VECTOR_TYPE:
>      if (!tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (expr))))
>        {
>          error ("can%'t convert between vector values of different size");
>          return error_mark_node;
>        }
> =>    return build1 (VIEW_CONVERT_EXPR, type, expr);
>
> If people want to not create useless conversions in the first place, though, I suspect there are lots of places that create useless conversions in the compiler.

Yeah, the above looks it comes from the frontends - gimplification
should strip this conversion, if it doesn't that is a bug ;)

Richard.
Michael Matz June 17, 2011, 1:57 p.m. UTC | #33
Hi,

On Thu, 16 Jun 2011, Richard Guenther wrote:

> > If people want to not create useless conversions in the first place, 
> > though, I suspect there are lots of places that create useless 
> > conversions in the compiler.
> 
> Yeah, the above looks it comes from the frontends - gimplification 
> should strip this conversion, if it doesn't that is a bug ;)

Well, Mike said this view_convert_expr actually is on the LHS of an 
assignment.  I wouldn't be surprised if nobody strips "useless" 
conversions of LHSes, exactly because such things on a LHS are no 
conversions.


Ciao,
Michael.
Richard Biener June 20, 2011, 9:19 a.m. UTC | #34
On Fri, 17 Jun 2011, Michael Matz wrote:

> Hi,
> 
> On Thu, 16 Jun 2011, Richard Guenther wrote:
> 
> > > If people want to not create useless conversions in the first place, 
> > > though, I suspect there are lots of places that create useless 
> > > conversions in the compiler.
> > 
> > Yeah, the above looks it comes from the frontends - gimplification 
> > should strip this conversion, if it doesn't that is a bug ;)
> 
> Well, Mike said this view_convert_expr actually is on the LHS of an 
> assignment.  I wouldn't be surprised if nobody strips "useless" 
> conversions of LHSes, exactly because such things on a LHS are no 
> conversions.

For vectors we can move them to the rhs instead.

Richard.
diff mbox

Patch

commit 7881846ebf44868b0a27f727064d84c271127c65
Author: Jason Merrill <jason@redhat.com>
Date:   Tue Jun 7 00:00:36 2011 -0400

    	PR c++/49290
    gcc/
    	* fold-const.c (fold_indirect_ref_maybe_permissive): Split out from...
    	(fold_indirect_ref_1): ...here.
    	* tree.h: Declare fold_indirect_ref_maybe_permissive.
    gcc/cp/
    	* semantics.c (cxx_eval_indirect_ref): Use
    	fold_indirect_ref_maybe_permissive.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index f005f2f..d979c18 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6765,6 +6765,7 @@  cxx_eval_indirect_ref (const constexpr_call *call, tree t,
 					   /*addr*/false, non_constant_p);
   tree type, sub, subtype, r;
   bool empty_base;
+  location_t loc = EXPR_LOCATION (t);
 
   /* Don't VERIFY_CONSTANT here.  */
   if (*non_constant_p)
@@ -6843,8 +6844,8 @@  cxx_eval_indirect_ref (const constexpr_call *call, tree t,
 
   /* Let build_fold_indirect_ref handle the cases it does fine with.  */
   if (r == NULL_TREE)
-    r = build_fold_indirect_ref (op0);
-  if (TREE_CODE (r) != INDIRECT_REF)
+    r = fold_indirect_ref_maybe_permissive (loc, type, op0, true);
+  if (r)
     r = cxx_eval_constant_expression (call, r, allow_non_constant,
 				      addr, non_constant_p);
   else if (TREE_CODE (sub) == ADDR_EXPR
@@ -6871,7 +6872,7 @@  cxx_eval_indirect_ref (const constexpr_call *call, tree t,
       TREE_CONSTANT (r) = true;
     }
 
-  if (TREE_CODE (r) == INDIRECT_REF && TREE_OPERAND (r, 0) == orig_op0)
+  if (r == NULL_TREE)
     return t;
   return r;
 }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 9a3f8cb..ef880fc 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -15555,14 +15555,19 @@  fold_build_cleanup_point_expr (tree type, tree expr)
 
 /* Given a pointer value OP0 and a type TYPE, return a simplified version
    of an indirection through OP0, or NULL_TREE if no simplification is
-   possible.  */
+   possible.  If PERMISSIVE is true, allow some variation in types.  */
 
 tree
-fold_indirect_ref_1 (location_t loc, tree type, tree op0)
+fold_indirect_ref_maybe_permissive (location_t loc, tree type, tree op0,
+				    bool permissive)
 {
   tree sub = op0;
   tree subtype;
 
+#define SAME_TYPE(X,Y) (permissive					\
+			? TYPE_MAIN_VARIANT (X) == TYPE_MAIN_VARIANT (Y) \
+			: (X) == (Y))
+
   STRIP_NOPS (sub);
   subtype = TREE_TYPE (sub);
   if (!POINTER_TYPE_P (subtype))
@@ -15586,7 +15591,7 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 	}
       /* *(foo *)&fooarray => fooarray[0] */
       else if (TREE_CODE (optype) == ARRAY_TYPE
-	       && type == TREE_TYPE (optype)
+	       && SAME_TYPE (type, TREE_TYPE (optype))
 	       && (!in_gimple_form
 		   || TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST))
 	{
@@ -15602,11 +15607,11 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 	}
       /* *(foo *)&complexfoo => __real__ complexfoo */
       else if (TREE_CODE (optype) == COMPLEX_TYPE
-	       && type == TREE_TYPE (optype))
+	       && SAME_TYPE (type, TREE_TYPE (optype)))
 	return fold_build1_loc (loc, REALPART_EXPR, type, op);
       /* *(foo *)&vectorfoo => BIT_FIELD_REF<vectorfoo,...> */
       else if (TREE_CODE (optype) == VECTOR_TYPE
-	       && type == TREE_TYPE (optype))
+	       && SAME_TYPE (type, TREE_TYPE (optype)))
 	{
 	  tree part_width = TYPE_SIZE (type);
 	  tree index = bitsize_int (0);
@@ -15629,7 +15634,7 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 
 	  /* ((foo*)&vectorfoo)[1] => BIT_FIELD_REF<vectorfoo,...> */
 	  if (TREE_CODE (op00type) == VECTOR_TYPE
-	      && type == TREE_TYPE (op00type))
+	      && SAME_TYPE (type, TREE_TYPE (op00type)))
 	    {
 	      HOST_WIDE_INT offset = tree_low_cst (op01, 0);
 	      tree part_width = TYPE_SIZE (type);
@@ -15645,7 +15650,7 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 	    }
 	  /* ((foo*)&complexfoo)[1] => __imag__ complexfoo */
 	  else if (TREE_CODE (op00type) == COMPLEX_TYPE
-		   && type == TREE_TYPE (op00type))
+		   && SAME_TYPE (type, TREE_TYPE (op00type)))
 	    {
 	      tree size = TYPE_SIZE_UNIT (type);
 	      if (tree_int_cst_equal (size, op01))
@@ -15653,7 +15658,7 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 	    }
 	  /* ((foo *)&fooarray)[1] => fooarray[1] */
 	  else if (TREE_CODE (op00type) == ARRAY_TYPE
-		   && type == TREE_TYPE (op00type))
+		   && SAME_TYPE (type, TREE_TYPE (op00type)))
 	    {
 	      tree type_domain = TYPE_DOMAIN (op00type);
 	      tree min_val = size_zero_node;
@@ -15670,7 +15675,7 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 
   /* *(foo *)fooarrptr => (*fooarrptr)[0] */
   if (TREE_CODE (TREE_TYPE (subtype)) == ARRAY_TYPE
-      && type == TREE_TYPE (TREE_TYPE (subtype))
+      && SAME_TYPE (type, TREE_TYPE (TREE_TYPE (subtype)))
       && (!in_gimple_form
 	  || TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST))
     {
@@ -15690,6 +15695,12 @@  fold_indirect_ref_1 (location_t loc, tree type, tree op0)
   return NULL_TREE;
 }
 
+tree
+fold_indirect_ref_1 (location_t loc, tree type, tree op0)
+{
+  return fold_indirect_ref_maybe_permissive (loc, type, op0, false);
+}
+
 /* Builds an expression for an indirection through T, simplifying some
    cases.  */
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr7.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr7.C
new file mode 100644
index 0000000..75e6f43
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr7.C
@@ -0,0 +1,20 @@ 
+// PR c++/49290
+// { dg-options -std=c++0x }
+
+typedef const unsigned T;
+struct S
+{
+  constexpr T foo (void);
+  unsigned s1[16];
+};
+
+constexpr T
+S::foo ()
+{
+  return *(T *) (s1 + 10);
+}
+
+constexpr S s = { 0,1,2,3,4,5,6,7,8,9,10 };
+
+#define SA(X) static_assert ((X), #X)
+SA(s.foo() == 10);
diff --git a/gcc/tree.h b/gcc/tree.h
index c7338ba..5b20a07 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5172,6 +5172,7 @@  extern tree fold_single_bit_test (location_t, enum tree_code, tree, tree, tree);
 extern tree fold_ignored_result (tree);
 extern tree fold_abs_const (tree, tree);
 extern tree fold_indirect_ref_1 (location_t, tree, tree);
+extern tree fold_indirect_ref_maybe_permissive (location_t, tree, tree, bool);
 extern void fold_defer_overflow_warnings (void);
 extern void fold_undefer_overflow_warnings (bool, const_gimple, int);
 extern void fold_undefer_and_ignore_overflow_warnings (void);