diff mbox

[RFC] PR 55415 : Pessimistic misalignment from eipa_sra pass

Message ID 50ABBCC4.5070405@redhat.com
State New
Headers show

Commit Message

Richard Henderson Nov. 20, 2012, 5:24 p.m. UTC
The get_pointer_alignment function can indicate that it does not know
what the alignment should be, and it always fills in worst-case values
for that case.  We should not use these worst-case values to "optimize"
the interface of a function.

At minimum I think something like the following would be good.  But
I'm unsure why we would *ever* want to lower the alignment at a function
interface.  It seems to me that we'd simply want the caller to handle
copying the data to an aligned location?

What was the use case of this code in the first place?



r~

Comments

Martin Jambor Nov. 21, 2012, 4:58 p.m. UTC | #1
Hi,

On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote:
> The get_pointer_alignment function can indicate that it does not know
> what the alignment should be, and it always fills in worst-case values
> for that case.  We should not use these worst-case values to "optimize"
> the interface of a function.
> 
> At minimum I think something like the following would be good.  But
> I'm unsure why we would *ever* want to lower the alignment at a function
> interface.  It seems to me that we'd simply want the caller to handle
> copying the data to an aligned location?
> 
> What was the use case of this code in the first place?

PR 52402, and not surprisingly, that testcase fails on x86_64-linux
with your patch.  Furthermore, misalignment due to MEM_REF offset has
to be accounted for whether we know the alignment of base or not.

I was hoping that we could do something along the lines of
build_ref_for_offset tree-sra.c but that would not work for cases like
the one from PR 52890, comment 7 when there is no dereference in the
caller, we tranform:

  D.2537 = &this->surfaces;
  sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);

into 

  D.2537 = &this->surfaces;
  D.2554 = MEM[(const struct ggTrain *)D.2537];
  sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);

At the moment I hope I'll be able to:

1) Bring back tree_non_aligned_mem_p from 4.6 to be used in
   access_precludes_ipa_sra_p.  This way, any knowingly misaligned
   load in the callee should will not be transferred to the caller, if
   there is none.

2) In ipa_modify_call_arguments, be optimistic in like in your patch
   except for the case when we are looking at a dereference (i.e. we
   are turning a BLK dereference into a smaller scalar type one).  In
   that case, use its alignment like in build_ref_for_offset.

But I'd like to think about this a bit more.  I believe we should ask
for Richi's approval when he comes back and recovers anyway.  But this
is now my highest priority (finally).

Thanks,

Martin


> 
> 
> 
> r~

> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 3150bd6..d117389 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -2956,15 +2956,17 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt,
>  	      unsigned int align;
>  	      unsigned HOST_WIDE_INT misalign;
>  
> -	      get_pointer_alignment_1 (base, &align, &misalign);
> -	      misalign += (tree_to_double_int (off)
> -			   .sext (TYPE_PRECISION (TREE_TYPE (off))).low
> -			   * BITS_PER_UNIT);
> -	      misalign = misalign & (align - 1);
> -	      if (misalign != 0)
> -		align = (misalign & -misalign);
> -	      if (align < TYPE_ALIGN (type))
> -		type = build_aligned_type (type, align);
> +	      if (get_pointer_alignment_1 (base, &align, &misalign))
> +		{
> +		  misalign += (tree_to_double_int (off)
> +			       .sext (TYPE_PRECISION (TREE_TYPE (off))).low
> +			       * BITS_PER_UNIT);
> +	          misalign = misalign & (align - 1);
> +	          if (misalign != 0)
> +		    align = (misalign & -misalign);
> +		  if (align < TYPE_ALIGN (type))
> +		    type = build_aligned_type (type, align);
> +		}
>  	      expr = fold_build2_loc (loc, MEM_REF, type, base, off);
>  	    }
>  	  else
Richard Biener Nov. 27, 2012, 1:02 p.m. UTC | #2
On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote:
>> The get_pointer_alignment function can indicate that it does not know
>> what the alignment should be, and it always fills in worst-case values
>> for that case.  We should not use these worst-case values to "optimize"
>> the interface of a function.
>>
>> At minimum I think something like the following would be good.  But
>> I'm unsure why we would *ever* want to lower the alignment at a function
>> interface.  It seems to me that we'd simply want the caller to handle
>> copying the data to an aligned location?
>>
>> What was the use case of this code in the first place?
>
> PR 52402, and not surprisingly, that testcase fails on x86_64-linux
> with your patch.  Furthermore, misalignment due to MEM_REF offset has
> to be accounted for whether we know the alignment of base or not.
>
> I was hoping that we could do something along the lines of
> build_ref_for_offset tree-sra.c but that would not work for cases like
> the one from PR 52890, comment 7 when there is no dereference in the
> caller, we tranform:
>
>   D.2537 = &this->surfaces;
>   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);
>
> into
>
>   D.2537 = &this->surfaces;
>   D.2554 = MEM[(const struct ggTrain *)D.2537];
>   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);
>
> At the moment I hope I'll be able to:
>
> 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in
>    access_precludes_ipa_sra_p.  This way, any knowingly misaligned
>    load in the callee should will not be transferred to the caller, if
>    there is none.
>
> 2) In ipa_modify_call_arguments, be optimistic in like in your patch
>    except for the case when we are looking at a dereference (i.e. we
>    are turning a BLK dereference into a smaller scalar type one).  In
>    that case, use its alignment like in build_ref_for_offset.
>
> But I'd like to think about this a bit more.  I believe we should ask
> for Richi's approval when he comes back and recovers anyway.  But this
> is now my highest priority (finally).

The issue here is that IPA SRA, when seeing

foo (T *addr)
{
  *addr;
}

 foo (&mem);

wanting to transform it to

foo' (T value)
{
  value;
}

 value = *(T *)mem;
 foo (value);

has to use the _same_ alignment for the access to mem as it was used
inside foo.  That's because of the fact that alignment information in GIMPLE
is solely present at memory accesses and _not_ in any way associated
with pointer types (as opposed to more strict rules imposed by some languages
such as C, often enough violated by users, *(char *)(int *)p is an access
aligned to at least 4 bytes in C).

IPA SRA can use bigger alignment if it knows that is safe (thus
get_pointer_alignment returns something larger than what the actual
access in foo was using).  What IPA SRA at the moment cannot do
is "propagate" larger alignment used in foo to the call site (AFAIK).
Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site),
get_object_alignment (original dereference site)).

For fixing pessimizations caused by IPA SRA I suggest to simply punt
(not do the transform) if get_pointer_alignment_1 returns false for the
actual call argument.  Or implement the propagation.

Richard.

> Thanks,
>
> Martin
>
>
>>
>>
>>
>> r~
>
>> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
>> index 3150bd6..d117389 100644
>> --- a/gcc/ipa-prop.c
>> +++ b/gcc/ipa-prop.c
>> @@ -2956,15 +2956,17 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt,
>>             unsigned int align;
>>             unsigned HOST_WIDE_INT misalign;
>>
>> -           get_pointer_alignment_1 (base, &align, &misalign);
>> -           misalign += (tree_to_double_int (off)
>> -                        .sext (TYPE_PRECISION (TREE_TYPE (off))).low
>> -                        * BITS_PER_UNIT);
>> -           misalign = misalign & (align - 1);
>> -           if (misalign != 0)
>> -             align = (misalign & -misalign);
>> -           if (align < TYPE_ALIGN (type))
>> -             type = build_aligned_type (type, align);
>> +           if (get_pointer_alignment_1 (base, &align, &misalign))
>> +             {
>> +               misalign += (tree_to_double_int (off)
>> +                            .sext (TYPE_PRECISION (TREE_TYPE (off))).low
>> +                            * BITS_PER_UNIT);
>> +               misalign = misalign & (align - 1);
>> +               if (misalign != 0)
>> +                 align = (misalign & -misalign);
>> +               if (align < TYPE_ALIGN (type))
>> +                 type = build_aligned_type (type, align);
>> +             }
>>             expr = fold_build2_loc (loc, MEM_REF, type, base, off);
>>           }
>>         else
>
Martin Jambor Nov. 27, 2012, 8:37 p.m. UTC | #3
Hi,

On Tue, Nov 27, 2012 at 02:02:42PM +0100, Richard Biener wrote:
> On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjambor@suse.cz> wrote:
> > Hi,
> >
> > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote:
> >> The get_pointer_alignment function can indicate that it does not know
> >> what the alignment should be, and it always fills in worst-case values
> >> for that case.  We should not use these worst-case values to "optimize"
> >> the interface of a function.
> >>
> >> At minimum I think something like the following would be good.  But
> >> I'm unsure why we would *ever* want to lower the alignment at a function
> >> interface.  It seems to me that we'd simply want the caller to handle
> >> copying the data to an aligned location?
> >>
> >> What was the use case of this code in the first place?
> >
> > PR 52402, and not surprisingly, that testcase fails on x86_64-linux
> > with your patch.  Furthermore, misalignment due to MEM_REF offset has
> > to be accounted for whether we know the alignment of base or not.
> >
> > I was hoping that we could do something along the lines of
> > build_ref_for_offset tree-sra.c but that would not work for cases like
> > the one from PR 52890, comment 7 when there is no dereference in the
> > caller, we tranform:
> >
> >   D.2537 = &this->surfaces;
> >   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);
> >
> > into
> >
> >   D.2537 = &this->surfaces;
> >   D.2554 = MEM[(const struct ggTrain *)D.2537];
> >   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);
> >
> > At the moment I hope I'll be able to:
> >
> > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in
> >    access_precludes_ipa_sra_p.  This way, any knowingly misaligned
> >    load in the callee should will not be transferred to the caller, if
> >    there is none.
> >
> > 2) In ipa_modify_call_arguments, be optimistic in like in your patch
> >    except for the case when we are looking at a dereference (i.e. we
> >    are turning a BLK dereference into a smaller scalar type one).  In
> >    that case, use its alignment like in build_ref_for_offset.
> >
> > But I'd like to think about this a bit more.  I believe we should ask
> > for Richi's approval when he comes back and recovers anyway.  But this
> > is now my highest priority (finally).
> 
> The issue here is that IPA SRA, when seeing
> 
> foo (T *addr)
> {
>   *addr;
> }
> 
>  foo (&mem);
> 
> wanting to transform it to
> 
> foo' (T value)
> {
>   value;
> }
> 
>  value = *(T *)mem;
>  foo (value);
> 
> has to use the _same_ alignment for the access to mem as it was used
> inside foo.  That's because of the fact that alignment information in GIMPLE
> is solely present at memory accesses and _not_ in any way associated
> with pointer types (as opposed to more strict rules imposed by some languages
> such as C, often enough violated by users, *(char *)(int *)p is an access
> aligned to at least 4 bytes in C).
> 
> IPA SRA can use bigger alignment if it knows that is safe (thus
> get_pointer_alignment returns something larger than what the actual
> access in foo was using).  What IPA SRA at the moment cannot do
> is "propagate" larger alignment used in foo to the call site (AFAIK).
> Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site),
> get_object_alignment (original dereference site)).
> 
> For fixing pessimizations caused by IPA SRA I suggest to simply punt
> (not do the transform) if get_pointer_alignment_1 returns false for the
> actual call argument.  Or implement the propagation.

the patch below punts if get_object_alignment on any dereference in
the callee returns something smaller than the alignment of the type of
the would-be replacement (and the type of the load we would introduce
in the caller).  This is essentially an implicit propagation of
alignment from callees to callers.  I have added a new testcase that
checks this which, when not handled properly, fails on sparc64 and
produces wrong code on arm (which I however only checked by looking at
cross-compiler assembly output).

Nevertheless, there is another case that must be taken care of (you
already added a x86_64 testcase: gcc.dg/torture/pr52402.c) when there
already is a dereference in the caller but it is a BLK-mode one and
IPA-SRA changes it to a scalar load - this happens when an aggregate
by-value parameter is turned into a scalar one.  In that case, we
should look at the dereference in the caller and derive alignment from
there.

I do not do MAX, I assume that when get_pointer_alignment returns
true, it can be trusted.  If it returns some smaller alignment than
the alignment of the type (propagated from the callee), then I suppose
there is an alignment attribute missing in the callee or some such
mistake in the original program.

So far I have successfully bootstrapped and tested this on
x86_64-linux and have run parts of the testsuite on sparc64-linux and
hppa-linux.  Full bootstrap on sparc64-linux and ppc64-linux are
underway.

I'm going to add a testcase for PR 55448 that will scan x86_64
assembly output.

What do you think?

Thanks,

Martin


2012-11-27  Martin Jambor  <mjambor@suse.cz>

	PR middle-end/52890
	PR tree-optimization/55415
	PR tree-optimization/54386
	PR target/55448
	* ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
	get_pointer_alignment_1 returns false and the base was not a
	dereference.
	* tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
	added check for required alignment.  Update the user.

	* testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.

*** /tmp/5XB8rd_ipa-prop.c	Tue Nov 27 21:34:54 2012
--- gcc/ipa-prop.c	Tue Nov 27 17:04:20 2012
*************** ipa_modify_call_arguments (struct cgraph
*** 2888,2893 ****
--- 2888,2894 ----
  	{
  	  tree expr, base, off;
  	  location_t loc;
+ 	  tree prev_base = NULL_TREE;
  
  	  /* We create a new parameter out of the value of the old one, we can
  	     do the following kind of transformations:
*************** ipa_modify_call_arguments (struct cgraph
*** 2919,2925 ****
  	  else
  	    {
  	      HOST_WIDE_INT base_offset;
- 	      tree prev_base;
  
  	      if (TREE_CODE (base) == ADDR_EXPR)
  		base = TREE_OPERAND (base, 0);
--- 2920,2925 ----
*************** ipa_modify_call_arguments (struct cgraph
*** 2956,2962 ****
  	      unsigned int align;
  	      unsigned HOST_WIDE_INT misalign;
  
! 	      get_pointer_alignment_1 (base, &align, &misalign);
  	      misalign += (tree_to_double_int (off)
  			   .sext (TYPE_PRECISION (TREE_TYPE (off))).low
  			   * BITS_PER_UNIT);
--- 2956,2973 ----
  	      unsigned int align;
  	      unsigned HOST_WIDE_INT misalign;
  
! 	      if (!get_pointer_alignment_1 (base, &align, &misalign))
! 		{
! 		  if (prev_base
! 		      && (TREE_CODE (prev_base) == MEM_REF
! 			  || TREE_CODE (prev_base) == TARGET_MEM_REF))
! 		    {
! 		      gcc_assert (misalign == 0);
! 		      align = TYPE_ALIGN (TREE_TYPE (prev_base));
! 		    }
! 		  else
! 		    align = TYPE_ALIGN (type);
! 		}
  	      misalign += (tree_to_double_int (off)
  			   .sext (TYPE_PRECISION (TREE_TYPE (off))).low
  			   * BITS_PER_UNIT);
*** /dev/null	Thu Oct 25 13:49:30 2012
--- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c	Tue Nov 27 18:48:45 2012
***************
*** 0 ****
--- 1,42 ----
+ /* { dg-do run } */
+ /* { dg-options "-O3" } */
+ 
+ typedef unsigned int myint __attribute__((aligned(1)));
+ 
+ typedef struct __attribute__((packed)) S {
+   unsigned a, b, c;
+ } SS;
+ 
+ typedef SS __attribute__((aligned(1))) SSS;
+ 
+ 
+ static unsigned int __attribute__ ((noinline))
+ get_a (SSS *p)
+ {
+   return p->a;
+ };
+ 
+ static int __attribute__ ((noinline, noclone))
+ foo (SS *p)
+ {
+   int r = (int) get_a(p) + 2;
+   return r;
+ }
+ 
+ char buf[512];
+ 
+ static SSS * __attribute__ ((noinline, noclone))
+ get_sss (void)
+ {
+   return (SSS *)(buf + 1);
+ }
+ 
+ 
+ int
+ main(int argc, char *argv[])
+ {
+   SSS *p = get_sss();
+   if (foo(p) != 2)
+     __builtin_abort ();
+   return 0;
+ }
*** /tmp/Hp0Wyc_tree-sra.c	Tue Nov 27 21:34:54 2012
--- gcc/tree-sra.c	Tue Nov 27 21:28:53 2012
*************** unmodified_by_ref_scalar_representative
*** 3891,3902 ****
    return repr;
  }
  
! /* Return true iff this access precludes IPA-SRA of the parameter it is
!    associated with. */
  
  static bool
! access_precludes_ipa_sra_p (struct access *access)
  {
    /* Avoid issues such as the second simple testcase in PR 42025.  The problem
       is incompatible assign in a call statement (and possibly even in asm
       statements).  This can be relaxed by using a new temporary but only for
--- 3891,3903 ----
    return repr;
  }
  
! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
!    associated with.  REQ_ALIGN is the minimum required alignment.  */
  
  static bool
! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
  {
+   unsigned int exp_align;
    /* Avoid issues such as the second simple testcase in PR 42025.  The problem
       is incompatible assign in a call statement (and possibly even in asm
       statements).  This can be relaxed by using a new temporary but only for
*************** access_precludes_ipa_sra_p (struct acces
*** 3908,3913 ****
--- 3909,3918 ----
  	  || gimple_code (access->stmt) == GIMPLE_ASM))
      return true;
  
+   exp_align = get_object_alignment (access->expr);
+   if (exp_align < req_align)
+     return true;
+ 
    return false;
  }
  
*************** splice_param_accesses (tree parm, bool *
*** 3943,3949 ****
        tree a1_alias_type;
        access = (*access_vec)[i];
        modification = access->write;
!       if (access_precludes_ipa_sra_p (access))
  	return NULL;
        a1_alias_type = reference_alias_ptr_type (access->expr);
  
--- 3948,3954 ----
        tree a1_alias_type;
        access = (*access_vec)[i];
        modification = access->write;
!       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
  	return NULL;
        a1_alias_type = reference_alias_ptr_type (access->expr);
  
*************** splice_param_accesses (tree parm, bool *
*** 3966,3972 ****
  	  else if (ac2->size != access->size)
  	    return NULL;
  
! 	  if (access_precludes_ipa_sra_p (ac2)
  	      || (ac2->type != access->type
  		  && (TREE_ADDRESSABLE (ac2->type)
  		      || TREE_ADDRESSABLE (access->type)))
--- 3971,3977 ----
  	  else if (ac2->size != access->size)
  	    return NULL;
  
! 	  if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type))
  	      || (ac2->type != access->type
  		  && (TREE_ADDRESSABLE (ac2->type)
  		      || TREE_ADDRESSABLE (access->type)))
Richard Biener Nov. 28, 2012, 1:11 p.m. UTC | #4
On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Tue, Nov 27, 2012 at 02:02:42PM +0100, Richard Biener wrote:
>> On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjambor@suse.cz> wrote:
>> > Hi,
>> >
>> > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote:
>> >> The get_pointer_alignment function can indicate that it does not know
>> >> what the alignment should be, and it always fills in worst-case values
>> >> for that case.  We should not use these worst-case values to "optimize"
>> >> the interface of a function.
>> >>
>> >> At minimum I think something like the following would be good.  But
>> >> I'm unsure why we would *ever* want to lower the alignment at a function
>> >> interface.  It seems to me that we'd simply want the caller to handle
>> >> copying the data to an aligned location?
>> >>
>> >> What was the use case of this code in the first place?
>> >
>> > PR 52402, and not surprisingly, that testcase fails on x86_64-linux
>> > with your patch.  Furthermore, misalignment due to MEM_REF offset has
>> > to be accounted for whether we know the alignment of base or not.
>> >
>> > I was hoping that we could do something along the lines of
>> > build_ref_for_offset tree-sra.c but that would not work for cases like
>> > the one from PR 52890, comment 7 when there is no dereference in the
>> > caller, we tranform:
>> >
>> >   D.2537 = &this->surfaces;
>> >   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);
>> >
>> > into
>> >
>> >   D.2537 = &this->surfaces;
>> >   D.2554 = MEM[(const struct ggTrain *)D.2537];
>> >   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1);
>> >
>> > At the moment I hope I'll be able to:
>> >
>> > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in
>> >    access_precludes_ipa_sra_p.  This way, any knowingly misaligned
>> >    load in the callee should will not be transferred to the caller, if
>> >    there is none.
>> >
>> > 2) In ipa_modify_call_arguments, be optimistic in like in your patch
>> >    except for the case when we are looking at a dereference (i.e. we
>> >    are turning a BLK dereference into a smaller scalar type one).  In
>> >    that case, use its alignment like in build_ref_for_offset.
>> >
>> > But I'd like to think about this a bit more.  I believe we should ask
>> > for Richi's approval when he comes back and recovers anyway.  But this
>> > is now my highest priority (finally).
>>
>> The issue here is that IPA SRA, when seeing
>>
>> foo (T *addr)
>> {
>>   *addr;
>> }
>>
>>  foo (&mem);
>>
>> wanting to transform it to
>>
>> foo' (T value)
>> {
>>   value;
>> }
>>
>>  value = *(T *)mem;
>>  foo (value);
>>
>> has to use the _same_ alignment for the access to mem as it was used
>> inside foo.  That's because of the fact that alignment information in GIMPLE
>> is solely present at memory accesses and _not_ in any way associated
>> with pointer types (as opposed to more strict rules imposed by some languages
>> such as C, often enough violated by users, *(char *)(int *)p is an access
>> aligned to at least 4 bytes in C).
>>
>> IPA SRA can use bigger alignment if it knows that is safe (thus
>> get_pointer_alignment returns something larger than what the actual
>> access in foo was using).  What IPA SRA at the moment cannot do
>> is "propagate" larger alignment used in foo to the call site (AFAIK).
>> Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site),
>> get_object_alignment (original dereference site)).
>>
>> For fixing pessimizations caused by IPA SRA I suggest to simply punt
>> (not do the transform) if get_pointer_alignment_1 returns false for the
>> actual call argument.  Or implement the propagation.
>
> the patch below punts if get_object_alignment on any dereference in
> the callee returns something smaller than the alignment of the type of
> the would-be replacement (and the type of the load we would introduce
> in the caller).  This is essentially an implicit propagation of
> alignment from callees to callers.  I have added a new testcase that
> checks this which, when not handled properly, fails on sparc64 and
> produces wrong code on arm (which I however only checked by looking at
> cross-compiler assembly output).
>
> Nevertheless, there is another case that must be taken care of (you
> already added a x86_64 testcase: gcc.dg/torture/pr52402.c) when there
> already is a dereference in the caller but it is a BLK-mode one and
> IPA-SRA changes it to a scalar load - this happens when an aggregate
> by-value parameter is turned into a scalar one.  In that case, we
> should look at the dereference in the caller and derive alignment from
> there.

Well, you can derive alignment from all memory accesses that are
always executed on the path the call stmt is on and take the maximum
(of course paying attention to properly translate what you get to the
alignment of the base object at offset zero).

Such analysis of course doesn't fit the flow-insensitiveness of SRA
very well ...

> I do not do MAX, I assume that when get_pointer_alignment returns
> true, it can be trusted.  If it returns some smaller alignment than
> the alignment of the type (propagated from the callee), then I suppose
> there is an alignment attribute missing in the callee or some such
> mistake in the original program.

Well, it can return true if you for example have

  ptr = ptr & ~3;
   ... *ptr

but then some other access may tell you that ptr was even more aligned.
But yes, the way you do it should be conservatively correct at least.

> So far I have successfully bootstrapped and tested this on
> x86_64-linux and have run parts of the testsuite on sparc64-linux and
> hppa-linux.  Full bootstrap on sparc64-linux and ppc64-linux are
> underway.
>
> I'm going to add a testcase for PR 55448 that will scan x86_64
> assembly output.
>
> What do you think?

See below - there is some confusion on my side regarding what memory
access alignment you query and why you have the two prev_base cases
(you constrain on TYPE_ALIGN, so you should always use TYPE_ALIGN?)

Richard.

> Thanks,
>
> Martin
>
>
> 2012-11-27  Martin Jambor  <mjambor@suse.cz>
>
>         PR middle-end/52890
>         PR tree-optimization/55415
>         PR tree-optimization/54386
>         PR target/55448
>         * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
>         get_pointer_alignment_1 returns false and the base was not a
>         dereference.
>         * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
>         added check for required alignment.  Update the user.
>
>         * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.
>
> *** /tmp/5XB8rd_ipa-prop.c      Tue Nov 27 21:34:54 2012
> --- gcc/ipa-prop.c      Tue Nov 27 17:04:20 2012
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2888,2893 ****
> --- 2888,2894 ----
>         {
>           tree expr, base, off;
>           location_t loc;
> +         tree prev_base = NULL_TREE;
>
>           /* We create a new parameter out of the value of the old one, we can
>              do the following kind of transformations:
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2919,2925 ****
>           else
>             {
>               HOST_WIDE_INT base_offset;
> -             tree prev_base;
>
>               if (TREE_CODE (base) == ADDR_EXPR)
>                 base = TREE_OPERAND (base, 0);
> --- 2920,2925 ----
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2956,2962 ****
>               unsigned int align;
>               unsigned HOST_WIDE_INT misalign;
>
> !             get_pointer_alignment_1 (base, &align, &misalign);
>               misalign += (tree_to_double_int (off)
>                            .sext (TYPE_PRECISION (TREE_TYPE (off))).low
>                            * BITS_PER_UNIT);
> --- 2956,2973 ----
>               unsigned int align;
>               unsigned HOST_WIDE_INT misalign;
>
> !             if (!get_pointer_alignment_1 (base, &align, &misalign))
> !               {
> !                 if (prev_base
> !                     && (TREE_CODE (prev_base) == MEM_REF
> !                         || TREE_CODE (prev_base) == TARGET_MEM_REF))
> !                   {
> !                     gcc_assert (misalign == 0);
> !                     align = TYPE_ALIGN (TREE_TYPE (prev_base));
> !                   }
> !                 else
> !                   align = TYPE_ALIGN (type);

I'm quite sure that misalign is wrong after this.

What memory access is 'base', what memory access is 'prev_base'
(prev_base from the original memory reference in the callee that is
going to be replaced?  base the one that IPA SRA "constructed"
based on the pointer passed at the call site?)

> !               }
>               misalign += (tree_to_double_int (off)
>                            .sext (TYPE_PRECISION (TREE_TYPE (off))).low
>                            * BITS_PER_UNIT);
> *** /dev/null   Thu Oct 25 13:49:30 2012
> --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c        Tue Nov 27 18:48:45 2012
> ***************
> *** 0 ****
> --- 1,42 ----
> + /* { dg-do run } */
> + /* { dg-options "-O3" } */
> +
> + typedef unsigned int myint __attribute__((aligned(1)));
> +
> + typedef struct __attribute__((packed)) S {
> +   unsigned a, b, c;
> + } SS;
> +
> + typedef SS __attribute__((aligned(1))) SSS;
> +
> +
> + static unsigned int __attribute__ ((noinline))
> + get_a (SSS *p)
> + {
> +   return p->a;
> + };
> +
> + static int __attribute__ ((noinline, noclone))
> + foo (SS *p)
> + {
> +   int r = (int) get_a(p) + 2;
> +   return r;
> + }
> +
> + char buf[512];
> +
> + static SSS * __attribute__ ((noinline, noclone))
> + get_sss (void)
> + {
> +   return (SSS *)(buf + 1);
> + }
> +
> +
> + int
> + main(int argc, char *argv[])
> + {
> +   SSS *p = get_sss();
> +   if (foo(p) != 2)
> +     __builtin_abort ();
> +   return 0;
> + }
> *** /tmp/Hp0Wyc_tree-sra.c      Tue Nov 27 21:34:54 2012
> --- gcc/tree-sra.c      Tue Nov 27 21:28:53 2012
> *************** unmodified_by_ref_scalar_representative
> *** 3891,3902 ****
>     return repr;
>   }
>
> ! /* Return true iff this access precludes IPA-SRA of the parameter it is
> !    associated with. */
>
>   static bool
> ! access_precludes_ipa_sra_p (struct access *access)
>   {
>     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>        is incompatible assign in a call statement (and possibly even in asm
>        statements).  This can be relaxed by using a new temporary but only for
> --- 3891,3903 ----
>     return repr;
>   }
>
> ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
> !    associated with.  REQ_ALIGN is the minimum required alignment.  */
>
>   static bool
> ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
>   {
> +   unsigned int exp_align;
>     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>        is incompatible assign in a call statement (and possibly even in asm
>        statements).  This can be relaxed by using a new temporary but only for
> *************** access_precludes_ipa_sra_p (struct acces
> *** 3908,3913 ****
> --- 3909,3918 ----
>           || gimple_code (access->stmt) == GIMPLE_ASM))
>       return true;
>
> +   exp_align = get_object_alignment (access->expr);

Is access->expr from the callee or the caller?

> +   if (exp_align < req_align)
> +     return true;
> +
>     return false;
>   }
>
> *************** splice_param_accesses (tree parm, bool *
> *** 3943,3949 ****
>         tree a1_alias_type;
>         access = (*access_vec)[i];
>         modification = access->write;
> !       if (access_precludes_ipa_sra_p (access))
>         return NULL;
>         a1_alias_type = reference_alias_ptr_type (access->expr);
>
> --- 3948,3954 ----
>         tree a1_alias_type;
>         access = (*access_vec)[i];
>         modification = access->write;
> !       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))

access->type is not the type of the base object, right?  Which means it
is not correct here - the alignment of the access is that what
get_object_alignment returns, which looks at the base as to whether to
lower the alignment compared to TYPE_ALIGN of the access type itself.

This is from inside the callee, right?  So what you want to check here is
that if you end up optimistically using the alignment of the access type
that this is at most equal but not larger than the alignment of the access.

>         return NULL;
>         a1_alias_type = reference_alias_ptr_type (access->expr);
>
> *************** splice_param_accesses (tree parm, bool *
> *** 3966,3972 ****
>           else if (ac2->size != access->size)
>             return NULL;
>
> !         if (access_precludes_ipa_sra_p (ac2)
>               || (ac2->type != access->type
>                   && (TREE_ADDRESSABLE (ac2->type)
>                       || TREE_ADDRESSABLE (access->type)))
> --- 3971,3977 ----
>           else if (ac2->size != access->size)
>             return NULL;
>
> !         if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type))
>               || (ac2->type != access->type
>                   && (TREE_ADDRESSABLE (ac2->type)
>                       || TREE_ADDRESSABLE (access->type)))
Martin Jambor Nov. 29, 2012, 10:06 a.m. UTC | #5
Hi,

thanks for the review.  When writing a reply I realized I indeed made
a mistake or two in the part concerning prev_base and the code was not
what it intended to be.  I'll re-write it today.

Nevertheless, I also have a question regarding a different place of
the patch:

On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote:
> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
> > *** /tmp/Hp0Wyc_tree-sra.c      Tue Nov 27 21:34:54 2012
> > --- gcc/tree-sra.c      Tue Nov 27 21:28:53 2012
> > *************** unmodified_by_ref_scalar_representative
> > *** 3891,3902 ****
> >     return repr;
> >   }
> >
> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is
> > !    associated with. */
> >
> >   static bool
> > ! access_precludes_ipa_sra_p (struct access *access)
> >   {
> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
> >        is incompatible assign in a call statement (and possibly even in asm
> >        statements).  This can be relaxed by using a new temporary but only for
> > --- 3891,3903 ----
> >     return repr;
> >   }
> >
> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
> > !    associated with.  REQ_ALIGN is the minimum required alignment.  */
> >
> >   static bool
> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
> >   {
> > +   unsigned int exp_align;
> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
> >        is incompatible assign in a call statement (and possibly even in asm
> >        statements).  This can be relaxed by using a new temporary but only for
> > *************** access_precludes_ipa_sra_p (struct acces
> > *** 3908,3913 ****
> > --- 3909,3918 ----
> >           || gimple_code (access->stmt) == GIMPLE_ASM))
> >       return true;
> >
> > +   exp_align = get_object_alignment (access->expr);
> 
> Is access->expr from the callee or the caller?

The callee.

> 
> > +   if (exp_align < req_align)
> > +     return true;
> > +
> >     return false;
> >   }
> >
> > *************** splice_param_accesses (tree parm, bool *
> > *** 3943,3949 ****
> >         tree a1_alias_type;
> >         access = (*access_vec)[i];
> >         modification = access->write;
> > !       if (access_precludes_ipa_sra_p (access))
> >         return NULL;
> >         a1_alias_type = reference_alias_ptr_type (access->expr);
> >
> > --- 3948,3954 ----
> >         tree a1_alias_type;
> >         access = (*access_vec)[i];
> >         modification = access->write;
> > !       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
> 
> access->type is not the type of the base object, right?

No, it is the actual type of the scalar memory access.

>  Which means it is not correct here - the alignment of the access is that what
> get_object_alignment returns, which looks at the base as to whether to
> lower the alignment compared to TYPE_ALIGN of the access type itself.

Oh, so I have to do something like (or even reuse) may_be_unaligned_p
from tree-ssa-loop-ivopts.c?  (If so, I'm wondering whether a few other
uses of get_object_alignment get this right...)

Thanks,

Martin
Richard Biener Nov. 29, 2012, 11:54 a.m. UTC | #6
On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> thanks for the review.  When writing a reply I realized I indeed made
> a mistake or two in the part concerning prev_base and the code was not
> what it intended to be.  I'll re-write it today.
>
> Nevertheless, I also have a question regarding a different place of
> the patch:
>
> On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote:
>> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
>> > *** /tmp/Hp0Wyc_tree-sra.c      Tue Nov 27 21:34:54 2012
>> > --- gcc/tree-sra.c      Tue Nov 27 21:28:53 2012
>> > *************** unmodified_by_ref_scalar_representative
>> > *** 3891,3902 ****
>> >     return repr;
>> >   }
>> >
>> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is
>> > !    associated with. */
>> >
>> >   static bool
>> > ! access_precludes_ipa_sra_p (struct access *access)
>> >   {
>> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>> >        is incompatible assign in a call statement (and possibly even in asm
>> >        statements).  This can be relaxed by using a new temporary but only for
>> > --- 3891,3903 ----
>> >     return repr;
>> >   }
>> >
>> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
>> > !    associated with.  REQ_ALIGN is the minimum required alignment.  */
>> >
>> >   static bool
>> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
>> >   {
>> > +   unsigned int exp_align;
>> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>> >        is incompatible assign in a call statement (and possibly even in asm
>> >        statements).  This can be relaxed by using a new temporary but only for
>> > *************** access_precludes_ipa_sra_p (struct acces
>> > *** 3908,3913 ****
>> > --- 3909,3918 ----
>> >           || gimple_code (access->stmt) == GIMPLE_ASM))
>> >       return true;
>> >
>> > +   exp_align = get_object_alignment (access->expr);
>>
>> Is access->expr from the callee or the caller?
>
> The callee.
>
>>
>> > +   if (exp_align < req_align)
>> > +     return true;
>> > +
>> >     return false;
>> >   }
>> >
>> > *************** splice_param_accesses (tree parm, bool *
>> > *** 3943,3949 ****
>> >         tree a1_alias_type;
>> >         access = (*access_vec)[i];
>> >         modification = access->write;
>> > !       if (access_precludes_ipa_sra_p (access))
>> >         return NULL;
>> >         a1_alias_type = reference_alias_ptr_type (access->expr);
>> >
>> > --- 3948,3954 ----
>> >         tree a1_alias_type;
>> >         access = (*access_vec)[i];
>> >         modification = access->write;
>> > !       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
>>
>> access->type is not the type of the base object, right?
>
> No, it is the actual type of the scalar memory access.
>
>>  Which means it is not correct here - the alignment of the access is that what
>> get_object_alignment returns, which looks at the base as to whether to
>> lower the alignment compared to TYPE_ALIGN of the access type itself.
>
> Oh, so I have to do something like (or even reuse) may_be_unaligned_p
> from tree-ssa-loop-ivopts.c?  (If so, I'm wondering whether a few other
> uses of get_object_alignment get this right...)

Well, I think you should use get_object_alignment here, too, and punt if
that is less than TYPE_ALIGN (access->type) if you will end up using that.

get_object_alignment (ref) is the authorative answer to alignment questions.
Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted
depends on whether get_object_alignment (ref) returns the same or bigger
alignment.  But if the type is all you can propagate in this case (and not
an explicit alignment value) you have to punt if the information from the
type isn't conservative.

Richard.

> Thanks,
>
> Martin
Martin Jambor Nov. 29, 2012, 12:33 p.m. UTC | #7
Hi,

On Thu, Nov 29, 2012 at 12:54:20PM +0100, Richard Biener wrote:
> On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor <mjambor@suse.cz> wrote:
> > Hi,
> >
> > thanks for the review.  When writing a reply I realized I indeed made
> > a mistake or two in the part concerning prev_base and the code was not
> > what it intended to be.  I'll re-write it today.
> >
> > Nevertheless, I also have a question regarding a different place of
> > the patch:
> >
> > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote:
> >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
> >> > *** /tmp/Hp0Wyc_tree-sra.c      Tue Nov 27 21:34:54 2012
> >> > --- gcc/tree-sra.c      Tue Nov 27 21:28:53 2012
> >> > *************** unmodified_by_ref_scalar_representative
> >> > *** 3891,3902 ****
> >> >     return repr;
> >> >   }
> >> >
> >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is
> >> > !    associated with. */
> >> >
> >> >   static bool
> >> > ! access_precludes_ipa_sra_p (struct access *access)
> >> >   {
> >> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
> >> >        is incompatible assign in a call statement (and possibly even in asm
> >> >        statements).  This can be relaxed by using a new temporary but only for
> >> > --- 3891,3903 ----
> >> >     return repr;
> >> >   }
> >> >
> >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
> >> > !    associated with.  REQ_ALIGN is the minimum required alignment.  */
> >> >
> >> >   static bool
> >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
> >> >   {
> >> > +   unsigned int exp_align;
> >> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
> >> >        is incompatible assign in a call statement (and possibly even in asm
> >> >        statements).  This can be relaxed by using a new temporary but only for
> >> > *************** access_precludes_ipa_sra_p (struct acces
> >> > *** 3908,3913 ****
> >> > --- 3909,3918 ----
> >> >           || gimple_code (access->stmt) == GIMPLE_ASM))
> >> >       return true;
> >> >
> >> > +   exp_align = get_object_alignment (access->expr);
> >>
> >> Is access->expr from the callee or the caller?
> >
> > The callee.
> >
> >>
> >> > +   if (exp_align < req_align)
> >> > +     return true;
> >> > +
> >> >     return false;
> >> >   }
> >> >
> >> > *************** splice_param_accesses (tree parm, bool *
> >> > *** 3943,3949 ****
> >> >         tree a1_alias_type;
> >> >         access = (*access_vec)[i];
> >> >         modification = access->write;
> >> > !       if (access_precludes_ipa_sra_p (access))
> >> >         return NULL;
> >> >         a1_alias_type = reference_alias_ptr_type (access->expr);
> >> >
> >> > --- 3948,3954 ----
> >> >         tree a1_alias_type;
> >> >         access = (*access_vec)[i];
> >> >         modification = access->write;
> >> > !       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
> >>
> >> access->type is not the type of the base object, right?
> >
> > No, it is the actual type of the scalar memory access.
> >
> >>  Which means it is not correct here - the alignment of the access is that what
> >> get_object_alignment returns, which looks at the base as to whether to
> >> lower the alignment compared to TYPE_ALIGN of the access type itself.
> >
> > Oh, so I have to do something like (or even reuse) may_be_unaligned_p
> > from tree-ssa-loop-ivopts.c?  (If so, I'm wondering whether a few other
> > uses of get_object_alignment get this right...)
> 
> Well, I think you should use get_object_alignment here, too, and punt if
> that is less than TYPE_ALIGN (access->type) if you will end up using that.
> 
> get_object_alignment (ref) is the authorative answer to alignment questions.
> Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted
> depends on whether get_object_alignment (ref) returns the same or bigger
> alignment.  But if the type is all you can propagate in this case (and not
> an explicit alignment value) you have to punt if the information from the
> type isn't conservative.
> 

This must be some sort of misunderstanding, that is exactly what the
code is doing.  access_precludes_ipa_sra_p called get_object_alignment
on the reference and then returned false (i.e. prevented the
transformation) if the result was smaller than the TYPE_ALIGN of the
type that was going to be propagated to the callee.

Thanks,

Martin
Richard Biener Nov. 29, 2012, 2:36 p.m. UTC | #8
On Thu, Nov 29, 2012 at 1:33 PM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Thu, Nov 29, 2012 at 12:54:20PM +0100, Richard Biener wrote:
>> On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor <mjambor@suse.cz> wrote:
>> > Hi,
>> >
>> > thanks for the review.  When writing a reply I realized I indeed made
>> > a mistake or two in the part concerning prev_base and the code was not
>> > what it intended to be.  I'll re-write it today.
>> >
>> > Nevertheless, I also have a question regarding a different place of
>> > the patch:
>> >
>> > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote:
>> >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
>> >> > *** /tmp/Hp0Wyc_tree-sra.c      Tue Nov 27 21:34:54 2012
>> >> > --- gcc/tree-sra.c      Tue Nov 27 21:28:53 2012
>> >> > *************** unmodified_by_ref_scalar_representative
>> >> > *** 3891,3902 ****
>> >> >     return repr;
>> >> >   }
>> >> >
>> >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is
>> >> > !    associated with. */
>> >> >
>> >> >   static bool
>> >> > ! access_precludes_ipa_sra_p (struct access *access)
>> >> >   {
>> >> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>> >> >        is incompatible assign in a call statement (and possibly even in asm
>> >> >        statements).  This can be relaxed by using a new temporary but only for
>> >> > --- 3891,3903 ----
>> >> >     return repr;
>> >> >   }
>> >> >
>> >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
>> >> > !    associated with.  REQ_ALIGN is the minimum required alignment.  */
>> >> >
>> >> >   static bool
>> >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
>> >> >   {
>> >> > +   unsigned int exp_align;
>> >> >     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>> >> >        is incompatible assign in a call statement (and possibly even in asm
>> >> >        statements).  This can be relaxed by using a new temporary but only for
>> >> > *************** access_precludes_ipa_sra_p (struct acces
>> >> > *** 3908,3913 ****
>> >> > --- 3909,3918 ----
>> >> >           || gimple_code (access->stmt) == GIMPLE_ASM))
>> >> >       return true;
>> >> >
>> >> > +   exp_align = get_object_alignment (access->expr);
>> >>
>> >> Is access->expr from the callee or the caller?
>> >
>> > The callee.
>> >
>> >>
>> >> > +   if (exp_align < req_align)
>> >> > +     return true;
>> >> > +
>> >> >     return false;
>> >> >   }
>> >> >
>> >> > *************** splice_param_accesses (tree parm, bool *
>> >> > *** 3943,3949 ****
>> >> >         tree a1_alias_type;
>> >> >         access = (*access_vec)[i];
>> >> >         modification = access->write;
>> >> > !       if (access_precludes_ipa_sra_p (access))
>> >> >         return NULL;
>> >> >         a1_alias_type = reference_alias_ptr_type (access->expr);
>> >> >
>> >> > --- 3948,3954 ----
>> >> >         tree a1_alias_type;
>> >> >         access = (*access_vec)[i];
>> >> >         modification = access->write;
>> >> > !       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
>> >>
>> >> access->type is not the type of the base object, right?
>> >
>> > No, it is the actual type of the scalar memory access.
>> >
>> >>  Which means it is not correct here - the alignment of the access is that what
>> >> get_object_alignment returns, which looks at the base as to whether to
>> >> lower the alignment compared to TYPE_ALIGN of the access type itself.
>> >
>> > Oh, so I have to do something like (or even reuse) may_be_unaligned_p
>> > from tree-ssa-loop-ivopts.c?  (If so, I'm wondering whether a few other
>> > uses of get_object_alignment get this right...)
>>
>> Well, I think you should use get_object_alignment here, too, and punt if
>> that is less than TYPE_ALIGN (access->type) if you will end up using that.
>>
>> get_object_alignment (ref) is the authorative answer to alignment questions.
>> Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted
>> depends on whether get_object_alignment (ref) returns the same or bigger
>> alignment.  But if the type is all you can propagate in this case (and not
>> an explicit alignment value) you have to punt if the information from the
>> type isn't conservative.
>>
>
> This must be some sort of misunderstanding, that is exactly what the
> code is doing.  access_precludes_ipa_sra_p called get_object_alignment
> on the reference and then returned false (i.e. prevented the
> transformation) if the result was smaller than the TYPE_ALIGN of the
> type that was going to be propagated to the callee.

Hm, ok, then this was a confusion on my part.

Richard.

> Thanks,
>
> Martin
>
Martin Jambor Nov. 30, 2012, 1:02 p.m. UTC | #9
On Thu, Nov 29, 2012 at 11:06:41AM +0100, Martin Jambor wrote:
> Hi,
> 
> thanks for the review.  When writing a reply I realized I indeed made
> a mistake or two in the part concerning prev_base and the code was not
> what it intended to be.  I'll re-write it today.

OK, this is it.  The part in tree-sra.c, in which I make IPA-SRA punt
when the alignment of the loads in the callee are not as big as
alignment of the required types of the new arguments, remained the
same.

In ipa-prop.h, I now use maximum of the alignment from
get_pointer_alignment_1 and the type alignment propagated from the
callee.  I now also recognize cases where the dereference is buried
below a COMPONENT_REF or ARRAY_REF (this is checked by new testcase
ipa-sra-9.c which fails at least on on sparc64 when things go wrong).

The patch has passed bootstrap and testing on x86_64-linux,
ppc64-linux and sparc64-linux.

Thanks,

Martin


2012-11-29  Martin Jambor  <mjambor@suse.cz>

	PR middle-end/52890
	PR tree-optimization/55415
	PR tree-optimization/54386
	PR target/55448
	* ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
	get_pointer_alignment_1 returns false and the base was not a
	dereference.
	* tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
	added check for required alignment.  Update the user.

	* testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.
	* testsuite/gcc.dg/ipa/ipa-sra-8.c: Likewise.
	* testsuite/gcc.dg/ipa/ipa-sra-9.c: Likewise.
	* testsuite/gcc.target/i386/pr55448.c: Likewise.


*** /tmp/cY007b_ipa-prop.c	Fri Nov 30 12:26:26 2012
--- gcc/ipa-prop.c	Thu Nov 29 16:05:04 2012
*************** ipa_modify_call_arguments (struct cgraph
*** 2888,2893 ****
--- 2888,2895 ----
  	{
  	  tree expr, base, off;
  	  location_t loc;
+ 	  unsigned int deref_align;
+ 	  bool deref_base = false;
  
  	  /* We create a new parameter out of the value of the old one, we can
  	     do the following kind of transformations:
*************** ipa_modify_call_arguments (struct cgraph
*** 2920,2928 ****
  	    {
  	      HOST_WIDE_INT base_offset;
  	      tree prev_base;
  
  	      if (TREE_CODE (base) == ADDR_EXPR)
! 		base = TREE_OPERAND (base, 0);
  	      prev_base = base;
  	      base = get_addr_base_and_unit_offset (base, &base_offset);
  	      /* Aggregate arguments can have non-invariant addresses.  */
--- 2922,2936 ----
  	    {
  	      HOST_WIDE_INT base_offset;
  	      tree prev_base;
+ 	      bool addrof;
  
  	      if (TREE_CODE (base) == ADDR_EXPR)
! 		{
! 		  base = TREE_OPERAND (base, 0);
! 		  addrof = true;
! 		}
! 	      else
! 		addrof = false;
  	      prev_base = base;
  	      base = get_addr_base_and_unit_offset (base, &base_offset);
  	      /* Aggregate arguments can have non-invariant addresses.  */
*************** ipa_modify_call_arguments (struct cgraph
*** 2934,2939 ****
--- 2942,2952 ----
  		}
  	      else if (TREE_CODE (base) == MEM_REF)
  		{
+ 		  if (!addrof)
+ 		    {
+ 		      deref_base = true;
+ 		      deref_align = TYPE_ALIGN (TREE_TYPE (base));
+ 		    }
  		  off = build_int_cst (adj->alias_ptr_type,
  				       base_offset
  				       + adj->offset / BITS_PER_UNIT);
*************** ipa_modify_call_arguments (struct cgraph
*** 2956,2962 ****
  	      unsigned int align;
  	      unsigned HOST_WIDE_INT misalign;
  
! 	      get_pointer_alignment_1 (base, &align, &misalign);
  	      misalign += (tree_to_double_int (off)
  			   .sext (TYPE_PRECISION (TREE_TYPE (off))).low
  			   * BITS_PER_UNIT);
--- 2969,2985 ----
  	      unsigned int align;
  	      unsigned HOST_WIDE_INT misalign;
  
! 	      if (deref_base)
! 		{
! 		  align = deref_align;
! 		  misalign = 0;
! 		}
! 	      else
! 		{
! 		  get_pointer_alignment_1 (base, &align, &misalign);
! 		  if (TYPE_ALIGN (type) > align)
! 		    align = TYPE_ALIGN (type);
! 		}
  	      misalign += (tree_to_double_int (off)
  			   .sext (TYPE_PRECISION (TREE_TYPE (off))).low
  			   * BITS_PER_UNIT);
*** /dev/null	Thu Oct 25 13:49:30 2012
--- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c	Wed Nov 28 14:19:30 2012
***************
*** 0 ****
--- 1,42 ----
+ /* { dg-do run } */
+ /* { dg-options "-O3" } */
+ 
+ typedef unsigned int myint __attribute__((aligned(1)));
+ 
+ typedef struct __attribute__((packed)) S {
+   unsigned a, b, c;
+ } SS;
+ 
+ typedef SS __attribute__((aligned(1))) SSS;
+ 
+ 
+ static unsigned int __attribute__ ((noinline))
+ get_a (SSS *p)
+ {
+   return p->a;
+ };
+ 
+ static int __attribute__ ((noinline, noclone))
+ foo (SS *p)
+ {
+   int r = (int) get_a(p) + 2;
+   return r;
+ }
+ 
+ char buf[512];
+ 
+ static SSS * __attribute__ ((noinline, noclone))
+ get_sss (void)
+ {
+   return (SSS *)(buf + 1);
+ }
+ 
+ 
+ int
+ main(int argc, char *argv[])
+ {
+   SSS *p = get_sss();
+   if (foo(p) != 2)
+     __builtin_abort ();
+   return 0;
+ }
*** /dev/null	Thu Oct 25 13:49:30 2012
--- gcc/testsuite/gcc.dg/ipa/ipa-sra-8.c	Fri Nov 30 00:52:03 2012
***************
*** 0 ****
--- 1,41 ----
+ /* { dg-do run } */
+ /* { dg-options "-O3" } */
+ 
+ typedef unsigned int myint __attribute__((aligned(1)));
+ 
+ typedef struct S {
+   unsigned a, b, c;
+ } SS;
+ 
+ typedef SS __attribute__((aligned(1))) SSS;
+ 
+ 
+ static unsigned int __attribute__ ((noinline))
+ get_a (SS s)
+ {
+   return s.a;
+ };
+ 
+ static int __attribute__ ((noinline, noclone))
+ foo (SSS *p)
+ {
+   int r = (int) get_a(*p) + 2;
+   return r;
+ }
+ 
+ char buf[512];
+ 
+ static SSS * __attribute__ ((noinline, noclone))
+ get_sss (void)
+ {
+   return (SSS *)(buf + 1);
+ }
+ 
+ int
+ main(int argc, char *argv[])
+ {
+   SSS *p = get_sss();
+   if (foo(p) != 2)
+     __builtin_abort ();
+   return 0;
+ }
*** /dev/null	Thu Oct 25 13:49:30 2012
--- gcc/testsuite/gcc.dg/ipa/ipa-sra-9.c	Fri Nov 30 00:52:12 2012
***************
*** 0 ****
--- 1,44 ----
+ /* { dg-do run } */
+ /* { dg-options "-O3" } */
+ 
+ typedef unsigned int myint __attribute__((aligned(1)));
+ 
+ typedef struct S {
+   unsigned a, b, c;
+ } SS;
+ 
+ typedef struct U {
+   SS s[2];
+ } UU;
+ 
+ typedef UU __attribute__((aligned(1))) UUU;
+ 
+ static unsigned int __attribute__ ((noinline))
+ get_a (SS s)
+ {
+   return s.a;
+ };
+ 
+ static int __attribute__ ((noinline, noclone))
+ foo (UUU *p)
+ {
+   int r = (int) get_a(p->s[0]) + 2;
+   return r;
+ }
+ 
+ char buf[512];
+ 
+ static UUU * __attribute__ ((noinline, noclone))
+ get_uuu (void)
+ {
+   return (UUU *)(buf + 1);
+ }
+ 
+ int
+ main(int argc, char *argv[])
+ {
+   UUU *p = get_uuu();
+   if (foo(p) != 2)
+     __builtin_abort ();
+   return 0;
+ }
*** /dev/null	Thu Oct 25 13:49:30 2012
--- gcc/testsuite/gcc.target/i386/pr55448.c	Thu Nov 29 15:42:16 2012
***************
*** 0 ****
--- 1,26 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -mavx" } */
+ 
+ #include <immintrin.h>
+ 
+ static inline __m256 add1(const __m256 *a, const __m256 *b)
+ {
+   return _mm256_add_ps(*a, *b);
+ }
+ 
+ void foo1(__m256 *a, const __m256 b)
+ {
+   *a = add1(a, &b);
+ }
+ 
+ static inline __m128 add2(const __m128 *a, const __m128 *b)
+ {
+   return _mm_add_ps(*a, *b);
+ }
+ 
+ void foo2(__m128 *a, const __m128 b)
+ {
+   *a = add2(a, &b);
+ }
+ 
+ /* { dg-final { scan-assembler-not "vmovups" } } */
*** /tmp/yTaOsa_tree-sra.c	Fri Nov 30 12:26:26 2012
--- gcc/tree-sra.c	Wed Nov 28 14:19:30 2012
*************** unmodified_by_ref_scalar_representative
*** 3891,3902 ****
    return repr;
  }
  
! /* Return true iff this access precludes IPA-SRA of the parameter it is
!    associated with. */
  
  static bool
! access_precludes_ipa_sra_p (struct access *access)
  {
    /* Avoid issues such as the second simple testcase in PR 42025.  The problem
       is incompatible assign in a call statement (and possibly even in asm
       statements).  This can be relaxed by using a new temporary but only for
--- 3891,3903 ----
    return repr;
  }
  
! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
!    associated with.  REQ_ALIGN is the minimum required alignment.  */
  
  static bool
! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
  {
+   unsigned int exp_align;
    /* Avoid issues such as the second simple testcase in PR 42025.  The problem
       is incompatible assign in a call statement (and possibly even in asm
       statements).  This can be relaxed by using a new temporary but only for
*************** access_precludes_ipa_sra_p (struct acces
*** 3908,3913 ****
--- 3909,3918 ----
  	  || gimple_code (access->stmt) == GIMPLE_ASM))
      return true;
  
+   exp_align = get_object_alignment (access->expr);
+   if (exp_align < req_align)
+     return true;
+ 
    return false;
  }
  
*************** splice_param_accesses (tree parm, bool *
*** 3943,3949 ****
        tree a1_alias_type;
        access = (*access_vec)[i];
        modification = access->write;
!       if (access_precludes_ipa_sra_p (access))
  	return NULL;
        a1_alias_type = reference_alias_ptr_type (access->expr);
  
--- 3948,3954 ----
        tree a1_alias_type;
        access = (*access_vec)[i];
        modification = access->write;
!       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
  	return NULL;
        a1_alias_type = reference_alias_ptr_type (access->expr);
  
*************** splice_param_accesses (tree parm, bool *
*** 3966,3972 ****
  	  else if (ac2->size != access->size)
  	    return NULL;
  
! 	  if (access_precludes_ipa_sra_p (ac2)
  	      || (ac2->type != access->type
  		  && (TREE_ADDRESSABLE (ac2->type)
  		      || TREE_ADDRESSABLE (access->type)))
--- 3971,3977 ----
  	  else if (ac2->size != access->size)
  	    return NULL;
  
! 	  if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type))
  	      || (ac2->type != access->type
  		  && (TREE_ADDRESSABLE (ac2->type)
  		      || TREE_ADDRESSABLE (access->type)))
Richard Biener Nov. 30, 2012, 1:25 p.m. UTC | #10
On Fri, Nov 30, 2012 at 2:02 PM, Martin Jambor <mjambor@suse.cz> wrote:
> On Thu, Nov 29, 2012 at 11:06:41AM +0100, Martin Jambor wrote:
>> Hi,
>>
>> thanks for the review.  When writing a reply I realized I indeed made
>> a mistake or two in the part concerning prev_base and the code was not
>> what it intended to be.  I'll re-write it today.
>
> OK, this is it.  The part in tree-sra.c, in which I make IPA-SRA punt
> when the alignment of the loads in the callee are not as big as
> alignment of the required types of the new arguments, remained the
> same.
>
> In ipa-prop.h, I now use maximum of the alignment from
> get_pointer_alignment_1 and the type alignment propagated from the
> callee.  I now also recognize cases where the dereference is buried
> below a COMPONENT_REF or ARRAY_REF (this is checked by new testcase
> ipa-sra-9.c which fails at least on on sparc64 when things go wrong).
>
> The patch has passed bootstrap and testing on x86_64-linux,
> ppc64-linux and sparc64-linux.

Works for me.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> 2012-11-29  Martin Jambor  <mjambor@suse.cz>
>
>         PR middle-end/52890
>         PR tree-optimization/55415
>         PR tree-optimization/54386
>         PR target/55448
>         * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
>         get_pointer_alignment_1 returns false and the base was not a
>         dereference.
>         * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
>         added check for required alignment.  Update the user.
>
>         * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.
>         * testsuite/gcc.dg/ipa/ipa-sra-8.c: Likewise.
>         * testsuite/gcc.dg/ipa/ipa-sra-9.c: Likewise.
>         * testsuite/gcc.target/i386/pr55448.c: Likewise.
>
>
> *** /tmp/cY007b_ipa-prop.c      Fri Nov 30 12:26:26 2012
> --- gcc/ipa-prop.c      Thu Nov 29 16:05:04 2012
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2888,2893 ****
> --- 2888,2895 ----
>         {
>           tree expr, base, off;
>           location_t loc;
> +         unsigned int deref_align;
> +         bool deref_base = false;
>
>           /* We create a new parameter out of the value of the old one, we can
>              do the following kind of transformations:
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2920,2928 ****
>             {
>               HOST_WIDE_INT base_offset;
>               tree prev_base;
>
>               if (TREE_CODE (base) == ADDR_EXPR)
> !               base = TREE_OPERAND (base, 0);
>               prev_base = base;
>               base = get_addr_base_and_unit_offset (base, &base_offset);
>               /* Aggregate arguments can have non-invariant addresses.  */
> --- 2922,2936 ----
>             {
>               HOST_WIDE_INT base_offset;
>               tree prev_base;
> +             bool addrof;
>
>               if (TREE_CODE (base) == ADDR_EXPR)
> !               {
> !                 base = TREE_OPERAND (base, 0);
> !                 addrof = true;
> !               }
> !             else
> !               addrof = false;
>               prev_base = base;
>               base = get_addr_base_and_unit_offset (base, &base_offset);
>               /* Aggregate arguments can have non-invariant addresses.  */
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2934,2939 ****
> --- 2942,2952 ----
>                 }
>               else if (TREE_CODE (base) == MEM_REF)
>                 {
> +                 if (!addrof)
> +                   {
> +                     deref_base = true;
> +                     deref_align = TYPE_ALIGN (TREE_TYPE (base));
> +                   }
>                   off = build_int_cst (adj->alias_ptr_type,
>                                        base_offset
>                                        + adj->offset / BITS_PER_UNIT);
> *************** ipa_modify_call_arguments (struct cgraph
> *** 2956,2962 ****
>               unsigned int align;
>               unsigned HOST_WIDE_INT misalign;
>
> !             get_pointer_alignment_1 (base, &align, &misalign);
>               misalign += (tree_to_double_int (off)
>                            .sext (TYPE_PRECISION (TREE_TYPE (off))).low
>                            * BITS_PER_UNIT);
> --- 2969,2985 ----
>               unsigned int align;
>               unsigned HOST_WIDE_INT misalign;
>
> !             if (deref_base)
> !               {
> !                 align = deref_align;
> !                 misalign = 0;
> !               }
> !             else
> !               {
> !                 get_pointer_alignment_1 (base, &align, &misalign);
> !                 if (TYPE_ALIGN (type) > align)
> !                   align = TYPE_ALIGN (type);
> !               }
>               misalign += (tree_to_double_int (off)
>                            .sext (TYPE_PRECISION (TREE_TYPE (off))).low
>                            * BITS_PER_UNIT);
> *** /dev/null   Thu Oct 25 13:49:30 2012
> --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c        Wed Nov 28 14:19:30 2012
> ***************
> *** 0 ****
> --- 1,42 ----
> + /* { dg-do run } */
> + /* { dg-options "-O3" } */
> +
> + typedef unsigned int myint __attribute__((aligned(1)));
> +
> + typedef struct __attribute__((packed)) S {
> +   unsigned a, b, c;
> + } SS;
> +
> + typedef SS __attribute__((aligned(1))) SSS;
> +
> +
> + static unsigned int __attribute__ ((noinline))
> + get_a (SSS *p)
> + {
> +   return p->a;
> + };
> +
> + static int __attribute__ ((noinline, noclone))
> + foo (SS *p)
> + {
> +   int r = (int) get_a(p) + 2;
> +   return r;
> + }
> +
> + char buf[512];
> +
> + static SSS * __attribute__ ((noinline, noclone))
> + get_sss (void)
> + {
> +   return (SSS *)(buf + 1);
> + }
> +
> +
> + int
> + main(int argc, char *argv[])
> + {
> +   SSS *p = get_sss();
> +   if (foo(p) != 2)
> +     __builtin_abort ();
> +   return 0;
> + }
> *** /dev/null   Thu Oct 25 13:49:30 2012
> --- gcc/testsuite/gcc.dg/ipa/ipa-sra-8.c        Fri Nov 30 00:52:03 2012
> ***************
> *** 0 ****
> --- 1,41 ----
> + /* { dg-do run } */
> + /* { dg-options "-O3" } */
> +
> + typedef unsigned int myint __attribute__((aligned(1)));
> +
> + typedef struct S {
> +   unsigned a, b, c;
> + } SS;
> +
> + typedef SS __attribute__((aligned(1))) SSS;
> +
> +
> + static unsigned int __attribute__ ((noinline))
> + get_a (SS s)
> + {
> +   return s.a;
> + };
> +
> + static int __attribute__ ((noinline, noclone))
> + foo (SSS *p)
> + {
> +   int r = (int) get_a(*p) + 2;
> +   return r;
> + }
> +
> + char buf[512];
> +
> + static SSS * __attribute__ ((noinline, noclone))
> + get_sss (void)
> + {
> +   return (SSS *)(buf + 1);
> + }
> +
> + int
> + main(int argc, char *argv[])
> + {
> +   SSS *p = get_sss();
> +   if (foo(p) != 2)
> +     __builtin_abort ();
> +   return 0;
> + }
> *** /dev/null   Thu Oct 25 13:49:30 2012
> --- gcc/testsuite/gcc.dg/ipa/ipa-sra-9.c        Fri Nov 30 00:52:12 2012
> ***************
> *** 0 ****
> --- 1,44 ----
> + /* { dg-do run } */
> + /* { dg-options "-O3" } */
> +
> + typedef unsigned int myint __attribute__((aligned(1)));
> +
> + typedef struct S {
> +   unsigned a, b, c;
> + } SS;
> +
> + typedef struct U {
> +   SS s[2];
> + } UU;
> +
> + typedef UU __attribute__((aligned(1))) UUU;
> +
> + static unsigned int __attribute__ ((noinline))
> + get_a (SS s)
> + {
> +   return s.a;
> + };
> +
> + static int __attribute__ ((noinline, noclone))
> + foo (UUU *p)
> + {
> +   int r = (int) get_a(p->s[0]) + 2;
> +   return r;
> + }
> +
> + char buf[512];
> +
> + static UUU * __attribute__ ((noinline, noclone))
> + get_uuu (void)
> + {
> +   return (UUU *)(buf + 1);
> + }
> +
> + int
> + main(int argc, char *argv[])
> + {
> +   UUU *p = get_uuu();
> +   if (foo(p) != 2)
> +     __builtin_abort ();
> +   return 0;
> + }
> *** /dev/null   Thu Oct 25 13:49:30 2012
> --- gcc/testsuite/gcc.target/i386/pr55448.c     Thu Nov 29 15:42:16 2012
> ***************
> *** 0 ****
> --- 1,26 ----
> + /* { dg-do compile } */
> + /* { dg-options "-O2 -mavx" } */
> +
> + #include <immintrin.h>
> +
> + static inline __m256 add1(const __m256 *a, const __m256 *b)
> + {
> +   return _mm256_add_ps(*a, *b);
> + }
> +
> + void foo1(__m256 *a, const __m256 b)
> + {
> +   *a = add1(a, &b);
> + }
> +
> + static inline __m128 add2(const __m128 *a, const __m128 *b)
> + {
> +   return _mm_add_ps(*a, *b);
> + }
> +
> + void foo2(__m128 *a, const __m128 b)
> + {
> +   *a = add2(a, &b);
> + }
> +
> + /* { dg-final { scan-assembler-not "vmovups" } } */
> *** /tmp/yTaOsa_tree-sra.c      Fri Nov 30 12:26:26 2012
> --- gcc/tree-sra.c      Wed Nov 28 14:19:30 2012
> *************** unmodified_by_ref_scalar_representative
> *** 3891,3902 ****
>     return repr;
>   }
>
> ! /* Return true iff this access precludes IPA-SRA of the parameter it is
> !    associated with. */
>
>   static bool
> ! access_precludes_ipa_sra_p (struct access *access)
>   {
>     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>        is incompatible assign in a call statement (and possibly even in asm
>        statements).  This can be relaxed by using a new temporary but only for
> --- 3891,3903 ----
>     return repr;
>   }
>
> ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
> !    associated with.  REQ_ALIGN is the minimum required alignment.  */
>
>   static bool
> ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
>   {
> +   unsigned int exp_align;
>     /* Avoid issues such as the second simple testcase in PR 42025.  The problem
>        is incompatible assign in a call statement (and possibly even in asm
>        statements).  This can be relaxed by using a new temporary but only for
> *************** access_precludes_ipa_sra_p (struct acces
> *** 3908,3913 ****
> --- 3909,3918 ----
>           || gimple_code (access->stmt) == GIMPLE_ASM))
>       return true;
>
> +   exp_align = get_object_alignment (access->expr);
> +   if (exp_align < req_align)
> +     return true;
> +
>     return false;
>   }
>
> *************** splice_param_accesses (tree parm, bool *
> *** 3943,3949 ****
>         tree a1_alias_type;
>         access = (*access_vec)[i];
>         modification = access->write;
> !       if (access_precludes_ipa_sra_p (access))
>         return NULL;
>         a1_alias_type = reference_alias_ptr_type (access->expr);
>
> --- 3948,3954 ----
>         tree a1_alias_type;
>         access = (*access_vec)[i];
>         modification = access->write;
> !       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
>         return NULL;
>         a1_alias_type = reference_alias_ptr_type (access->expr);
>
> *************** splice_param_accesses (tree parm, bool *
> *** 3966,3972 ****
>           else if (ac2->size != access->size)
>             return NULL;
>
> !         if (access_precludes_ipa_sra_p (ac2)
>               || (ac2->type != access->type
>                   && (TREE_ADDRESSABLE (ac2->type)
>                       || TREE_ADDRESSABLE (access->type)))
> --- 3971,3977 ----
>           else if (ac2->size != access->size)
>             return NULL;
>
> !         if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type))
>               || (ac2->type != access->type
>                   && (TREE_ADDRESSABLE (ac2->type)
>                       || TREE_ADDRESSABLE (access->type)))
diff mbox

Patch

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 3150bd6..d117389 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -2956,15 +2956,17 @@  ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt,
 	      unsigned int align;
 	      unsigned HOST_WIDE_INT misalign;
 
-	      get_pointer_alignment_1 (base, &align, &misalign);
-	      misalign += (tree_to_double_int (off)
-			   .sext (TYPE_PRECISION (TREE_TYPE (off))).low
-			   * BITS_PER_UNIT);
-	      misalign = misalign & (align - 1);
-	      if (misalign != 0)
-		align = (misalign & -misalign);
-	      if (align < TYPE_ALIGN (type))
-		type = build_aligned_type (type, align);
+	      if (get_pointer_alignment_1 (base, &align, &misalign))
+		{
+		  misalign += (tree_to_double_int (off)
+			       .sext (TYPE_PRECISION (TREE_TYPE (off))).low
+			       * BITS_PER_UNIT);
+	          misalign = misalign & (align - 1);
+	          if (misalign != 0)
+		    align = (misalign & -misalign);
+		  if (align < TYPE_ALIGN (type))
+		    type = build_aligned_type (type, align);
+		}
 	      expr = fold_build2_loc (loc, MEM_REF, type, base, off);
 	    }
 	  else