Message ID | 50ABBCC4.5070405@redhat.com |
---|---|
State | New |
Headers | show |
Hi, On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote: > The get_pointer_alignment function can indicate that it does not know > what the alignment should be, and it always fills in worst-case values > for that case. We should not use these worst-case values to "optimize" > the interface of a function. > > At minimum I think something like the following would be good. But > I'm unsure why we would *ever* want to lower the alignment at a function > interface. It seems to me that we'd simply want the caller to handle > copying the data to an aligned location? > > What was the use case of this code in the first place? PR 52402, and not surprisingly, that testcase fails on x86_64-linux with your patch. Furthermore, misalignment due to MEM_REF offset has to be accounted for whether we know the alignment of base or not. I was hoping that we could do something along the lines of build_ref_for_offset tree-sra.c but that would not work for cases like the one from PR 52890, comment 7 when there is no dereference in the caller, we tranform: D.2537 = &this->surfaces; sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); into D.2537 = &this->surfaces; D.2554 = MEM[(const struct ggTrain *)D.2537]; sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); At the moment I hope I'll be able to: 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in access_precludes_ipa_sra_p. This way, any knowingly misaligned load in the callee should will not be transferred to the caller, if there is none. 2) In ipa_modify_call_arguments, be optimistic in like in your patch except for the case when we are looking at a dereference (i.e. we are turning a BLK dereference into a smaller scalar type one). In that case, use its alignment like in build_ref_for_offset. But I'd like to think about this a bit more. I believe we should ask for Richi's approval when he comes back and recovers anyway. But this is now my highest priority (finally). Thanks, Martin > > > > r~ > diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c > index 3150bd6..d117389 100644 > --- a/gcc/ipa-prop.c > +++ b/gcc/ipa-prop.c > @@ -2956,15 +2956,17 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt, > unsigned int align; > unsigned HOST_WIDE_INT misalign; > > - get_pointer_alignment_1 (base, &align, &misalign); > - misalign += (tree_to_double_int (off) > - .sext (TYPE_PRECISION (TREE_TYPE (off))).low > - * BITS_PER_UNIT); > - misalign = misalign & (align - 1); > - if (misalign != 0) > - align = (misalign & -misalign); > - if (align < TYPE_ALIGN (type)) > - type = build_aligned_type (type, align); > + if (get_pointer_alignment_1 (base, &align, &misalign)) > + { > + misalign += (tree_to_double_int (off) > + .sext (TYPE_PRECISION (TREE_TYPE (off))).low > + * BITS_PER_UNIT); > + misalign = misalign & (align - 1); > + if (misalign != 0) > + align = (misalign & -misalign); > + if (align < TYPE_ALIGN (type)) > + type = build_aligned_type (type, align); > + } > expr = fold_build2_loc (loc, MEM_REF, type, base, off); > } > else
On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjambor@suse.cz> wrote: > Hi, > > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote: >> The get_pointer_alignment function can indicate that it does not know >> what the alignment should be, and it always fills in worst-case values >> for that case. We should not use these worst-case values to "optimize" >> the interface of a function. >> >> At minimum I think something like the following would be good. But >> I'm unsure why we would *ever* want to lower the alignment at a function >> interface. It seems to me that we'd simply want the caller to handle >> copying the data to an aligned location? >> >> What was the use case of this code in the first place? > > PR 52402, and not surprisingly, that testcase fails on x86_64-linux > with your patch. Furthermore, misalignment due to MEM_REF offset has > to be accounted for whether we know the alignment of base or not. > > I was hoping that we could do something along the lines of > build_ref_for_offset tree-sra.c but that would not work for cases like > the one from PR 52890, comment 7 when there is no dereference in the > caller, we tranform: > > D.2537 = &this->surfaces; > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); > > into > > D.2537 = &this->surfaces; > D.2554 = MEM[(const struct ggTrain *)D.2537]; > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); > > At the moment I hope I'll be able to: > > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in > access_precludes_ipa_sra_p. This way, any knowingly misaligned > load in the callee should will not be transferred to the caller, if > there is none. > > 2) In ipa_modify_call_arguments, be optimistic in like in your patch > except for the case when we are looking at a dereference (i.e. we > are turning a BLK dereference into a smaller scalar type one). In > that case, use its alignment like in build_ref_for_offset. > > But I'd like to think about this a bit more. I believe we should ask > for Richi's approval when he comes back and recovers anyway. But this > is now my highest priority (finally). The issue here is that IPA SRA, when seeing foo (T *addr) { *addr; } foo (&mem); wanting to transform it to foo' (T value) { value; } value = *(T *)mem; foo (value); has to use the _same_ alignment for the access to mem as it was used inside foo. That's because of the fact that alignment information in GIMPLE is solely present at memory accesses and _not_ in any way associated with pointer types (as opposed to more strict rules imposed by some languages such as C, often enough violated by users, *(char *)(int *)p is an access aligned to at least 4 bytes in C). IPA SRA can use bigger alignment if it knows that is safe (thus get_pointer_alignment returns something larger than what the actual access in foo was using). What IPA SRA at the moment cannot do is "propagate" larger alignment used in foo to the call site (AFAIK). Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site), get_object_alignment (original dereference site)). For fixing pessimizations caused by IPA SRA I suggest to simply punt (not do the transform) if get_pointer_alignment_1 returns false for the actual call argument. Or implement the propagation. Richard. > Thanks, > > Martin > > >> >> >> >> r~ > >> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c >> index 3150bd6..d117389 100644 >> --- a/gcc/ipa-prop.c >> +++ b/gcc/ipa-prop.c >> @@ -2956,15 +2956,17 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt, >> unsigned int align; >> unsigned HOST_WIDE_INT misalign; >> >> - get_pointer_alignment_1 (base, &align, &misalign); >> - misalign += (tree_to_double_int (off) >> - .sext (TYPE_PRECISION (TREE_TYPE (off))).low >> - * BITS_PER_UNIT); >> - misalign = misalign & (align - 1); >> - if (misalign != 0) >> - align = (misalign & -misalign); >> - if (align < TYPE_ALIGN (type)) >> - type = build_aligned_type (type, align); >> + if (get_pointer_alignment_1 (base, &align, &misalign)) >> + { >> + misalign += (tree_to_double_int (off) >> + .sext (TYPE_PRECISION (TREE_TYPE (off))).low >> + * BITS_PER_UNIT); >> + misalign = misalign & (align - 1); >> + if (misalign != 0) >> + align = (misalign & -misalign); >> + if (align < TYPE_ALIGN (type)) >> + type = build_aligned_type (type, align); >> + } >> expr = fold_build2_loc (loc, MEM_REF, type, base, off); >> } >> else >
Hi, On Tue, Nov 27, 2012 at 02:02:42PM +0100, Richard Biener wrote: > On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjambor@suse.cz> wrote: > > Hi, > > > > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote: > >> The get_pointer_alignment function can indicate that it does not know > >> what the alignment should be, and it always fills in worst-case values > >> for that case. We should not use these worst-case values to "optimize" > >> the interface of a function. > >> > >> At minimum I think something like the following would be good. But > >> I'm unsure why we would *ever* want to lower the alignment at a function > >> interface. It seems to me that we'd simply want the caller to handle > >> copying the data to an aligned location? > >> > >> What was the use case of this code in the first place? > > > > PR 52402, and not surprisingly, that testcase fails on x86_64-linux > > with your patch. Furthermore, misalignment due to MEM_REF offset has > > to be accounted for whether we know the alignment of base or not. > > > > I was hoping that we could do something along the lines of > > build_ref_for_offset tree-sra.c but that would not work for cases like > > the one from PR 52890, comment 7 when there is no dereference in the > > caller, we tranform: > > > > D.2537 = &this->surfaces; > > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); > > > > into > > > > D.2537 = &this->surfaces; > > D.2554 = MEM[(const struct ggTrain *)D.2537]; > > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); > > > > At the moment I hope I'll be able to: > > > > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in > > access_precludes_ipa_sra_p. This way, any knowingly misaligned > > load in the callee should will not be transferred to the caller, if > > there is none. > > > > 2) In ipa_modify_call_arguments, be optimistic in like in your patch > > except for the case when we are looking at a dereference (i.e. we > > are turning a BLK dereference into a smaller scalar type one). In > > that case, use its alignment like in build_ref_for_offset. > > > > But I'd like to think about this a bit more. I believe we should ask > > for Richi's approval when he comes back and recovers anyway. But this > > is now my highest priority (finally). > > The issue here is that IPA SRA, when seeing > > foo (T *addr) > { > *addr; > } > > foo (&mem); > > wanting to transform it to > > foo' (T value) > { > value; > } > > value = *(T *)mem; > foo (value); > > has to use the _same_ alignment for the access to mem as it was used > inside foo. That's because of the fact that alignment information in GIMPLE > is solely present at memory accesses and _not_ in any way associated > with pointer types (as opposed to more strict rules imposed by some languages > such as C, often enough violated by users, *(char *)(int *)p is an access > aligned to at least 4 bytes in C). > > IPA SRA can use bigger alignment if it knows that is safe (thus > get_pointer_alignment returns something larger than what the actual > access in foo was using). What IPA SRA at the moment cannot do > is "propagate" larger alignment used in foo to the call site (AFAIK). > Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site), > get_object_alignment (original dereference site)). > > For fixing pessimizations caused by IPA SRA I suggest to simply punt > (not do the transform) if get_pointer_alignment_1 returns false for the > actual call argument. Or implement the propagation. the patch below punts if get_object_alignment on any dereference in the callee returns something smaller than the alignment of the type of the would-be replacement (and the type of the load we would introduce in the caller). This is essentially an implicit propagation of alignment from callees to callers. I have added a new testcase that checks this which, when not handled properly, fails on sparc64 and produces wrong code on arm (which I however only checked by looking at cross-compiler assembly output). Nevertheless, there is another case that must be taken care of (you already added a x86_64 testcase: gcc.dg/torture/pr52402.c) when there already is a dereference in the caller but it is a BLK-mode one and IPA-SRA changes it to a scalar load - this happens when an aggregate by-value parameter is turned into a scalar one. In that case, we should look at the dereference in the caller and derive alignment from there. I do not do MAX, I assume that when get_pointer_alignment returns true, it can be trusted. If it returns some smaller alignment than the alignment of the type (propagated from the callee), then I suppose there is an alignment attribute missing in the callee or some such mistake in the original program. So far I have successfully bootstrapped and tested this on x86_64-linux and have run parts of the testsuite on sparc64-linux and hppa-linux. Full bootstrap on sparc64-linux and ppc64-linux are underway. I'm going to add a testcase for PR 55448 that will scan x86_64 assembly output. What do you think? Thanks, Martin 2012-11-27 Martin Jambor <mjambor@suse.cz> PR middle-end/52890 PR tree-optimization/55415 PR tree-optimization/54386 PR target/55448 * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when get_pointer_alignment_1 returns false and the base was not a dereference. * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align, added check for required alignment. Update the user. * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test. *** /tmp/5XB8rd_ipa-prop.c Tue Nov 27 21:34:54 2012 --- gcc/ipa-prop.c Tue Nov 27 17:04:20 2012 *************** ipa_modify_call_arguments (struct cgraph *** 2888,2893 **** --- 2888,2894 ---- { tree expr, base, off; location_t loc; + tree prev_base = NULL_TREE; /* We create a new parameter out of the value of the old one, we can do the following kind of transformations: *************** ipa_modify_call_arguments (struct cgraph *** 2919,2925 **** else { HOST_WIDE_INT base_offset; - tree prev_base; if (TREE_CODE (base) == ADDR_EXPR) base = TREE_OPERAND (base, 0); --- 2920,2925 ---- *************** ipa_modify_call_arguments (struct cgraph *** 2956,2962 **** unsigned int align; unsigned HOST_WIDE_INT misalign; ! get_pointer_alignment_1 (base, &align, &misalign); misalign += (tree_to_double_int (off) .sext (TYPE_PRECISION (TREE_TYPE (off))).low * BITS_PER_UNIT); --- 2956,2973 ---- unsigned int align; unsigned HOST_WIDE_INT misalign; ! if (!get_pointer_alignment_1 (base, &align, &misalign)) ! { ! if (prev_base ! && (TREE_CODE (prev_base) == MEM_REF ! || TREE_CODE (prev_base) == TARGET_MEM_REF)) ! { ! gcc_assert (misalign == 0); ! align = TYPE_ALIGN (TREE_TYPE (prev_base)); ! } ! else ! align = TYPE_ALIGN (type); ! } misalign += (tree_to_double_int (off) .sext (TYPE_PRECISION (TREE_TYPE (off))).low * BITS_PER_UNIT); *** /dev/null Thu Oct 25 13:49:30 2012 --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c Tue Nov 27 18:48:45 2012 *************** *** 0 **** --- 1,42 ---- + /* { dg-do run } */ + /* { dg-options "-O3" } */ + + typedef unsigned int myint __attribute__((aligned(1))); + + typedef struct __attribute__((packed)) S { + unsigned a, b, c; + } SS; + + typedef SS __attribute__((aligned(1))) SSS; + + + static unsigned int __attribute__ ((noinline)) + get_a (SSS *p) + { + return p->a; + }; + + static int __attribute__ ((noinline, noclone)) + foo (SS *p) + { + int r = (int) get_a(p) + 2; + return r; + } + + char buf[512]; + + static SSS * __attribute__ ((noinline, noclone)) + get_sss (void) + { + return (SSS *)(buf + 1); + } + + + int + main(int argc, char *argv[]) + { + SSS *p = get_sss(); + if (foo(p) != 2) + __builtin_abort (); + return 0; + } *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 *************** unmodified_by_ref_scalar_representative *** 3891,3902 **** return repr; } ! /* Return true iff this access precludes IPA-SRA of the parameter it is ! associated with. */ static bool ! access_precludes_ipa_sra_p (struct access *access) { /* Avoid issues such as the second simple testcase in PR 42025. The problem is incompatible assign in a call statement (and possibly even in asm statements). This can be relaxed by using a new temporary but only for --- 3891,3903 ---- return repr; } ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is ! associated with. REQ_ALIGN is the minimum required alignment. */ static bool ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) { + unsigned int exp_align; /* Avoid issues such as the second simple testcase in PR 42025. The problem is incompatible assign in a call statement (and possibly even in asm statements). This can be relaxed by using a new temporary but only for *************** access_precludes_ipa_sra_p (struct acces *** 3908,3913 **** --- 3909,3918 ---- || gimple_code (access->stmt) == GIMPLE_ASM)) return true; + exp_align = get_object_alignment (access->expr); + if (exp_align < req_align) + return true; + return false; } *************** splice_param_accesses (tree parm, bool * *** 3943,3949 **** tree a1_alias_type; access = (*access_vec)[i]; modification = access->write; ! if (access_precludes_ipa_sra_p (access)) return NULL; a1_alias_type = reference_alias_ptr_type (access->expr); --- 3948,3954 ---- tree a1_alias_type; access = (*access_vec)[i]; modification = access->write; ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) return NULL; a1_alias_type = reference_alias_ptr_type (access->expr); *************** splice_param_accesses (tree parm, bool * *** 3966,3972 **** else if (ac2->size != access->size) return NULL; ! if (access_precludes_ipa_sra_p (ac2) || (ac2->type != access->type && (TREE_ADDRESSABLE (ac2->type) || TREE_ADDRESSABLE (access->type))) --- 3971,3977 ---- else if (ac2->size != access->size) return NULL; ! if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type)) || (ac2->type != access->type && (TREE_ADDRESSABLE (ac2->type) || TREE_ADDRESSABLE (access->type)))
On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote: > Hi, > > On Tue, Nov 27, 2012 at 02:02:42PM +0100, Richard Biener wrote: >> On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjambor@suse.cz> wrote: >> > Hi, >> > >> > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote: >> >> The get_pointer_alignment function can indicate that it does not know >> >> what the alignment should be, and it always fills in worst-case values >> >> for that case. We should not use these worst-case values to "optimize" >> >> the interface of a function. >> >> >> >> At minimum I think something like the following would be good. But >> >> I'm unsure why we would *ever* want to lower the alignment at a function >> >> interface. It seems to me that we'd simply want the caller to handle >> >> copying the data to an aligned location? >> >> >> >> What was the use case of this code in the first place? >> > >> > PR 52402, and not surprisingly, that testcase fails on x86_64-linux >> > with your patch. Furthermore, misalignment due to MEM_REF offset has >> > to be accounted for whether we know the alignment of base or not. >> > >> > I was hoping that we could do something along the lines of >> > build_ref_for_offset tree-sra.c but that would not work for cases like >> > the one from PR 52890, comment 7 when there is no dereference in the >> > caller, we tranform: >> > >> > D.2537 = &this->surfaces; >> > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); >> > >> > into >> > >> > D.2537 = &this->surfaces; >> > D.2554 = MEM[(const struct ggTrain *)D.2537]; >> > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 (D.2537, 1); >> > >> > At the moment I hope I'll be able to: >> > >> > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in >> > access_precludes_ipa_sra_p. This way, any knowingly misaligned >> > load in the callee should will not be transferred to the caller, if >> > there is none. >> > >> > 2) In ipa_modify_call_arguments, be optimistic in like in your patch >> > except for the case when we are looking at a dereference (i.e. we >> > are turning a BLK dereference into a smaller scalar type one). In >> > that case, use its alignment like in build_ref_for_offset. >> > >> > But I'd like to think about this a bit more. I believe we should ask >> > for Richi's approval when he comes back and recovers anyway. But this >> > is now my highest priority (finally). >> >> The issue here is that IPA SRA, when seeing >> >> foo (T *addr) >> { >> *addr; >> } >> >> foo (&mem); >> >> wanting to transform it to >> >> foo' (T value) >> { >> value; >> } >> >> value = *(T *)mem; >> foo (value); >> >> has to use the _same_ alignment for the access to mem as it was used >> inside foo. That's because of the fact that alignment information in GIMPLE >> is solely present at memory accesses and _not_ in any way associated >> with pointer types (as opposed to more strict rules imposed by some languages >> such as C, often enough violated by users, *(char *)(int *)p is an access >> aligned to at least 4 bytes in C). >> >> IPA SRA can use bigger alignment if it knows that is safe (thus >> get_pointer_alignment returns something larger than what the actual >> access in foo was using). What IPA SRA at the moment cannot do >> is "propagate" larger alignment used in foo to the call site (AFAIK). >> Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site), >> get_object_alignment (original dereference site)). >> >> For fixing pessimizations caused by IPA SRA I suggest to simply punt >> (not do the transform) if get_pointer_alignment_1 returns false for the >> actual call argument. Or implement the propagation. > > the patch below punts if get_object_alignment on any dereference in > the callee returns something smaller than the alignment of the type of > the would-be replacement (and the type of the load we would introduce > in the caller). This is essentially an implicit propagation of > alignment from callees to callers. I have added a new testcase that > checks this which, when not handled properly, fails on sparc64 and > produces wrong code on arm (which I however only checked by looking at > cross-compiler assembly output). > > Nevertheless, there is another case that must be taken care of (you > already added a x86_64 testcase: gcc.dg/torture/pr52402.c) when there > already is a dereference in the caller but it is a BLK-mode one and > IPA-SRA changes it to a scalar load - this happens when an aggregate > by-value parameter is turned into a scalar one. In that case, we > should look at the dereference in the caller and derive alignment from > there. Well, you can derive alignment from all memory accesses that are always executed on the path the call stmt is on and take the maximum (of course paying attention to properly translate what you get to the alignment of the base object at offset zero). Such analysis of course doesn't fit the flow-insensitiveness of SRA very well ... > I do not do MAX, I assume that when get_pointer_alignment returns > true, it can be trusted. If it returns some smaller alignment than > the alignment of the type (propagated from the callee), then I suppose > there is an alignment attribute missing in the callee or some such > mistake in the original program. Well, it can return true if you for example have ptr = ptr & ~3; ... *ptr but then some other access may tell you that ptr was even more aligned. But yes, the way you do it should be conservatively correct at least. > So far I have successfully bootstrapped and tested this on > x86_64-linux and have run parts of the testsuite on sparc64-linux and > hppa-linux. Full bootstrap on sparc64-linux and ppc64-linux are > underway. > > I'm going to add a testcase for PR 55448 that will scan x86_64 > assembly output. > > What do you think? See below - there is some confusion on my side regarding what memory access alignment you query and why you have the two prev_base cases (you constrain on TYPE_ALIGN, so you should always use TYPE_ALIGN?) Richard. > Thanks, > > Martin > > > 2012-11-27 Martin Jambor <mjambor@suse.cz> > > PR middle-end/52890 > PR tree-optimization/55415 > PR tree-optimization/54386 > PR target/55448 > * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when > get_pointer_alignment_1 returns false and the base was not a > dereference. > * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align, > added check for required alignment. Update the user. > > * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test. > > *** /tmp/5XB8rd_ipa-prop.c Tue Nov 27 21:34:54 2012 > --- gcc/ipa-prop.c Tue Nov 27 17:04:20 2012 > *************** ipa_modify_call_arguments (struct cgraph > *** 2888,2893 **** > --- 2888,2894 ---- > { > tree expr, base, off; > location_t loc; > + tree prev_base = NULL_TREE; > > /* We create a new parameter out of the value of the old one, we can > do the following kind of transformations: > *************** ipa_modify_call_arguments (struct cgraph > *** 2919,2925 **** > else > { > HOST_WIDE_INT base_offset; > - tree prev_base; > > if (TREE_CODE (base) == ADDR_EXPR) > base = TREE_OPERAND (base, 0); > --- 2920,2925 ---- > *************** ipa_modify_call_arguments (struct cgraph > *** 2956,2962 **** > unsigned int align; > unsigned HOST_WIDE_INT misalign; > > ! get_pointer_alignment_1 (base, &align, &misalign); > misalign += (tree_to_double_int (off) > .sext (TYPE_PRECISION (TREE_TYPE (off))).low > * BITS_PER_UNIT); > --- 2956,2973 ---- > unsigned int align; > unsigned HOST_WIDE_INT misalign; > > ! if (!get_pointer_alignment_1 (base, &align, &misalign)) > ! { > ! if (prev_base > ! && (TREE_CODE (prev_base) == MEM_REF > ! || TREE_CODE (prev_base) == TARGET_MEM_REF)) > ! { > ! gcc_assert (misalign == 0); > ! align = TYPE_ALIGN (TREE_TYPE (prev_base)); > ! } > ! else > ! align = TYPE_ALIGN (type); I'm quite sure that misalign is wrong after this. What memory access is 'base', what memory access is 'prev_base' (prev_base from the original memory reference in the callee that is going to be replaced? base the one that IPA SRA "constructed" based on the pointer passed at the call site?) > ! } > misalign += (tree_to_double_int (off) > .sext (TYPE_PRECISION (TREE_TYPE (off))).low > * BITS_PER_UNIT); > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c Tue Nov 27 18:48:45 2012 > *************** > *** 0 **** > --- 1,42 ---- > + /* { dg-do run } */ > + /* { dg-options "-O3" } */ > + > + typedef unsigned int myint __attribute__((aligned(1))); > + > + typedef struct __attribute__((packed)) S { > + unsigned a, b, c; > + } SS; > + > + typedef SS __attribute__((aligned(1))) SSS; > + > + > + static unsigned int __attribute__ ((noinline)) > + get_a (SSS *p) > + { > + return p->a; > + }; > + > + static int __attribute__ ((noinline, noclone)) > + foo (SS *p) > + { > + int r = (int) get_a(p) + 2; > + return r; > + } > + > + char buf[512]; > + > + static SSS * __attribute__ ((noinline, noclone)) > + get_sss (void) > + { > + return (SSS *)(buf + 1); > + } > + > + > + int > + main(int argc, char *argv[]) > + { > + SSS *p = get_sss(); > + if (foo(p) != 2) > + __builtin_abort (); > + return 0; > + } > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 > *************** unmodified_by_ref_scalar_representative > *** 3891,3902 **** > return repr; > } > > ! /* Return true iff this access precludes IPA-SRA of the parameter it is > ! associated with. */ > > static bool > ! access_precludes_ipa_sra_p (struct access *access) > { > /* Avoid issues such as the second simple testcase in PR 42025. The problem > is incompatible assign in a call statement (and possibly even in asm > statements). This can be relaxed by using a new temporary but only for > --- 3891,3903 ---- > return repr; > } > > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is > ! associated with. REQ_ALIGN is the minimum required alignment. */ > > static bool > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) > { > + unsigned int exp_align; > /* Avoid issues such as the second simple testcase in PR 42025. The problem > is incompatible assign in a call statement (and possibly even in asm > statements). This can be relaxed by using a new temporary but only for > *************** access_precludes_ipa_sra_p (struct acces > *** 3908,3913 **** > --- 3909,3918 ---- > || gimple_code (access->stmt) == GIMPLE_ASM)) > return true; > > + exp_align = get_object_alignment (access->expr); Is access->expr from the callee or the caller? > + if (exp_align < req_align) > + return true; > + > return false; > } > > *************** splice_param_accesses (tree parm, bool * > *** 3943,3949 **** > tree a1_alias_type; > access = (*access_vec)[i]; > modification = access->write; > ! if (access_precludes_ipa_sra_p (access)) > return NULL; > a1_alias_type = reference_alias_ptr_type (access->expr); > > --- 3948,3954 ---- > tree a1_alias_type; > access = (*access_vec)[i]; > modification = access->write; > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) access->type is not the type of the base object, right? Which means it is not correct here - the alignment of the access is that what get_object_alignment returns, which looks at the base as to whether to lower the alignment compared to TYPE_ALIGN of the access type itself. This is from inside the callee, right? So what you want to check here is that if you end up optimistically using the alignment of the access type that this is at most equal but not larger than the alignment of the access. > return NULL; > a1_alias_type = reference_alias_ptr_type (access->expr); > > *************** splice_param_accesses (tree parm, bool * > *** 3966,3972 **** > else if (ac2->size != access->size) > return NULL; > > ! if (access_precludes_ipa_sra_p (ac2) > || (ac2->type != access->type > && (TREE_ADDRESSABLE (ac2->type) > || TREE_ADDRESSABLE (access->type))) > --- 3971,3977 ---- > else if (ac2->size != access->size) > return NULL; > > ! if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type)) > || (ac2->type != access->type > && (TREE_ADDRESSABLE (ac2->type) > || TREE_ADDRESSABLE (access->type)))
Hi, thanks for the review. When writing a reply I realized I indeed made a mistake or two in the part concerning prev_base and the code was not what it intended to be. I'll re-write it today. Nevertheless, I also have a question regarding a different place of the patch: On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: > On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote: > > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 > > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 > > *************** unmodified_by_ref_scalar_representative > > *** 3891,3902 **** > > return repr; > > } > > > > ! /* Return true iff this access precludes IPA-SRA of the parameter it is > > ! associated with. */ > > > > static bool > > ! access_precludes_ipa_sra_p (struct access *access) > > { > > /* Avoid issues such as the second simple testcase in PR 42025. The problem > > is incompatible assign in a call statement (and possibly even in asm > > statements). This can be relaxed by using a new temporary but only for > > --- 3891,3903 ---- > > return repr; > > } > > > > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is > > ! associated with. REQ_ALIGN is the minimum required alignment. */ > > > > static bool > > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) > > { > > + unsigned int exp_align; > > /* Avoid issues such as the second simple testcase in PR 42025. The problem > > is incompatible assign in a call statement (and possibly even in asm > > statements). This can be relaxed by using a new temporary but only for > > *************** access_precludes_ipa_sra_p (struct acces > > *** 3908,3913 **** > > --- 3909,3918 ---- > > || gimple_code (access->stmt) == GIMPLE_ASM)) > > return true; > > > > + exp_align = get_object_alignment (access->expr); > > Is access->expr from the callee or the caller? The callee. > > > + if (exp_align < req_align) > > + return true; > > + > > return false; > > } > > > > *************** splice_param_accesses (tree parm, bool * > > *** 3943,3949 **** > > tree a1_alias_type; > > access = (*access_vec)[i]; > > modification = access->write; > > ! if (access_precludes_ipa_sra_p (access)) > > return NULL; > > a1_alias_type = reference_alias_ptr_type (access->expr); > > > > --- 3948,3954 ---- > > tree a1_alias_type; > > access = (*access_vec)[i]; > > modification = access->write; > > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) > > access->type is not the type of the base object, right? No, it is the actual type of the scalar memory access. > Which means it is not correct here - the alignment of the access is that what > get_object_alignment returns, which looks at the base as to whether to > lower the alignment compared to TYPE_ALIGN of the access type itself. Oh, so I have to do something like (or even reuse) may_be_unaligned_p from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other uses of get_object_alignment get this right...) Thanks, Martin
On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor <mjambor@suse.cz> wrote: > Hi, > > thanks for the review. When writing a reply I realized I indeed made > a mistake or two in the part concerning prev_base and the code was not > what it intended to be. I'll re-write it today. > > Nevertheless, I also have a question regarding a different place of > the patch: > > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote: >> > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 >> > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 >> > *************** unmodified_by_ref_scalar_representative >> > *** 3891,3902 **** >> > return repr; >> > } >> > >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is >> > ! associated with. */ >> > >> > static bool >> > ! access_precludes_ipa_sra_p (struct access *access) >> > { >> > /* Avoid issues such as the second simple testcase in PR 42025. The problem >> > is incompatible assign in a call statement (and possibly even in asm >> > statements). This can be relaxed by using a new temporary but only for >> > --- 3891,3903 ---- >> > return repr; >> > } >> > >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is >> > ! associated with. REQ_ALIGN is the minimum required alignment. */ >> > >> > static bool >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) >> > { >> > + unsigned int exp_align; >> > /* Avoid issues such as the second simple testcase in PR 42025. The problem >> > is incompatible assign in a call statement (and possibly even in asm >> > statements). This can be relaxed by using a new temporary but only for >> > *************** access_precludes_ipa_sra_p (struct acces >> > *** 3908,3913 **** >> > --- 3909,3918 ---- >> > || gimple_code (access->stmt) == GIMPLE_ASM)) >> > return true; >> > >> > + exp_align = get_object_alignment (access->expr); >> >> Is access->expr from the callee or the caller? > > The callee. > >> >> > + if (exp_align < req_align) >> > + return true; >> > + >> > return false; >> > } >> > >> > *************** splice_param_accesses (tree parm, bool * >> > *** 3943,3949 **** >> > tree a1_alias_type; >> > access = (*access_vec)[i]; >> > modification = access->write; >> > ! if (access_precludes_ipa_sra_p (access)) >> > return NULL; >> > a1_alias_type = reference_alias_ptr_type (access->expr); >> > >> > --- 3948,3954 ---- >> > tree a1_alias_type; >> > access = (*access_vec)[i]; >> > modification = access->write; >> > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) >> >> access->type is not the type of the base object, right? > > No, it is the actual type of the scalar memory access. > >> Which means it is not correct here - the alignment of the access is that what >> get_object_alignment returns, which looks at the base as to whether to >> lower the alignment compared to TYPE_ALIGN of the access type itself. > > Oh, so I have to do something like (or even reuse) may_be_unaligned_p > from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other > uses of get_object_alignment get this right...) Well, I think you should use get_object_alignment here, too, and punt if that is less than TYPE_ALIGN (access->type) if you will end up using that. get_object_alignment (ref) is the authorative answer to alignment questions. Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted depends on whether get_object_alignment (ref) returns the same or bigger alignment. But if the type is all you can propagate in this case (and not an explicit alignment value) you have to punt if the information from the type isn't conservative. Richard. > Thanks, > > Martin
Hi, On Thu, Nov 29, 2012 at 12:54:20PM +0100, Richard Biener wrote: > On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor <mjambor@suse.cz> wrote: > > Hi, > > > > thanks for the review. When writing a reply I realized I indeed made > > a mistake or two in the part concerning prev_base and the code was not > > what it intended to be. I'll re-write it today. > > > > Nevertheless, I also have a question regarding a different place of > > the patch: > > > > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: > >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote: > >> > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 > >> > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 > >> > *************** unmodified_by_ref_scalar_representative > >> > *** 3891,3902 **** > >> > return repr; > >> > } > >> > > >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is > >> > ! associated with. */ > >> > > >> > static bool > >> > ! access_precludes_ipa_sra_p (struct access *access) > >> > { > >> > /* Avoid issues such as the second simple testcase in PR 42025. The problem > >> > is incompatible assign in a call statement (and possibly even in asm > >> > statements). This can be relaxed by using a new temporary but only for > >> > --- 3891,3903 ---- > >> > return repr; > >> > } > >> > > >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is > >> > ! associated with. REQ_ALIGN is the minimum required alignment. */ > >> > > >> > static bool > >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) > >> > { > >> > + unsigned int exp_align; > >> > /* Avoid issues such as the second simple testcase in PR 42025. The problem > >> > is incompatible assign in a call statement (and possibly even in asm > >> > statements). This can be relaxed by using a new temporary but only for > >> > *************** access_precludes_ipa_sra_p (struct acces > >> > *** 3908,3913 **** > >> > --- 3909,3918 ---- > >> > || gimple_code (access->stmt) == GIMPLE_ASM)) > >> > return true; > >> > > >> > + exp_align = get_object_alignment (access->expr); > >> > >> Is access->expr from the callee or the caller? > > > > The callee. > > > >> > >> > + if (exp_align < req_align) > >> > + return true; > >> > + > >> > return false; > >> > } > >> > > >> > *************** splice_param_accesses (tree parm, bool * > >> > *** 3943,3949 **** > >> > tree a1_alias_type; > >> > access = (*access_vec)[i]; > >> > modification = access->write; > >> > ! if (access_precludes_ipa_sra_p (access)) > >> > return NULL; > >> > a1_alias_type = reference_alias_ptr_type (access->expr); > >> > > >> > --- 3948,3954 ---- > >> > tree a1_alias_type; > >> > access = (*access_vec)[i]; > >> > modification = access->write; > >> > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) > >> > >> access->type is not the type of the base object, right? > > > > No, it is the actual type of the scalar memory access. > > > >> Which means it is not correct here - the alignment of the access is that what > >> get_object_alignment returns, which looks at the base as to whether to > >> lower the alignment compared to TYPE_ALIGN of the access type itself. > > > > Oh, so I have to do something like (or even reuse) may_be_unaligned_p > > from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other > > uses of get_object_alignment get this right...) > > Well, I think you should use get_object_alignment here, too, and punt if > that is less than TYPE_ALIGN (access->type) if you will end up using that. > > get_object_alignment (ref) is the authorative answer to alignment questions. > Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted > depends on whether get_object_alignment (ref) returns the same or bigger > alignment. But if the type is all you can propagate in this case (and not > an explicit alignment value) you have to punt if the information from the > type isn't conservative. > This must be some sort of misunderstanding, that is exactly what the code is doing. access_precludes_ipa_sra_p called get_object_alignment on the reference and then returned false (i.e. prevented the transformation) if the result was smaller than the TYPE_ALIGN of the type that was going to be propagated to the callee. Thanks, Martin
On Thu, Nov 29, 2012 at 1:33 PM, Martin Jambor <mjambor@suse.cz> wrote: > Hi, > > On Thu, Nov 29, 2012 at 12:54:20PM +0100, Richard Biener wrote: >> On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor <mjambor@suse.cz> wrote: >> > Hi, >> > >> > thanks for the review. When writing a reply I realized I indeed made >> > a mistake or two in the part concerning prev_base and the code was not >> > what it intended to be. I'll re-write it today. >> > >> > Nevertheless, I also have a question regarding a different place of >> > the patch: >> > >> > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: >> >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor <mjambor@suse.cz> wrote: >> >> > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 >> >> > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 >> >> > *************** unmodified_by_ref_scalar_representative >> >> > *** 3891,3902 **** >> >> > return repr; >> >> > } >> >> > >> >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is >> >> > ! associated with. */ >> >> > >> >> > static bool >> >> > ! access_precludes_ipa_sra_p (struct access *access) >> >> > { >> >> > /* Avoid issues such as the second simple testcase in PR 42025. The problem >> >> > is incompatible assign in a call statement (and possibly even in asm >> >> > statements). This can be relaxed by using a new temporary but only for >> >> > --- 3891,3903 ---- >> >> > return repr; >> >> > } >> >> > >> >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is >> >> > ! associated with. REQ_ALIGN is the minimum required alignment. */ >> >> > >> >> > static bool >> >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) >> >> > { >> >> > + unsigned int exp_align; >> >> > /* Avoid issues such as the second simple testcase in PR 42025. The problem >> >> > is incompatible assign in a call statement (and possibly even in asm >> >> > statements). This can be relaxed by using a new temporary but only for >> >> > *************** access_precludes_ipa_sra_p (struct acces >> >> > *** 3908,3913 **** >> >> > --- 3909,3918 ---- >> >> > || gimple_code (access->stmt) == GIMPLE_ASM)) >> >> > return true; >> >> > >> >> > + exp_align = get_object_alignment (access->expr); >> >> >> >> Is access->expr from the callee or the caller? >> > >> > The callee. >> > >> >> >> >> > + if (exp_align < req_align) >> >> > + return true; >> >> > + >> >> > return false; >> >> > } >> >> > >> >> > *************** splice_param_accesses (tree parm, bool * >> >> > *** 3943,3949 **** >> >> > tree a1_alias_type; >> >> > access = (*access_vec)[i]; >> >> > modification = access->write; >> >> > ! if (access_precludes_ipa_sra_p (access)) >> >> > return NULL; >> >> > a1_alias_type = reference_alias_ptr_type (access->expr); >> >> > >> >> > --- 3948,3954 ---- >> >> > tree a1_alias_type; >> >> > access = (*access_vec)[i]; >> >> > modification = access->write; >> >> > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) >> >> >> >> access->type is not the type of the base object, right? >> > >> > No, it is the actual type of the scalar memory access. >> > >> >> Which means it is not correct here - the alignment of the access is that what >> >> get_object_alignment returns, which looks at the base as to whether to >> >> lower the alignment compared to TYPE_ALIGN of the access type itself. >> > >> > Oh, so I have to do something like (or even reuse) may_be_unaligned_p >> > from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other >> > uses of get_object_alignment get this right...) >> >> Well, I think you should use get_object_alignment here, too, and punt if >> that is less than TYPE_ALIGN (access->type) if you will end up using that. >> >> get_object_alignment (ref) is the authorative answer to alignment questions. >> Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted >> depends on whether get_object_alignment (ref) returns the same or bigger >> alignment. But if the type is all you can propagate in this case (and not >> an explicit alignment value) you have to punt if the information from the >> type isn't conservative. >> > > This must be some sort of misunderstanding, that is exactly what the > code is doing. access_precludes_ipa_sra_p called get_object_alignment > on the reference and then returned false (i.e. prevented the > transformation) if the result was smaller than the TYPE_ALIGN of the > type that was going to be propagated to the callee. Hm, ok, then this was a confusion on my part. Richard. > Thanks, > > Martin >
On Thu, Nov 29, 2012 at 11:06:41AM +0100, Martin Jambor wrote: > Hi, > > thanks for the review. When writing a reply I realized I indeed made > a mistake or two in the part concerning prev_base and the code was not > what it intended to be. I'll re-write it today. OK, this is it. The part in tree-sra.c, in which I make IPA-SRA punt when the alignment of the loads in the callee are not as big as alignment of the required types of the new arguments, remained the same. In ipa-prop.h, I now use maximum of the alignment from get_pointer_alignment_1 and the type alignment propagated from the callee. I now also recognize cases where the dereference is buried below a COMPONENT_REF or ARRAY_REF (this is checked by new testcase ipa-sra-9.c which fails at least on on sparc64 when things go wrong). The patch has passed bootstrap and testing on x86_64-linux, ppc64-linux and sparc64-linux. Thanks, Martin 2012-11-29 Martin Jambor <mjambor@suse.cz> PR middle-end/52890 PR tree-optimization/55415 PR tree-optimization/54386 PR target/55448 * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when get_pointer_alignment_1 returns false and the base was not a dereference. * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align, added check for required alignment. Update the user. * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test. * testsuite/gcc.dg/ipa/ipa-sra-8.c: Likewise. * testsuite/gcc.dg/ipa/ipa-sra-9.c: Likewise. * testsuite/gcc.target/i386/pr55448.c: Likewise. *** /tmp/cY007b_ipa-prop.c Fri Nov 30 12:26:26 2012 --- gcc/ipa-prop.c Thu Nov 29 16:05:04 2012 *************** ipa_modify_call_arguments (struct cgraph *** 2888,2893 **** --- 2888,2895 ---- { tree expr, base, off; location_t loc; + unsigned int deref_align; + bool deref_base = false; /* We create a new parameter out of the value of the old one, we can do the following kind of transformations: *************** ipa_modify_call_arguments (struct cgraph *** 2920,2928 **** { HOST_WIDE_INT base_offset; tree prev_base; if (TREE_CODE (base) == ADDR_EXPR) ! base = TREE_OPERAND (base, 0); prev_base = base; base = get_addr_base_and_unit_offset (base, &base_offset); /* Aggregate arguments can have non-invariant addresses. */ --- 2922,2936 ---- { HOST_WIDE_INT base_offset; tree prev_base; + bool addrof; if (TREE_CODE (base) == ADDR_EXPR) ! { ! base = TREE_OPERAND (base, 0); ! addrof = true; ! } ! else ! addrof = false; prev_base = base; base = get_addr_base_and_unit_offset (base, &base_offset); /* Aggregate arguments can have non-invariant addresses. */ *************** ipa_modify_call_arguments (struct cgraph *** 2934,2939 **** --- 2942,2952 ---- } else if (TREE_CODE (base) == MEM_REF) { + if (!addrof) + { + deref_base = true; + deref_align = TYPE_ALIGN (TREE_TYPE (base)); + } off = build_int_cst (adj->alias_ptr_type, base_offset + adj->offset / BITS_PER_UNIT); *************** ipa_modify_call_arguments (struct cgraph *** 2956,2962 **** unsigned int align; unsigned HOST_WIDE_INT misalign; ! get_pointer_alignment_1 (base, &align, &misalign); misalign += (tree_to_double_int (off) .sext (TYPE_PRECISION (TREE_TYPE (off))).low * BITS_PER_UNIT); --- 2969,2985 ---- unsigned int align; unsigned HOST_WIDE_INT misalign; ! if (deref_base) ! { ! align = deref_align; ! misalign = 0; ! } ! else ! { ! get_pointer_alignment_1 (base, &align, &misalign); ! if (TYPE_ALIGN (type) > align) ! align = TYPE_ALIGN (type); ! } misalign += (tree_to_double_int (off) .sext (TYPE_PRECISION (TREE_TYPE (off))).low * BITS_PER_UNIT); *** /dev/null Thu Oct 25 13:49:30 2012 --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c Wed Nov 28 14:19:30 2012 *************** *** 0 **** --- 1,42 ---- + /* { dg-do run } */ + /* { dg-options "-O3" } */ + + typedef unsigned int myint __attribute__((aligned(1))); + + typedef struct __attribute__((packed)) S { + unsigned a, b, c; + } SS; + + typedef SS __attribute__((aligned(1))) SSS; + + + static unsigned int __attribute__ ((noinline)) + get_a (SSS *p) + { + return p->a; + }; + + static int __attribute__ ((noinline, noclone)) + foo (SS *p) + { + int r = (int) get_a(p) + 2; + return r; + } + + char buf[512]; + + static SSS * __attribute__ ((noinline, noclone)) + get_sss (void) + { + return (SSS *)(buf + 1); + } + + + int + main(int argc, char *argv[]) + { + SSS *p = get_sss(); + if (foo(p) != 2) + __builtin_abort (); + return 0; + } *** /dev/null Thu Oct 25 13:49:30 2012 --- gcc/testsuite/gcc.dg/ipa/ipa-sra-8.c Fri Nov 30 00:52:03 2012 *************** *** 0 **** --- 1,41 ---- + /* { dg-do run } */ + /* { dg-options "-O3" } */ + + typedef unsigned int myint __attribute__((aligned(1))); + + typedef struct S { + unsigned a, b, c; + } SS; + + typedef SS __attribute__((aligned(1))) SSS; + + + static unsigned int __attribute__ ((noinline)) + get_a (SS s) + { + return s.a; + }; + + static int __attribute__ ((noinline, noclone)) + foo (SSS *p) + { + int r = (int) get_a(*p) + 2; + return r; + } + + char buf[512]; + + static SSS * __attribute__ ((noinline, noclone)) + get_sss (void) + { + return (SSS *)(buf + 1); + } + + int + main(int argc, char *argv[]) + { + SSS *p = get_sss(); + if (foo(p) != 2) + __builtin_abort (); + return 0; + } *** /dev/null Thu Oct 25 13:49:30 2012 --- gcc/testsuite/gcc.dg/ipa/ipa-sra-9.c Fri Nov 30 00:52:12 2012 *************** *** 0 **** --- 1,44 ---- + /* { dg-do run } */ + /* { dg-options "-O3" } */ + + typedef unsigned int myint __attribute__((aligned(1))); + + typedef struct S { + unsigned a, b, c; + } SS; + + typedef struct U { + SS s[2]; + } UU; + + typedef UU __attribute__((aligned(1))) UUU; + + static unsigned int __attribute__ ((noinline)) + get_a (SS s) + { + return s.a; + }; + + static int __attribute__ ((noinline, noclone)) + foo (UUU *p) + { + int r = (int) get_a(p->s[0]) + 2; + return r; + } + + char buf[512]; + + static UUU * __attribute__ ((noinline, noclone)) + get_uuu (void) + { + return (UUU *)(buf + 1); + } + + int + main(int argc, char *argv[]) + { + UUU *p = get_uuu(); + if (foo(p) != 2) + __builtin_abort (); + return 0; + } *** /dev/null Thu Oct 25 13:49:30 2012 --- gcc/testsuite/gcc.target/i386/pr55448.c Thu Nov 29 15:42:16 2012 *************** *** 0 **** --- 1,26 ---- + /* { dg-do compile } */ + /* { dg-options "-O2 -mavx" } */ + + #include <immintrin.h> + + static inline __m256 add1(const __m256 *a, const __m256 *b) + { + return _mm256_add_ps(*a, *b); + } + + void foo1(__m256 *a, const __m256 b) + { + *a = add1(a, &b); + } + + static inline __m128 add2(const __m128 *a, const __m128 *b) + { + return _mm_add_ps(*a, *b); + } + + void foo2(__m128 *a, const __m128 b) + { + *a = add2(a, &b); + } + + /* { dg-final { scan-assembler-not "vmovups" } } */ *** /tmp/yTaOsa_tree-sra.c Fri Nov 30 12:26:26 2012 --- gcc/tree-sra.c Wed Nov 28 14:19:30 2012 *************** unmodified_by_ref_scalar_representative *** 3891,3902 **** return repr; } ! /* Return true iff this access precludes IPA-SRA of the parameter it is ! associated with. */ static bool ! access_precludes_ipa_sra_p (struct access *access) { /* Avoid issues such as the second simple testcase in PR 42025. The problem is incompatible assign in a call statement (and possibly even in asm statements). This can be relaxed by using a new temporary but only for --- 3891,3903 ---- return repr; } ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is ! associated with. REQ_ALIGN is the minimum required alignment. */ static bool ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) { + unsigned int exp_align; /* Avoid issues such as the second simple testcase in PR 42025. The problem is incompatible assign in a call statement (and possibly even in asm statements). This can be relaxed by using a new temporary but only for *************** access_precludes_ipa_sra_p (struct acces *** 3908,3913 **** --- 3909,3918 ---- || gimple_code (access->stmt) == GIMPLE_ASM)) return true; + exp_align = get_object_alignment (access->expr); + if (exp_align < req_align) + return true; + return false; } *************** splice_param_accesses (tree parm, bool * *** 3943,3949 **** tree a1_alias_type; access = (*access_vec)[i]; modification = access->write; ! if (access_precludes_ipa_sra_p (access)) return NULL; a1_alias_type = reference_alias_ptr_type (access->expr); --- 3948,3954 ---- tree a1_alias_type; access = (*access_vec)[i]; modification = access->write; ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) return NULL; a1_alias_type = reference_alias_ptr_type (access->expr); *************** splice_param_accesses (tree parm, bool * *** 3966,3972 **** else if (ac2->size != access->size) return NULL; ! if (access_precludes_ipa_sra_p (ac2) || (ac2->type != access->type && (TREE_ADDRESSABLE (ac2->type) || TREE_ADDRESSABLE (access->type))) --- 3971,3977 ---- else if (ac2->size != access->size) return NULL; ! if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type)) || (ac2->type != access->type && (TREE_ADDRESSABLE (ac2->type) || TREE_ADDRESSABLE (access->type)))
On Fri, Nov 30, 2012 at 2:02 PM, Martin Jambor <mjambor@suse.cz> wrote: > On Thu, Nov 29, 2012 at 11:06:41AM +0100, Martin Jambor wrote: >> Hi, >> >> thanks for the review. When writing a reply I realized I indeed made >> a mistake or two in the part concerning prev_base and the code was not >> what it intended to be. I'll re-write it today. > > OK, this is it. The part in tree-sra.c, in which I make IPA-SRA punt > when the alignment of the loads in the callee are not as big as > alignment of the required types of the new arguments, remained the > same. > > In ipa-prop.h, I now use maximum of the alignment from > get_pointer_alignment_1 and the type alignment propagated from the > callee. I now also recognize cases where the dereference is buried > below a COMPONENT_REF or ARRAY_REF (this is checked by new testcase > ipa-sra-9.c which fails at least on on sparc64 when things go wrong). > > The patch has passed bootstrap and testing on x86_64-linux, > ppc64-linux and sparc64-linux. Works for me. Thanks, Richard. > Thanks, > > Martin > > > 2012-11-29 Martin Jambor <mjambor@suse.cz> > > PR middle-end/52890 > PR tree-optimization/55415 > PR tree-optimization/54386 > PR target/55448 > * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when > get_pointer_alignment_1 returns false and the base was not a > dereference. > * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align, > added check for required alignment. Update the user. > > * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test. > * testsuite/gcc.dg/ipa/ipa-sra-8.c: Likewise. > * testsuite/gcc.dg/ipa/ipa-sra-9.c: Likewise. > * testsuite/gcc.target/i386/pr55448.c: Likewise. > > > *** /tmp/cY007b_ipa-prop.c Fri Nov 30 12:26:26 2012 > --- gcc/ipa-prop.c Thu Nov 29 16:05:04 2012 > *************** ipa_modify_call_arguments (struct cgraph > *** 2888,2893 **** > --- 2888,2895 ---- > { > tree expr, base, off; > location_t loc; > + unsigned int deref_align; > + bool deref_base = false; > > /* We create a new parameter out of the value of the old one, we can > do the following kind of transformations: > *************** ipa_modify_call_arguments (struct cgraph > *** 2920,2928 **** > { > HOST_WIDE_INT base_offset; > tree prev_base; > > if (TREE_CODE (base) == ADDR_EXPR) > ! base = TREE_OPERAND (base, 0); > prev_base = base; > base = get_addr_base_and_unit_offset (base, &base_offset); > /* Aggregate arguments can have non-invariant addresses. */ > --- 2922,2936 ---- > { > HOST_WIDE_INT base_offset; > tree prev_base; > + bool addrof; > > if (TREE_CODE (base) == ADDR_EXPR) > ! { > ! base = TREE_OPERAND (base, 0); > ! addrof = true; > ! } > ! else > ! addrof = false; > prev_base = base; > base = get_addr_base_and_unit_offset (base, &base_offset); > /* Aggregate arguments can have non-invariant addresses. */ > *************** ipa_modify_call_arguments (struct cgraph > *** 2934,2939 **** > --- 2942,2952 ---- > } > else if (TREE_CODE (base) == MEM_REF) > { > + if (!addrof) > + { > + deref_base = true; > + deref_align = TYPE_ALIGN (TREE_TYPE (base)); > + } > off = build_int_cst (adj->alias_ptr_type, > base_offset > + adj->offset / BITS_PER_UNIT); > *************** ipa_modify_call_arguments (struct cgraph > *** 2956,2962 **** > unsigned int align; > unsigned HOST_WIDE_INT misalign; > > ! get_pointer_alignment_1 (base, &align, &misalign); > misalign += (tree_to_double_int (off) > .sext (TYPE_PRECISION (TREE_TYPE (off))).low > * BITS_PER_UNIT); > --- 2969,2985 ---- > unsigned int align; > unsigned HOST_WIDE_INT misalign; > > ! if (deref_base) > ! { > ! align = deref_align; > ! misalign = 0; > ! } > ! else > ! { > ! get_pointer_alignment_1 (base, &align, &misalign); > ! if (TYPE_ALIGN (type) > align) > ! align = TYPE_ALIGN (type); > ! } > misalign += (tree_to_double_int (off) > .sext (TYPE_PRECISION (TREE_TYPE (off))).low > * BITS_PER_UNIT); > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c Wed Nov 28 14:19:30 2012 > *************** > *** 0 **** > --- 1,42 ---- > + /* { dg-do run } */ > + /* { dg-options "-O3" } */ > + > + typedef unsigned int myint __attribute__((aligned(1))); > + > + typedef struct __attribute__((packed)) S { > + unsigned a, b, c; > + } SS; > + > + typedef SS __attribute__((aligned(1))) SSS; > + > + > + static unsigned int __attribute__ ((noinline)) > + get_a (SSS *p) > + { > + return p->a; > + }; > + > + static int __attribute__ ((noinline, noclone)) > + foo (SS *p) > + { > + int r = (int) get_a(p) + 2; > + return r; > + } > + > + char buf[512]; > + > + static SSS * __attribute__ ((noinline, noclone)) > + get_sss (void) > + { > + return (SSS *)(buf + 1); > + } > + > + > + int > + main(int argc, char *argv[]) > + { > + SSS *p = get_sss(); > + if (foo(p) != 2) > + __builtin_abort (); > + return 0; > + } > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.dg/ipa/ipa-sra-8.c Fri Nov 30 00:52:03 2012 > *************** > *** 0 **** > --- 1,41 ---- > + /* { dg-do run } */ > + /* { dg-options "-O3" } */ > + > + typedef unsigned int myint __attribute__((aligned(1))); > + > + typedef struct S { > + unsigned a, b, c; > + } SS; > + > + typedef SS __attribute__((aligned(1))) SSS; > + > + > + static unsigned int __attribute__ ((noinline)) > + get_a (SS s) > + { > + return s.a; > + }; > + > + static int __attribute__ ((noinline, noclone)) > + foo (SSS *p) > + { > + int r = (int) get_a(*p) + 2; > + return r; > + } > + > + char buf[512]; > + > + static SSS * __attribute__ ((noinline, noclone)) > + get_sss (void) > + { > + return (SSS *)(buf + 1); > + } > + > + int > + main(int argc, char *argv[]) > + { > + SSS *p = get_sss(); > + if (foo(p) != 2) > + __builtin_abort (); > + return 0; > + } > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.dg/ipa/ipa-sra-9.c Fri Nov 30 00:52:12 2012 > *************** > *** 0 **** > --- 1,44 ---- > + /* { dg-do run } */ > + /* { dg-options "-O3" } */ > + > + typedef unsigned int myint __attribute__((aligned(1))); > + > + typedef struct S { > + unsigned a, b, c; > + } SS; > + > + typedef struct U { > + SS s[2]; > + } UU; > + > + typedef UU __attribute__((aligned(1))) UUU; > + > + static unsigned int __attribute__ ((noinline)) > + get_a (SS s) > + { > + return s.a; > + }; > + > + static int __attribute__ ((noinline, noclone)) > + foo (UUU *p) > + { > + int r = (int) get_a(p->s[0]) + 2; > + return r; > + } > + > + char buf[512]; > + > + static UUU * __attribute__ ((noinline, noclone)) > + get_uuu (void) > + { > + return (UUU *)(buf + 1); > + } > + > + int > + main(int argc, char *argv[]) > + { > + UUU *p = get_uuu(); > + if (foo(p) != 2) > + __builtin_abort (); > + return 0; > + } > *** /dev/null Thu Oct 25 13:49:30 2012 > --- gcc/testsuite/gcc.target/i386/pr55448.c Thu Nov 29 15:42:16 2012 > *************** > *** 0 **** > --- 1,26 ---- > + /* { dg-do compile } */ > + /* { dg-options "-O2 -mavx" } */ > + > + #include <immintrin.h> > + > + static inline __m256 add1(const __m256 *a, const __m256 *b) > + { > + return _mm256_add_ps(*a, *b); > + } > + > + void foo1(__m256 *a, const __m256 b) > + { > + *a = add1(a, &b); > + } > + > + static inline __m128 add2(const __m128 *a, const __m128 *b) > + { > + return _mm_add_ps(*a, *b); > + } > + > + void foo2(__m128 *a, const __m128 b) > + { > + *a = add2(a, &b); > + } > + > + /* { dg-final { scan-assembler-not "vmovups" } } */ > *** /tmp/yTaOsa_tree-sra.c Fri Nov 30 12:26:26 2012 > --- gcc/tree-sra.c Wed Nov 28 14:19:30 2012 > *************** unmodified_by_ref_scalar_representative > *** 3891,3902 **** > return repr; > } > > ! /* Return true iff this access precludes IPA-SRA of the parameter it is > ! associated with. */ > > static bool > ! access_precludes_ipa_sra_p (struct access *access) > { > /* Avoid issues such as the second simple testcase in PR 42025. The problem > is incompatible assign in a call statement (and possibly even in asm > statements). This can be relaxed by using a new temporary but only for > --- 3891,3903 ---- > return repr; > } > > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is > ! associated with. REQ_ALIGN is the minimum required alignment. */ > > static bool > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) > { > + unsigned int exp_align; > /* Avoid issues such as the second simple testcase in PR 42025. The problem > is incompatible assign in a call statement (and possibly even in asm > statements). This can be relaxed by using a new temporary but only for > *************** access_precludes_ipa_sra_p (struct acces > *** 3908,3913 **** > --- 3909,3918 ---- > || gimple_code (access->stmt) == GIMPLE_ASM)) > return true; > > + exp_align = get_object_alignment (access->expr); > + if (exp_align < req_align) > + return true; > + > return false; > } > > *************** splice_param_accesses (tree parm, bool * > *** 3943,3949 **** > tree a1_alias_type; > access = (*access_vec)[i]; > modification = access->write; > ! if (access_precludes_ipa_sra_p (access)) > return NULL; > a1_alias_type = reference_alias_ptr_type (access->expr); > > --- 3948,3954 ---- > tree a1_alias_type; > access = (*access_vec)[i]; > modification = access->write; > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) > return NULL; > a1_alias_type = reference_alias_ptr_type (access->expr); > > *************** splice_param_accesses (tree parm, bool * > *** 3966,3972 **** > else if (ac2->size != access->size) > return NULL; > > ! if (access_precludes_ipa_sra_p (ac2) > || (ac2->type != access->type > && (TREE_ADDRESSABLE (ac2->type) > || TREE_ADDRESSABLE (access->type))) > --- 3971,3977 ---- > else if (ac2->size != access->size) > return NULL; > > ! if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type)) > || (ac2->type != access->type > && (TREE_ADDRESSABLE (ac2->type) > || TREE_ADDRESSABLE (access->type)))
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 3150bd6..d117389 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -2956,15 +2956,17 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt, unsigned int align; unsigned HOST_WIDE_INT misalign; - get_pointer_alignment_1 (base, &align, &misalign); - misalign += (tree_to_double_int (off) - .sext (TYPE_PRECISION (TREE_TYPE (off))).low - * BITS_PER_UNIT); - misalign = misalign & (align - 1); - if (misalign != 0) - align = (misalign & -misalign); - if (align < TYPE_ALIGN (type)) - type = build_aligned_type (type, align); + if (get_pointer_alignment_1 (base, &align, &misalign)) + { + misalign += (tree_to_double_int (off) + .sext (TYPE_PRECISION (TREE_TYPE (off))).low + * BITS_PER_UNIT); + misalign = misalign & (align - 1); + if (misalign != 0) + align = (misalign & -misalign); + if (align < TYPE_ALIGN (type)) + type = build_aligned_type (type, align); + } expr = fold_build2_loc (loc, MEM_REF, type, base, off); } else