Message ID | 20100905021216.GJ16898@codesourcery.com |
---|---|
State | New |
Headers | show |
On Sun, Sep 5, 2010 at 4:12 AM, Nathan Froyd <froydnj@codesourcery.com> wrote: > The functions in the following testcase (PPC-specific, but > generalizable) do not compile to the same code: > > #define ASMSTR "stw%U0%X0 %1,%0" > > struct Foo > { > unsigned pad[10]; > unsigned cs0_bnds; > }; > > static inline > void Out (unsigned *addr) > { > __asm__ __volatile__(ASMSTR : "=m" (*addr) : "r" (0x1f)); > } > > void AsmFunction (void) > { > struct Foo *ddr = (struct Foo *)0xffe02000; > > Out (&ddr->cs0_bnds); > Out (&ddr->pad[0]); > } > > void AsmDirect (void) > { > struct Foo *ddr = (struct Foo *)0xffe02000; > > __asm__ __volatile__(ASMSTR : "=m" (ddr->cs0_bnds) : "r" (0x1f)); > __asm__ __volatile__(ASMSTR : "=m" (ddr->pad[0]) : "r" (0x1f)); > } > > The results are (cleaned up to remove #APP bits): > > AsmFunction: > lis 9,0xffe0 > li 0,31 > ori 9,9,8232 > stw 0,0(9) > lis 9,0xffe0 > ori 9,9,8192 > stw 0,0(9) > blr > > AsmDirect: > lis 9,0xffe0 > li 0,31 > ori 9,9,8192 > stw 0,40(9) > li 0,28 > stw 0,0(9) > blr > > Notice that in AsmDirect, we've CSE'd the address of `ddr', and we > haven't done so in AsmFunction. (We failed to move the lowpart of the > address into the stores, but AFAICT that is a problem with the PPC > backend and is not relevant here. Compiling the example above for MIPS > demonstrates the same problem without the distraction of bad code.) > > We used to do this CSE'ing in 4.3. When faced with this code: > > D.2017_2 = &4292878336B->cs0_bnds; > __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m" *D.2017_2 : "r" 31); > > in ccp, we'd propagate D.2017_2 into its use and wind up with identical > trees between the two functions heading into expand. In 4.4 and 4.5, > however, we don't, so we've never given an opportunity to CSE the > address of `ddr'. > > This affects applications that are size-conscious; if you're doing > several calls to 'Out' in the same function, your code size can increase > quite a bit in 4.4 and 4.5. > > So far as I can tell, this is because is_gimple_invariant_address fails > to consider that we might have an INTEGER_CST base address. The patch > below fixes the testcase; with it applied, AsmFunction and AsmDirect > compile to the same code. What I'm unsure of is: > > - Whether I should also check for decl_address_invariant_p, perhaps by: > > if (TREE_CODE (op) == INDIRECT_REF) > op = TREE_OPERAND (op, 0); > > return CONSTANT_CLASS_P (op) || decl_address_invariant_p (op); > > - Whether the same treatment should be applied to > is_gimple_ip_invariant_address. > > My most recent powerpc compiler is somewhat old, but it is post-MEM_REF > merge and it also does not do the desired propagation. It looks to me > like the relevant case in handled in is_gimple_invariant_address, but > something is not working; the result &4292878336B->FOO is marked > VARYING. > > Anyway, the below patch is against 4.5 and I'm asking for approval to > 4.4 and 4.5 and possibly mainline (with modifications, obviously) once I > verify with an up-to-date compiler. Answers to the questions above > would be helpful. > > -Nathan > > * gimple.c (is_gimple_invariant_address): Check for a constant > operand of an INDIRECT_REF. > > Index: gcc/ChangeLog > =================================================================== > Index: gcc/gimple.c > =================================================================== > --- gcc/gimple.c (revision 163868) > +++ gcc/gimple.c (working copy) > @@ -2591,7 +2591,13 @@ is_gimple_invariant_address (const_tree > > op = strip_invariant_refs (TREE_OPERAND (t, 0)); > > - return op && (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); > + if (!op) > + return false; > + > + if (TREE_CODE (op) == INDIRECT_REF) > + return CONSTANT_CLASS_P (TREE_OPERAND (op, 0)); > + > + return (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); > } This doesn't make sense. If we are asking for &CST->x then we should have folded it to a plain CST. So - you have to look elsewhere for why that does not happen. (and of course such a change wouldn't be appropriate for the release branches, you basically change the GIMPLE IL) Richard. > /* Return true if T is a gimple invariant address at IPA level >
On Sun, Sep 05, 2010 at 10:36:16AM +0200, Richard Guenther wrote: > > Index: gcc/ChangeLog > > =================================================================== > > Index: gcc/gimple.c > > =================================================================== > > --- gcc/gimple.c (revision 163868) > > +++ gcc/gimple.c (working copy) > > @@ -2591,7 +2591,13 @@ is_gimple_invariant_address (const_tree > > > > op = strip_invariant_refs (TREE_OPERAND (t, 0)); > > > > - return op && (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); > > + if (!op) > > + return false; > > + > > + if (TREE_CODE (op) == INDIRECT_REF) > > + return CONSTANT_CLASS_P (TREE_OPERAND (op, 0)); > > + > > + return (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); > > } > > This doesn't make sense. If we are asking for &CST->x then we should have > folded it to a plain CST. I don't understand what you mean by this. Could you explain? > (and of course such a change wouldn't be appropriate for the release branches, > you basically change the GIMPLE IL) I'm happy to fix it however you like on mainline, but I'd also like to find a fix that works for release branches. -Nathan
On Sun, Sep 5, 2010 at 9:52 PM, Nathan Froyd <froydnj@codesourcery.com> wrote: > On Sun, Sep 05, 2010 at 10:36:16AM +0200, Richard Guenther wrote: >> > Index: gcc/ChangeLog >> > =================================================================== >> > Index: gcc/gimple.c >> > =================================================================== >> > --- gcc/gimple.c (revision 163868) >> > +++ gcc/gimple.c (working copy) >> > @@ -2591,7 +2591,13 @@ is_gimple_invariant_address (const_tree >> > >> > op = strip_invariant_refs (TREE_OPERAND (t, 0)); >> > >> > - return op && (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); >> > + if (!op) >> > + return false; >> > + >> > + if (TREE_CODE (op) == INDIRECT_REF) >> > + return CONSTANT_CLASS_P (TREE_OPERAND (op, 0)); >> > + >> > + return (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); >> > } >> >> This doesn't make sense. If we are asking for &CST->x then we should have >> folded it to a plain CST. > > I don't understand what you mean by this. Could you explain? &CST->x is a constant, we should fold it to it instead of keeping the address-of expression (if it wouldn't be a constant it wouldn't and shouldn't be is_gimple_min_invariant either). >> (and of course such a change wouldn't be appropriate for the release branches, >> you basically change the GIMPLE IL) > > I'm happy to fix it however you like on mainline, but I'd also like to > find a fix that works for release branches. A fix that folds &CST->x to a constant would be appropriate. And I think we do such folding somewhere, just maybe not in the right place(s). Richard. > -Nathan >
On Sun, Sep 05, 2010 at 09:56:39PM +0200, Richard Guenther wrote: > >> This doesn't make sense. If we are asking for &CST->x then we should have > >> folded it to a plain CST. > > > > I don't understand what you mean by this. Could you explain? > > &CST->x is a constant, we should fold it to it instead of keeping the address-of > expression (if it wouldn't be a constant it wouldn't and shouldn't be > is_gimple_min_invariant either). Hm, I don't know. I can see this being worse. Consider doing that folding to: *(&CST->x) = foo; *(&CST->y) = bar; we'll have: *(CST1) = foo; *(CST2) = bar; We might expand that to: t1 = CST1 [t1] = foo t2 = CST2 [t2] = bar Here we can't CSE CST1 and CST2. But if we'd gone with the original, we'd have something like: t1 = CST [t1+offsetof(x)] = foo t2 = CST [t2+offsetof(y)] = bar and here we can CSE CST easily. Of course, some backends might expand the folded example to: t1 = high(CST1) [t1+low(CST1)] = foo t2 = high(CST2) [t2+low(CST2)] = bar and high(CST1) is almost certainly equal to high(CST2), so we can CSE those. But they'd also expand the unfolded example to that same code, so I don't see the folding as winning you anything. > >> (and of course such a change wouldn't be appropriate for the release branches, > >> you basically change the GIMPLE IL) > > > > I'm happy to fix it however you like on mainline, but I'd also like to > > find a fix that works for release branches. > > A fix that folds &CST->x to a constant would be appropriate. And I think > we do such folding somewhere, just maybe not in the right place(s). I don't see an obvious case in fold-const.c. Part of the reason we don't do it, even if the code existed, is due to this bit in tree-ssa-ccp: tree ret, save = *base; tree new_base; new_base = fold_build2 (MEM_REF, TREE_TYPE (*base), unshare_expr (val), TREE_OPERAND (*base, 1)); /* We need to return a new tree, not modify the IL or share parts of it. So play some tricks to avoid manually building it. */ *base = new_base; ret = unshare_expr (rhs); recompute_tree_invariant_for_addr_expr (ret); *base = save; return ret; so when we propagate CST into ADDR_EXPRs, we're just copying things rather than folding. 4.3 marked the whole ADDR_EXPR as TREE_INVARIANT, and TREE_INVARIANT disappeared in 4.4. Trying to tweak is_gimple_min_invariant to something closer to 4.3's behavior seems like a more reasonable thing to do--at least for release branches. -Nathan
On Sun, Sep 5, 2010 at 10:47 PM, Nathan Froyd <froydnj@codesourcery.com> wrote: > On Sun, Sep 05, 2010 at 09:56:39PM +0200, Richard Guenther wrote: >> >> This doesn't make sense. If we are asking for &CST->x then we should have >> >> folded it to a plain CST. >> > >> > I don't understand what you mean by this. Could you explain? >> >> &CST->x is a constant, we should fold it to it instead of keeping the address-of >> expression (if it wouldn't be a constant it wouldn't and shouldn't be >> is_gimple_min_invariant either). > > Hm, I don't know. I can see this being worse. Consider doing that > folding to: > > *(&CST->x) = foo; > *(&CST->y) = bar; > > we'll have: > > *(CST1) = foo; > *(CST2) = bar; > > We might expand that to: > > t1 = CST1 > [t1] = foo > t2 = CST2 > [t2] = bar > > Here we can't CSE CST1 and CST2. But if we'd gone with the original, > we'd have something like: > > t1 = CST > [t1+offsetof(x)] = foo > t2 = CST > [t2+offsetof(y)] = bar > > and here we can CSE CST easily. Of course, some backends might expand > the folded example to: > > t1 = high(CST1) > [t1+low(CST1)] = foo > t2 = high(CST2) > [t2+low(CST2)] = bar > > and high(CST1) is almost certainly equal to high(CST2), so we can CSE > those. But they'd also expand the unfolded example to that same code, > so I don't see the folding as winning you anything. It's IL consistency. It's not GIMPLEs business to create target friendly constants. >> >> (and of course such a change wouldn't be appropriate for the release branches, >> >> you basically change the GIMPLE IL) >> > >> > I'm happy to fix it however you like on mainline, but I'd also like to >> > find a fix that works for release branches. >> >> A fix that folds &CST->x to a constant would be appropriate. And I think >> we do such folding somewhere, just maybe not in the right place(s). > > I don't see an obvious case in fold-const.c. Part of the reason we > don't do it, even if the code existed, is due to this bit in > tree-ssa-ccp: > > tree ret, save = *base; > tree new_base; > new_base = fold_build2 (MEM_REF, TREE_TYPE (*base), > unshare_expr (val), > TREE_OPERAND (*base, 1)); > /* We need to return a new tree, not modify the IL > or share parts of it. So play some tricks to > avoid manually building it. */ > *base = new_base; > ret = unshare_expr (rhs); > recompute_tree_invariant_for_addr_expr (ret); > *base = save; > return ret; > > so when we propagate CST into ADDR_EXPRs, we're just copying things > rather than folding. > > 4.3 marked the whole ADDR_EXPR as TREE_INVARIANT, and TREE_INVARIANT > disappeared in 4.4. Trying to tweak is_gimple_min_invariant to > something closer to 4.3's behavior seems like a more reasonable thing to > do--at least for release branches. No. I can look at the folding - is there a PR for this? Thanks, Richard. > -Nathan >
On Sun, Sep 05, 2010 at 11:26:56PM +0200, Richard Guenther wrote: > It's IL consistency. It's not GIMPLEs business to create target > friendly constants. *shrug* It's not about creating "friendly" constants, it's about creating something that the RTL optimizers can deal with appropriately. > I can look at the folding - is there a PR for this? Not one that I'm aware of. I can file one if you like. -Nathan
Index: gcc/ChangeLog =================================================================== Index: gcc/gimple.c =================================================================== --- gcc/gimple.c (revision 163868) +++ gcc/gimple.c (working copy) @@ -2591,7 +2591,13 @@ is_gimple_invariant_address (const_tree op = strip_invariant_refs (TREE_OPERAND (t, 0)); - return op && (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); + if (!op) + return false; + + if (TREE_CODE (op) == INDIRECT_REF) + return CONSTANT_CLASS_P (TREE_OPERAND (op, 0)); + + return (CONSTANT_CLASS_P (op) || decl_address_invariant_p (op)); } /* Return true if T is a gimple invariant address at IPA level