Message ID | 213485283eede9da12b217737d95fc8f5c4be442.1463428211.git.segher@kernel.crashing.org |
---|---|
State | New |
Headers | show |
> We should do CLEANUP_EXPENSIVE after shrink-wrapping, because shrink- > wrapping creates constructs that CLEANUP_EXPENSIVE can optimise, and > nothing runs CLEANUP_EXPENSIVE later. We don't need cleanup_cfg before > shrink-wrapping, nothing in shrink-wrapping (or the other *logue insertion > code) cares at all. Are you sure of that? I agree that CLEANUP_EXPENSIVE might be overkill, but can't cleanup_cfg expose more opportunities to have multiple epilogues?
On Tue, May 17, 2016 at 10:05:57AM +0200, Eric Botcazou wrote: > > We should do CLEANUP_EXPENSIVE after shrink-wrapping, because shrink- > > wrapping creates constructs that CLEANUP_EXPENSIVE can optimise, and > > nothing runs CLEANUP_EXPENSIVE later. We don't need cleanup_cfg before > > shrink-wrapping, nothing in shrink-wrapping (or the other *logue insertion > > code) cares at all. > > Are you sure of that? I agree that CLEANUP_EXPENSIVE might be overkill, but > can't cleanup_cfg expose more opportunities to have multiple epilogues? How would it? The shrink-wrapping algorithms do not much care how you write your control flow. The only things I can think of are drastic things like removing some dead code, or converting a switch to a direct jump, but those had better be done for the immediately preceding passes already (register allocation). I can put back a cleanup_cfg (0) in front if that seems less tricky (or just safer)? Segher
> How would it? The shrink-wrapping algorithms do not much care how you > write your control flow. The only things I can think of are drastic > things like removing some dead code, or converting a switch to a direct > jump, but those had better be done for the immediately preceding passes > already (register allocation). But the compiler didn't wait until after shrink-wrapping to emit multiple epilogues and can still do that w/o shrink-wrapping. > I can put back a cleanup_cfg (0) in front if that seems less tricky > (or just safer)? I think you need to evaluate the effects of the change on a set of sources.
On Tue, May 17, 2016 at 11:08:53AM +0200, Eric Botcazou wrote: > > How would it? The shrink-wrapping algorithms do not much care how you > > write your control flow. The only things I can think of are drastic > > things like removing some dead code, or converting a switch to a direct > > jump, but those had better be done for the immediately preceding passes > > already (register allocation). > > But the compiler didn't wait until after shrink-wrapping to emit multiple > epilogues and can still do that w/o shrink-wrapping. It will only ever generate a single epilogue (unless you also count sibcall epilogues), and that is done after shrink-wrapping. Or you mean something else and I just don't see it. > > I can put back a cleanup_cfg (0) in front if that seems less tricky > > (or just safer)? > > I think you need to evaluate the effects of the change on a set of sources. Yeah I'll do that, thanks for the idea. Segher
On Tue, May 17, 2016 at 04:17:58AM -0500, Segher Boessenkool wrote: > On Tue, May 17, 2016 at 11:08:53AM +0200, Eric Botcazou wrote: > > > How would it? The shrink-wrapping algorithms do not much care how you > > > write your control flow. The only things I can think of are drastic > > > things like removing some dead code, or converting a switch to a direct > > > jump, but those had better be done for the immediately preceding passes > > > already (register allocation). > > > > But the compiler didn't wait until after shrink-wrapping to emit multiple > > epilogues and can still do that w/o shrink-wrapping. > > It will only ever generate a single epilogue (unless you also count > sibcall epilogues), and that is done after shrink-wrapping. Or you mean > something else and I just don't see it. > > > > I can put back a cleanup_cfg (0) in front if that seems less tricky > > > (or just safer)? > > > > I think you need to evaluate the effects of the change on a set of sources. > > Yeah I'll do that, thanks for the idea. I built cross-compilers for 30 targets, and built Linux with that. 6 of those failed for unrelated reasons. Of the 24 that do build, five show a few insns difference between having a cleanup_cfg before shrink-wrapping or not (CLEANUP_EXPENSIVE made no difference there). These targets are s390, blackfin, m68k, mn10300, nios2. It turns out that prepare_shrink_wrap *does* care about block structure: namely, it only moves insns from the "head" block to a successor. It then makes a difference when the cleanup_cfg can merge two successor blocks (say, the first is a forwarder block). This happens quite seldomly. So, shall I put a cleanup_cfg back before shrink-wrapping? [ I'm now also looking at what patch #3 (and #2) change; also small stuff ]. Segher
> I built cross-compilers for 30 targets, and built Linux with that. > 6 of those failed for unrelated reasons. Of the 24 that do build, > five show a few insns difference between having a cleanup_cfg before > shrink-wrapping or not (CLEANUP_EXPENSIVE made no difference there). > These targets are s390, blackfin, m68k, mn10300, nios2. Interesting, thanks for experimenting. > It turns out that prepare_shrink_wrap *does* care about block structure: > namely, it only moves insns from the "head" block to a successor. It > then makes a difference when the cleanup_cfg can merge two successor blocks > (say, the first is a forwarder block). This happens quite seldomly. > > So, shall I put a cleanup_cfg back before shrink-wrapping? Yes, I'd only swap CLEANUP_EXPENSIVE with 0, i.e. leave the calls themselves and add a comment on the first one saying that this helps shrink-wrapping too.
diff --git a/gcc/function.c b/gcc/function.c index 70584b9..b9a6338 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -6369,9 +6369,6 @@ make_pass_leaf_regs (gcc::context *ctxt) static unsigned int rest_of_handle_thread_prologue_and_epilogue (void) { - if (optimize) - cleanup_cfg (CLEANUP_EXPENSIVE); - /* On some machines, the prologue and epilogue code, or parts thereof, can be represented as RTL. Doing so lets us schedule insns between it and the rest of the code and also allows delayed branch @@ -6384,7 +6381,7 @@ rest_of_handle_thread_prologue_and_epilogue (void) /* Shrink-wrapping can result in unreachable edges in the epilogue, see PR57320. */ - cleanup_cfg (0); + cleanup_cfg (optimize ? CLEANUP_EXPENSIVE : 0); /* The stack usage info is finalized during prologue expansion. */ if (flag_stack_usage_info)