Message ID | 554CD2AE.7010304@mentor.com |
---|---|
State | New |
Headers | show |
On Fri, May 8, 2015 at 5:13 PM, Tom de Vries <Tom_deVries@mentor.com> wrote: > Hi, > > this patch fixes PR66013. > > > I. > > Consider this test-case, with a va_list passed from f2 to f2_1: > ... > #include <stdarg.h> > > inline int __attribute__((always_inline)) > f2_1 (va_list ap) > { > return va_arg (ap, int); > } > > int > f2 (int i, ...) > { > int res; > va_list ap; > > va_start (ap, i); > res = f2_1 (ap); > va_end (ap); > > return res; > } > ... > > When compiling at -O2 with -m32, before pass_stdarg we see that va_start and > va_arg (in the same function after inlining) use different aps (with the > same value though): > ... > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = nonlocal escaped > # CLB = nonlocal escaped { D.1809 } > __builtin_va_startD.1021 (&apD.1809, 0); > > # VUSE <.MEM_2> > # PT = nonlocal > ap.0_3 = apD.1809; > > # .MEM_4 = VDEF <.MEM_2> > apD.1820 = ap.0_3; > > # .MEM_8 = VDEF <.MEM_4> > # USE = nonlocal null { D.1820 } (escaped) > # CLB = nonlocal null { D.1820 } (escaped) > _7 = VA_ARG (&apD.1820, 0B, 1); > ... > > After expand_ifn_va_arg_1, we have this representation, and that's the one > the pass_stdarg optimization operates on: > ... > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = nonlocal escaped > # CLB = nonlocal escaped { D.1809 } > __builtin_va_startD.1021 (&apD.1809, 0); > > # VUSE <.MEM_2> > # PT = nonlocal > ap.0_3 = apD.1809; > > # .MEM_4 = VDEF <.MEM_2> > apD.1820 = ap.0_3; > > # VUSE <.MEM_4> > ap.4_9 = apD.1820; > > ap.5_10 = ap.4_9 + 4; > > # .MEM_11 = VDEF <.MEM_4> > apD.1820 = ap.5_10; > > # VUSE <.MEM_11> > _7 = MEM[(intD.1 *)ap.4_9]; > ... > > The optimization in pass_stdarg fails: > ... > f2: va_list escapes 1, needs to save all GPR units and all FPR units. > ... > > The optimization fails because this assignment makes the va_list escape: > ... > va_list escapes in # .MEM_4 = VDEF <.MEM_2> > apD.1820 = ap.0_3; > ... > > > II. > > By recalculating address_taken after expanding the ifn_va_arg, we get > instead: > ... > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = nonlocal escaped > # CLB = nonlocal escaped { D.1809 } > __builtin_va_startD.1021 (&apD.1809, 0); > > # VUSE <.MEM_2> > # PT = nonlocal > ap.0_3 = apD.1809; > > ap_11 = ap.0_3; > > ap.4_9 = ap_11; > > ap.5_10 = ap.4_9 + 4; > > ap_4 = ap.5_10; > > # VUSE <.MEM_2> > _7 = MEM[(intD.1 *)ap.4_9]; > ... > > and the pass_stdarg optimization succeeds now: > ... > f2: va_list escapes 0, needs to save 4 GPR units and all FPR units. > ... > > Bootstrapped and reg-tested on x86_64 with and without -m32. > > OK for trunk? As noted in one of the PRs I think that it is the proper time to re-implement the stdarg optimization on the un-lowered form which should also fix this. Thanks, Richard. > > Thanks, > - Tom
On 08-05-15 17:31, Richard Biener wrote: >> >OK for trunk? > As noted in one of the PRs I think that it is the proper time to > re-implement the stdarg optimization on the un-lowered form which > should also fix this. AFAIU, the implementation of the stdarg optimization on the un-lowered form should contain an independent fix for this problem, and this fix won't be functional any more then, so indeed, it's a patch with (hopefully) a limited lifespan. But I thought that the patch is simple and low-risk enough to accept for trunk now, hence the submission. Also, I suppose it's not bad to start developing from a situation where more tests are passing. Thanks, - Tom
On Fri, May 8, 2015 at 9:16 PM, Tom de Vries <Tom_deVries@mentor.com> wrote: > On 08-05-15 17:31, Richard Biener wrote: >>> >>> >OK for trunk? >> >> As noted in one of the PRs I think that it is the proper time to >> re-implement the stdarg optimization on the un-lowered form which >> should also fix this. > > > AFAIU, the implementation of the stdarg optimization on the un-lowered form > should contain an independent fix for this problem, and this fix won't be > functional any more then, so indeed, it's a patch with (hopefully) a limited > lifespan. > > But I thought that the patch is simple and low-risk enough to accept for > trunk now, hence the submission. > > Also, I suppose it's not bad to start developing from a situation where more > tests are passing. Sure - but this adds a pass over the IL plus an SSA update to the compile. Sth which may also enable va-unrelated optimization and thus may cause regressions if you remove it later... Otherwise I agree of course. Thanks, Richard. > Thanks, > - Tom
Update address_taken after ifn_va_arg expansion 2015-05-08 Tom de Vries <tom@codesourcery.com> PR tree-optimization/66013 * tree-stdarg.c: Include tree-ssa.h. (expand_ifn_va_arg_1): Call execute_update_addresses_taken after TODO_update_ssa. * gcc.dg/tree-ssa/stdarg-2.c: Add ia32 scan for 'va_list escapes 0'. --- gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c | 2 ++ gcc/tree-stdarg.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c index f09b5de..3b1bc2c 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c @@ -300,6 +300,8 @@ f15 (int i, ...) /* We may be able to improve upon this after fixing PR66010/PR66013. */ /* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target alpha*-*-linux* } } } */ +/* { dg-final { scan-tree-dump "f15: va_list escapes 0" "stdarg" { target { { i?86-*-* x86_64-*-* } && ia32 } } } } */ + /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && ia32 } } } } */ /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target ia64-*-* } } } */ /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target { powerpc*-*-* && lp64 } } } } */ diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c index 1356374..64e6224 100644 --- a/gcc/tree-stdarg.c +++ b/gcc/tree-stdarg.c @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-cfg.h" #include "tree-pass.h" #include "tree-stdarg.h" +#include "tree-ssa.h" /* A simple pass that attempts to optimize stdarg functions on architectures that need to save register arguments to stack on entry to stdarg functions. @@ -1108,6 +1109,7 @@ expand_ifn_va_arg_1 (function *fun) free_dominance_info (CDI_DOMINATORS); update_ssa (TODO_update_ssa); + execute_update_addresses_taken (); } /* Expand IFN_VA_ARGs in FUN, if necessary. */ -- 1.9.1