Message ID | fb8eff07-3445-9a76-5d03-fe02e2966ef3@linux.alibaba.com |
---|---|
State | New |
Headers | show |
Series | [PR90106] Builtin call transformation changes in cdce pass | expand |
On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: > > Hi > > As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), > when gcc meets builtin function call like: > > y = sqrt (x); > > The cdce pass tries to transform the call into an internal function > call and conditionally executes call with a simple range check on the > arguments which can detect most cases and the errno does not need > to be set. It looks like: > > y = IFN_SQRT (x); > if (__builtin_isless (x, 0)) > sqrt (x); > > However, If the call is in tail position, for example: > > y = sqrt (x); > return y; > > will become: > > y = IFN_SQRT (x); > if (__builtin_isless (x, 0)) > sqrt (x); > return y; > > This transformation breaks tailcall pattern, and prevents > later tailcall optimizations. > > So This patch transform builtin call with return value into > if-then-else part, which looks like: > > y = sqrt (x); > ==> > if (__builtin_isless (x, 0)) > y = sqrt (x); > else > y = IFN_SQRT (x); > > BTW, y = sqrt (x) can also transform like: > > y = IFN_SQRT (x); > if (__builtin_isless (x, 0)) > y = sqrt (x); > > We don‘t choose this pattern because it emits worse assemble > code(more move instruction and use more registers) in x86_64. > > Bootstrapped/regtested on x86_64-linux, ok for trunk? OK. Thanks, Richard. > Regards > JunMa > > > gcc/ChangeLog > > 2019-05-07 Jun Ma <JunMa@linux.alibaba.com> > > PR tree-optimization/90106 > * tree-call-cdce.c (shrink_wrap_one_built_in_call_with_conds): Add > new parameter as new internal function call, also move it to new > basic block. > (use_internal_fn): Pass internal function call to > shrink_wrap_one_built_in_call_with_conds. > > gcc/testsuite/ChangeLog > > 2019-05-07 Jun Ma <JunMa@linux.alibaba.com> > > PR tree-optimization/90106 > * gcc.dg/cdce1.c: Check tailcall code generation after cdce pass. > * gcc.dg/cdce2.c: Likewise. >
On 5/8/19 6:28 AM, Richard Biener wrote: > On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: >> >> Hi >> >> As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), >> when gcc meets builtin function call like: >> >> y = sqrt (x); >> >> The cdce pass tries to transform the call into an internal function >> call and conditionally executes call with a simple range check on the >> arguments which can detect most cases and the errno does not need >> to be set. It looks like: >> >> y = IFN_SQRT (x); >> if (__builtin_isless (x, 0)) >> sqrt (x); >> >> However, If the call is in tail position, for example: >> >> y = sqrt (x); >> return y; >> >> will become: >> >> y = IFN_SQRT (x); >> if (__builtin_isless (x, 0)) >> sqrt (x); >> return y; >> >> This transformation breaks tailcall pattern, and prevents >> later tailcall optimizations. >> >> So This patch transform builtin call with return value into >> if-then-else part, which looks like: >> >> y = sqrt (x); >> ==> >> if (__builtin_isless (x, 0)) >> y = sqrt (x); >> else >> y = IFN_SQRT (x); >> >> BTW, y = sqrt (x) can also transform like: >> >> y = IFN_SQRT (x); >> if (__builtin_isless (x, 0)) >> y = sqrt (x); >> >> We don‘t choose this pattern because it emits worse assemble >> code(more move instruction and use more registers) in x86_64. >> >> Bootstrapped/regtested on x86_64-linux, ok for trunk? > > OK. JunMa -- do you have a copyright assignment on file and write access to the repository? If not we should take care of that before proceeding further. Jeff
On Thu, May 9, 2019 at 5:31 AM Jeff Law <law@redhat.com> wrote: > > On 5/8/19 6:28 AM, Richard Biener wrote: > > On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: > >> > >> Hi > >> > >> As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), > >> when gcc meets builtin function call like: > >> > >> y = sqrt (x); > >> > >> The cdce pass tries to transform the call into an internal function > >> call and conditionally executes call with a simple range check on the > >> arguments which can detect most cases and the errno does not need > >> to be set. It looks like: > >> > >> y = IFN_SQRT (x); > >> if (__builtin_isless (x, 0)) > >> sqrt (x); > >> > >> However, If the call is in tail position, for example: > >> > >> y = sqrt (x); > >> return y; > >> > >> will become: > >> > >> y = IFN_SQRT (x); > >> if (__builtin_isless (x, 0)) > >> sqrt (x); > >> return y; > >> > >> This transformation breaks tailcall pattern, and prevents > >> later tailcall optimizations. > >> > >> So This patch transform builtin call with return value into > >> if-then-else part, which looks like: > >> > >> y = sqrt (x); > >> ==> > >> if (__builtin_isless (x, 0)) > >> y = sqrt (x); > >> else > >> y = IFN_SQRT (x); > >> > >> BTW, y = sqrt (x) can also transform like: > >> > >> y = IFN_SQRT (x); > >> if (__builtin_isless (x, 0)) > >> y = sqrt (x); > >> > >> We don‘t choose this pattern because it emits worse assemble > >> code(more move instruction and use more registers) in x86_64. > >> > >> Bootstrapped/regtested on x86_64-linux, ok for trunk? > > > > OK. > JunMa -- do you have a copyright assignment on file and write access to > the repository? If not we should take care of that before proceeding > further. Hi Jeff, Thanks very much for helping. Yes, he is under Alibaba's copyright assignment. What else should we do other than noticing in this mailing list message? BTW, I think JunMa needs to apply write-access though. Thanks, bin > > Jeff
在 2019/5/9 上午9:20, Bin.Cheng 写道: > On Thu, May 9, 2019 at 5:31 AM Jeff Law <law@redhat.com> wrote: >> On 5/8/19 6:28 AM, Richard Biener wrote: >>> On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: >>>> Hi >>>> >>>> As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), >>>> when gcc meets builtin function call like: >>>> >>>> y = sqrt (x); >>>> >>>> The cdce pass tries to transform the call into an internal function >>>> call and conditionally executes call with a simple range check on the >>>> arguments which can detect most cases and the errno does not need >>>> to be set. It looks like: >>>> >>>> y = IFN_SQRT (x); >>>> if (__builtin_isless (x, 0)) >>>> sqrt (x); >>>> >>>> However, If the call is in tail position, for example: >>>> >>>> y = sqrt (x); >>>> return y; >>>> >>>> will become: >>>> >>>> y = IFN_SQRT (x); >>>> if (__builtin_isless (x, 0)) >>>> sqrt (x); >>>> return y; >>>> >>>> This transformation breaks tailcall pattern, and prevents >>>> later tailcall optimizations. >>>> >>>> So This patch transform builtin call with return value into >>>> if-then-else part, which looks like: >>>> >>>> y = sqrt (x); >>>> ==> >>>> if (__builtin_isless (x, 0)) >>>> y = sqrt (x); >>>> else >>>> y = IFN_SQRT (x); >>>> >>>> BTW, y = sqrt (x) can also transform like: >>>> >>>> y = IFN_SQRT (x); >>>> if (__builtin_isless (x, 0)) >>>> y = sqrt (x); >>>> >>>> We don‘t choose this pattern because it emits worse assemble >>>> code(more move instruction and use more registers) in x86_64. >>>> >>>> Bootstrapped/regtested on x86_64-linux, ok for trunk? >>> OK. >> JunMa -- do you have a copyright assignment on file and write access to >> the repository? If not we should take care of that before proceeding >> further. > Hi Jeff, > Thanks very much for helping. > Yes, he is under Alibaba's copyright assignment. What else should we > do other than noticing in this mailing list message? > BTW, I think JunMa needs to apply write-access though. Yes, I do not have write access, FYI. Thanks JunMa > Thanks, > bin >> Jeff
On 5/8/19 7:20 PM, Bin.Cheng wrote: > On Thu, May 9, 2019 at 5:31 AM Jeff Law <law@redhat.com> wrote: >> >> On 5/8/19 6:28 AM, Richard Biener wrote: >>> On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: >>>> >>>> Hi >>>> >>>> As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), >>>> when gcc meets builtin function call like: >>>> >>>> y = sqrt (x); >>>> >>>> The cdce pass tries to transform the call into an internal function >>>> call and conditionally executes call with a simple range check on the >>>> arguments which can detect most cases and the errno does not need >>>> to be set. It looks like: >>>> >>>> y = IFN_SQRT (x); >>>> if (__builtin_isless (x, 0)) >>>> sqrt (x); >>>> >>>> However, If the call is in tail position, for example: >>>> >>>> y = sqrt (x); >>>> return y; >>>> >>>> will become: >>>> >>>> y = IFN_SQRT (x); >>>> if (__builtin_isless (x, 0)) >>>> sqrt (x); >>>> return y; >>>> >>>> This transformation breaks tailcall pattern, and prevents >>>> later tailcall optimizations. >>>> >>>> So This patch transform builtin call with return value into >>>> if-then-else part, which looks like: >>>> >>>> y = sqrt (x); >>>> ==> >>>> if (__builtin_isless (x, 0)) >>>> y = sqrt (x); >>>> else >>>> y = IFN_SQRT (x); >>>> >>>> BTW, y = sqrt (x) can also transform like: >>>> >>>> y = IFN_SQRT (x); >>>> if (__builtin_isless (x, 0)) >>>> y = sqrt (x); >>>> >>>> We don‘t choose this pattern because it emits worse assemble >>>> code(more move instruction and use more registers) in x86_64. >>>> >>>> Bootstrapped/regtested on x86_64-linux, ok for trunk? >>> >>> OK. >> JunMa -- do you have a copyright assignment on file and write access to >> the repository? If not we should take care of that before proceeding >> further. > Hi Jeff, > Thanks very much for helping. > Yes, he is under Alibaba's copyright assignment. What else should we > do other than noticing in this mailing list message? > BTW, I think JunMa needs to apply write-access though. If he's covered under a corporate assignment, then we're fine from that standpoint. For write access, we just need to go through the usual procedure. JunMa just needs to fill out this form, I'm happy to sponsor him. https://sourceware.org/cgi-bin/pdw/ps_form.cgi Jeff
On 5/8/19 8:25 PM, JunMa wrote: > 在 2019/5/9 上午9:20, Bin.Cheng 写道: >> On Thu, May 9, 2019 at 5:31 AM Jeff Law <law@redhat.com> wrote: >>> On 5/8/19 6:28 AM, Richard Biener wrote: >>>> On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: >>>>> Hi >>>>> >>>>> As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), >>>>> when gcc meets builtin function call like: >>>>> >>>>> y = sqrt (x); >>>>> >>>>> The cdce pass tries to transform the call into an internal function >>>>> call and conditionally executes call with a simple range check on the >>>>> arguments which can detect most cases and the errno does not need >>>>> to be set. It looks like: >>>>> >>>>> y = IFN_SQRT (x); >>>>> if (__builtin_isless (x, 0)) >>>>> sqrt (x); >>>>> >>>>> However, If the call is in tail position, for example: >>>>> >>>>> y = sqrt (x); >>>>> return y; >>>>> >>>>> will become: >>>>> >>>>> y = IFN_SQRT (x); >>>>> if (__builtin_isless (x, 0)) >>>>> sqrt (x); >>>>> return y; >>>>> >>>>> This transformation breaks tailcall pattern, and prevents >>>>> later tailcall optimizations. >>>>> >>>>> So This patch transform builtin call with return value into >>>>> if-then-else part, which looks like: >>>>> >>>>> y = sqrt (x); >>>>> ==> >>>>> if (__builtin_isless (x, 0)) >>>>> y = sqrt (x); >>>>> else >>>>> y = IFN_SQRT (x); >>>>> >>>>> BTW, y = sqrt (x) can also transform like: >>>>> >>>>> y = IFN_SQRT (x); >>>>> if (__builtin_isless (x, 0)) >>>>> y = sqrt (x); >>>>> >>>>> We don‘t choose this pattern because it emits worse assemble >>>>> code(more move instruction and use more registers) in x86_64. >>>>> >>>>> Bootstrapped/regtested on x86_64-linux, ok for trunk? >>>> OK. >>> JunMa -- do you have a copyright assignment on file and write access to >>> the repository? If not we should take care of that before proceeding >>> further. >> Hi Jeff, >> Thanks very much for helping. >> Yes, he is under Alibaba's copyright assignment. What else should we >> do other than noticing in this mailing list message? >> BTW, I think JunMa needs to apply write-access though. > > Yes, I do not have write access, FYI. OK. So in my response to Bin I gave you a link to the form to fill out to get write access. List me as your sponsor. Once that's all taken care of and working, go ahead and commit this change to the trunk. Jeff
在 2019/5/10 上午4:00, Jeff Law 写道: > On 5/8/19 8:25 PM, JunMa wrote: >> 在 2019/5/9 上午9:20, Bin.Cheng 写道: >>> On Thu, May 9, 2019 at 5:31 AM Jeff Law <law@redhat.com> wrote: >>>> On 5/8/19 6:28 AM, Richard Biener wrote: >>>>> On Wed, May 8, 2019 at 12:09 PM JunMa <JunMa@linux.alibaba.com> wrote: >>>>>> Hi >>>>>> >>>>>> As PR90106 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106), >>>>>> when gcc meets builtin function call like: >>>>>> >>>>>> y = sqrt (x); >>>>>> >>>>>> The cdce pass tries to transform the call into an internal function >>>>>> call and conditionally executes call with a simple range check on the >>>>>> arguments which can detect most cases and the errno does not need >>>>>> to be set. It looks like: >>>>>> >>>>>> y = IFN_SQRT (x); >>>>>> if (__builtin_isless (x, 0)) >>>>>> sqrt (x); >>>>>> >>>>>> However, If the call is in tail position, for example: >>>>>> >>>>>> y = sqrt (x); >>>>>> return y; >>>>>> >>>>>> will become: >>>>>> >>>>>> y = IFN_SQRT (x); >>>>>> if (__builtin_isless (x, 0)) >>>>>> sqrt (x); >>>>>> return y; >>>>>> >>>>>> This transformation breaks tailcall pattern, and prevents >>>>>> later tailcall optimizations. >>>>>> >>>>>> So This patch transform builtin call with return value into >>>>>> if-then-else part, which looks like: >>>>>> >>>>>> y = sqrt (x); >>>>>> ==> >>>>>> if (__builtin_isless (x, 0)) >>>>>> y = sqrt (x); >>>>>> else >>>>>> y = IFN_SQRT (x); >>>>>> >>>>>> BTW, y = sqrt (x) can also transform like: >>>>>> >>>>>> y = IFN_SQRT (x); >>>>>> if (__builtin_isless (x, 0)) >>>>>> y = sqrt (x); >>>>>> >>>>>> We don‘t choose this pattern because it emits worse assemble >>>>>> code(more move instruction and use more registers) in x86_64. >>>>>> >>>>>> Bootstrapped/regtested on x86_64-linux, ok for trunk? >>>>> OK. >>>> JunMa -- do you have a copyright assignment on file and write access to >>>> the repository? If not we should take care of that before proceeding >>>> further. >>> Hi Jeff, >>> Thanks very much for helping. >>> Yes, he is under Alibaba's copyright assignment. What else should we >>> do other than noticing in this mailing list message? >>> BTW, I think JunMa needs to apply write-access though. >> Yes, I do not have write access, FYI. > OK. So in my response to Bin I gave you a link to the form to fill out > to get write access. List me as your sponsor. Once that's all taken > care of and working, go ahead and commit this change to the trunk. Thanks! JunMa > Jeff
On Wed, May 08, 2019 at 06:09:06PM +0800, JunMa wrote: > 2019-05-07 Jun Ma <JunMa@linux.alibaba.com> > > PR tree-optimization/90106 > * gcc.dg/cdce1.c: Check tailcall code generation after cdce pass. > * gcc.dg/cdce2.c: Likewise. This is wrong and results in UNSUPPORTED failures. Both tests are dg-do run, so you can't scan-assembler for them, assembly isn't generated at all (it could be if -save-temps is in dg-options). Furthermore, I don't any target actually spells the tailcall in the assembly as jmp space pow/log, rather than say on x86_64/i686 jmp tab pow/log, but on most other targets completely differently (and some targets don't support tail calls at all, or have various limitations for them etc.). One possibility is to add -fdump-tree-optimized and scan for /* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ resp. /* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > diff --git a/gcc/testsuite/gcc.dg/cdce1.c b/gcc/testsuite/gcc.dg/cdce1.c > index b23ad63..424d80f 100644 > --- a/gcc/testsuite/gcc.dg/cdce1.c > +++ b/gcc/testsuite/gcc.dg/cdce1.c > @@ -1,7 +1,8 @@ > /* { dg-do run } */ > /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > /* { dg-require-effective-target int32plus } */ > -/* { dg-final { scan-tree-dump "cdce1.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-assembler "jmp pow" } } */ > /* { dg-require-effective-target large_double } */ > > #include <stdlib.h> > diff --git a/gcc/testsuite/gcc.dg/cdce2.c b/gcc/testsuite/gcc.dg/cdce2.c > index 30e7cb1..2af2893 100644 > --- a/gcc/testsuite/gcc.dg/cdce2.c > +++ b/gcc/testsuite/gcc.dg/cdce2.c > @@ -1,7 +1,8 @@ > /* { dg-do run } */ > /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ > /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > -/* { dg-final { scan-tree-dump "cdce2.c:15: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-assembler "jmp log" } } */ > > #include <stdlib.h> > #include <math.h> Jakub
On Thu, May 16, 2019 at 11:39:38PM +0200, Jakub Jelinek wrote: > One possibility is to add -fdump-tree-optimized and scan for > /* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > resp. > /* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ Here it is in patch form. That said, I'm not convinced your patch does what you wanted, because comparing a month old trunk with today's trunk generates the same assembly except for .ident, generates as many [tail call] lines in *.optimized dump as before, emits the same number of jmp\tpow and jmp\tlog instructions as before (one in a separate routine). But at least the tests aren't UNSUPPORTED anymore. 2019-05-16 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/90106 * gcc.dg/cdce1.c: Don't scan-assembler, instead -fdump-tree-optimized and scan-tree-dump for tail call. * gcc.dg/cdce2.c: Likewise. --- gcc/testsuite/gcc.dg/cdce1.c.jj 2019-05-16 11:28:22.750177582 +0200 +++ gcc/testsuite/gcc.dg/cdce1.c 2019-05-16 23:50:23.618450891 +0200 @@ -1,9 +1,9 @@ -/* { dg-do run } */ -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ +/* { dg-do run } */ +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ /* { dg-require-effective-target int32plus } */ -/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ -/* { dg-final { scan-assembler "jmp pow" } } */ /* { dg-require-effective-target large_double } */ +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ #include <stdlib.h> #include <math.h> --- gcc/testsuite/gcc.dg/cdce2.c.jj 2019-05-16 11:28:22.781177075 +0200 +++ gcc/testsuite/gcc.dg/cdce2.c 2019-05-16 23:50:58.505880845 +0200 @@ -1,8 +1,8 @@ -/* { dg-do run } */ +/* { dg-do run } */ /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ -/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ -/* { dg-final { scan-assembler "jmp log" } } */ +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ #include <stdlib.h> #include <math.h> Jakub
在 2019/5/17 上午6:04, Jakub Jelinek 写道: > On Thu, May 16, 2019 at 11:39:38PM +0200, Jakub Jelinek wrote: >> One possibility is to add -fdump-tree-optimized and scan for >> /* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ >> resp. >> /* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > Here it is in patch form. > > That said, I'm not convinced your patch does what you wanted, because > comparing a month old trunk with today's trunk generates the same assembly > except for .ident, generates as many [tail call] lines in *.optimized dump > as before, emits the same number of jmp\tpow and jmp\tlog instructions as > before (one in a separate routine). Thanks for point out the mistake and fix it. For these two tests, cdce pass doesn't transform the builtin math functions in foo with or without the patch because they cannot use internal functions. I'll add another testcase to verify the patch. Regards Jun > But at least the tests aren't UNSUPPORTED anymore. > > 2019-05-16 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/90106 > * gcc.dg/cdce1.c: Don't scan-assembler, instead -fdump-tree-optimized > and scan-tree-dump for tail call. > * gcc.dg/cdce2.c: Likewise. > > --- gcc/testsuite/gcc.dg/cdce1.c.jj 2019-05-16 11:28:22.750177582 +0200 > +++ gcc/testsuite/gcc.dg/cdce1.c 2019-05-16 23:50:23.618450891 +0200 > @@ -1,9 +1,9 @@ > -/* { dg-do run } */ > -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > /* { dg-require-effective-target int32plus } */ > -/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > -/* { dg-final { scan-assembler "jmp pow" } } */ > /* { dg-require-effective-target large_double } */ > +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > #include <stdlib.h> > #include <math.h> > --- gcc/testsuite/gcc.dg/cdce2.c.jj 2019-05-16 11:28:22.781177075 +0200 > +++ gcc/testsuite/gcc.dg/cdce2.c 2019-05-16 23:50:58.505880845 +0200 > @@ -1,8 +1,8 @@ > -/* { dg-do run } */ > +/* { dg-do run } */ > /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ > -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > -/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > -/* { dg-final { scan-assembler "jmp log" } } */ > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > #include <stdlib.h> > #include <math.h> > > > Jakub
在 2019/5/17 上午11:09, JunMa 写道: > 在 2019/5/17 上午6:04, Jakub Jelinek 写道: >> On Thu, May 16, 2019 at 11:39:38PM +0200, Jakub Jelinek wrote: >>> One possibility is to add -fdump-tree-optimized and scan for >>> /* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail >>> call\\\]" "optimized" } } */ >>> resp. >>> /* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail >>> call\\\]" "optimized" } } */ >> Here it is in patch form. >> >> That said, I'm not convinced your patch does what you wanted, because >> comparing a month old trunk with today's trunk generates the same >> assembly >> except for .ident, generates as many [tail call] lines in *.optimized >> dump >> as before, emits the same number of jmp\tpow and jmp\tlog >> instructions as >> before (one in a separate routine). > > > Thanks for point out the mistake and fix it. > > For these two tests, cdce pass doesn't transform the builtin math > functions in foo > with or without the patch because they cannot use internal functions. > > I'll add another testcase to verify the patch. Here is the new testcase. The sqrtf function call keeps as tailcall with the patch but not without the patch. For more details, you can read https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106. Bootstrapped/regtested on x86_64-linux, ok for trunk? Regards Jun gcc/testsuite/ChangeLog 2019-05-17 Jun Ma <JunMa@linux.alibaba.com> PR tree-optimization/90106 * gcc.dg/cdce3.c: New test. > > Regards > Jun > > >> But at least the tests aren't UNSUPPORTED anymore. >> >> 2019-05-16 Jakub Jelinek <jakub@redhat.com> >> >> PR tree-optimization/90106 >> * gcc.dg/cdce1.c: Don't scan-assembler, instead >> -fdump-tree-optimized >> and scan-tree-dump for tail call. >> * gcc.dg/cdce2.c: Likewise. >> >> --- gcc/testsuite/gcc.dg/cdce1.c.jj 2019-05-16 11:28:22.750177582 >> +0200 >> +++ gcc/testsuite/gcc.dg/cdce1.c 2019-05-16 23:50:23.618450891 +0200 >> @@ -1,9 +1,9 @@ >> -/* { dg-do run } */ >> -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ >> +/* { dg-do run } */ >> +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details >> -fdump-tree-optimized -lm" } */ >> /* { dg-require-effective-target int32plus } */ >> -/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is >> shrink-wrapped into error conditions\." "cdce" } } */ >> -/* { dg-final { scan-assembler "jmp pow" } } */ >> /* { dg-require-effective-target large_double } */ >> +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is >> shrink-wrapped into error conditions\." "cdce" } } */ >> +/* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail >> call\\\]" "optimized" } } */ >> #include <stdlib.h> >> #include <math.h> >> --- gcc/testsuite/gcc.dg/cdce2.c.jj 2019-05-16 11:28:22.781177075 >> +0200 >> +++ gcc/testsuite/gcc.dg/cdce2.c 2019-05-16 23:50:58.505880845 +0200 >> @@ -1,8 +1,8 @@ >> -/* { dg-do run } */ >> +/* { dg-do run } */ >> /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ >> -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ >> -/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is >> shrink-wrapped into error conditions\." "cdce" } } */ >> -/* { dg-final { scan-assembler "jmp log" } } */ >> +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details >> -fdump-tree-optimized -lm" } */ >> +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is >> shrink-wrapped into error conditions\." "cdce" } } */ >> +/* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail >> call\\\]" "optimized" } } */ >> #include <stdlib.h> >> #include <math.h> >> >> >> Jakub > --- gcc/testsuite/gcc.dg/cdce3.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/cdce3.c diff --git a/gcc/testsuite/gcc.dg/cdce3.c b/gcc/testsuite/gcc.dg/cdce3.c new file mode 100644 index 0000000..0062c4f --- /dev/null +++ b/gcc/testsuite/gcc.dg/cdce3.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ +/* { dg-final { scan-tree-dump "cdce3.c:10: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "sqrtf \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ + +#include <math.h> + +float foo ( float x ) +{ + return sqrtf( x ); +} +
On Fri, May 17, 2019 at 02:24:22PM +0800, JunMa wrote: > 2019-05-17 Jun Ma <JunMa@linux.alibaba.com> Two spaces before < rather than one. > PR tree-optimization/90106 > * gcc.dg/cdce3.c: New test. > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/cdce3.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ Just use one space instead of two. > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ For compile time test, no need to add " -lm" (well, no need to add it even for link/run tests). > +/* { dg-final { scan-tree-dump "cdce3.c:10: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ Please use \[^\n\r]* instead of .*, you don't want newlines matched in there. > +/* { dg-final { scan-tree-dump "sqrtf \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > + > +#include <math.h> Wouldn't it be better to just declare it yourself: float sqrtf (float); ? You really don't know what the target math.h includes. > + > +float foo ( float x ) > +{ > + return sqrtf( x ); > +} > + > -- > 1.8.3.1 > Jakub
在 2019/5/17 下午2:34, Jakub Jelinek 写道: > On Fri, May 17, 2019 at 02:24:22PM +0800, JunMa wrote: >> 2019-05-17 Jun Ma <JunMa@linux.alibaba.com> > Two spaces before < rather than one. > >> PR tree-optimization/90106 >> * gcc.dg/cdce3.c: New test. >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/cdce3.c >> @@ -0,0 +1,12 @@ >> +/* { dg-do compile } */ > Just use one space instead of two. > >> +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > For compile time test, no need to add " -lm" (well, no need to add it even > for link/run tests). > >> +/* { dg-final { scan-tree-dump "cdce3.c:10: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > Please use \[^\n\r]* instead of .*, you don't want newlines matched in > there. > >> +/* { dg-final { scan-tree-dump "sqrtf \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ >> + >> +#include <math.h> > Wouldn't it be better to just declare it yourself: > float sqrtf (float); > ? > You really don't know what the target math.h includes. > >> + >> +float foo ( float x ) >> +{ >> + return sqrtf( x ); >> +} >> + >> -- >> 1.8.3.1 >> Thanks for review so carefully. Update the patch based on your suggestion. Regards Jun gcc/testsuite/ChangeLog 2019-05-17 Jun Ma <JunMa@linux.alibaba.com> PR tree-optimization/90106 * gcc.dg/cdce3.c: New test. > Jakub --- gcc/testsuite/gcc.dg/cdce3.c | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/cdce3.c diff --git a/gcc/testsuite/gcc.dg/cdce3.c b/gcc/testsuite/gcc.dg/cdce3.c new file mode 100644 index 0000000..8a74379 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cdce3.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump "cdce3.c:9: \[^\n\r]* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "sqrtf \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ + +float sqrtf (float); +float foo (float x) +{ + return sqrtf (x); +} +
On Fri, May 17, 2019 at 03:02:51PM +0800, JunMa wrote: > 2019-05-17 Jun Ma <JunMa@linux.alibaba.com> > > PR tree-optimization/90106 > * gcc.dg/cdce3.c: New test. Ok, thanks. Jakub
On Fri, May 17, 2019 at 12:04:16AM +0200, Jakub Jelinek wrote: > On Thu, May 16, 2019 at 11:39:38PM +0200, Jakub Jelinek wrote: > > One possibility is to add -fdump-tree-optimized and scan for > > /* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > resp. > > /* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > Here it is in patch form. > > That said, I'm not convinced your patch does what you wanted, because > comparing a month old trunk with today's trunk generates the same assembly > except for .ident, generates as many [tail call] lines in *.optimized dump > as before, emits the same number of jmp\tpow and jmp\tlog instructions as > before (one in a separate routine). > > But at least the tests aren't UNSUPPORTED anymore. > > 2019-05-16 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/90106 > * gcc.dg/cdce1.c: Don't scan-assembler, instead -fdump-tree-optimized > and scan-tree-dump for tail call. > * gcc.dg/cdce2.c: Likewise. I'd like to ping this patch. A new cdce3.c testcase has been added in the mean time, so the above "That said," paragraph is resolved through that. > --- gcc/testsuite/gcc.dg/cdce1.c.jj 2019-05-16 11:28:22.750177582 +0200 > +++ gcc/testsuite/gcc.dg/cdce1.c 2019-05-16 23:50:23.618450891 +0200 > @@ -1,9 +1,9 @@ > -/* { dg-do run } */ > -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > /* { dg-require-effective-target int32plus } */ > -/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > -/* { dg-final { scan-assembler "jmp pow" } } */ > /* { dg-require-effective-target large_double } */ > +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > #include <stdlib.h> > #include <math.h> > --- gcc/testsuite/gcc.dg/cdce2.c.jj 2019-05-16 11:28:22.781177075 +0200 > +++ gcc/testsuite/gcc.dg/cdce2.c 2019-05-16 23:50:58.505880845 +0200 > @@ -1,8 +1,8 @@ > -/* { dg-do run } */ > +/* { dg-do run } */ > /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ > -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > -/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > -/* { dg-final { scan-assembler "jmp log" } } */ > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > +/* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > #include <stdlib.h> > #include <math.h> > Jakub
On Fri, 24 May 2019, Jakub Jelinek wrote: > On Fri, May 17, 2019 at 12:04:16AM +0200, Jakub Jelinek wrote: > > On Thu, May 16, 2019 at 11:39:38PM +0200, Jakub Jelinek wrote: > > > One possibility is to add -fdump-tree-optimized and scan for > > > /* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > > resp. > > > /* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > > > Here it is in patch form. > > > > That said, I'm not convinced your patch does what you wanted, because > > comparing a month old trunk with today's trunk generates the same assembly > > except for .ident, generates as many [tail call] lines in *.optimized dump > > as before, emits the same number of jmp\tpow and jmp\tlog instructions as > > before (one in a separate routine). > > > > But at least the tests aren't UNSUPPORTED anymore. > > > > 2019-05-16 Jakub Jelinek <jakub@redhat.com> > > > > PR tree-optimization/90106 > > * gcc.dg/cdce1.c: Don't scan-assembler, instead -fdump-tree-optimized > > and scan-tree-dump for tail call. > > * gcc.dg/cdce2.c: Likewise. > > I'd like to ping this patch. A new cdce3.c testcase has been added in the > mean time, so the above "That said," paragraph is resolved through that. OK > > --- gcc/testsuite/gcc.dg/cdce1.c.jj 2019-05-16 11:28:22.750177582 +0200 > > +++ gcc/testsuite/gcc.dg/cdce1.c 2019-05-16 23:50:23.618450891 +0200 > > @@ -1,9 +1,9 @@ > > -/* { dg-do run } */ > > -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > > +/* { dg-do run } */ > > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > > /* { dg-require-effective-target int32plus } */ > > -/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > > -/* { dg-final { scan-assembler "jmp pow" } } */ > > /* { dg-require-effective-target large_double } */ > > +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > > +/* { dg-final { scan-tree-dump "pow \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > > > #include <stdlib.h> > > #include <math.h> > > --- gcc/testsuite/gcc.dg/cdce2.c.jj 2019-05-16 11:28:22.781177075 +0200 > > +++ gcc/testsuite/gcc.dg/cdce2.c 2019-05-16 23:50:58.505880845 +0200 > > @@ -1,8 +1,8 @@ > > -/* { dg-do run } */ > > +/* { dg-do run } */ > > /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ > > -/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ > > -/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > > -/* { dg-final { scan-assembler "jmp log" } } */ > > +/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized -lm" } */ > > +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ > > +/* { dg-final { scan-tree-dump "log \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ > > > > #include <stdlib.h> > > #include <math.h> > > > > Jakub >
diff --git a/gcc/testsuite/gcc.dg/cdce1.c b/gcc/testsuite/gcc.dg/cdce1.c index b23ad63..424d80f 100644 --- a/gcc/testsuite/gcc.dg/cdce1.c +++ b/gcc/testsuite/gcc.dg/cdce1.c @@ -1,7 +1,8 @@ /* { dg-do run } */ /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ /* { dg-require-effective-target int32plus } */ -/* { dg-final { scan-tree-dump "cdce1.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "cdce1.c:17: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-assembler "jmp pow" } } */ /* { dg-require-effective-target large_double } */ #include <stdlib.h> diff --git a/gcc/testsuite/gcc.dg/cdce2.c b/gcc/testsuite/gcc.dg/cdce2.c index 30e7cb1..2af2893 100644 --- a/gcc/testsuite/gcc.dg/cdce2.c +++ b/gcc/testsuite/gcc.dg/cdce2.c @@ -1,7 +1,8 @@ /* { dg-do run } */ /* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */ -/* { dg-final { scan-tree-dump "cdce2.c:15: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "cdce2.c:16: .* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-assembler "jmp log" } } */ #include <stdlib.h> #include <math.h> diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c index 2e482b3..9e3372f 100644 --- a/gcc/tree-call-cdce.c +++ b/gcc/tree-call-cdce.c @@ -93,10 +93,10 @@ along with GCC; see the file COPYING3. If not see y = sqrt (x); ==> - y = IFN_SQRT (x); if (__builtin_isless (x, 0)) - sqrt (x); - + y = sqrt (x); + else + y = IFN_SQRT (x); In the vast majority of cases we should then never need to call sqrt. Note that library functions are not supposed to clear errno to zero without @@ -793,14 +793,16 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec<gimple *> conds, } /* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS - conditions in CONDS is false. */ + conditions in CONDS is false. Also move BI_NEWCALL to a new basic block + when it is non-null, it is called while all of the CONDS are true. */ static void shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds, - unsigned int nconds) + unsigned int nconds, + gcall *bi_newcall = NULL) { gimple_stmt_iterator bi_call_bsi; - basic_block bi_call_bb, join_tgt_bb, guard_bb; + basic_block bi_call_bb, bi_newcall_bb, join_tgt_bb, guard_bb; edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru; edge bi_call_in_edge0, guard_bb_in_edge; unsigned tn_cond_stmts; @@ -809,27 +811,26 @@ shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds, gimple *cond_expr_start; /* The cfg we want to create looks like this: - - [guard n-1] <- guard_bb (old block) - | \ - | [guard n-2] } - | / \ } - | / ... } new blocks - | / [guard 0] } - | / / | } - [ call ] | <- bi_call_bb } - | \ | - | \ | - | [ join ] <- join_tgt_bb (old iff call must end bb) - | + [guard n-1] <- guard_bb (old block) + | \ + | [guard n-2] } + | / \ } + | / ... } new blocks + | / [guard 0] } + | / / | } + [call] | <- bi_call_bb } + \ [newcall] <-bi_newcall_bb} + \ | + [join] <- join_tgt_bb (old iff call must end bb) possible EH edges (only if [join] is old) When [join] is new, the immediate dominators for these blocks are: 1. [guard n-1]: unchanged 2. [call]: [guard n-1] - 3. [guard m]: [guard m+1] for 0 <= m <= n-2 - 4. [join]: [guard n-1] + 3. [newcall]: [guard 0] + 4. [guard m]: [guard m+1] for 0 <= m <= n-2 + 5. [join]: [guard n-1] We punt for the more complex case case of [join] being old and simply free the dominance info. We also punt on postdominators, @@ -927,6 +928,47 @@ shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds, edges.quick_push (edge_pair (bi_call_in_edge, guard_bb_in_edge)); } + /* Move BI_NEWCALL to new basic block when it is non-null. */ + if (bi_newcall) + { + /* Get bi_newcall_bb by split join_tgt_in_edge_fall_thru edge, + and move BI_NEWCALL to bi_newcall_bb. */ + bi_newcall_bb = split_edge (join_tgt_in_edge_fall_thru); + gimple_stmt_iterator to_gsi = gsi_start_bb (bi_newcall_bb); + gimple_stmt_iterator from_gsi = gsi_for_stmt (bi_newcall); + gsi_move_before (&from_gsi, &to_gsi); + join_tgt_in_edge_fall_thru = EDGE_SUCC (bi_newcall_bb, 0); + join_tgt_bb = join_tgt_in_edge_fall_thru->dest; + + tree bi_newcall_lhs = gimple_call_lhs (bi_newcall); + tree bi_call_lhs = gimple_call_lhs (bi_call); + if (!bi_call_lhs) + { + bi_call_lhs = copy_ssa_name (bi_newcall_lhs); + gimple_call_set_lhs (bi_call, bi_call_lhs); + SSA_NAME_DEF_STMT (bi_call_lhs) = bi_call; + } + + /* Create phi node for lhs of BI_CALL and BI_NEWCALL. */ + gphi *new_phi = create_phi_node (copy_ssa_name (bi_newcall_lhs), + join_tgt_bb); + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (new_phi)) + = SSA_NAME_OCCURS_IN_ABNORMAL_PHI (bi_newcall_lhs); + add_phi_arg (new_phi, bi_call_lhs, join_tgt_in_edge_from_call, + gimple_location (bi_call)); + add_phi_arg (new_phi, bi_newcall_lhs, join_tgt_in_edge_fall_thru, + gimple_location (bi_newcall)); + + /* Replace all use of original return value with result of phi node. */ + use_operand_p use_p; + gimple *use_stmt; + imm_use_iterator iterator; + FOR_EACH_IMM_USE_STMT (use_stmt, iterator, bi_newcall_lhs) + if (use_stmt != new_phi) + FOR_EACH_IMM_USE_ON_STMT (use_p, iterator) + SET_USE (use_p, PHI_RESULT (new_phi)); + } + /* Now update the probability and profile information, processing the guards in order of execution. @@ -1030,9 +1072,11 @@ use_internal_fn (gcall *call) unsigned nconds = 0; auto_vec<gimple *, 12> conds; + bool is_arg_conds = false; if (can_test_argument_range (call)) { gen_shrink_wrap_conditions (call, conds, &nconds); + is_arg_conds = true; gcc_assert (nconds != 0); } else @@ -1082,8 +1126,8 @@ use_internal_fn (gcall *call) call = new_call; } } - - shrink_wrap_one_built_in_call_with_conds (call, conds, nconds); + shrink_wrap_one_built_in_call_with_conds (call, conds, nconds, + is_arg_conds ? new_call : NULL); } /* The top level function for conditional dead code shrink