diff mbox series

Remove empty loop with assumed finiteness (PR tree-optimization/89713)

Message ID BYAPR01MB48692B7F7F8DEAF7DAD8C586F70B0@BYAPR01MB4869.prod.exchangelabs.com
State New
Headers show
Series Remove empty loop with assumed finiteness (PR tree-optimization/89713) | expand

Commit Message

Feng Xue OS May 17, 2019, 4:17 a.m. UTC
This patch is meant to give user a way to optimize away those empty loops which are impossible to be recognized by compiler, such as C++ STL container-based loop,

    void f (std::map<int, int> &m)
    {
        for (auto it = m.begin (); it != m.end (); ++it);
    }
 
An option "-ffinite-loop" is added to tell compiler about finiteness of loops so that compiler can apply the optimization.

Feng

Comments

Jeff Law May 17, 2019, 4:47 p.m. UTC | #1
On 5/16/19 10:17 PM, Feng Xue OS wrote:
> This patch is meant to give user a way to optimize away those empty loops which are impossible to be recognized by compiler, such as C++ STL container-based loop,
> 
>     void f (std::map<int, int> &m)
>     {
>         for (auto it = m.begin (); it != m.end (); ++it);
>     }
>  
> An option "-ffinite-loop" is added to tell compiler about finiteness of loops so that compiler can apply the optimization.
> 
> Feng
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index d8bed3a..c55f2e6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2019-05-16  Feng Xue  <fxue@os.amperecomputing.com>
> +
> +        PR tree-optimization/89713
> +        * doc/invoke.texi (-ffinite-loop): Document new option.
> +        * common.opt (-ffinite-loop): New option.
> +        * passes.def: Replace pass_cd_dce with pass_cd_dce2.
> +        * tree-pass.h (pass_cd_dce2): New declaration.
> +        * tree-ssa-dce.c (loop_has_true_exits): New function.
> +        (find_obviously_necessary_stmts): Add aggressive_loop_removal
> +        parameter.
> +        (perform_tree_ssa_dce, tree_ssa_cd_dce): Likewise.
> +        (class pass_cd_dce): Add new member aggressive_loop_removal.
> +        (pass_cd_dce::pass_cd_dce: Add alr parameter.
> +        (make_pass_cd_dce2): New function.
I'm not a big fan of this patch.  I'd rather be looking at either
improving our analysis or adding attributes to the loops to help the
analysis bits prove termination.

Jeff
Richard Biener May 17, 2019, 6:50 p.m. UTC | #2
On May 17, 2019 6:47:00 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 5/16/19 10:17 PM, Feng Xue OS wrote:
>> This patch is meant to give user a way to optimize away those empty
>loops which are impossible to be recognized by compiler, such as C++
>STL container-based loop,
>> 
>>     void f (std::map<int, int> &m)
>>     {
>>         for (auto it = m.begin (); it != m.end (); ++it);
>>     }
>>  
>> An option "-ffinite-loop" is added to tell compiler about finiteness
>of loops so that compiler can apply the optimization.
>> 
>> Feng
>> 
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index d8bed3a..c55f2e6 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,18 @@
>> +2019-05-16  Feng Xue  <fxue@os.amperecomputing.com>
>> +
>> +        PR tree-optimization/89713
>> +        * doc/invoke.texi (-ffinite-loop): Document new option.
>> +        * common.opt (-ffinite-loop): New option.
>> +        * passes.def: Replace pass_cd_dce with pass_cd_dce2.
>> +        * tree-pass.h (pass_cd_dce2): New declaration.
>> +        * tree-ssa-dce.c (loop_has_true_exits): New function.
>> +        (find_obviously_necessary_stmts): Add
>aggressive_loop_removal
>> +        parameter.
>> +        (perform_tree_ssa_dce, tree_ssa_cd_dce): Likewise.
>> +        (class pass_cd_dce): Add new member aggressive_loop_removal.
>> +        (pass_cd_dce::pass_cd_dce: Add alr parameter.
>> +        (make_pass_cd_dce2): New function.
>I'm not a big fan of this patch.  I'd rather be looking at either
>improving our analysis or adding attributes to the loops to help the
>analysis bits prove termination.

And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC. 

Richard. 

>Jeff
Marc Glisse May 18, 2019, 1:59 p.m. UTC | #3
(@Feng Xue, it is better to include testcases in your patches)

>> I'm not a big fan of this patch.  I'd rather be looking at either
>> improving our analysis

Better analysis cannot hurt.

>> or adding attributes to the loops to help the analysis bits prove 
>> termination.

There may not be a loop in the source code, it may be a recursive function 
call that gcc turned into a loop. Plus, I know that it applies to all 
loops in my program.

We could have a function to compute the length of a linked list:
struct A { A*p; };
unsigned l(A*a){return a?l(a->p)+1:0;}

and because of other optimizations, it turns out that we do not actually 
use this length
void g(A*a){l(a);}

wouldn't it be nice if gcc could remove all that useless code, instead of 
keeping a useless loop on the off chance that it might be infinite?

> And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.

IIUC that was slightly different: "This option tells the loop optimizer to 
assume that loop indices do not overflow, and that loops with nontrivial 
exit condition are not infinite."

The assumption on indices looks unsafe indeed if it applied to unsigned 
indices in non-empty loops.

But the C++ standard went to the trouble of banning infinite loops without 
side effects specifically to enable DCE on this type of code... Actually, 
an infinite loop with a trivial exit condition looks like a great 
opportunity to use __builtin_unreachable() to me ;-) (I have been wanting 
a -fmain-does-return -fno-funny-business for years, since I looked at 
replacing some malloc with stack allocations, but that's all out of scope 
for this patch)

Why exactly are we trying so hard to preserve no-side-effect, infinite 
loops? What are they good for? Note that reading an atomic or volatile 
variable counts as a side effect for this purpose. It looks like some kind 
of busy waiting, but without checking a flag, so it can only be stopped by 
some external action (say a signal), so if the OS has any notion of sleep 
for a thread, blocking would be better. Or maybe it is running through a 
circular list, ensuring that it stays in RAM? Anyway it seems specific 
enough that that strange code should be the one needing an annotation.
Richard Biener May 20, 2019, 7:50 a.m. UTC | #4
On Sat, May 18, 2019 at 4:00 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> (@Feng Xue, it is better to include testcases in your patches)
>
> >> I'm not a big fan of this patch.  I'd rather be looking at either
> >> improving our analysis
>
> Better analysis cannot hurt.
>
> >> or adding attributes to the loops to help the analysis bits prove
> >> termination.
>
> There may not be a loop in the source code, it may be a recursive function
> call that gcc turned into a loop. Plus, I know that it applies to all
> loops in my program.
>
> We could have a function to compute the length of a linked list:
> struct A { A*p; };
> unsigned l(A*a){return a?l(a->p)+1:0;}
>
> and because of other optimizations, it turns out that we do not actually
> use this length
> void g(A*a){l(a);}
>
> wouldn't it be nice if gcc could remove all that useless code, instead of
> keeping a useless loop on the off chance that it might be infinite?
>
> > And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.
>
> IIUC that was slightly different: "This option tells the loop optimizer to
> assume that loop indices do not overflow, and that loops with nontrivial
> exit condition are not infinite."
>
> The assumption on indices looks unsafe indeed if it applied to unsigned
> indices in non-empty loops.

The question is of couse what a "nontrivial exit condition" is.  Indeed
the general handling of unsigned wrapping was what made the option
useless in practice.

But we thoroughly need to specify "nontrivial exit condition", if going
as far as non-constant exit condition, that is, only do {} while (1) is required
to be detected as infinite then this breaks down to unsigned wrapping IVs
not be infinite.  Otherwise it requires the compiler to be able to correctly
analyze all unsigned IVs which I know we do not at the moment (SCEV
has limits).

So - any suggestion as to how define "nontrivial exit condition"?

> But the C++ standard went to the trouble of banning infinite loops without
> side effects specifically to enable DCE on this type of code... Actually,
> an infinite loop with a trivial exit condition looks like a great
> opportunity to use __builtin_unreachable() to me ;-) (I have been wanting
> a -fmain-does-return -fno-funny-business for years, since I looked at
> replacing some malloc with stack allocations, but that's all out of scope
> for this patch)
>
> Why exactly are we trying so hard to preserve no-side-effect, infinite
> loops? What are they good for? Note that reading an atomic or volatile
> variable counts as a side effect for this purpose. It looks like some kind
> of busy waiting, but without checking a flag, so it can only be stopped by
> some external action (say a signal), so if the OS has any notion of sleep
> for a thread, blocking would be better. Or maybe it is running through a
> circular list, ensuring that it stays in RAM? Anyway it seems specific
> enough that that strange code should be the one needing an annotation.

I guess we preserve them because we have to?

I suppose we could add a flag that allows us to elide
loops with no side-effect and non-constant exit condition
(so only preserve do{}while (1))?

Not sure where that would fit best though - certainly not
in niter / IV analysis but in CD-DCE then?

Richard.

> --
> Marc Glisse
Feng Xue OS May 20, 2019, 8:27 a.m. UTC | #5
>>
>> IIUC that was slightly different: "This option tells the loop optimizer to
>> assume that loop indices do not overflow, and that loops with nontrivial
>> exit condition are not infinite."
>>
>> The assumption on indices looks unsafe indeed if it applied to unsigned
>> indices in non-empty loops.

> The question is of couse what a "nontrivial exit condition" is.  Indeed
> the general handling of unsigned wrapping was what made the option
> useless in practice.

> But we thoroughly need to specify "nontrivial exit condition", if going
> as far as non-constant exit condition, that is, only do {} while (1) is required
> to be detected as infinite then this breaks down to unsigned wrapping IVs
> not be infinite.  Otherwise it requires the compiler to be able to correctly
> analyze all unsigned IVs which I know we do not at the moment (SCEV
> has limits).

> So - any suggestion as to how define "nontrivial exit condition"?

>>
>> Why exactly are we trying so hard to preserve no-side-effect, infinite
>> loops? What are they good for? Note that reading an atomic or volatile
>> variable counts as a side effect for this purpose. It looks like some kind
>> of busy waiting, but without checking a flag, so it can only be stopped by
>> some external action (say a signal), so if the OS has any notion of sleep
>> for a thread, blocking would be better. Or maybe it is running through a
>> circular list, ensuring that it stays in RAM? Anyway it seems specific
>> enough that that strange code should be the one needing an annotation.

> I guess we preserve them because we have to?

> I suppose we could add a flag that allows us to elide
> loops with no side-effect and non-constant exit condition
> (so only preserve do{}while (1))?

> Not sure where that would fit best though - certainly not
> in niter / IV analysis but in CD-DCE then?

Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.

Moreover, in CD-DCE, if a loop contains side-effect statements, w/o finiteness assertion does not trigger loop removal.

Feng
Richard Biener May 20, 2019, 9:19 a.m. UTC | #6
On Mon, May 20, 2019 at 10:27 AM Feng Xue OS
<fxue@os.amperecomputing.com> wrote:
>
>
> >>
> >> IIUC that was slightly different: "This option tells the loop optimizer to
> >> assume that loop indices do not overflow, and that loops with nontrivial
> >> exit condition are not infinite."
> >>
> >> The assumption on indices looks unsafe indeed if it applied to unsigned
> >> indices in non-empty loops.
>
> > The question is of couse what a "nontrivial exit condition" is.  Indeed
> > the general handling of unsigned wrapping was what made the option
> > useless in practice.
>
> > But we thoroughly need to specify "nontrivial exit condition", if going
> > as far as non-constant exit condition, that is, only do {} while (1) is required
> > to be detected as infinite then this breaks down to unsigned wrapping IVs
> > not be infinite.  Otherwise it requires the compiler to be able to correctly
> > analyze all unsigned IVs which I know we do not at the moment (SCEV
> > has limits).
>
> > So - any suggestion as to how define "nontrivial exit condition"?
>
> >>
> >> Why exactly are we trying so hard to preserve no-side-effect, infinite
> >> loops? What are they good for? Note that reading an atomic or volatile
> >> variable counts as a side effect for this purpose. It looks like some kind
> >> of busy waiting, but without checking a flag, so it can only be stopped by
> >> some external action (say a signal), so if the OS has any notion of sleep
> >> for a thread, blocking would be better. Or maybe it is running through a
> >> circular list, ensuring that it stays in RAM? Anyway it seems specific
> >> enough that that strange code should be the one needing an annotation.
>
> > I guess we preserve them because we have to?
>
> > I suppose we could add a flag that allows us to elide
> > loops with no side-effect and non-constant exit condition
> > (so only preserve do{}while (1))?
>
> > Not sure where that would fit best though - certainly not
> > in niter / IV analysis but in CD-DCE then?
>
> Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.

Is that so?  The early pipeline contains a CD-DCE pass as well.  Note we also
have pure/const discovery affected by this.

> Moreover, in CD-DCE, if a loop contains side-effect statements, w/o finiteness assertion does not trigger loop removal.

That's of course true.

Now we still need to define "non-trivial exit condition" and a way to
actually test for that.

Richard.

> Feng
>
Feng Xue OS May 20, 2019, 9:48 a.m. UTC | #7
>> Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.

> Is that so?  The early pipeline contains a CD-DCE pass as well.  Note we also
> have pure/const discovery affected by this.

I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.

>> Now we still need to define "non-trivial exit condition" and a way to
actually test for that.

Still not very clear on "non-trival exit condition". I think if a loop contains non-EH exit, it is terminable.

Feng
Richard Biener May 20, 2019, 11:54 a.m. UTC | #8
On Mon, May 20, 2019 at 11:48 AM Feng Xue OS
<fxue@os.amperecomputing.com> wrote:
>
> >> Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.
>
>
> > Is that so?  The early pipeline contains a CD-DCE pass as well.  Note we also
> > have pure/const discovery affected by this.
>
> I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.

Oh, but why not generally do this?

> >> Now we still need to define "non-trivial exit condition" and a way to
> actually test for that.
>
> Still not very clear on "non-trival exit condition". I think if a loop contains non-EH exit, it is terminable.

Possibly, but that's not terminology for user-facing documentation ;)
I'd even say if a loop contains an exit
it is terminable (remove the non-EH qualification).

Richard.

> Feng
>
>
Marc Glisse May 20, 2019, 1:04 p.m. UTC | #9
On Mon, 20 May 2019, Richard Biener wrote:

> On Sat, May 18, 2019 at 4:00 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>>
>> (@Feng Xue, it is better to include testcases in your patches)
>>
>>>> I'm not a big fan of this patch.  I'd rather be looking at either
>>>> improving our analysis
>>
>> Better analysis cannot hurt.
>>
>>>> or adding attributes to the loops to help the analysis bits prove
>>>> termination.
>>
>> There may not be a loop in the source code, it may be a recursive function
>> call that gcc turned into a loop. Plus, I know that it applies to all
>> loops in my program.
>>
>> We could have a function to compute the length of a linked list:
>> struct A { A*p; };
>> unsigned l(A*a){return a?l(a->p)+1:0;}
>>
>> and because of other optimizations, it turns out that we do not actually
>> use this length
>> void g(A*a){l(a);}
>>
>> wouldn't it be nice if gcc could remove all that useless code, instead of
>> keeping a useless loop on the off chance that it might be infinite?
>>
>>> And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.
>>
>> IIUC that was slightly different: "This option tells the loop optimizer to
>> assume that loop indices do not overflow, and that loops with nontrivial
>> exit condition are not infinite."
>>
>> The assumption on indices looks unsafe indeed if it applied to unsigned
>> indices in non-empty loops.
>
> The question is of couse what a "nontrivial exit condition" is.  Indeed
> the general handling of unsigned wrapping was what made the option
> useless in practice.
>
> But we thoroughly need to specify "nontrivial exit condition", if going
> as far as non-constant exit condition, that is, only do {} while (1) is required
> to be detected as infinite then this breaks down to unsigned wrapping IVs
> not be infinite.  Otherwise it requires the compiler to be able to correctly
> analyze all unsigned IVs which I know we do not at the moment (SCEV
> has limits).

We also want to handle pointer-chasing loops (lists, trees), not 
specifically unsigned IV.

> So - any suggestion as to how define "nontrivial exit condition"?
>
>> But the C++ standard went to the trouble of banning infinite loops without
>> side effects specifically to enable DCE on this type of code... Actually,
>> an infinite loop with a trivial exit condition looks like a great
>> opportunity to use __builtin_unreachable() to me ;-) (I have been wanting
>> a -fmain-does-return -fno-funny-business for years, since I looked at
>> replacing some malloc with stack allocations, but that's all out of scope
>> for this patch)
>>
>> Why exactly are we trying so hard to preserve no-side-effect, infinite
>> loops? What are they good for? Note that reading an atomic or volatile
>> variable counts as a side effect for this purpose. It looks like some kind
>> of busy waiting, but without checking a flag, so it can only be stopped by
>> some external action (say a signal), so if the OS has any notion of sleep
>> for a thread, blocking would be better. Or maybe it is running through a
>> circular list, ensuring that it stays in RAM? Anyway it seems specific
>> enough that that strange code should be the one needing an annotation.
>
> I guess we preserve them because we have to?
>
> I suppose we could add a flag that allows us to elide
> loops with no side-effect and non-constant exit condition
> (so only preserve do{}while (1))?

The C++ standard says that do{}while(1) is __builtin_unreachable(), we 
don't have to preserve it. There is no mention of anything like a 
"nontrivial exit condition". Other languages may have a different opinion 
though, so it would probably need a flag indeed... But I am curious what 
the point of such a loop is.
Richard Biener May 20, 2019, 1:25 p.m. UTC | #10
On Mon, May 20, 2019 at 3:04 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> On Mon, 20 May 2019, Richard Biener wrote:
>
> > On Sat, May 18, 2019 at 4:00 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> >>
> >> (@Feng Xue, it is better to include testcases in your patches)
> >>
> >>>> I'm not a big fan of this patch.  I'd rather be looking at either
> >>>> improving our analysis
> >>
> >> Better analysis cannot hurt.
> >>
> >>>> or adding attributes to the loops to help the analysis bits prove
> >>>> termination.
> >>
> >> There may not be a loop in the source code, it may be a recursive function
> >> call that gcc turned into a loop. Plus, I know that it applies to all
> >> loops in my program.
> >>
> >> We could have a function to compute the length of a linked list:
> >> struct A { A*p; };
> >> unsigned l(A*a){return a?l(a->p)+1:0;}
> >>
> >> and because of other optimizations, it turns out that we do not actually
> >> use this length
> >> void g(A*a){l(a);}
> >>
> >> wouldn't it be nice if gcc could remove all that useless code, instead of
> >> keeping a useless loop on the off chance that it might be infinite?
> >>
> >>> And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.
> >>
> >> IIUC that was slightly different: "This option tells the loop optimizer to
> >> assume that loop indices do not overflow, and that loops with nontrivial
> >> exit condition are not infinite."
> >>
> >> The assumption on indices looks unsafe indeed if it applied to unsigned
> >> indices in non-empty loops.
> >
> > The question is of couse what a "nontrivial exit condition" is.  Indeed
> > the general handling of unsigned wrapping was what made the option
> > useless in practice.
> >
> > But we thoroughly need to specify "nontrivial exit condition", if going
> > as far as non-constant exit condition, that is, only do {} while (1) is required
> > to be detected as infinite then this breaks down to unsigned wrapping IVs
> > not be infinite.  Otherwise it requires the compiler to be able to correctly
> > analyze all unsigned IVs which I know we do not at the moment (SCEV
> > has limits).
>
> We also want to handle pointer-chasing loops (lists, trees), not
> specifically unsigned IV.
>
> > So - any suggestion as to how define "nontrivial exit condition"?
> >
> >> But the C++ standard went to the trouble of banning infinite loops without
> >> side effects specifically to enable DCE on this type of code... Actually,
> >> an infinite loop with a trivial exit condition looks like a great
> >> opportunity to use __builtin_unreachable() to me ;-) (I have been wanting
> >> a -fmain-does-return -fno-funny-business for years, since I looked at
> >> replacing some malloc with stack allocations, but that's all out of scope
> >> for this patch)
> >>
> >> Why exactly are we trying so hard to preserve no-side-effect, infinite
> >> loops? What are they good for? Note that reading an atomic or volatile
> >> variable counts as a side effect for this purpose. It looks like some kind
> >> of busy waiting, but without checking a flag, so it can only be stopped by
> >> some external action (say a signal), so if the OS has any notion of sleep
> >> for a thread, blocking would be better. Or maybe it is running through a
> >> circular list, ensuring that it stays in RAM? Anyway it seems specific
> >> enough that that strange code should be the one needing an annotation.
> >
> > I guess we preserve them because we have to?
> >
> > I suppose we could add a flag that allows us to elide
> > loops with no side-effect and non-constant exit condition
> > (so only preserve do{}while (1))?
>
> The C++ standard says that do{}while(1) is __builtin_unreachable(), we
> don't have to preserve it. There is no mention of anything like a
> "nontrivial exit condition". Other languages may have a different opinion
> though, so it would probably need a flag indeed... But I am curious what
> the point of such a loop is.

busy wait until wakeup by signal or interrupt.

Richard.

> --
> Marc Glisse
Feng Xue OS May 20, 2019, 1:59 p.m. UTC | #11
>> I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.

> Oh, but why not generally do this?

It is nature that real finiteness of loop should override assumed one. As you said, CD-DCE could be invocated as an early pass. At that time, we might make mistake deduction on finiteness for a loop whose bound is in form of variable, but could be transformed to constant by later optimizations. So it is not "safe" to generally do this in CD-DCE. We need one late and specific pass.

Feng
Richard Biener May 20, 2019, 2:04 p.m. UTC | #12
On Mon, May 20, 2019 at 4:00 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> >> I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.
>
>
> > Oh, but why not generally do this?
>
> It is nature that real finiteness of loop should override assumed one. As you said, CD-DCE could be invocated as an early pass. At that time, we might make mistake deduction on finiteness for a loop whose bound is in form of variable, but could be transformed to constant by later optimizations. So it is not "safe" to generally do this in CD-DCE. We need one late and specific pass.

I don't see how it is safe in a late pass when it is not safe in an
earlier one.  Optimization is imperfect - we could fail to remove
an "obvious" never taken exit and still have a loop that appears to be
finite according to our definition.  The only way
to define it would be if there was, at any point, an exit from the
loop (and there it _may_ be exclude EH edges) then
the loop is assumed to be finite.

Richard.

> Feng
Michael Matz May 20, 2019, 2:49 p.m. UTC | #13
Hi,

On Mon, 20 May 2019, Richard Biener wrote:

> > The C++ standard says that do{}while(1) is __builtin_unreachable(), we 
> > don't have to preserve it. There is no mention of anything like a 
> > "nontrivial exit condition". Other languages may have a different 
> > opinion though, so it would probably need a flag indeed... But I am 
> > curious what the point of such a loop is.
> 
> busy wait until wakeup by signal or interrupt.

I'd actually turn it around from what C++ says.  If the user wrote, as 
is, "do{}while(1);" or "while(1);" or "for(;;);" then we can assume 
something funky going on and not remove the loop.  For any other loop we 
assume that they are finite.  I.e. we mark loops as to-be-preserved (which 
we set on a few known patterns), and just remove all other loops when they 
contain no observable side effects after optimization.

And of course we'd still have to determine what acceptable side effects 
are.  E.g. in a pointer chasing loop containing no body, is the 
segfault when the pointer chain is not in fact circular, a side effect we 
should retain, or should we be allowed to remove the loop?  I'd say we 
should remove the loop, of course.

(And yes, I've always found our obsession with preserving infinite loops, 
outside of the above "obvious" cases, overly anal as well)


Ciao,
Michael.
Feng Xue OS May 20, 2019, 2:51 p.m. UTC | #14
> I don't see how it is safe in a late pass when it is not safe in an

> earlier one.  Optimization is imperfect - we could fail to remove
> an "obvious" never taken exit and still have a loop that appears to be
> finite according to our definition.

Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
Compiler tries to do something based on hint you tell it, but does not ensure correctness.

> The only way
> to define it would be if there was, at any point, an exit from the
> loop (and there it _may_ be exclude EH edges) then
> the loop is assumed to be finite.

No catch your point. If we treat an infinite loop as finite, it's bad because the loop might be removed.

Suppose we have a function:

    void foo(int bound)
     { for (int i = 0; i <= bound; i++); }

 In an early CD-DCE pass, "bound" is represented as a variable, and loop has a exit, so it is assumed to finite, and is removed.

But in a late pass, this function is inlined into another one, and "bound" has value of INT_MAX, this loop is infinite, and here we can know it should not be removed.

This is why I suggest doing the optimization as late as possible.

Feng
Marc Glisse May 21, 2019, 8:05 a.m. UTC | #15
On Mon, 20 May 2019, Michael Matz wrote:

> On Mon, 20 May 2019, Richard Biener wrote:
>
>>> The C++ standard says that do{}while(1) is __builtin_unreachable(), we
>>> don't have to preserve it. There is no mention of anything like a
>>> "nontrivial exit condition". Other languages may have a different
>>> opinion though, so it would probably need a flag indeed... But I am
>>> curious what the point of such a loop is.
>>
>> busy wait until wakeup by signal or interrupt.
>
> I'd actually turn it around from what C++ says.  If the user wrote, as
> is, "do{}while(1);" or "while(1);" or "for(;;);" then we can assume
> something funky going on and not remove the loop.  For any other loop we
> assume that they are finite.  I.e. we mark loops as to-be-preserved (which
> we set on a few known patterns), and just remove all other loops when they
> contain no observable side effects after optimization.

Seems sensible, although marking the trivial infinite loops in gimple 
seems simpler than doing it in the front-ends, and a good enough 
approximation unless we are willing to replace some other infinite loops 
with unreachable (or trap).

> And of course we'd still have to determine what acceptable side effects
> are.  E.g. in a pointer chasing loop containing no body, is the
> segfault when the pointer chain is not in fact circular, a side effect we
> should retain, or should we be allowed to remove the loop?  I'd say we
> should remove the loop, of course.

That may depend on flags like -fnon-call-exceptions (maybe not the right 
one) I guess, although I would also want the removal to happen in as many 
cases as possible. We do usually remove memory reads if the value read is 
unused.

> (And yes, I've always found our obsession with preserving infinite loops,
> outside of the above "obvious" cases, overly anal as well)
Richard Biener May 21, 2019, 10:12 a.m. UTC | #16
On Mon, May 20, 2019 at 4:51 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> > I don't see how it is safe in a late pass when it is not safe in an
>
> > earlier one.  Optimization is imperfect - we could fail to remove
> > an "obvious" never taken exit and still have a loop that appears to be
> > finite according to our definition.
>
> Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
> Compiler tries to do something based on hint you tell it, but does not ensure correctness.
>
> > The only way
> > to define it would be if there was, at any point, an exit from the
> > loop (and there it _may_ be exclude EH edges) then
> > the loop is assumed to be finite.
>
> No catch your point. If we treat an infinite loop as finite, it's bad because the loop might be removed.
>
> Suppose we have a function:
>
>     void foo(int bound)
>      { for (int i = 0; i <= bound; i++); }
>
>  In an early CD-DCE pass, "bound" is represented as a variable, and loop has a exit, so it is assumed to finite, and is removed.
>
> But in a late pass, this function is inlined into another one, and "bound" has value of INT_MAX, this loop is infinite, and here we can know it should not be removed.

But if "bound" is always INT_MAX but that's not visible to the
compiler we will still remove the
loop so I see no difference with removing it always.

> This is why I suggest doing the optimization as late as possible.

But this will defeat the purpose of allowing followup optimizations.

IMHO the only "sensible" thing is to do

Index: gcc/tree-ssa-dce.c
===================================================================
--- gcc/tree-ssa-dce.c  (revision 271415)
+++ gcc/tree-ssa-dce.c  (working copy)
@@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
          }

       FOR_EACH_LOOP (loop, 0)
-       if (!finite_loop_p (loop))
+       if (!loop_has_exit_edges (loop))
          {
            if (dump_file)
              fprintf (dump_file, "cannot prove finiteness of loop
%i\n", loop->num);

that also has the obvious advantage that we don't need to replace the loop
with a trap() but have a place to forward control flow to.  The loop in the
following testcase is then successfully removed:

int main(int argc, char **argv)
{
  unsigned i = argc;
  while (i+=2);
  return 0;
}

Likewise is the loop

void **q;
int main(int argc, char **argv)
{
  void **p = q;
  while (p = (void **)*p);
  return 0;
}

(that's the pointer-chasing).  Not with -fnon-call-exceptions
-fexceptions though.

Richard.

> Feng
>
Richard Biener May 21, 2019, 2:24 p.m. UTC | #17
On Tue, May 21, 2019 at 12:12 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, May 20, 2019 at 4:51 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
> >
> > > I don't see how it is safe in a late pass when it is not safe in an
> >
> > > earlier one.  Optimization is imperfect - we could fail to remove
> > > an "obvious" never taken exit and still have a loop that appears to be
> > > finite according to our definition.
> >
> > Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
> > Compiler tries to do something based on hint you tell it, but does not ensure correctness.
> >
> > > The only way
> > > to define it would be if there was, at any point, an exit from the
> > > loop (and there it _may_ be exclude EH edges) then
> > > the loop is assumed to be finite.
> >
> > No catch your point. If we treat an infinite loop as finite, it's bad because the loop might be removed.
> >
> > Suppose we have a function:
> >
> >     void foo(int bound)
> >      { for (int i = 0; i <= bound; i++); }
> >
> >  In an early CD-DCE pass, "bound" is represented as a variable, and loop has a exit, so it is assumed to finite, and is removed.
> >
> > But in a late pass, this function is inlined into another one, and "bound" has value of INT_MAX, this loop is infinite, and here we can know it should not be removed.
>
> But if "bound" is always INT_MAX but that's not visible to the
> compiler we will still remove the
> loop so I see no difference with removing it always.
>
> > This is why I suggest doing the optimization as late as possible.
>
> But this will defeat the purpose of allowing followup optimizations.
>
> IMHO the only "sensible" thing is to do
>
> Index: gcc/tree-ssa-dce.c
> ===================================================================
> --- gcc/tree-ssa-dce.c  (revision 271415)
> +++ gcc/tree-ssa-dce.c  (working copy)
> @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
>           }
>
>        FOR_EACH_LOOP (loop, 0)
> -       if (!finite_loop_p (loop))
> +       if (!loop_has_exit_edges (loop))
>           {
>             if (dump_file)
>               fprintf (dump_file, "cannot prove finiteness of loop
> %i\n", loop->num);

Bootstrapped / tested on x86_64-unknown-linux-gnu.  Fallout:

FAIL: gcc.dg/loop-unswitch-1.c scan-tree-dump unswitch ";; Unswitching loop"
FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
match heuristics: 2.20%" 3
FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
match heuristics: 5.50%" 1
FAIL: gcc.dg/uninit-28-gimple.c  (test for bogus messages, line 9)
FAIL: gcc.dg/graphite/scop-19.c scan-tree-dump-times graphite "number
of SCoPs: 0" 2
...
UNRESOLVED: gcc.dg/tree-ssa/20040211-1.c scan-tree-dump cddce2 "if "
FAIL: gcc.dg/tree-ssa/loop-10.c scan-tree-dump-times optimized "if " 3
FAIL: gcc.dg/tree-ssa/pr84648.c scan-tree-dump-times cddce1 "Found
loop 1 to be finite: upper bound found" 1
FAIL: gcc.dg/tree-ssa/split-path-6.c scan-tree-dump-times split-paths
"Duplicating join block" 3
FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread2 "FSM"
FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread3 "FSM"

I didn't look if the testcases are sensible for loop removal (or what
actually happens).

Richard.

> that also has the obvious advantage that we don't need to replace the loop
> with a trap() but have a place to forward control flow to.  The loop in the
> following testcase is then successfully removed:
>
> int main(int argc, char **argv)
> {
>   unsigned i = argc;
>   while (i+=2);
>   return 0;
> }
>
> Likewise is the loop
>
> void **q;
> int main(int argc, char **argv)
> {
>   void **p = q;
>   while (p = (void **)*p);
>   return 0;
> }
>
> (that's the pointer-chasing).  Not with -fnon-call-exceptions
> -fexceptions though.
>
> Richard.
>
> > Feng
> >
Michael Matz May 22, 2019, 1:44 p.m. UTC | #18
Hi,

On Tue, 21 May 2019, Richard Biener wrote:

> > Index: gcc/tree-ssa-dce.c
> > ===================================================================
> > --- gcc/tree-ssa-dce.c  (revision 271415)
> > +++ gcc/tree-ssa-dce.c  (working copy)
> > @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
> >           }
> >
> >        FOR_EACH_LOOP (loop, 0)
> > -       if (!finite_loop_p (loop))
> > +       if (!loop_has_exit_edges (loop))
> >           {
> >             if (dump_file)
> >               fprintf (dump_file, "cannot prove finiteness of loop
> > %i\n", loop->num);
> 
> Bootstrapped / tested on x86_64-unknown-linux-gnu.  Fallout:
> 
> FAIL: gcc.dg/loop-unswitch-1.c scan-tree-dump unswitch ";; Unswitching loop"
> FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
> match heuristics: 2.20%" 3
> FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
> match heuristics: 5.50%" 1

These contain infinite loops without other sideeffects.

> FAIL: gcc.dg/graphite/scop-19.c scan-tree-dump-times graphite "number
> of SCoPs: 0" 2

Loop without sideeffects.

> ...
> UNRESOLVED: gcc.dg/tree-ssa/20040211-1.c scan-tree-dump cddce2 "if "

why unresolved?  Anyway, conditionally infinite loop, but without side 
effects.

> FAIL: gcc.dg/tree-ssa/loop-10.c scan-tree-dump-times optimized "if " 3

Probably removes one more loop, which is conditionally infinite, but no 
side-effects.

> FAIL: gcc.dg/tree-ssa/pr84648.c scan-tree-dump-times cddce1 "Found
> loop 1 to be finite: upper bound found" 1

finite, no side-effect.

> FAIL: gcc.dg/tree-ssa/split-path-6.c scan-tree-dump-times split-paths
> "Duplicating join block" 3

AFAICS all loops therein contain side-effects, though the one in 
lookharder() only an invalid one, which might be optimized away, so that 
might be it.  But this would need to be looked at.

> FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread2 "FSM"
> FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread3 "FSM"

If everything is properly inlined, this contains two nested 
side-effect-free loops.

> FAIL: gcc.dg/uninit-28-gimple.c  (test for bogus messages, line 9)

But this one doesn't contain a loop at all!?

> I didn't look if the testcases are sensible for loop removal (or what
> actually happens).

IMHO most testcases above (perhaps except gcc.dg/uninit-28-gimple.c and 
tree-ssa/split-path-6.c) are sensible for loop removal if mere memory 
accesses and possible infiniteness don't count as side-effects.


Ciao,
Michael.
diff mbox series

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d8bed3a..c55f2e6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@ 
+2019-05-16  Feng Xue  <fxue@os.amperecomputing.com>
+
+        PR tree-optimization/89713
+        * doc/invoke.texi (-ffinite-loop): Document new option.
+        * common.opt (-ffinite-loop): New option.
+        * passes.def: Replace pass_cd_dce with pass_cd_dce2.
+        * tree-pass.h (pass_cd_dce2): New declaration.
+        * tree-ssa-dce.c (loop_has_true_exits): New function.
+        (find_obviously_necessary_stmts): Add aggressive_loop_removal
+        parameter.
+        (perform_tree_ssa_dce, tree_ssa_cd_dce): Likewise.
+        (class pass_cd_dce): Add new member aggressive_loop_removal.
+        (pass_cd_dce::pass_cd_dce: Add alr parameter.
+        (make_pass_cd_dce2): New function.
+
 2019-05-16  Jakub Jelinek  <jakub@redhat.com>
 
 	PR c++/90484
diff --git a/gcc/common.opt b/gcc/common.opt
index d342c4f..e98a34d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@  ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization
+Assume loops are finite if can not be analytically determined.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5e3e887..9a3882c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@  Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -9501,6 +9502,20 @@  that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Allow the compiler to assume that if finiteness of a loop can not be
+analytically determined, the loop must be finite. With the assumption, some
+aggressive transformation could be possible, such as removal of this kind
+of empty loop by dead code elimination (DCE).
+
+This option is not turned on by any @option{-O} option since it might result
+in incorrect behaviour for programs that contain seemly finite, but actually
+infinite loop.
+
+The default is @option{-fno-finite-loop}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/passes.def b/gcc/passes.def
index ad2efab..b84ee34 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -322,7 +322,7 @@  along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_copy_prop);
       NEXT_PASS (pass_warn_restrict);
       NEXT_PASS (pass_dse);
-      NEXT_PASS (pass_cd_dce);
+      NEXT_PASS (pass_cd_dce2);
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt, false /* early_p */);
       NEXT_PASS (pass_fold_builtins);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 3a0b380..2392bc5 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -395,6 +395,7 @@  extern gimple_opt_pass *make_pass_build_ealias (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_dominator (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_dce (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_cd_dce (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_cd_dce2 (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_call_cdce (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_merge_phi (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_thread_jumps (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..d4659df 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -356,6 +356,27 @@  mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
     bitmap_set_bit (visited_control_parents, bb->index);
 }
 
+/* Check whether a loop has any non-EH exit. */
+
+static bool
+loop_has_true_exits (const struct loop *loop)
+{
+  vec<edge> exits = get_loop_exit_edges (loop);
+  bool found = false;
+  edge e;
+  unsigned i;
+
+  FOR_EACH_VEC_ELT (exits, i, e)
+    if (!(e->flags & EDGE_EH))
+      {
+        found = true;
+        break;
+      }
+
+  exits.release ();
+  return found;
+}
+
 
 /* Find obviously necessary statements.  These are things like most function
    calls, and stores to file level variables.
@@ -365,7 +386,7 @@  mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
    dependence analysis.  */
 
 static void
-find_obviously_necessary_stmts (bool aggressive)
+find_obviously_necessary_stmts (bool aggressive, bool aggressive_loop_removal)
 {
   basic_block bb;
   gimple_stmt_iterator gsi;
@@ -417,7 +438,8 @@  find_obviously_necessary_stmts (bool aggressive)
 	  }
 
       FOR_EACH_LOOP (loop, 0)
-	if (!finite_loop_p (loop))
+	if (!finite_loop_p (loop)
+	    && (!aggressive_loop_removal || !loop_has_true_exits (loop)))
 	  {
 	    if (dump_file)
 	      fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num);
@@ -1532,6 +1554,8 @@  tree_dce_done (bool aggressive)
 /* Main routine to eliminate dead code.
 
    AGGRESSIVE controls the aggressiveness of the algorithm.
+   If aggressive mode is on, AGGRESSIVE_LOOP_REMOVAL enables more aggressive
+   removal of empty loop whose finiteness can not be statically determined.
    In conservative mode, we ignore control dependence and simply declare
    all but the most trivially dead branches necessary.  This mode is fast.
    In aggressive mode, control dependences are taken into account, which
@@ -1544,7 +1568,7 @@  tree_dce_done (bool aggressive)
 	  start experimenting with pass ordering.  */
 
 static unsigned int
-perform_tree_ssa_dce (bool aggressive)
+perform_tree_ssa_dce (bool aggressive, bool aggressive_loop_removal = false)
 {
   bool something_changed = 0;
 
@@ -1576,7 +1600,7 @@  perform_tree_ssa_dce (bool aggressive)
       mark_dfs_back_edges ();
     }
 
-  find_obviously_necessary_stmts (aggressive);
+  find_obviously_necessary_stmts (aggressive, aggressive_loop_removal);
 
   if (aggressive && ! in_loop_pipeline)
     {
@@ -1631,9 +1655,10 @@  tree_ssa_dce (void)
 }
 
 static unsigned int
-tree_ssa_cd_dce (void)
+tree_ssa_cd_dce (bool aggressive_loop_removal)
 {
-  return perform_tree_ssa_dce (/*aggressive=*/optimize >= 2);
+  return perform_tree_ssa_dce (/*aggressive=*/optimize >= 2,
+                               aggressive_loop_removal && flag_finite_loop);
 }
 
 namespace {
@@ -1690,15 +1715,20 @@  const pass_data pass_data_cd_dce =
 
 class pass_cd_dce : public gimple_opt_pass
 {
+  bool aggressive_loop_removal;
 public:
-  pass_cd_dce (gcc::context *ctxt)
+  pass_cd_dce (gcc::context *ctxt, bool alr = false)
     : gimple_opt_pass (pass_data_cd_dce, ctxt)
+    , aggressive_loop_removal (alr)
   {}
 
   /* opt_pass methods: */
   opt_pass * clone () { return new pass_cd_dce (m_ctxt); }
   virtual bool gate (function *) { return flag_tree_dce != 0; }
-  virtual unsigned int execute (function *) { return tree_ssa_cd_dce (); }
+  virtual unsigned int execute (function *)
+  {
+    return tree_ssa_cd_dce (aggressive_loop_removal);
+  }
 
 }; // class pass_cd_dce
 
@@ -1710,6 +1740,12 @@  make_pass_cd_dce (gcc::context *ctxt)
   return new pass_cd_dce (ctxt);
 }
 
+gimple_opt_pass *
+make_pass_cd_dce2 (gcc::context *ctxt)
+{
+  return new pass_cd_dce (ctxt, true);
+}
+
 
 /* A cheap DCE interface.  WORKLIST is a list of possibly dead stmts and
    is consumed by this function.  The function has linear complexity in