Remove some restrictions on loop shape in tree-if-conv.c

Alan Lawrence wrote:
> Tree if-conversion currently bails out for loops that (a) contain nested loops; 
> (b) have more than one exit; (c) where the exit block (source of the exit edge) 
> does not dominate the loop latch; (d) where the exit block is the loop header, 
> or there are statements after the exit.
> 
> This patch removes restrictions (c) and (d). The intuition is that, for (c), "if 
> (P) {... if (Q) break;}" is equivalent to "if (P) {...}; if (P&&Q) break;" and 
> this is mostly handled by existing code for propagating conditions. For (d), "if 
> (P) break; stmts" is equivalent to "if (!P) stmts; if (P) break;" - this 
> requires inserting the predicated stmts before the branch rather than after.
> 
> Mostly thus this patch is just removing assumptions about when we do/don't need 
> to store predicates. One 'gotcha' was in some test cases the latch block passed 
> into if-conversion is non-empty; in such cases, if-conversion will now restore 
> "good form" by moving the statement into the exit block (predicated with 
> !exit-condition).
> 
> The condition on dominance in add_to_predicate_list, I haven't quite managed to 
> convince myself is right; we _do_ want to store a predicate for the latch block 
> to handle the above case, but I'm not totally sure of the postdominance 
> condition - I think it may store conditions in cases where we don't really need 
> to (e.g. "for (;;) { ... if (P) { for (;;) ; } }" which might look nested but 
> isn't, and has no route to the function exit). However, storing conditions when 
> we don't need to, is OK, unlike failing to store when we do need to ;).
> 
> A simple example of the patch at work:
> 
> int
> foo ()
> {
>    for (int i = 0; i < N ; i++)
>    {
>      int m = (a[i] & i) ? 5 : 4;
>      b[i] = a[i] * m;
>    }
> }
> 
> compiled at -O3, -fdump-tree-ivcanon shows this immediately before 
> tree-if-conversion:
> 
> ...function entry, variables, etc...
>    <bb 2>:
>    _10 = a[0];
>    goto <bb 6>;
> 
>    <bb 3>:
>    _5 = a[i_9];
>    _6 = _5 & i_9;
>    if (_6 != 0)
>      goto <bb 5>;
>    else
>      goto <bb 4>;
> 
>    <bb 4>:
> 
>    <bb 5>:
>    # m_14 = PHI <5(3), 4(4)>
> 
>    <bb 6>:
>    # m_2 = PHI <m_14(5), 4(2)>
>    # _15 = PHI <_5(5), _10(2)>
>    # i_16 = PHI <i_9(5), 0(2)>
>    # ivtmp_13 = PHI <ivtmp_3(5), 32(2)>
>    _7 = m_2 * _15;
>    b[i_16] = _7;
>    i_9 = i_16 + 1;
>    ivtmp_3 = ivtmp_13 - 1;
>    if (ivtmp_3 != 0)
>      goto <bb 3>;
>    else
>      goto <bb 7>;
> 
> which previously was not if-converted. With this patch:
> 
>    <bb 2>:
>    _10 = a[0];
>    goto <bb 4>;
> 
>    <bb 3>:
> 
>    <bb 4>:
>    # m_2 = PHI <m_14(3), 4(2)>
>    # _15 = PHI <_5(3), _10(2)>
>    # i_16 = PHI <i_9(3), 0(2)>
>    # ivtmp_13 = PHI <ivtmp_3(3), 32(2)>
>    _7 = m_2 * _15;
>    b[i_16] = _7;
>    i_9 = i_16 + 1;
>    ivtmp_3 = ivtmp_13 - 1;
>    _5 = a[i_9];
>    _6 = _5 & i_9;
>    m_14 = _6 != 0 ? 5 : 4;
>    if (ivtmp_3 != 0)
>      goto <bb 3>;
>    else
>      goto <bb 5>;
> 
>    <bb 5>:
>    return;
> 
> (Unfortunately the vectorizer still doesn't handle this loop either, but that's 
> another issue/patch...)
> 
> Bootstrapped + check-gcc on x86_64-unknown-linux-gnu and aarch64-none-linux-gnu.
> Cross-tested check-gcc on aarch64-none-elf.
> I'm investigating impact on benchmarks - on AArch64 Spec2k6, this touches a 
> number of object files, leading to an overall slight decrease in the number of 
> instructions, but no change that looks significant (specifically, no more or 
> less vectorization).
> 
> Is this OK for trunk?
> 
> Cheers, Alan

Attached!

--Alan

Remove some restrictions on loop shape in tree-if-conv.c

Commit Message

Patch