Message ID | CAC1BbcSjSYHd2j==dSwrjuMTrDvgnwrJJ2941k89aLEqnt49xg@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 11 July 2015 at 01:00, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote: > On 10 July 2015 at 14:31, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > PS: no -mbranch-cost and, a tad more seriously, no --param branch-cost either ;) err, arm and aarch64 have no -mbranch-cost, a couple of prominent other arches do..
Hi Bernhard, On 11/07/15 00:00, Bernhard Reutner-Fischer wrote: > On 10 July 2015 at 14:31, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >> Hi all, >> >> This patch makes if-conversion more aggressive when handling code of the >> form: >> if (test) >> x := a //THEN >> else >> x := b //ELSE >> The current code adds the costs of both the THEN and ELSE blocks and proceeds if they don't >> exceed the branch cost. I don't think that's quite a right calculation. >> We're going to be executing at least one of the basic blocks anyway. >> This patch we instead check the *maximum* of the two blocks against the branch cost. >> This should still catch cases where a high latency instruction appears in one or both of >> the paths. > Shouldn't this maximum also take probability into account? Or maybe > not, would have to think about it tomorrow. The branch cost that we test against (recorded in if_info earlier in the ifcvt.c call chain) is either the predictable branch cost or the unpredictable branch cost, depending on what the predictable_edge_p machinery returned. I think checking against that should be enough, but it's an easy thing to experiment with, so I'm open to arguments in any direction. > > $ contrib/check_GNU_style.sh rtl-ifcvt.00.patch > > Blocks of 8 spaces should be replaced with tabs. > 783:+ return FALSE; > > > Generally ifcvt.c (resp. the whole tree) could use a > sed -i -e "s/\([[:space:]]\)FALSE/\1false/g" gcc/ifcvt.c > Maybe some of the int predicates could then become bools. Ok, will go over the style in the patch. > > > +/* Return iff the registers that the insns in BB_A set do not > + get used in BB_B. */ > > Return true iff I tried to be too formal here ;) https://en.wikipedia.org/wiki/If_and_only_if I'll use a normal if here. > > > Did you include go in your testing? > I see: > Unexpected results in this build (new failures) > FAIL: encoding/json > FAIL: go/printer > FAIL: go/scanner > FAIL: html/template > FAIL: log > FAIL: net/http > FAIL: net/http/cgi > FAIL: net/http/cookiejar > FAIL: os > FAIL: text/template Hmmm. I don't see these failures. I double checked right now and they appear as PASS in my configuration. I tested make check-go on x86_64-unknown-linux-gnu configured with --without-isl --disable-multilib --enable-languages=c,c++,fortran,go. Are you sure this is not some other issue in your tree? > > bbs_ok_for_cmove_arith() looks costly but i guess you looked if > there's some pre-existing cleverness you could have used instead? I did have a look, but couldn't find any. The bbs_ok_for_cmove_arith is done after the costs check so I'd hope that the costs check would already discard really long basic-blocks. > > noce_emit_bb() could use a better comment. Likewise insn_valid_noce_process_p(). > > insn_rtx_cost() should return an unsigned int, then_cost, else_cost > should thus be unsigned int too. > > copy_of_a versus copy_of_insn_b; I'd shorten the latter. Thanks, good suggestions. > > bb_valid_for_noce_process_p() suggests that there is a JOIN_BB param > but there is none? > Also should document the return value (and should not clobber the OUT > params upon failure, no?). bah, I forgot to update the comment once I modified the function during development of the patch. I'll fix those. > > As for the testcases, it would be nice to have at least a tiny bit for > x86_64, too. I can put the testcases in gcc.dg and enable them for x86 as well, but I think a couple of the already pass as is because x86 doesn't need to do an extra zero_extend inside the basic-block. > > PS: no -mbranch-cost and, a tad more seriously, no --param branch-cost either ;) > PPS: attached meant to illustrate comments above. Untested. Thanks a lot! This is all very helpful. I'll respin the patch. Thanks, Kyrill > > cheers,
On 13/07/15 10:45, Kyrill Tkachov wrote: > >> >> +/* Return iff the registers that the insns in BB_A set do not >> + get used in BB_B. */ >> >> Return true iff > I tried to be too formal here ;) https://en.wikipedia.org/wiki/If_and_only_if > I'll use a normal if here. Err, of course you were talking about the missing 'true'. Sorry, early morning. Kyrill
On July 13, 2015 11:45:55 AM GMT+02:00, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >Hi Bernhard, > >> Did you include go in your testing? >> I see: >> Unexpected results in this build (new failures) >> FAIL: encoding/json >> FAIL: go/printer >> FAIL: go/scanner >> FAIL: html/template >> FAIL: log >> FAIL: net/http >> FAIL: net/http/cgi >> FAIL: net/http/cookiejar >> FAIL: os >> FAIL: text/template > >Hmmm. I don't see these failures. I double checked right now and they >appear as PASS in my configuration. > >I tested make check-go on x86_64-unknown-linux-gnu configured with >--without-isl --disable-multilib --enable-languages=c,c++,fortran,go. > >Are you sure this is not some other issue in your tree? I have ISL enabled. I do have a couple of local stuff but that tested fine before your patch and should not really have impact on the parts your patch touches. So maybe it's ISL. Thanks,
On 13/07/15 11:48, Bernhard Reutner-Fischer wrote: > On July 13, 2015 11:45:55 AM GMT+02:00, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >> Hi Bernhard, >> >>> Did you include go in your testing? >>> I see: >>> Unexpected results in this build (new failures) >>> FAIL: encoding/json >>> FAIL: go/printer >>> FAIL: go/scanner >>> FAIL: html/template >>> FAIL: log >>> FAIL: net/http >>> FAIL: net/http/cgi >>> FAIL: net/http/cookiejar >>> FAIL: os >>> FAIL: text/template >> Hmmm. I don't see these failures. I double checked right now and they >> appear as PASS in my configuration. >> >> I tested make check-go on x86_64-unknown-linux-gnu configured with >> --without-isl --disable-multilib --enable-languages=c,c++,fortran,go. >> >> Are you sure this is not some other issue in your tree? > I have ISL enabled. I do have a couple of local stuff but that tested fine before your patch and should not really have impact on the parts your patch touches. So maybe it's ISL. I've rebuilt with ISL from scratch and the tests still pass for me. I'm testing Ubuntu on a Haswell machine, don't know if that's relevant. Kyrill > > Thanks, >
From 1b7d8f9b61eb538cc4338e2073d04a66518f13c2 Mon Sep 17 00:00:00 2001 From: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> Date: Fri, 10 Jul 2015 21:25:30 +0200 Subject: [PATCH] rtl-ifcvt typos fix some typos on top of Kyrill's patch. should mv the testcases to common ground. --- gcc/ifcvt.c | 49 ++++++++++------------- gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c | 2 +- gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c | 2 +- gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c | 7 ++-- 4 files changed, 27 insertions(+), 33 deletions(-) diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 3d324257..0bf6645 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -1784,8 +1784,7 @@ noce_try_cmove_arith (struct noce_if_info *if_info) /* We're going to execute one of the basic blocks anyway, so bail out if the most expensive of the two blocks is unacceptable. */ - if (MAX (then_cost, else_cost) - > COSTS_N_INSNS (if_info->branch_cost)) + if (MAX (then_cost, else_cost) > COSTS_N_INSNS (if_info->branch_cost)) return FALSE; /* Possibly rearrange operands to make things come out more natural. */ @@ -2730,28 +2729,26 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond, rtx cc = cc_in_cond (cond); if (!insn_valid_noce_process_p (last_insn, cc)) - return FALSE; + return false; last_set = single_set (last_insn); rtx x = SET_DEST (last_set); - rtx_insn *first_insn = first_active_insn (test_bb); rtx first_set = single_set (first_insn); - bool speed_p = optimize_bb_for_speed_p (test_bb); - *cost = insn_rtx_cost (last_set, speed_p); if (!first_set) return false; + /* We have a single simple set, that's okay. */ - else if (first_insn == last_insn) + bool speed_p = optimize_bb_for_speed_p (test_bb); + + if (first_insn == last_insn) { *simple_p = noce_operand_ok (SET_DEST (first_set)); *cost = insn_rtx_cost (first_set, speed_p); return *simple_p; } - *simple_p = false; - rtx_insn *prev_last_insn = PREV_INSN (last_insn); gcc_assert (prev_last_insn); @@ -2764,6 +2761,7 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond, /* The regs that are live out of test_bb. */ bitmap test_bb_live_out = df_get_live_out (test_bb); + int potential_cost = insn_rtx_cost (last_set, speed_p); rtx_insn *insn; FOR_BB_INSNS (test_bb, insn) { @@ -2781,7 +2779,7 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond, if (MEM_P (SET_SRC (sset)) || MEM_P (SET_DEST (sset))) goto free_bitmap_and_fail; - *cost += insn_rtx_cost (sset, speed_p); + potential_cost += insn_rtx_cost (sset, speed_p); bitmap_set_bit (test_bb_temps, REGNO (SET_DEST (sset))); } } @@ -2792,11 +2790,13 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond, goto free_bitmap_and_fail; BITMAP_FREE (test_bb_temps); + *cost = potential_cost; + *simple_p = false; return true; - free_bitmap_and_fail: - BITMAP_FREE (test_bb_temps); - return false; + free_bitmap_and_fail: + BITMAP_FREE (test_bb_temps); + return false; } /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert @@ -2828,21 +2828,14 @@ noce_process_if_block (struct noce_if_info *if_info) to calculate a value for x. ??? For future expansion, look for multiple X in such patterns. */ - bool then_bb_valid - = bb_valid_for_noce_process_p (then_bb, cond, &if_info->then_cost, - &if_info->then_simple); - - bool else_bb_valid = false; - if (else_bb) - else_bb_valid - = bb_valid_for_noce_process_p (else_bb, cond, &if_info->else_cost, - &if_info->else_simple); - - if (!then_bb_valid) - return FALSE; + if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->then_cost, + &if_info->then_simple)) + return false; - if (else_bb && !else_bb_valid) - return FALSE; + if (else_bb + && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->else_cost, + &if_info->else_simple)) + return false; insn_a = last_active_insn (then_bb, FALSE); set_a = single_set (insn_a); @@ -2866,7 +2859,7 @@ noce_process_if_block (struct noce_if_info *if_info) gcc_assert (set_b); if (!rtx_interchangeable_p (x, SET_DEST (set_b))) - return FALSE; + return FALSE; } else { diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c index 1836f57..92bc17a 100644 --- a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c +++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target aarch64*-*-* x86_64-*-* } } */ /* { dg-options "-fdump-rtl-ce1 -O2" } */ int diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c index 8c48270..e0e1728 100644 --- a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c +++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target aarch64*-*-* x86_64-*-* } } */ /* { dg-options "-fdump-rtl-ce1 -O2" } */ diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c index 1aecbc9..0441357 100644 --- a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c +++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c @@ -1,5 +1,5 @@ -/* { dg-do compile } */ -/* { dg-options "-save-temps -O2" } */ +/* { dg-do assemble { target aarch64*-*-* x86_64-*-* } } */ +/* { dg-options "-fdump-rtl-ce1 -O2" } */ typedef long long s64; @@ -16,4 +16,5 @@ foo (s64 a, s64 b, s64 c) } /* This test can be reduced to just return a + c; */ -/* { dg-final { scan-assembler-not "sub\.*\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+\.*" } } */ +/* { dg-final { scan-rtl-dump "3 true changes made" "ce1" } } */ +/* { dg-final { scan-assembler-not "sub\.*\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+\.*" { target { aarch64*-*-* } } } } */ -- 1.7.10.4