Message ID | DB5PR0801MB27423F8E089681655CD3BB34E78D0@DB5PR0801MB2742.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote: > Hi, > This patch fixes PR81832. Root cause for the ICE is: > 1) Loop has distributed inner loop. > 2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header. > 3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus > not eliminated. > > Given pass_ch_vect copies loop header to enable more vectorization, we should > skip loop in this case because distributed inner loop means this loop can not > be vectorized anyway. One point to mention is name inner_loop_distributed_p > is a little misleading. The name indicates that each basic block is checked, > but the patch only checks loop's header for simplicity/efficiency's purpose. > Any comment? My comment would be to question pass_ch_vect placement -- what was the reason to place it so late? I also see GRAPHITE runs inbetween loop distribution and vectorization -- what prevents GRAPHITE from messing up things here? Or autopar? The patch itself shows should_duplicate_loop_header_p should handle this IFN specially (somehow all IFNs are considered "inexpensive"). So can you please adjust should_duplicate_loop_header_p instead and/or gimple_inexpensive_call_p? Since we have IFNs for stuff like EXP10 I'm not sure if that default is so good. Thanks, Richard. > Bootstrap and test on x86_64. > > Thanks, > bin > 2017-08-15 Bin Cheng <bin.cheng@arm.com> > > PR tree-optimization/81832 > * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function. > (pass_ch_vect::process_loop_p): Call above function. > > gcc/testsuite > 2017-08-15 Bin Cheng <bin.cheng@arm.com> > > PR tree-optimization/81832 > * gcc.dg/tree-ssa/pr81832.c: New test.
Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote: >> Hi, >> This patch fixes PR81832. Root cause for the ICE is: >> 1) Loop has distributed inner loop. >> 2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header. >> 3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus >> not eliminated. >> >> Given pass_ch_vect copies loop header to enable more vectorization, we should >> skip loop in this case because distributed inner loop means this loop can not >> be vectorized anyway. One point to mention is name inner_loop_distributed_p >> is a little misleading. The name indicates that each basic block is checked, >> but the patch only checks loop's header for simplicity/efficiency's purpose. >> Any comment? > > My comment would be to question pass_ch_vect placement -- what was the > reason to place it so late? > > I also see GRAPHITE runs inbetween loop distribution and vectorization -- > what prevents GRAPHITE from messing up things here? Or autopar? > > The patch itself shows should_duplicate_loop_header_p should > handle this IFN specially (somehow all IFNs are considered "inexpensive"). > > So can you please adjust should_duplicate_loop_header_p instead and/or > gimple_inexpensive_call_p? Since we have IFNs for stuff like EXP10 I'm not > sure if that default is so good. I think the default itself is OK: we only use IFNs for libm functions like exp10 if an underlying optab exists (unlike __builtin_exp10, which is useful as the non-errno setting version of exp10), so the target must have something that it thinks is worth open-coding. Also, we currently treat all MD built-ins as "simple" and thus "inexpensive", so the IFN handling is consistent with that. Maybe there are some IFNs that are worth special-casing as expensive, but IMO doing that to solve this problem would be a bit hacky. It seems like "inexpensive" should be more of a cost thing than a correctness thing. Thanks, Richard > Thanks, > Richard. > >> Bootstrap and test on x86_64. >> >> Thanks, >> bin >> 2017-08-15 Bin Cheng <bin.cheng@arm.com> >> >> PR tree-optimization/81832 >> * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function. >> (pass_ch_vect::process_loop_p): Call above function. >> >> gcc/testsuite >> 2017-08-15 Bin Cheng <bin.cheng@arm.com> >> >> PR tree-optimization/81832 >> * gcc.dg/tree-ssa/pr81832.c: New test.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c new file mode 100644 index 0000000..893124e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int a, b, *c; +void d(void) +{ + int **e; + for(;;) + for(int f = 1; f <= 6; f++) + { + b = 0; + if(a) +g: + while(a++); + if (**e); + else + { + *c = a; + goto g; + } + } +} diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c index 14cc6d8d..3c217d4 100644 --- a/gcc/tree-ssa-loop-ch.c +++ b/gcc/tree-ssa-loop-ch.c @@ -143,6 +143,27 @@ should_duplicate_loop_header_p (basic_block header, struct loop *loop, return true; } +/* Return TRUE if LOOP's inner loop is versioned by loop distribution and + the guarding internal function call happens to be in LOOP's header. + Given loop distribution is placed between pass_ch and pass_ch_vect, + this function only returns true in pass_ch_vect. When it returns TRUE, + it's known that copying LOOP's header is meaningless. */ + +static bool +inner_loop_distributed_p (struct loop *loop) +{ + gimple *stmt = last_stmt (loop->header); + if (stmt == NULL || gimple_code (stmt) != GIMPLE_COND) + return false; + + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + gsi_prev (&gsi); + if (gsi_end_p (gsi)) + return false; + + return (gimple_call_internal_p (gsi_stmt (gsi), IFN_LOOP_DIST_ALIAS)); +} + /* Checks whether LOOP is a do-while style loop. */ static bool @@ -442,6 +463,9 @@ pass_ch_vect::process_loop_p (struct loop *loop) if (loop->dont_vectorize) return false; + if (inner_loop_distributed_p (loop)) + return false; + if (!do_while_loop_p (loop)) return true;