Message ID | 000901cef552$0a49b890$1edd29b0$@arm.com |
---|---|
State | New |
Headers | show |
On 12/09/13 19:46, bin.cheng wrote: >>>> IVOPT adds original cand and tries to keep it the original IV is >>>> because it does have an updating statement. For IVs picked up by >>>> this patch, it doesn't have stepping statement, so no need/way to >>>> leaving it untouched. It will be represented by candidates for IVs >>>> from which it is peeled. >>> >>> Understood. Thanks for clarifying. The patch is fine for the trunk. >> Thanks very much for reviewing. Since Sebastian hasn't added any >> comments, I will change the patch as suggested by Richard before, then >> retest it, and check in the new version if it's ok. >> > Hi, > This is the new version patch computing the difference in tree affine then > comparing against integer_zero_node. > Bootstrap and test on current upstream. I will commit it if there is no > other objection. Please install. Thanks, Jeff
On Mon, Dec 9, 2013 at 6:46 PM, bin.cheng <bin.cheng@arm.com> wrote: > >> > Hi, > This is the new version patch computing the difference in tree affine then > comparing against integer_zero_node. > Bootstrap and test on current upstream. I will commit it if there is no > other objection. > This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59445
On Tue, Dec 10, 2013 at 10:46:32AM +0800, bin.cheng wrote: > This is the new version patch computing the difference in tree affine then > comparing against integer_zero_node. > Bootstrap and test on current upstream. I will commit it if there is no > other objection. This breaks bootstrap on x86_64-linux for me too, though in a different pass. Anyway, the bugs I'm seeing here: 1) other passes that use a cache for tree-affine computations don't initialize it by pointer_map_create (), tree-affine.c does that for you 2) the cache shouldn't be destroyed by pointer_map_destroy, but instead by free_affine_expand_cache that will actually make sure it doesn't leak memory, etc. 3) the actual issue why this breaks bootstrap is that you've added the pointer_map_destroy (should be free_affine_expand_cache per 2) ) to scev_finalize, but scev_initialize is performed at the start of the loop passes and scev_finalize several passes later, and in between passes ggc_collect is called. As the cache isn't registered with GC, if you are unlucky and ggc_collect actually collects in between scev_initialize and scev_finalize and garbage collects some trees only mentioned in the affine cache peeled_chrec_map, things can break completely. So, IMHO please revert this patch for now and try to find some better place for the free_affine_expand_cache (&peeled_chrec_map); call so that the cache is never non-NULL when ggc_collect may be called. Alternatively you'd need to either tell GC about it (but the structures are heap allocated, not GC), or say instruct through some GTY magic that during collection free_affine_expand_cache (&peeled_chrec_map); should be called. Jakub
On Wed, Dec 11, 2013 at 6:50 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Dec 10, 2013 at 10:46:32AM +0800, bin.cheng wrote: >> This is the new version patch computing the difference in tree affine then >> comparing against integer_zero_node. >> Bootstrap and test on current upstream. I will commit it if there is no >> other objection. > > This breaks bootstrap on x86_64-linux for me too, though in a different > pass. > Anyway, the bugs I'm seeing here: > 1) other passes that use a cache for tree-affine computations > don't initialize it by pointer_map_create (), tree-affine.c does that > for you > 2) the cache shouldn't be destroyed by pointer_map_destroy, but > instead by free_affine_expand_cache that will actually make sure > it doesn't leak memory, etc. > 3) the actual issue why this breaks bootstrap is that you've added > the pointer_map_destroy (should be free_affine_expand_cache per 2) ) > to scev_finalize, but scev_initialize is performed at the start > of the loop passes and scev_finalize several passes later, and > in between passes ggc_collect is called. As the cache isn't registered > with GC, if you are unlucky and ggc_collect actually collects > in between scev_initialize and scev_finalize and garbage collects > some trees only mentioned in the affine cache peeled_chrec_map, > things can break completely. > > So, IMHO please revert this patch for now and try to find some > better place for the free_affine_expand_cache (&peeled_chrec_map); > call so that the cache is never non-NULL when ggc_collect may be called. > Alternatively you'd need to either tell GC about it (but the structures > are heap allocated, not GC), or say instruct through some GTY magic > that during collection free_affine_expand_cache (&peeled_chrec_map); > should be called. Sorry for bothering, I have reverted the patch. Will investigate it. Thanks, bin > > Jakub
On Wed, Dec 11, 2013 at 4:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Mon, Dec 9, 2013 at 6:46 PM, bin.cheng <bin.cheng@arm.com> wrote: >> >>> >> Hi, >> This is the new version patch computing the difference in tree affine then >> comparing against integer_zero_node. >> Bootstrap and test on current upstream. I will commit it if there is no >> other objection. >> > > This caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59445 I found out the cause and addressed it in the bug entry. Thanks, bin > > -- > H.J.
On Wed, Dec 11, 2013 at 6:50 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Dec 10, 2013 at 10:46:32AM +0800, bin.cheng wrote: >> This is the new version patch computing the difference in tree affine then >> comparing against integer_zero_node. >> Bootstrap and test on current upstream. I will commit it if there is no >> other objection. > > This breaks bootstrap on x86_64-linux for me too, though in a different > pass. > Anyway, the bugs I'm seeing here: > 1) other passes that use a cache for tree-affine computations > don't initialize it by pointer_map_create (), tree-affine.c does that > for you > 2) the cache shouldn't be destroyed by pointer_map_destroy, but > instead by free_affine_expand_cache that will actually make sure > it doesn't leak memory, etc. > 3) the actual issue why this breaks bootstrap is that you've added > the pointer_map_destroy (should be free_affine_expand_cache per 2) ) > to scev_finalize, but scev_initialize is performed at the start > of the loop passes and scev_finalize several passes later, and > in between passes ggc_collect is called. As the cache isn't registered > with GC, if you are unlucky and ggc_collect actually collects > in between scev_initialize and scev_finalize and garbage collects > some trees only mentioned in the affine cache peeled_chrec_map, > things can break completely. > > So, IMHO please revert this patch for now and try to find some > better place for the free_affine_expand_cache (&peeled_chrec_map); > call so that the cache is never non-NULL when ggc_collect may be called. > Alternatively you'd need to either tell GC about it (but the structures > are heap allocated, not GC), or say instruct through some GTY magic > that during collection free_affine_expand_cache (&peeled_chrec_map); > should be called. Thanks for helping on the issue, I shouldn't use the affine facility before fully understanding it. I know little about GC, so when ggc_collect may be called (between two passes)? If so, I have to call free_affine_expand_cache just after the calling to tree_to_affine_combination_expand in SCEV because it's an analyzer and as you pointed out, the analyzing result is used by different optimizers. The pointer map would be maintained between different passes if it's not instantly freed. Thanks, bin > > Jakub
On 12/10/13 23:35, Bin.Cheng wrote: > > I know little about GC, so when ggc_collect may be called (between two > passes)? If so, I have to call free_affine_expand_cache just after the > calling to tree_to_affine_combination_expand in SCEV because it's an > analyzer and as you pointed out, the analyzing result is used by > different optimizers. The pointer map would be maintained between > different passes if it's not instantly freed. The garbage collector only runs between passes. If you have roots in static storage, then you'll need to decorate them so the garbage collector scans them. There's a fairly extensive discussion of the garbage collector in the GCC internals manual. jeff
On Tue, Dec 10, 2013 at 11:59:16PM -0700, Jeff Law wrote: > On 12/10/13 23:35, Bin.Cheng wrote: > >I know little about GC, so when ggc_collect may be called (between two > >passes)? If so, I have to call free_affine_expand_cache just after the > >calling to tree_to_affine_combination_expand in SCEV because it's an > >analyzer and as you pointed out, the analyzing result is used by > >different optimizers. The pointer map would be maintained between > >different passes if it's not instantly freed. > The garbage collector only runs between passes. If you have roots > in static storage, then you'll need to decorate them so the garbage > collector scans them. > > There's a fairly extensive discussion of the garbage collector in > the GCC internals manual. It isn't that easy, pointer_map isn't a data structure recognized by the garbage collector at all (plus, as I've said before, the code is inserting XNEW allocated structures into the pointer_map, which isn't GC friendly either, but making them GC allocated might increase garbage unnecessarily, as all the structs are short lived). I've looked through scev_{initialize,finalize} callers, and almost all passes initialize it and finalize within the same pass, it is only the loop optimizations: NEXT_PASS (pass_tree_loop_init); NEXT_PASS (pass_lim); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_dce_loop); NEXT_PASS (pass_tree_unswitch); NEXT_PASS (pass_scev_cprop); NEXT_PASS (pass_record_bounds); NEXT_PASS (pass_check_data_deps); NEXT_PASS (pass_loop_distribution); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_graphite); PUSH_INSERT_PASSES_WITHIN (pass_graphite) NEXT_PASS (pass_graphite_transforms); NEXT_PASS (pass_lim); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_dce_loop); POP_INSERT_PASSES () NEXT_PASS (pass_iv_canon); NEXT_PASS (pass_parallelize_loops); NEXT_PASS (pass_if_conversion); /* pass_vectorize must immediately follow pass_if_conversion. Please do not add any other passes in between. */ NEXT_PASS (pass_vectorize); PUSH_INSERT_PASSES_WITHIN (pass_vectorize) NEXT_PASS (pass_dce_loop); POP_INSERT_PASSES () NEXT_PASS (pass_predcom); NEXT_PASS (pass_complete_unroll); NEXT_PASS (pass_slp_vectorize); NEXT_PASS (pass_loop_prefetch); NEXT_PASS (pass_iv_optimize); NEXT_PASS (pass_lim); NEXT_PASS (pass_tree_loop_done); where scev is live in between the passes, and unfortunately there is no call that each pass calls that you could easily add the free_affine.* call to (except for execute_function_todo but you'd need to invent another TODO_* flag for it and the question is how you'd ensure it will be handled for the non-loop specific passes that are just run in between these passes (say pass_copy_prop, would we need pass_copy_prop_loop?). Perhaps you could replace in tree-affine.c and all it's users {,struct} pointer_map_t * with struct GTY((user)) affine_expand_cache { pointer_map_t *cache; }; and write custom gt_ggc_mx (affine_expand_cache *) function that would would either pointer_map_traverse the cache and for each cache entry call gt_ggc_mx on the embedded trees in the structures, or drop the cache (dunno if the cache is really just a cache and it doesn't matter if it is dropped any time, or if it can affect code generation). There are also PCH related functions that need to be defined, though I guess you could just assert there that the cache is NULL and not walk anything. Jakub
On Wed, Dec 11, 2013 at 4:17 PM, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Dec 10, 2013 at 11:59:16PM -0700, Jeff Law wrote: >> On 12/10/13 23:35, Bin.Cheng wrote: >> >I know little about GC, so when ggc_collect may be called (between two >> >passes)? If so, I have to call free_affine_expand_cache just after the >> >calling to tree_to_affine_combination_expand in SCEV because it's an >> >analyzer and as you pointed out, the analyzing result is used by >> >different optimizers. The pointer map would be maintained between >> >different passes if it's not instantly freed. >> The garbage collector only runs between passes. If you have roots >> in static storage, then you'll need to decorate them so the garbage >> collector scans them. >> >> There's a fairly extensive discussion of the garbage collector in >> the GCC internals manual. > > It isn't that easy, pointer_map isn't a data structure recognized by the > garbage collector at all (plus, as I've said before, the code is inserting > XNEW allocated structures into the pointer_map, which isn't GC friendly > either, but making them GC allocated might increase garbage unnecessarily, > as all the structs are short lived). > > I've looked through scev_{initialize,finalize} callers, and almost all > passes initialize it and finalize within the same pass, it is only the loop > optimizations: > NEXT_PASS (pass_tree_loop_init); > NEXT_PASS (pass_lim); > NEXT_PASS (pass_copy_prop); > NEXT_PASS (pass_dce_loop); > NEXT_PASS (pass_tree_unswitch); > NEXT_PASS (pass_scev_cprop); > NEXT_PASS (pass_record_bounds); > NEXT_PASS (pass_check_data_deps); > NEXT_PASS (pass_loop_distribution); > NEXT_PASS (pass_copy_prop); > NEXT_PASS (pass_graphite); > PUSH_INSERT_PASSES_WITHIN (pass_graphite) > NEXT_PASS (pass_graphite_transforms); > NEXT_PASS (pass_lim); > NEXT_PASS (pass_copy_prop); > NEXT_PASS (pass_dce_loop); > POP_INSERT_PASSES () > NEXT_PASS (pass_iv_canon); > NEXT_PASS (pass_parallelize_loops); > NEXT_PASS (pass_if_conversion); > /* pass_vectorize must immediately follow pass_if_conversion. > Please do not add any other passes in between. */ > NEXT_PASS (pass_vectorize); > PUSH_INSERT_PASSES_WITHIN (pass_vectorize) > NEXT_PASS (pass_dce_loop); > POP_INSERT_PASSES () > NEXT_PASS (pass_predcom); > NEXT_PASS (pass_complete_unroll); > NEXT_PASS (pass_slp_vectorize); > NEXT_PASS (pass_loop_prefetch); > NEXT_PASS (pass_iv_optimize); > NEXT_PASS (pass_lim); > NEXT_PASS (pass_tree_loop_done); > where scev is live in between the passes, and unfortunately there > is no call that each pass calls that you could easily add the free_affine.* > call to (except for execute_function_todo but you'd need to invent another > TODO_* flag for it and the question is how you'd ensure it will be handled > for the non-loop specific passes that are just run in between these > passes (say pass_copy_prop, would we need pass_copy_prop_loop?). > > Perhaps you could replace in tree-affine.c and all it's users > {,struct} pointer_map_t * with > struct GTY((user)) affine_expand_cache { > pointer_map_t *cache; > }; > and write custom gt_ggc_mx (affine_expand_cache *) function that would > would either pointer_map_traverse the cache and for each cache entry > call gt_ggc_mx on the embedded trees in the structures, or drop the cache > (dunno if the cache is really just a cache and it doesn't matter if it is > dropped any time, or if it can affect code generation). There are also PCH > related functions that need to be defined, though I guess you could just > assert there that the cache is NULL and not walk anything. Thank both of you for being patient on this patch. I went through the documentation quickly and realized that I have to modify pointer-map structure to make it recognized by GC (maybe more work suggested by Jakub). It seems I shouldn't include that task in this patch at this stage 3, I am thinking just call free_affine* function in the place it is created for SCEV. Of course, that makes it more expensive. I just found that the patch also fixes PR58296 on IVOPT and I do want to include it in 4.9 if we are ok with it. So what's your opinion? Thanks, bin > > Jakub
On Wed, Dec 11, 2013 at 04:31:55PM +0800, Bin.Cheng wrote: > Thank both of you for being patient on this patch. > I went through the documentation quickly and realized that I have to > modify pointer-map structure to make it recognized by GC (maybe more Changing pointer_map at this point is IMHO not appropriate. > work suggested by Jakub). It seems I shouldn't include that task in > this patch at this stage 3, I am thinking just call free_affine* > function in the place it is created for SCEV. Of course, that makes > it more expensive. Perhaps you could call free_affine_expand_cache say from analyze_scalar_evolution, that is the innermost enclosing exported API that indirectly calls your new code, but it seems it is also called recursively from analyze_scalar_evolution_1 or functions it calls. So maybe you'd need to rename analyze_scalar_evolution to analyze_scalar_evolution_2 and adjust some calls to analyze_scalar_evolution to the latter in tree-scalar-evolution.c, and add analyze_scalar_evolution wrapper that calls analyze_scalar_evolution_2 and free_affine_expand_cache. Whether this would work depends on detailed analysis of the tree-scalar-evolution.c callgraph, there are tons of functions, most of them static, and the question is if there is a single non-static (or at most a few of them) function that cover all calls to your new code in the static functions it (indirectly) calls, i.e. if there isn't any other external entry point that might call your new code without doing free_affine_expand_cache afterwards. If you can find that, I'd say it would be the safest thing to do. But let's see what say Richard thinks about it too. Jakub
On Wed, Dec 11, 2013 at 9:50 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Wed, Dec 11, 2013 at 04:31:55PM +0800, Bin.Cheng wrote: >> Thank both of you for being patient on this patch. >> I went through the documentation quickly and realized that I have to >> modify pointer-map structure to make it recognized by GC (maybe more > > Changing pointer_map at this point is IMHO not appropriate. > >> work suggested by Jakub). It seems I shouldn't include that task in >> this patch at this stage 3, I am thinking just call free_affine* >> function in the place it is created for SCEV. Of course, that makes >> it more expensive. > > Perhaps you could call free_affine_expand_cache say from > analyze_scalar_evolution, that is the innermost enclosing exported > API that indirectly calls your new code, but it seems it is also called > recursively from analyze_scalar_evolution_1 or functions it calls. > So maybe you'd need to rename analyze_scalar_evolution to > analyze_scalar_evolution_2 and adjust some calls to analyze_scalar_evolution > to the latter in tree-scalar-evolution.c, and add analyze_scalar_evolution > wrapper that calls analyze_scalar_evolution_2 and free_affine_expand_cache. > Whether this would work depends on detailed analysis of the > tree-scalar-evolution.c callgraph, there are tons of functions, most of > them static, and the question is if there is a single non-static (or at most > a few of them) function that cover all calls to your new code in the > static functions it (indirectly) calls, i.e. if there isn't any other > external entry point that might call your new code without doing > free_affine_expand_cache afterwards. > > If you can find that, I'd say it would be the safest thing to do. > But let's see what say Richard thinks about it too. I think keeping the SCEV cache live over multiple passes is fragile (we do re-use SSA names and passes do not appropriately call scev_reset[_htab] in all cases). With fixing that the easiest approach is to associate a affine-expand cache with the scev info. Without that there isn't a very good solution other than discarding the cache after each expansion at the point you use it. The SCEV cache should effectively make most of the affine expansion caching moot (but you can try instrumenting that to verify it). That is, I suggest to try freeing the cache right after the second tree_to_aff_combination_expand. Btw, + /* Try harder to check if they are equal. */ + tree_to_aff_combination_expand (left, type, &aff1, &peeled_chrec_map); + tree_to_aff_combination_expand (step_val, type, &aff2, &peeled_chrec_map); + aff_combination_scale (&aff2, double_int_minus_one); + aff_combination_add (&aff1, &aff2); + left = fold_convert (type, aff_combination_to_tree (&aff1)); + + /* Transform (init, {left, right}_LOOP)_LOOP to {init, right}_LOOP + if "left" equals to "init + right". */ + if (operand_equal_p (left, integer_zero_node, 0)) you don't need aff_combination_to_tree, simply check aff1.n == 0 && aff1.offset.is_zero () (you may want to add aff_combination_zero_p or even aff_combination_equal_p) Thanks, Richard. > Jakub
Index: gcc/tree-scalar-evolution.c =================================================================== --- gcc/tree-scalar-evolution.c (revision 205688) +++ gcc/tree-scalar-evolution.c (working copy) @@ -280,6 +280,8 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "cfgloop.h" #include "tree-chrec.h" +#include "pointer-set.h" +#include "tree-affine.h" #include "tree-scalar-evolution.h" #include "dumpfile.h" #include "params.h" @@ -1380,7 +1382,67 @@ follow_ssa_edge (struct loop *loop, gimple def, gi } +/* Pointer map used when simplifying PEELED_CHREC into POLYNOMIAL_CHREC. */ +static pointer_map_t *peeled_chrec_map; +/* Simplify PEELED_CHREC represented by (init_cond, arg) in LOOP. + Handle below case and return the corresponding POLYNOMIAL_CHREC: + + # i_17 = PHI <i_13(5), 0(3)> + # _20 = PHI <_5(5), start_4(D)(3)> + ... + i_13 = i_17 + 1; + _5 = start_4(D) + i_13; + + Though variable _20 appears as a PEELED_CHREC in the form of + (start_4, _5)_LOOP, it's a POLYNOMIAL_CHREC like {start_4, 1}_LOOP. + + See PR41488. */ + +static tree +simplify_peeled_chrec (struct loop *loop, tree arg, tree init_cond) +{ + aff_tree aff1, aff2; + tree ev, left, right, type, step_val; + + ev = instantiate_parameters (loop, analyze_scalar_evolution (loop, arg)); + if (ev == NULL_TREE || TREE_CODE (ev) != POLYNOMIAL_CHREC) + return chrec_dont_know; + + left = CHREC_LEFT (ev); + right = CHREC_RIGHT (ev); + type = TREE_TYPE (left); + step_val = chrec_fold_plus (type, init_cond, right); + + /* Transform (init, {left, right}_LOOP)_LOOP to {init, right}_LOOP + if "left" equals to "init + right". */ + if (operand_equal_p (left, step_val, 0)) + { + if (dump_file && (dump_flags & TDF_SCEV)) + fprintf (dump_file, "Simplify PEELED_CHREC into POLYNOMIAL_CHREC.\n"); + + return build_polynomial_chrec (loop->num, init_cond, right); + } + + /* Try harder to check if they are equal. */ + tree_to_aff_combination_expand (left, type, &aff1, &peeled_chrec_map); + tree_to_aff_combination_expand (step_val, type, &aff2, &peeled_chrec_map); + aff_combination_scale (&aff2, double_int_minus_one); + aff_combination_add (&aff1, &aff2); + left = fold_convert (type, aff_combination_to_tree (&aff1)); + + /* Transform (init, {left, right}_LOOP)_LOOP to {init, right}_LOOP + if "left" equals to "init + right". */ + if (operand_equal_p (left, integer_zero_node, 0)) + { + if (dump_file && (dump_flags & TDF_SCEV)) + fprintf (dump_file, "Simplify PEELED_CHREC into POLYNOMIAL_CHREC.\n"); + + return build_polynomial_chrec (loop->num, init_cond, right); + } + return chrec_dont_know; +} + /* Given a LOOP_PHI_NODE, this function determines the evolution function from LOOP_PHI_NODE to LOOP_PHI_NODE in the loop. */ @@ -1392,6 +1454,7 @@ analyze_evolution_in_loop (gimple loop_phi_node, tree evolution_function = chrec_not_analyzed_yet; struct loop *loop = loop_containing_stmt (loop_phi_node); basic_block bb; + static bool simplify_peeled_chrec_p = true; if (dump_file && (dump_flags & TDF_SCEV)) { @@ -1442,7 +1505,19 @@ analyze_evolution_in_loop (gimple loop_phi_node, all the other iterations it has the value of ARG. For the moment, PEELED_CHREC nodes are not built. */ if (res != t_true) - ev_fn = chrec_dont_know; + { + ev_fn = chrec_dont_know; + /* Try to recognize POLYNOMIAL_CHREC which appears in + the form of PEELED_CHREC, but guard the process with + a bool variable to keep the analyzer from infinite + recurrence for real PEELED_RECs. */ + if (simplify_peeled_chrec_p && TREE_CODE (arg) == SSA_NAME) + { + simplify_peeled_chrec_p = false; + ev_fn = simplify_peeled_chrec (loop, arg, init_cond); + simplify_peeled_chrec_p = true; + } + } /* When there are multiple back edges of the loop (which in fact never happens currently, but nevertheless), merge their evolutions. */ @@ -3086,6 +3161,8 @@ scev_initialize (void) initialize_scalar_evolutions_analyzer (); + peeled_chrec_map = pointer_map_create (); + FOR_EACH_LOOP (loop, 0) { loop->nb_iterations = NULL_TREE; @@ -3122,6 +3199,12 @@ scev_reset (void) scev_reset_htab (); + if (peeled_chrec_map) + { + pointer_map_destroy (peeled_chrec_map); + peeled_chrec_map = NULL; + } + if (!current_loops) return; @@ -3209,6 +3292,11 @@ scev_finalize (void) return; htab_delete (scalar_evolution_info); scalar_evolution_info = NULL; + if (peeled_chrec_map) + { + pointer_map_destroy (peeled_chrec_map); + peeled_chrec_map = NULL; + } } /* Returns true if the expression EXPR is considered to be too expensive Index: gcc/testsuite/gcc.dg/tree-ssa/scev-7.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/scev-7.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/scev-7.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sccp-scev" } */ + +struct struct_t +{ + int* data; +}; + +void foo (struct struct_t* sp, int start, int end) +{ + int i; + + for (i = 1000; i+start > end; i--) + sp->data[i+start] = 0; +} + +/* { dg-final { scan-tree-dump-times "Simplify PEELED_CHREC into POLYNOMIAL_CHREC" 1 "sccp" } } */ +/* { dg-final { cleanup-tree-dump "sccp" } } */ Index: gcc/testsuite/gcc.dg/pr41488.c =================================================================== --- gcc/testsuite/gcc.dg/pr41488.c (revision 0) +++ gcc/testsuite/gcc.dg/pr41488.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sccp-scev" } */ + +struct struct_t +{ + int* data; +}; + +void foo (struct struct_t* sp, int start, int end) +{ + int i; + + for (i = 0; i+start < end; i++) + sp->data[i+start] = 0; +} + +/* { dg-final { scan-tree-dump-times "Simplify PEELED_CHREC into POLYNOMIAL_CHREC" 1 "sccp" } } */ +/* { dg-final { cleanup-tree-dump "sccp" } } */ Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 205688) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -2526,11 +2526,19 @@ add_old_iv_candidates (struct ivopts_data *data, s /* Additionally record the possibility of leaving the original iv untouched. */ def = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (data->current_loop)); - cand = add_candidate_1 (data, - iv->base, iv->step, true, IP_ORIGINAL, NULL, - SSA_NAME_DEF_STMT (def)); - cand->var_before = iv->ssa_name; - cand->var_after = def; + /* Don't add candidate if it's from another PHI node because + it's an affine iv appearing in the form of PEELED_CHREC. */ + phi = SSA_NAME_DEF_STMT (def); + if (gimple_code (phi) != GIMPLE_PHI) + { + cand = add_candidate_1 (data, + iv->base, iv->step, true, IP_ORIGINAL, NULL, + SSA_NAME_DEF_STMT (def)); + cand->var_before = iv->ssa_name; + cand->var_after = def; + } + else + gcc_assert (gimple_bb (phi) == data->current_loop->header); } }