From patchwork Wed Oct 11 14:43:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 824466 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-463950-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="fBwXWZsa"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yBxYd04y5z9s7c for ; Thu, 12 Oct 2017 01:43:31 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=WO6QEpKaG9rfMSx4oE+usiqoRJUWcwkDSmdITU0PaDghADlLNS jZcqW8zyi5A2i+ojxzijIhXbx8Bwyb04ks5PtprBSNUdwH32QImvsKAwBegW80WC Xp+lDbJkncie0OwSimIBOfBNah4jTtKKUDqTZaB2O0nCo6jv4AqKdnXCw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=wxCfg4eCNAjQcsp7h0B//0FmAdo=; b=fBwXWZsaDWbTIJOSiw6q nDkdyXB2Ebd3pipaCZG7KD7idKYA8YYkLBk2TVMvopbclzkKJ2tbz6inq6YW7Ywc OA7s5m3BRqj5EZoHfaMCFPe5KU0yx6NiJN2anvMRKBjQVoFfXWlCv5m63T/eAmgI /DjV2Yn9ZmMdwzlUoLkEx+g= Received: (qmail 13356 invoked by alias); 11 Oct 2017 14:43:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 13335 invoked by uid 89); 11 Oct 2017 14:43:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-16.1 required=5.0 tests=BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=expo, EXPO, ass, fake X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Oct 2017 14:43:13 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 6B539AAC8; Wed, 11 Oct 2017 14:43:10 +0000 (UTC) Date: Wed, 11 Oct 2017 16:43:09 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: sebpop@gmail.com Subject: [PATCH][GRAPHITE] Fix PR82451 (and PR82355 in a different way) Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 For PR82355 I introduced a fake dimension to ISL to allow CHRECs having an evolution in a loop that isn't fully part of the SESE region we are processing. That was easier than fending off those CHRECs (without simply giving up on SESE regions with those). But it didn't fully solve the issue as PR82451 shows where we run into the issue that we eventually have to code-gen those evolutions and thus in theory need a canonical IV of that containing loop. So I decided (after Micha pressuring me a bit...) to revisit the original issue and make SCEV analysis "properly" handle SE regions. It turns out that it is mostly instantiate_scev lacking proper support plus the necessary interfacing change (really just cosmetic in some sense) from a instantiate_before basic-block to a instantiate_before edge. data-ref interfaces have been similarly adjusted, here changing the "loop nest" loop parameter to the entry edge for the SE region and passing that down accordingly. I've for now tried to keep other high-level loop-based interfaces the same by simply using the loop preheader edge as entry where appropriate (needing loop_preheader_edge cope with the loop root tree for simplicity). In the process I ran into issues with us too overly aggressive instantiating random trees and thus I cut those down. That part doesn't successfully test separately (when I remove the strange ARRAY_REF instantiation), so it's part of this patch. I've also run into an SSA verification fail (the id-27.f90 testcase) which shows we _do_ need to clear the SCEV cache after introducing the versioned CFG (and added a comment before it). On the previously failing testcases I've verified we produce sensible instantiations for those pesky refs residing in "no" loop in the SCOP and that we get away with the result in terms of optimizing. SPEC 2k6 testing shows loop nest optimized: 311 loop nest not optimized, code generation error: 0 loop nest not optimized, optimized schedule is identical to original schedule: 173 loop nest not optimized, optimization timed out: 59 loop nest not optimized, ISL signalled an error: 9 loop nest: 552 for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity still reveals some codegen errors: loop nest optimized: 437 loop nest not optimized, code generation error: 25 loop nest not optimized, optimized schedule is identical to original schedule: 169 loop nest not optimized, optimization timed out: 60 loop nest not optimized, ISL signalled an error: 9 loop nest: 700 Bootstrap and regtest in progress on x86_64-unknown-linux-gnu (with and without -fgraphite-identity -floop-nest-optimize). Ok? Thanks, Richard. 2017-10-11 Richard Biener PR tree-optimization/82451 Revert 2017-10-02 Richard Biener PR tree-optimization/82355 * graphite-isl-ast-to-gimple.c (build_iv_mapping): Also build a mapping for the enclosing loop but avoid generating one for the loop tree root. (copy_bb_and_scalar_dependences): Remove premature codegen error on PHIs in blocks duplicated into multiple places. * graphite-scop-detection.c (scop_detection::stmt_has_simple_data_refs_p): For a loop not in the region use it as loop and nest to analyze the DR in. (try_generate_gimple_bb): Likewise. * graphite-sese-to-poly.c (extract_affine_chrec): Adjust. (add_loop_constraints): For blocks in a loop not in the region create a dimension with a single iteration. * sese.h (gbb_loop_at_index): Remove assert. * cfgloop.c (loop_preheader_edge): For the loop tree root return the single successor of the entry block. * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Reset the SCEV hashtable and niters. * graphite-scop-detection.c (scop_detection::graphite_can_represent_scev): Add SCOP parameter, assert that we only have POLYNOMIAL_CHREC that vary in loops contained in the region. (scop_detection::graphite_can_represent_expr): Adjust. (scop_detection::stmt_has_simple_data_refs_p): For loops not in the region set loop to NULL. The nest is now the entry edge to the region. (try_generate_gimple_bb): Likewise. * sese.c (scalar_evolution_in_region): Adjust for instantiate_scev change. * tree-data-ref.h (graphite_find_data_references_in_stmt): Make nest parameter the edge into the region. (create_data_ref): Likewise. * tree-data-ref.c (dr_analyze_indices): Make nest parameter an entry edge into a region and adjust instantiate_scev calls. (create_data_ref): Likewise. (graphite_find_data_references_in_stmt): Likewise. (find_data_references_in_stmt): Pass the loop preheader edge from the nest argument. * tree-scalar-evolution.h (instantiate_scev): Make instantiate_below parameter the edge into the region. (instantiate_parameters): Use the loop preheader edge as entry. * tree-scalar-evolution.c (analyze_scalar_evolution): Handle NULL loop. (get_instantiated_value_entry): Make instantiate_below parameter the edge into the region. (instantiate_scev_name): Likewise. Adjust dominance checks, when we cannot use loop-based instantiation instantiate by walking use-def chains. (instantiate_scev_poly): Adjust. (instantiate_scev_binary): Likewise. (instantiate_scev_convert): Likewise. (instantiate_scev_not): Likewise. (instantiate_array_ref): Remove. (instantiate_scev_3): Likewise. (instantiate_scev_2): Likewise. (instantiate_scev_1): Likewise. (instantiate_scev_r): Do not blindly handle N-operand trees. Do not instantiate array-refs. Handle all constants and invariants. (instantiate_scev): Make instantiate_below parameter the edge into the region. (resolve_mixers): Use the loop preheader edge for the region parameter to instantiate_scev_r. * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Adjust. * gcc.dg/graphite/pr82451.c: New testcase. * gfortran.dg/graphite/id-27.f90: Likewise. * gfortran.dg/graphite/pr82451.f: Likewise. Index: gcc/cfgloop.c =================================================================== --- gcc/cfgloop.c (revision 253645) +++ gcc/cfgloop.c (working copy) @@ -1713,12 +1713,19 @@ loop_preheader_edge (const struct loop * edge e; edge_iterator ei; - gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)); + gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS) + && ! loops_state_satisfies_p (LOOPS_MAY_HAVE_MULTIPLE_LATCHES)); FOR_EACH_EDGE (e, ei, loop->header->preds) if (e->src != loop->latch) break; + if (! e) + { + gcc_assert (! loop_outer (loop)); + return single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + } + return e; } Index: gcc/graphite-isl-ast-to-gimple.c =================================================================== --- gcc/graphite-isl-ast-to-gimple.c (revision 253645) +++ gcc/graphite-isl-ast-to-gimple.c (working copy) @@ -749,10 +749,8 @@ build_iv_mapping (vec iv_map, gimp if (codegen_error_p ()) t = integer_zero_node; - loop_p old_loop = gbb_loop_at_index (gbb, region, i - 2); - /* Record sth only for real loops. */ - if (loop_in_sese_p (old_loop, region)) - iv_map[old_loop->num] = t; + loop_p old_loop = gbb_loop_at_index (gbb, region, i - 1); + iv_map[old_loop->num] = t; } } @@ -1570,6 +1568,12 @@ graphite_regenerate_ast_isl (scop_p scop update_ssa (TODO_update_ssa); checking_verify_ssa (true, true); rewrite_into_loop_closed_ssa (NULL, 0); + /* We analyzed evolutions of all SCOPs during SCOP detection + which cached evolutions. Now we've introduced PHIs for + liveouts which causes those cached solutions to be invalid + for code-generation purposes given we'd insert references + to SSA names not dominating their new use. */ + scev_reset (); } if (t.codegen_error_p ()) Index: gcc/graphite-scop-detection.c =================================================================== --- gcc/graphite-scop-detection.c (revision 253645) +++ gcc/graphite-scop-detection.c (working copy) @@ -403,7 +403,7 @@ public: Something like "i * n" or "n * m" is not allowed. */ - static bool graphite_can_represent_scev (tree scev); + static bool graphite_can_represent_scev (sese_l scop, tree scev); /* Return true when EXPR can be represented in the polyhedral model. @@ -963,7 +963,7 @@ scop_detection::graphite_can_represent_i Something like "i * n" or "n * m" is not allowed. */ bool -scop_detection::graphite_can_represent_scev (tree scev) +scop_detection::graphite_can_represent_scev (sese_l scop, tree scev) { if (chrec_contains_undetermined (scev)) return false; @@ -982,13 +982,13 @@ scop_detection::graphite_can_represent_s case BIT_NOT_EXPR: CASE_CONVERT: case NON_LVALUE_EXPR: - return graphite_can_represent_scev (TREE_OPERAND (scev, 0)); + return graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0)); case PLUS_EXPR: case POINTER_PLUS_EXPR: case MINUS_EXPR: - return graphite_can_represent_scev (TREE_OPERAND (scev, 0)) - && graphite_can_represent_scev (TREE_OPERAND (scev, 1)); + return graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0)) + && graphite_can_represent_scev (scop, TREE_OPERAND (scev, 1)); case MULT_EXPR: return !CONVERT_EXPR_CODE_P (TREE_CODE (TREE_OPERAND (scev, 0))) @@ -996,18 +996,20 @@ scop_detection::graphite_can_represent_s && !(chrec_contains_symbols (TREE_OPERAND (scev, 0)) && chrec_contains_symbols (TREE_OPERAND (scev, 1))) && graphite_can_represent_init (scev) - && graphite_can_represent_scev (TREE_OPERAND (scev, 0)) - && graphite_can_represent_scev (TREE_OPERAND (scev, 1)); + && graphite_can_represent_scev (scop, TREE_OPERAND (scev, 0)) + && graphite_can_represent_scev (scop, TREE_OPERAND (scev, 1)); case POLYNOMIAL_CHREC: /* Check for constant strides. With a non constant stride of 'n' we would have a value of 'iv * n'. Also check that the initial value can represented: for example 'n * m' cannot be represented. */ + gcc_assert (loop_in_sese_p (get_loop (cfun, + CHREC_VARIABLE (scev)), scop)); if (!evolution_function_right_is_integer_cst (scev) || !graphite_can_represent_init (scev)) return false; - return graphite_can_represent_scev (CHREC_LEFT (scev)); + return graphite_can_represent_scev (scop, CHREC_LEFT (scev)); default: break; @@ -1031,7 +1033,7 @@ scop_detection::graphite_can_represent_e tree expr) { tree scev = scalar_evolution_in_region (scop, loop, expr); - return graphite_can_represent_scev (scev); + return graphite_can_represent_scev (scop, scev); } /* Return true if the data references of STMT can be represented by Graphite. @@ -1040,12 +1042,15 @@ scop_detection::graphite_can_represent_e bool scop_detection::stmt_has_simple_data_refs_p (sese_l scop, gimple *stmt) { - loop_p nest; + edge nest; loop_p loop = loop_containing_stmt (stmt); if (!loop_in_sese_p (loop, scop)) - nest = loop; + { + nest = scop.entry; + loop = NULL; + } else - nest = outermost_loop_in_sese (scop, gimple_bb (stmt)); + nest = loop_preheader_edge (outermost_loop_in_sese (scop, gimple_bb (stmt))); auto_vec drs; if (! graphite_find_data_references_in_stmt (nest, loop, stmt, &drs)) @@ -1056,7 +1061,7 @@ scop_detection::stmt_has_simple_data_ref FOR_EACH_VEC_ELT (drs, j, dr) { for (unsigned i = 0; i < DR_NUM_DIMENSIONS (dr); ++i) - if (! graphite_can_represent_scev (DR_ACCESS_FN (dr, i))) + if (! graphite_can_represent_scev (scop, DR_ACCESS_FN (dr, i))) return false; } @@ -1413,12 +1418,15 @@ try_generate_gimple_bb (scop_p scop, bas vec reads = vNULL; sese_l region = scop->scop_info->region; - loop_p nest; + edge nest; loop_p loop = bb->loop_father; if (!loop_in_sese_p (loop, region)) - nest = loop; + { + nest = region.entry; + loop = NULL; + } else - nest = outermost_loop_in_sese (region, bb); + nest = loop_preheader_edge (outermost_loop_in_sese (region, bb)); for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) Index: gcc/graphite-sese-to-poly.c =================================================================== --- gcc/graphite-sese-to-poly.c (revision 253645) +++ gcc/graphite-sese-to-poly.c (working copy) @@ -86,7 +86,7 @@ extract_affine_chrec (scop_p s, tree e, isl_pw_aff *lhs = extract_affine (s, CHREC_LEFT (e), isl_space_copy (space)); isl_pw_aff *rhs = extract_affine (s, CHREC_RIGHT (e), isl_space_copy (space)); isl_local_space *ls = isl_local_space_from_space (space); - unsigned pos = sese_loop_depth (s->scop_info->region, get_chrec_loop (e)); + unsigned pos = sese_loop_depth (s->scop_info->region, get_chrec_loop (e)) - 1; isl_aff *loop = isl_aff_set_coefficient_si (isl_aff_zero_on_domain (ls), isl_dim_in, pos, 1); isl_pw_aff *l = isl_pw_aff_from_aff (loop); @@ -763,10 +763,10 @@ add_loop_constraints (scop_p scop, __isl return domain; const sese_l ®ion = scop->scop_info->region; if (!loop_in_sese_p (loop, region)) - ; - else - /* Recursion all the way up to the context loop. */ - domain = add_loop_constraints (scop, domain, loop_outer (loop), context); + return domain; + + /* Recursion all the way up to the context loop. */ + domain = add_loop_constraints (scop, domain, loop_outer (loop), context); /* Then, build constraints over the loop in post-order: outer to inner. */ @@ -777,21 +777,6 @@ add_loop_constraints (scop_p scop, __isl domain = add_iter_domain_dimension (domain, loop, scop); isl_space *space = isl_set_get_space (domain); - if (!loop_in_sese_p (loop, region)) - { - /* 0 == loop_i */ - isl_local_space *ls = isl_local_space_from_space (space); - isl_constraint *c = isl_equality_alloc (ls); - c = isl_constraint_set_coefficient_si (c, isl_dim_set, loop_index, 1); - if (dump_file) - { - fprintf (dump_file, "[sese-to-poly] adding constraint to the domain: "); - print_isl_constraint (dump_file, c); - } - domain = isl_set_add_constraint (domain, c); - return domain; - } - /* 0 <= loop_i */ isl_local_space *ls = isl_local_space_from_space (isl_space_copy (space)); isl_constraint *c = isl_inequality_alloc (ls); Index: gcc/sese.c =================================================================== --- gcc/sese.c (revision 253645) +++ gcc/sese.c (working copy) @@ -461,7 +461,6 @@ scalar_evolution_in_region (const sese_l { gimple *def; struct loop *def_loop; - basic_block before = region.entry->src; /* SCOP parameters. */ if (TREE_CODE (t) == SSA_NAME @@ -472,7 +471,7 @@ scalar_evolution_in_region (const sese_l || loop_in_sese_p (loop, region)) /* FIXME: we would need instantiate SCEV to work on a region, and be more flexible wrt. memory loads that may be invariant in the region. */ - return instantiate_scev (before, loop, + return instantiate_scev (region.entry, loop, analyze_scalar_evolution (loop, t)); def = SSA_NAME_DEF_STMT (t); @@ -494,7 +493,7 @@ scalar_evolution_in_region (const sese_l if (has_vdefs) return chrec_dont_know; - return instantiate_scev (before, loop, t); + return instantiate_scev (region.entry, loop, t); } /* Return true if BB is empty, contains only DEBUG_INSNs. */ Index: gcc/sese.h =================================================================== --- gcc/sese.h (revision 253645) +++ gcc/sese.h (working copy) @@ -334,6 +334,8 @@ gbb_loop_at_index (gimple_poly_bb_p gbb, while (--depth > index) loop = loop_outer (loop); + gcc_assert (loop_in_sese_p (loop, region)); + return loop; } Index: gcc/testsuite/gcc.dg/graphite/fuse-1.c =================================================================== --- gcc/testsuite/gcc.dg/graphite/fuse-1.c (revision 253645) +++ gcc/testsuite/gcc.dg/graphite/fuse-1.c (working copy) @@ -1,15 +1,15 @@ /* Check that the two loops are fused and that we manage to fold the two xor operations. */ -/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop4-all -fdump-tree-graphite-all" } */ +/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop-all -fdump-tree-graphite-all" } */ /* Make sure we fuse the loops like this: AST generated by isl: for (int c0 = 0; c0 <= 99; c0 += 1) { - S_3(0, c0); - S_6(0, c0); - S_9(0, c0); + S_3(c0); + S_6(c0); + S_9(c0); } */ -/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*\\}" 1 "graphite" } } */ +/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */ /* Check that after fusing the loops, the scalar computation is also fused. */ /* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 "forwprop4" } } */ Index: gcc/testsuite/gcc.dg/graphite/fuse-2.c =================================================================== --- gcc/testsuite/gcc.dg/graphite/fuse-2.c (revision 253645) +++ gcc/testsuite/gcc.dg/graphite/fuse-2.c (working copy) @@ -3,13 +3,13 @@ /* Make sure we fuse the loops like this: AST generated by isl: for (int c0 = 0; c0 <= 99; c0 += 1) { - S_3(0, c0); - S_6(0, c0); - S_9(0, c0); + S_3(c0); + S_6(c0); + S_9(c0); } */ -/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*S_.*\\(0, c0\\);.*\\}" 1 "graphite" } } */ +/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */ #define MAX 100 int A[MAX], B[MAX], C[MAX]; Index: gcc/testsuite/gcc.dg/graphite/pr82451.c =================================================================== --- gcc/testsuite/gcc.dg/graphite/pr82451.c (nonexistent) +++ gcc/testsuite/gcc.dg/graphite/pr82451.c (working copy) @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O -floop-parallelize-all" } */ + +static int a[]; +int b[1]; +int c; +static void +d (int *f, int *g) +{ + int e; + for (e = 0; e < 2; e++) + g[e] = 1; + for (e = 0; e < 2; e++) + g[e] = f[e] + f[e + 1]; +} +void +h () +{ + for (;; c += 8) + d (&a[c], b); +} Index: gcc/testsuite/gfortran.dg/graphite/id-27.f90 =================================================================== --- gcc/testsuite/gfortran.dg/graphite/id-27.f90 (nonexistent) +++ gcc/testsuite/gfortran.dg/graphite/id-27.f90 (working copy) @@ -0,0 +1,40 @@ +! { dg-additional-options "-Ofast" } +MODULE module_ra_gfdleta + INTEGER, PARAMETER :: NBLY=15 + REAL , SAVE :: EM1(28,180),EM1WDE(28,180),TABLE1(28,180), & + TABLE2(28,180),TABLE3(28,180),EM3(28,180), & + SOURCE(28,NBLY), DSRCE(28,NBLY) +CONTAINS + SUBROUTINE TABLE + INTEGER, PARAMETER :: NBLX=47 + INTEGER , PARAMETER:: NBLW = 163 + REAL :: & + SUM(28,180),PERTSM(28,180),SUM3(28,180), & + SUMWDE(28,180),SRCWD(28,NBLX),SRC1NB(28,NBLW), & + DBDTNB(28,NBLW) + REAL :: & + ZMASS(181),ZROOT(181),SC(28),DSC(28),XTEMV(28), & + TFOUR(28),FORTCU(28),X(28),X1(28),X2(180),SRCS(28), & + R1T(28),R2(28),S2(28),T3(28),R1WD(28) + REAL :: EXPO(180),FAC(180) + I = 0 + DO 417 J=121,180 + FAC(J)=ZMASS(J)*(ONE-(ONE+X2(J))*EXPO(J))/(X2(J)*X2(J)) +417 CONTINUE + DO 421 J=121,180 + SUM3(I,J)=SUM3(I,J)+DBDTNB(I,N)*FAC(J) +421 CONTINUE + IF (CENT.GT.160. .AND. CENT.LT.560.) THEN + DO 420 J=1,180 + DO 420 I=1,28 + SUMWDE(I,J)=SUMWDE(I,J)+SRC1NB(I,N)*EXPO(J) +420 CONTINUE + ENDIF + DO 433 J=121,180 + EM3(I,J)=SUM3(I,J)/FORTCU(I) +433 CONTINUE + DO 501 I=1,28 + EM1WDE(I,J)=SUMWDE(I,J)/TFOUR(I) +501 CONTINUE + END SUBROUTINE TABLE + END MODULE module_RA_GFDLETA Index: gcc/testsuite/gfortran.dg/graphite/pr82451.f =================================================================== --- gcc/testsuite/gfortran.dg/graphite/pr82451.f (nonexistent) +++ gcc/testsuite/gfortran.dg/graphite/pr82451.f (working copy) @@ -0,0 +1,39 @@ +! { dg-do compile } +! { dg-options "-O2 -floop-nest-optimize" } + MODULE LES3D_DATA + PARAMETER ( NSCHEME = 4, ICHEM = 0, ISGSK = 0, IVISC = 1 ) + DOUBLE PRECISION DT, TIME, STATTIME, CFL, RELNO, TSTND, ALREF + INTEGER IDYN, IMAX, JMAX, KMAX + PARAMETER( RUNIV = 8.3145D3, + > TPRANDLT = 0.91D0) + DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:) :: + > U, V, W, P, T, H, EK, + > UAV, VAV, WAV, PAV, TAV, HAV, EKAV + DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) :: + > CONC, HF, QAV, COAV, HFAV, DU + DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:,:) :: + > Q + END MODULE LES3D_DATA + SUBROUTINE FLUXJ() + USE LES3D_DATA + ALLOCATABLE QS(:), FSJ(:,:,:) + ALLOCATABLE DWDX(:),DWDY(:),DWDZ(:) + ALLOCATABLE DHDY(:), DKDY(:) + PARAMETER ( R12I = 1.0D0 / 12.0D0, + > TWO3 = 2.0D0 / 3.0D0 ) + ALLOCATE( QS(IMAX-1), FSJ(IMAX-1,0:JMAX-1,ND)) + ALLOCATE( DWDX(IMAX-1),DWDY(IMAX-1),DWDZ(IMAX-1)) + I1 = 1 + DO K = K1,K2 + DO J = J1,J2 + DO I = I1, I2 + FSJ(I,J,5) = FSJ(I,J,5) + PAV(I,J,K) * QS(I) + END DO + DO I = I1, I2 + DWDX(I) = DXI * R12I * (WAV(I-2,J,K) - WAV(I+2,J,K) + + > 8.0D0 * (WAV(I+1,J,K) - WAV(I-1,J,K))) + END DO + END DO + END DO + DEALLOCATE( QS, FSJ, DHDY, DKDY) + END Index: gcc/tree-data-ref.c =================================================================== --- gcc/tree-data-ref.c (revision 253645) +++ gcc/tree-data-ref.c (working copy) @@ -957,15 +957,14 @@ access_fn_component_p (tree op) } /* Determines the base object and the list of indices of memory reference - DR, analyzed in LOOP and instantiated in loop nest NEST. */ + DR, analyzed in LOOP and instantiated before NEST. */ static void -dr_analyze_indices (struct data_reference *dr, loop_p nest, loop_p loop) +dr_analyze_indices (struct data_reference *dr, edge nest, loop_p loop) { vec access_fns = vNULL; tree ref, op; tree base, off, access_fn; - basic_block before_loop; /* If analyzing a basic-block there are no indices to analyze and thus no access functions. */ @@ -977,7 +976,6 @@ dr_analyze_indices (struct data_referenc } ref = DR_REF (dr); - before_loop = block_before_loop (nest); /* REALPART_EXPR and IMAGPART_EXPR can be handled like accesses into a two element array with a constant index. The base is @@ -1002,7 +1000,7 @@ dr_analyze_indices (struct data_referenc { op = TREE_OPERAND (ref, 1); access_fn = analyze_scalar_evolution (loop, op); - access_fn = instantiate_scev (before_loop, loop, access_fn); + access_fn = instantiate_scev (nest, loop, access_fn); access_fns.safe_push (access_fn); } else if (TREE_CODE (ref) == COMPONENT_REF @@ -1034,7 +1032,7 @@ dr_analyze_indices (struct data_referenc { op = TREE_OPERAND (ref, 0); access_fn = analyze_scalar_evolution (loop, op); - access_fn = instantiate_scev (before_loop, loop, access_fn); + access_fn = instantiate_scev (nest, loop, access_fn); if (TREE_CODE (access_fn) == POLYNOMIAL_CHREC) { tree orig_type; @@ -1139,7 +1137,7 @@ free_data_ref (data_reference_p dr) in which the data reference should be analyzed. */ struct data_reference * -create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt, +create_data_ref (edge nest, loop_p loop, tree memref, gimple *stmt, bool is_read, bool is_conditional_in_stmt) { struct data_reference *dr; @@ -4970,7 +4968,8 @@ find_data_references_in_stmt (struct loo FOR_EACH_VEC_ELT (references, i, ref) { - dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref, + dr = create_data_ref (nest ? loop_preheader_edge (nest) : NULL, + loop_containing_stmt (stmt), ref->ref, stmt, ref->is_read, ref->is_conditional_in_stmt); gcc_assert (dr != NULL); datarefs->safe_push (dr); @@ -4986,7 +4985,7 @@ find_data_references_in_stmt (struct loo should be analyzed. */ bool -graphite_find_data_references_in_stmt (loop_p nest, loop_p loop, gimple *stmt, +graphite_find_data_references_in_stmt (edge nest, loop_p loop, gimple *stmt, vec *datarefs) { unsigned i; Index: gcc/tree-data-ref.h =================================================================== --- gcc/tree-data-ref.h (revision 253645) +++ gcc/tree-data-ref.h (working copy) @@ -436,11 +436,11 @@ extern void free_data_ref (data_referenc extern void free_data_refs (vec ); extern bool find_data_references_in_stmt (struct loop *, gimple *, vec *); -extern bool graphite_find_data_references_in_stmt (loop_p, loop_p, gimple *, +extern bool graphite_find_data_references_in_stmt (edge, loop_p, gimple *, vec *); tree find_data_references_in_loop (struct loop *, vec *); bool loop_nest_has_data_refs (loop_p loop); -struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool, +struct data_reference *create_data_ref (edge, loop_p, tree, gimple *, bool, bool); extern bool find_loop_nest (struct loop *, vec *); extern struct data_dependence_relation *initialize_data_dependence_relation Index: gcc/tree-scalar-evolution.c =================================================================== --- gcc/tree-scalar-evolution.c (revision 253645) +++ gcc/tree-scalar-evolution.c (working copy) @@ -2095,6 +2095,10 @@ analyze_scalar_evolution (struct loop *l { tree res; + /* ??? Fix callers. */ + if (! loop) + return var; + if (dump_file && (dump_flags & TDF_SCEV)) { fprintf (dump_file, "(analyze_scalar_evolution \n"); @@ -2271,7 +2275,7 @@ eq_idx_scev_info (const void *e1, const static unsigned get_instantiated_value_entry (instantiate_cache_type &cache, - tree name, basic_block instantiate_below) + tree name, edge instantiate_below) { if (!cache.map) { @@ -2281,7 +2285,7 @@ get_instantiated_value_entry (instantiat scev_info_str e; e.name_version = SSA_NAME_VERSION (name); - e.instantiated_below = instantiate_below->index; + e.instantiated_below = instantiate_below->dest->index; void **slot = htab_find_slot_with_hash (cache.map, &e, scev_info_hasher::hash (&e), INSERT); if (!*slot) @@ -2325,7 +2329,7 @@ loop_closed_phi_def (tree var) return NULL_TREE; } -static tree instantiate_scev_r (basic_block, struct loop *, struct loop *, +static tree instantiate_scev_r (edge, struct loop *, struct loop *, tree, bool *, int); /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW @@ -2344,7 +2348,7 @@ static tree instantiate_scev_r (basic_bl instantiated, and to stop if it exceeds some limit. */ static tree -instantiate_scev_name (basic_block instantiate_below, +instantiate_scev_name (edge instantiate_below, struct loop *evolution_loop, struct loop *inner_loop, tree chrec, bool *fold_conversions, @@ -2358,7 +2362,7 @@ instantiate_scev_name (basic_block insta evolutions in outer loops), nothing to do. */ if (!def_bb || loop_depth (def_bb->loop_father) == 0 - || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb)) + || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest)) return chrec; /* We cache the value of instantiated variable to avoid exponential @@ -2380,6 +2384,51 @@ instantiate_scev_name (basic_block insta def_loop = find_common_loop (evolution_loop, def_bb->loop_father); + if (! dominated_by_p (CDI_DOMINATORS, + def_loop->header, instantiate_below->dest)) + { + gimple *def = SSA_NAME_DEF_STMT (chrec); + if (gassign *ass = dyn_cast (def)) + { + switch (gimple_assign_rhs_class (ass)) + { + case GIMPLE_UNARY_RHS: + { + tree op0 = instantiate_scev_r (instantiate_below, evolution_loop, + inner_loop, gimple_assign_rhs1 (ass), + fold_conversions, size_expr); + if (op0 == chrec_dont_know) + return chrec_dont_know; + res = fold_build1 (gimple_assign_rhs_code (ass), + TREE_TYPE (chrec), op0); + break; + } + case GIMPLE_BINARY_RHS: + { + tree op0 = instantiate_scev_r (instantiate_below, evolution_loop, + inner_loop, gimple_assign_rhs1 (ass), + fold_conversions, size_expr); + if (op0 == chrec_dont_know) + return chrec_dont_know; + tree op1 = instantiate_scev_r (instantiate_below, evolution_loop, + inner_loop, gimple_assign_rhs2 (ass), + fold_conversions, size_expr); + if (op1 == chrec_dont_know) + return chrec_dont_know; + res = fold_build2 (gimple_assign_rhs_code (ass), + TREE_TYPE (chrec), op0, op1); + break; + } + default: + res = chrec_dont_know; + } + } + else + res = chrec_dont_know; + global_cache->set (si, res); + return res; + } + /* If the analysis yields a parametric chrec, instantiate the result again. */ res = analyze_scalar_evolution (def_loop, chrec); @@ -2411,8 +2460,9 @@ instantiate_scev_name (basic_block insta inner_loop, res, fold_conversions, size_expr); } - else if (!dominated_by_p (CDI_DOMINATORS, instantiate_below, - gimple_bb (SSA_NAME_DEF_STMT (res)))) + else if (dominated_by_p (CDI_DOMINATORS, + gimple_bb (SSA_NAME_DEF_STMT (res)), + instantiate_below->dest)) res = chrec_dont_know; } @@ -2450,7 +2500,7 @@ instantiate_scev_name (basic_block insta instantiated, and to stop if it exceeds some limit. */ static tree -instantiate_scev_poly (basic_block instantiate_below, +instantiate_scev_poly (edge instantiate_below, struct loop *evolution_loop, struct loop *, tree chrec, bool *fold_conversions, int size_expr) { @@ -2495,7 +2545,7 @@ instantiate_scev_poly (basic_block insta instantiated, and to stop if it exceeds some limit. */ static tree -instantiate_scev_binary (basic_block instantiate_below, +instantiate_scev_binary (edge instantiate_below, struct loop *evolution_loop, struct loop *inner_loop, tree chrec, enum tree_code code, tree type, tree c0, tree c1, @@ -2541,43 +2591,6 @@ instantiate_scev_binary (basic_block ins /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW and EVOLUTION_LOOP, that were left under a symbolic form. - "CHREC" is an array reference to be instantiated. - - CACHE is the cache of already instantiated values. - - Variable pointed by FOLD_CONVERSIONS is set to TRUE when the - conversions that may wrap in signed/pointer type are folded, as long - as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL - then we don't do such fold. - - SIZE_EXPR is used for computing the size of the expression to be - instantiated, and to stop if it exceeds some limit. */ - -static tree -instantiate_array_ref (basic_block instantiate_below, - struct loop *evolution_loop, struct loop *inner_loop, - tree chrec, bool *fold_conversions, int size_expr) -{ - tree res; - tree index = TREE_OPERAND (chrec, 1); - tree op1 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, index, - fold_conversions, size_expr); - - if (op1 == chrec_dont_know) - return chrec_dont_know; - - if (chrec && op1 == index) - return chrec; - - res = unshare_expr (chrec); - TREE_OPERAND (res, 1) = op1; - return res; -} - -/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW - and EVOLUTION_LOOP, that were left under a symbolic form. - "CHREC" that stands for a convert expression "(TYPE) OP" is to be instantiated. @@ -2592,7 +2605,7 @@ instantiate_array_ref (basic_block insta instantiated, and to stop if it exceeds some limit. */ static tree -instantiate_scev_convert (basic_block instantiate_below, +instantiate_scev_convert (edge instantiate_below, struct loop *evolution_loop, struct loop *inner_loop, tree chrec, tree type, tree op, bool *fold_conversions, int size_expr) @@ -2643,7 +2656,7 @@ instantiate_scev_convert (basic_block in instantiated, and to stop if it exceeds some limit. */ static tree -instantiate_scev_not (basic_block instantiate_below, +instantiate_scev_not (edge instantiate_below, struct loop *evolution_loop, struct loop *inner_loop, tree chrec, enum tree_code code, tree type, tree op, @@ -2681,130 +2694,6 @@ instantiate_scev_not (basic_block instan /* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW and EVOLUTION_LOOP, that were left under a symbolic form. - CHREC is an expression with 3 operands to be instantiated. - - CACHE is the cache of already instantiated values. - - Variable pointed by FOLD_CONVERSIONS is set to TRUE when the - conversions that may wrap in signed/pointer type are folded, as long - as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL - then we don't do such fold. - - SIZE_EXPR is used for computing the size of the expression to be - instantiated, and to stop if it exceeds some limit. */ - -static tree -instantiate_scev_3 (basic_block instantiate_below, - struct loop *evolution_loop, struct loop *inner_loop, - tree chrec, - bool *fold_conversions, int size_expr) -{ - tree op1, op2; - tree op0 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, TREE_OPERAND (chrec, 0), - fold_conversions, size_expr); - if (op0 == chrec_dont_know) - return chrec_dont_know; - - op1 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, TREE_OPERAND (chrec, 1), - fold_conversions, size_expr); - if (op1 == chrec_dont_know) - return chrec_dont_know; - - op2 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, TREE_OPERAND (chrec, 2), - fold_conversions, size_expr); - if (op2 == chrec_dont_know) - return chrec_dont_know; - - if (op0 == TREE_OPERAND (chrec, 0) - && op1 == TREE_OPERAND (chrec, 1) - && op2 == TREE_OPERAND (chrec, 2)) - return chrec; - - return fold_build3 (TREE_CODE (chrec), - TREE_TYPE (chrec), op0, op1, op2); -} - -/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW - and EVOLUTION_LOOP, that were left under a symbolic form. - - CHREC is an expression with 2 operands to be instantiated. - - CACHE is the cache of already instantiated values. - - Variable pointed by FOLD_CONVERSIONS is set to TRUE when the - conversions that may wrap in signed/pointer type are folded, as long - as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL - then we don't do such fold. - - SIZE_EXPR is used for computing the size of the expression to be - instantiated, and to stop if it exceeds some limit. */ - -static tree -instantiate_scev_2 (basic_block instantiate_below, - struct loop *evolution_loop, struct loop *inner_loop, - tree chrec, - bool *fold_conversions, int size_expr) -{ - tree op1; - tree op0 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, TREE_OPERAND (chrec, 0), - fold_conversions, size_expr); - if (op0 == chrec_dont_know) - return chrec_dont_know; - - op1 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, TREE_OPERAND (chrec, 1), - fold_conversions, size_expr); - if (op1 == chrec_dont_know) - return chrec_dont_know; - - if (op0 == TREE_OPERAND (chrec, 0) - && op1 == TREE_OPERAND (chrec, 1)) - return chrec; - - return fold_build2 (TREE_CODE (chrec), TREE_TYPE (chrec), op0, op1); -} - -/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW - and EVOLUTION_LOOP, that were left under a symbolic form. - - CHREC is an expression with 2 operands to be instantiated. - - CACHE is the cache of already instantiated values. - - Variable pointed by FOLD_CONVERSIONS is set to TRUE when the - conversions that may wrap in signed/pointer type are folded, as long - as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL - then we don't do such fold. - - SIZE_EXPR is used for computing the size of the expression to be - instantiated, and to stop if it exceeds some limit. */ - -static tree -instantiate_scev_1 (basic_block instantiate_below, - struct loop *evolution_loop, struct loop *inner_loop, - tree chrec, - bool *fold_conversions, int size_expr) -{ - tree op0 = instantiate_scev_r (instantiate_below, evolution_loop, - inner_loop, TREE_OPERAND (chrec, 0), - fold_conversions, size_expr); - - if (op0 == chrec_dont_know) - return chrec_dont_know; - - if (op0 == TREE_OPERAND (chrec, 0)) - return chrec; - - return fold_build1 (TREE_CODE (chrec), TREE_TYPE (chrec), op0); -} - -/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW - and EVOLUTION_LOOP, that were left under a symbolic form. - CHREC is the scalar evolution to instantiate. CACHE is the cache of already instantiated values. @@ -2818,7 +2707,7 @@ instantiate_scev_1 (basic_block instanti instantiated, and to stop if it exceeds some limit. */ static tree -instantiate_scev_r (basic_block instantiate_below, +instantiate_scev_r (edge instantiate_below, struct loop *evolution_loop, struct loop *inner_loop, tree chrec, bool *fold_conversions, int size_expr) @@ -2870,50 +2759,20 @@ instantiate_scev_r (basic_block instanti fold_conversions, size_expr); case ADDR_EXPR: + if (is_gimple_min_invariant (chrec)) + return chrec; + /* Fallthru. */ case SCEV_NOT_KNOWN: return chrec_dont_know; case SCEV_KNOWN: return chrec_known; - case ARRAY_REF: - return instantiate_array_ref (instantiate_below, evolution_loop, - inner_loop, chrec, - fold_conversions, size_expr); - - default: - break; - } - - if (VL_EXP_CLASS_P (chrec)) - return chrec_dont_know; - - switch (TREE_CODE_LENGTH (TREE_CODE (chrec))) - { - case 3: - return instantiate_scev_3 (instantiate_below, evolution_loop, - inner_loop, chrec, - fold_conversions, size_expr); - - case 2: - return instantiate_scev_2 (instantiate_below, evolution_loop, - inner_loop, chrec, - fold_conversions, size_expr); - - case 1: - return instantiate_scev_1 (instantiate_below, evolution_loop, - inner_loop, chrec, - fold_conversions, size_expr); - - case 0: - return chrec; - default: - break; + if (CONSTANT_CLASS_P (chrec)) + return chrec; + return chrec_dont_know; } - - /* Too complicated to handle. */ - return chrec_dont_know; } /* Analyze all the parameters of the chrec that were left under a @@ -2923,7 +2782,7 @@ instantiate_scev_r (basic_block instanti a function parameter. */ tree -instantiate_scev (basic_block instantiate_below, struct loop *evolution_loop, +instantiate_scev (edge instantiate_below, struct loop *evolution_loop, tree chrec) { tree res; @@ -2931,8 +2790,10 @@ instantiate_scev (basic_block instantiat if (dump_file && (dump_flags & TDF_SCEV)) { fprintf (dump_file, "(instantiate_scev \n"); - fprintf (dump_file, " (instantiate_below = %d)\n", instantiate_below->index); - fprintf (dump_file, " (evolution_loop = %d)\n", evolution_loop->num); + fprintf (dump_file, " (instantiate_below = %d -> %d)\n", + instantiate_below->src->index, instantiate_below->dest->index); + if (evolution_loop) + fprintf (dump_file, " (evolution_loop = %d)\n", evolution_loop->num); fprintf (dump_file, " (chrec = "); print_generic_expr (dump_file, chrec); fprintf (dump_file, ")\n"); @@ -2980,7 +2841,7 @@ resolve_mixers (struct loop *loop, tree destr = true; } - tree ret = instantiate_scev_r (block_before_loop (loop), loop, NULL, + tree ret = instantiate_scev_r (loop_preheader_edge (loop), loop, NULL, chrec, &fold_conversions, 0); if (folded_casts && !*folded_casts) Index: gcc/tree-scalar-evolution.h =================================================================== --- gcc/tree-scalar-evolution.h (revision 253645) +++ gcc/tree-scalar-evolution.h (working copy) @@ -30,7 +30,7 @@ extern void scev_reset (void); extern void scev_reset_htab (void); extern void scev_finalize (void); extern tree analyze_scalar_evolution (struct loop *, tree); -extern tree instantiate_scev (basic_block, struct loop *, tree); +extern tree instantiate_scev (edge, struct loop *, tree); extern tree resolve_mixers (struct loop *, tree, bool *); extern void gather_stats_on_scev_database (void); extern void final_value_replacement_loop (struct loop *); @@ -60,7 +60,7 @@ block_before_loop (loop_p loop) static inline tree instantiate_parameters (struct loop *loop, tree chrec) { - return instantiate_scev (block_before_loop (loop), loop, chrec); + return instantiate_scev (loop_preheader_edge (loop), loop, chrec); } /* Returns the loop of the polynomial chrec CHREC. */ Index: gcc/tree-ssa-loop-prefetch.c =================================================================== --- gcc/tree-ssa-loop-prefetch.c (revision 253645) +++ gcc/tree-ssa-loop-prefetch.c (working copy) @@ -1632,7 +1632,8 @@ determine_loop_nest_reuse (struct loop * for (gr = refs; gr; gr = gr->next) for (ref = gr->refs; ref; ref = ref->next) { - dr = create_data_ref (nest, loop_containing_stmt (ref->stmt), + dr = create_data_ref (loop_preheader_edge (nest), + loop_containing_stmt (ref->stmt), ref->mem, ref->stmt, !ref->write_p, false); if (dr)