From patchwork Thu Feb 22 12:08:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 876632 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-473704-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="mxJ+vJio"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3znCnf5Gwcz9s5R for ; Thu, 22 Feb 2018 23:09:08 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=pgUH+eSXwYnLx20Q1hZUzwkemn8J6m7VJCMdXRCgiGwevO6VDy hmkJKGN+FjahRJSj6lM9gL3YA0YbNwrNZ0OVs0F4fw711NvwSyKyx4RDL15jMflB 1ub5fZbAploJO4BgkqKk1mtXTPjPilUbmFJsLqX6iF7j4y6tmDUiLYg0E= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; s= default; bh=99dZIUi2j7Pjyw6FQK9vPw2Vka8=; b=mxJ+vJior+KQVxyZ5uDF H71ZtFJKFaVC1JDOYu+QtddwwzJSehCq6pmzHxPmWr79j0WGtfaFxmaEI2B0m2uh 4/wm7pCFXzVY8dcENYDUienoD5m8KTOpW8N8rw0NLQi3Ssv1jkR6/yfGMPOYdxHO d2U7HVtoYr+vDnHmD3V6uJg= Received: (qmail 118952 invoked by alias); 22 Feb 2018 12:09:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 118942 invoked by uid 89); 22 Feb 2018 12:09:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_RED autolearn=ham version=3.3.2 spammy= X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 22 Feb 2018 12:08:58 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1eopgF-0003UZ-ID from Tom_deVries@mentor.com ; Thu, 22 Feb 2018 04:08:55 -0800 Received: from [172.30.72.140] (137.202.0.87) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 22 Feb 2018 12:08:51 +0000 To: Richard Biener CC: GCC Patches From: Tom de Vries Subject: [parloops, PR83126], Use cached affine_ivs canonicalize_loop_ivs Message-ID: <72456add-0339-35cf-c9d6-6aade4af6050@mentor.com> Date: Thu, 22 Feb 2018 13:08:50 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) Hi, this patch fixes an ICE in the parloops pass. The ICE (when compiling the test-case in attached patch) follows from the fact that here in gen_parallel_loop the call to canonicalize_loop_ivs fails to "base all the induction variables in LOOP on a single control one": ... /* Base all the induction variables in LOOP on a single control one.*/ canonicalize_loop_ivs (loop, &nit, true); ... This is caused by the fact that for one of the induction variables, simple_iv no longer returns true (as was the case at the start of gen_parallel_loop). This seems to be triggered by the fact that during loop_version we call scev_reset (although I'm not sure why when recalculating scev info we're not reaching the same conclusion as before). The patch fixes the ICE by caching the affine_ivs as calculated by simple_iv before calling loop_version, and reusing those in canonicalize_loop_ivs (instead of calling simple_iv again). Bootstrapped and reg-tested on x86_64. OK for stage1? I'm not sure if this is minimal/low-impact enough for stage4. If not, I can rework the bit of the patch that adds an assert after canonicalize_loop_ivs into a patch that aborts the optimization instead of ICE-ing. Thanks, - Tom [parloops] Use cached affine_ivs canonicalize_loop_ivs 2018-02-22 Tom de Vries PR tree-optimization/83126 * tree-ssa-loop-manip.c (rewrite_phi_with_iv): Add and handle piv parameter. (rewrite_all_phi_nodes_with_iv, canonicalize_loop_ivs): Add and handle simple_ivs parameter. * tree-ssa-loop-manip.h (canonicalize_loop_ivs): Declare simple_ivs parameter and default to NULL. * tree-parloops.c (num_phis): New function. (gen_parallel_loop): Add simple_ivs argument to canonicalize_loop_ivs call. --- gcc/testsuite/gcc.dg/graphite/pr83126.c | 19 +++++++++++++++ gcc/tree-parloops.c | 43 +++++++++++++++++++++++++++++++-- gcc/tree-ssa-loop-manip.c | 41 ++++++++++++++++++++++++------- gcc/tree-ssa-loop-manip.h | 6 ++++- 4 files changed, 97 insertions(+), 12 deletions(-) diff --git a/gcc/testsuite/gcc.dg/graphite/pr83126.c b/gcc/testsuite/gcc.dg/graphite/pr83126.c new file mode 100644 index 0000000..663d059 --- /dev/null +++ b/gcc/testsuite/gcc.dg/graphite/pr83126.c @@ -0,0 +1,19 @@ +/* { dg-additional-options "-w -ftree-parallelize-loops=2 -floop-parallelize-all -O1" } */ + +void +ew (unsigned short int c9, int stuff) +{ + int e1; + + for (;;) + { + unsigned int *by = &e1; + int *fd = &stuff; + + *fd = c9; + fd = *fd; + if (*fd != 0) + for (*by = 0; *by < 2; ++*by) + c9 *= e1; + } +} diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index e44ad5e..5971680 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2235,6 +2235,25 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, calculate_dominance_info (CDI_DOMINATORS); } +/* Return number of phis in bb. If COUNT_VIRTUAL_P is false, don't count the + virtual phi. */ + +static unsigned int +num_phis (basic_block bb, bool count_virtual_p) +{ + unsigned int nr_phis = 0; + gphi_iterator gsi; + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + if (!count_virtual_p && virtual_operand_p (PHI_RESULT (gsi.phi ()))) + continue; + + nr_phis++; + } + + return nr_phis; +} + /* Generates code to execute the iterations of LOOP in N_THREADS threads in parallel, which can be 0 if that number is to be determined later. @@ -2326,6 +2345,24 @@ gen_parallel_loop (struct loop *loop, if (stmts) gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), stmts); + /* The loop_version call below calls gimple_duplicate_loop_to_header_edge, + which calls scev_reset. So a loop header phi node that classifies as + simple_iv here, may not classify as such anymore after loop_version, which + will make canonicalize_loop_ivs fail to "base all the induction variables + in LOOP on a single control one". We fix this by storing the affine_iv + result of the simple_iv call in a map from phi node result to affine_iv, + and passing that map as an argument to canonicalize_loop_ivs. */ + std::map simple_ivs; + gphi_iterator psi; + for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi)) + { + gphi *phi = psi.phi (); + affine_iv iv; + tree res = PHI_RESULT (phi); + if (simple_iv (loop, loop, res, &iv, true)) + simple_ivs.insert (std::pair (res, iv)); + } + if (!oacc_kernels_p) { if (loop->inner) @@ -2364,12 +2401,14 @@ gen_parallel_loop (struct loop *loop, profile_probability::unlikely (), profile_probability::likely (), profile_probability::unlikely (), true); - update_ssa (TODO_update_ssa); free_original_copy_tables (); } /* Base all the induction variables in LOOP on a single control one. */ - canonicalize_loop_ivs (loop, &nit, true); + canonicalize_loop_ivs (loop, &nit, true, &simple_ivs); + gcc_assert (num_phis (loop->header, false) + == reduction_list->elements () + 1); + update_ssa (TODO_update_ssa); /* Ensure that the exit condition is the first statement in the loop. The common case is that latch of the loop is empty (apart from the diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c index bf425af..1054ef3 100644 --- a/gcc/tree-ssa-loop-manip.c +++ b/gcc/tree-ssa-loop-manip.c @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-scalar-evolution.h" #include "params.h" #include "tree-inline.h" +#include /* All bitmaps for rewriting into loop-closed SSA go on this obstack, so that we can free them all at once. */ @@ -1427,13 +1428,14 @@ tree_unroll_loop (struct loop *loop, unsigned factor, } /* Rewrite the phi node at position PSI in function of the main - induction variable MAIN_IV and insert the generated code at GSI. */ + induction variable MAIN_IV and insert the generated code at GSI. PIV + optionally contains the associcated affine_iv. */ static void rewrite_phi_with_iv (loop_p loop, gphi_iterator *psi, gimple_stmt_iterator *gsi, - tree main_iv) + tree main_iv, affine_iv *piv) { affine_iv iv; gassign *stmt; @@ -1446,7 +1448,9 @@ rewrite_phi_with_iv (loop_p loop, return; } - if (!simple_iv (loop, loop, res, &iv, true)) + if (piv) + iv = *piv; + else if (!simple_iv (loop, loop, res, &iv, true)) { gsi_next (psi); return; @@ -1468,10 +1472,12 @@ rewrite_phi_with_iv (loop_p loop, } /* Rewrite all the phi nodes of LOOP in function of the main induction - variable MAIN_IV. */ + variable MAIN_IV. SIMPLE_IVS contains an optional map from phi node result + to affive_iv. */ static void -rewrite_all_phi_nodes_with_iv (loop_p loop, tree main_iv) +rewrite_all_phi_nodes_with_iv (loop_p loop, tree main_iv, + std::map *simple_ivs) { unsigned i; basic_block *bbs = get_loop_body_in_dom_order (loop); @@ -1486,7 +1492,22 @@ rewrite_all_phi_nodes_with_iv (loop_p loop, tree main_iv) continue; for (psi = gsi_start_phis (bb); !gsi_end_p (psi); ) - rewrite_phi_with_iv (loop, &psi, &gsi, main_iv); + { + affine_iv *piv = NULL; + affine_iv iv; + gphi *phi = psi.phi (); + if (simple_ivs) + { + std::map::iterator search + = simple_ivs->find (PHI_RESULT (phi)); + if (search != simple_ivs->end ()) + { + iv = search->second; + piv = &iv; + } + } + rewrite_phi_with_iv (loop, &psi, &gsi, main_iv, piv); + } } free (bbs); @@ -1498,11 +1519,13 @@ rewrite_all_phi_nodes_with_iv (loop_p loop, tree main_iv) converted to the larger type, the conversion code is inserted before the loop, and *NIT is updated to the new definition. When BUMP_IN_LATCH is true, the induction variable is incremented in the loop latch, otherwise it is - incremented in the loop header. Return the induction variable that was + incremented in the loop header. SIMPLE_IVS contains an optional map from + phi node result to affive_iv. Return the induction variable that was created. */ tree -canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch) +canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch, + std::map *simple_ivs) { unsigned precision = TYPE_PRECISION (TREE_TYPE (*nit)); unsigned original_precision = precision; @@ -1557,7 +1580,7 @@ canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch) create_iv (build_int_cst_type (type, 0), build_int_cst (type, 1), NULL_TREE, loop, &gsi, bump_in_latch, &var_before, NULL); - rewrite_all_phi_nodes_with_iv (loop, var_before); + rewrite_all_phi_nodes_with_iv (loop, var_before, simple_ivs); stmt = as_a (last_stmt (exit->src)); /* Make the loop exit if the control condition is not satisfied. */ diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h index 390ac6f..a96ccb7 100644 --- a/gcc/tree-ssa-loop-manip.h +++ b/gcc/tree-ssa-loop-manip.h @@ -20,6 +20,9 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_TREE_SSA_LOOP_MANIP_H #define GCC_TREE_SSA_LOOP_MANIP_H +#include +#include "tree-ssa-loop.h" + typedef void (*transform_callback)(struct loop *, void *); extern void create_iv (tree, tree, tree, struct loop *, gimple_stmt_iterator *, @@ -54,7 +57,8 @@ extern void tree_transform_and_unroll_loop (struct loop *, unsigned, transform_callback, void *); extern void tree_unroll_loop (struct loop *, unsigned, edge, struct tree_niter_desc *); -extern tree canonicalize_loop_ivs (struct loop *, tree *, bool); +extern tree canonicalize_loop_ivs (struct loop *, tree *, bool, + std::map *simple_ivs = NULL);