From patchwork Fri Sep 22 13:03:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 817498 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-462777-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="dV/7k01s"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xzDFD2Cn5z9s9Y for ; Fri, 22 Sep 2017 23:03:40 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=uduE5ze6YUzTk4qQO8kPb2KDSVh3OWWOWSbyDDRbrb+FlSxWcf1b9 hy/qzZIjgQSkEoVz3WyFIagHInkm7ZIDzn3Gjq9YGyRq1wY7cwcafotTWNECQRof I+LS4MhJMZoh/U535/OGa5cAW21Nh3QuOzNR58yVaZwbrlz5WZFDlU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=L1/xH1EK2J0jPwkfcvbSYus5LBc=; b=dV/7k01sGr6V4Qo50F1Q gsD+cBqHpsPXbSGzB93XDmv7+PkT/DoEd4Rkmy77ZiXYNBqbrXiDJyCMbHCxjzOy vN2Ch9lNtwnwuW50eEvLeZhE/tecTwFRhVEtfjK10Z7qsRz3InbK6gZplFiZF0TS enOmFbhxZ32dQMXUVdGmkMU= Received: (qmail 94683 invoked by alias); 22 Sep 2017 13:03:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 94633 invoked by uid 89); 22 Sep 2017 13:03:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 22 Sep 2017 13:03:13 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7F1B0AC45 for ; Fri, 22 Sep 2017 13:03:10 +0000 (UTC) Date: Fri, 22 Sep 2017 15:03:10 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH][GRAPHITE] More TLC Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 This simplifies canonicalize_loop_closed_ssa and does other minimal TLC. It also adds a testcase I reduced from a stupid mistake I made when reworking canonicalize_loop_closed_ssa. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. SPEC CPU 2006 is happy with it, current statistics on x86_64 with -Ofast -march=haswell -floop-nest-optimize are 61 loop nests "optimized" 45 loop nest transforms cancelled because of code generation issues 21 loop nest optimizations timed out the 350000 ISL "operations" we allow I say "optimized" because the usual transform I've seen is static tiling as enforced by GRAPHITE according to --param loop-block-tile-size. There's no way to automagically figure what kind of transform ISL did (usually none with the schedule identical check confused by FILTER stuff positioning). This is also the issue with most GRAPHITE testcases. We can't really verify easily whether we performed loop interchange or not. We can probably tell whether we applied loop fusion or splitting (by counting loops). I'm not aware of any remaining ICEs / wrong-code issues with GRAPHITE. I'm aware that the current "black-box" granularity hinders scheduling freedom (each GIMPLE BB is mapped to a ISL stmt, this is too coarse to schedule say two writes in a BB independently from each other). Quick experiments could be done by simply splitting gimple BBs at some points. I'm aware that the SCOP detection algorithm assumes that it can walk loop->next and find loops "in order" -- but while that's true for the initial flow_loops_find result (DFS walk) it isn't true for any later created / discovered loops. Sorting of loop siblings in DFS order should be easy (and a general cfgloopanal helper). Richard. 2017-09-22 Richard Biener * graphite-isl-ast-to-gimple.c (graphite_verify): Inline into single caller. (graphite_regenerate_ast_isl): Do not reset SCEV. Move debug print of no dependency loops ... * graphite.c (graphite_transform_loops): ... here. (canonicalize_loop_closed_ssa_form): Work from inner to outer loops. (same_close_phi_node, remove_duplicate_close_phi, make_close_phi_nodes_unique, defined_in_loop_p): Fold into ... (canonicalize_loop_closed_ssa): ... here and simplify. * graphite-optimize-isl.c: Include tree-vectorizer.h. (optimize_isl): Use dump_printf_loc to tell when we stopped optimizing because of an ISL timeout. * gcc.dg/graphite/scop-24.c: New testcase. Index: gcc/graphite-isl-ast-to-gimple.c =================================================================== --- gcc/graphite-isl-ast-to-gimple.c (revision 253091) +++ gcc/graphite-isl-ast-to-gimple.c (working copy) @@ -73,15 +73,6 @@ struct ast_build_info bool is_parallelizable; }; -/* Verifies properties that GRAPHITE should maintain during translation. */ - -static inline void -graphite_verify (void) -{ - checking_verify_loop_structure (); - checking_verify_loop_closed_ssa (true); -} - /* IVS_PARAMS maps isl's scattering and parameter identifiers to corresponding trees. */ @@ -2997,8 +2988,9 @@ graphite_regenerate_ast_isl (scop_p scop delete_loop (loop); } - graphite_verify (); - scev_reset (); + /* Verifies properties that GRAPHITE should maintain during translation. */ + checking_verify_loop_structure (); + checking_verify_loop_closed_ssa (true); free (if_region->true_region); free (if_region->region); @@ -3008,19 +3000,6 @@ graphite_regenerate_ast_isl (scop_p scop isl_ast_node_free (root_node); timevar_pop (TV_GRAPHITE_CODE_GEN); - if (dump_file && (dump_flags & TDF_DETAILS)) - { - loop_p loop; - int num_no_dependency = 0; - - FOR_EACH_LOOP (loop, 0) - if (loop->can_be_parallel) - num_no_dependency++; - - fprintf (dump_file, "%d loops carried no dependency.\n", - num_no_dependency); - } - return !t.codegen_error_p (); } Index: gcc/graphite-optimize-isl.c =================================================================== --- gcc/graphite-optimize-isl.c (revision 253091) +++ gcc/graphite-optimize-isl.c (working copy) @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. #include "tree-data-ref.h" #include "params.h" #include "dumpfile.h" +#include "tree-vectorizer.h" #include "graphite.h" @@ -156,9 +157,12 @@ optimize_isl (scop_p scop) if (!scop->transformed_schedule || isl_ctx_last_error (scop->isl_context) == isl_error_quota) { - if (dump_file && dump_flags) - fprintf (dump_file, "isl timed out --param max-isl-operations=%d\n", - max_operations); + location_t loc = find_loop_location + (scop->scop_info->region.entry->dest->loop_father); + dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, + "loop nest not optimized, optimization timed out " + "after %d operations [--param max-isl-operations]\n", + max_operations); return false; } Index: gcc/graphite.c =================================================================== --- gcc/graphite.c (revision 253091) +++ gcc/graphite.c (working copy) @@ -293,86 +293,6 @@ free_scops (vec scops) scops.release (); } -/* Returns true when P1 and P2 are close phis with the same - argument. */ - -static inline bool -same_close_phi_node (gphi *p1, gphi *p2) -{ - return (types_compatible_p (TREE_TYPE (gimple_phi_result (p1)), - TREE_TYPE (gimple_phi_result (p2))) - && operand_equal_p (gimple_phi_arg_def (p1, 0), - gimple_phi_arg_def (p2, 0), 0)); -} - -static void make_close_phi_nodes_unique (basic_block bb); - -/* Remove the close phi node at GSI and replace its rhs with the rhs - of PHI. */ - -static void -remove_duplicate_close_phi (gphi *phi, gphi_iterator *gsi) -{ - gimple *use_stmt; - use_operand_p use_p; - imm_use_iterator imm_iter; - tree res = gimple_phi_result (phi); - tree def = gimple_phi_result (gsi->phi ()); - - gcc_assert (same_close_phi_node (phi, gsi->phi ())); - - FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, def) - { - FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) - SET_USE (use_p, res); - - update_stmt (use_stmt); - - /* It is possible that we just created a duplicate close-phi - for an already-processed containing loop. Check for this - case and clean it up. */ - if (gimple_code (use_stmt) == GIMPLE_PHI - && gimple_phi_num_args (use_stmt) == 1) - make_close_phi_nodes_unique (gimple_bb (use_stmt)); - } - - remove_phi_node (gsi, true); -} - -/* Removes all the close phi duplicates from BB. */ - -static void -make_close_phi_nodes_unique (basic_block bb) -{ - gphi_iterator psi; - - for (psi = gsi_start_phis (bb); !gsi_end_p (psi); gsi_next (&psi)) - { - gphi_iterator gsi = psi; - gphi *phi = psi.phi (); - - /* At this point, PHI should be a close phi in normal form. */ - gcc_assert (gimple_phi_num_args (phi) == 1); - - /* Iterate over the next phis and remove duplicates. */ - gsi_next (&gsi); - while (!gsi_end_p (gsi)) - if (same_close_phi_node (phi, gsi.phi ())) - remove_duplicate_close_phi (phi, &gsi); - else - gsi_next (&gsi); - } -} - -/* Return true when NAME is defined in LOOP. */ - -static bool -defined_in_loop_p (tree name, loop_p loop) -{ - gcc_assert (TREE_CODE (name) == SSA_NAME); - return loop == loop_containing_stmt (SSA_NAME_DEF_STMT (name)); -} - /* Transforms LOOP to the canonical loop closed SSA form. */ static void @@ -380,20 +300,22 @@ canonicalize_loop_closed_ssa (loop_p loo { edge e = single_exit (loop); basic_block bb; + gphi_iterator psi; if (!e || (e->flags & EDGE_COMPLEX)) return; bb = e->dest; + /* Make the loop-close PHI node BB contain only PHIs and have a + single predecessor. */ if (single_pred_p (bb)) { e = split_block_after_labels (bb); - make_close_phi_nodes_unique (e->src); + bb = e->src; } else { - gphi_iterator psi; basic_block close = split_edge (e); e = single_succ_edge (close); for (psi = gsi_start_phis (bb); !gsi_end_p (psi); gsi_next (&psi)) @@ -404,7 +326,7 @@ canonicalize_loop_closed_ssa (loop_p loo /* Only add close phi nodes for SSA_NAMEs defined in LOOP. */ if (TREE_CODE (arg) != SSA_NAME - || !defined_in_loop_p (arg, loop)) + || loop_containing_stmt (SSA_NAME_DEF_STMT (arg)) != loop) continue; tree res = copy_ssa_name (arg); @@ -413,8 +335,30 @@ canonicalize_loop_closed_ssa (loop_p loo UNKNOWN_LOCATION); SET_USE (use_p, res); } + bb = close; + } - make_close_phi_nodes_unique (close); + /* Eliminate duplicates. This relies on processing loops from + innermost to outer. */ + for (psi = gsi_start_phis (bb); !gsi_end_p (psi); gsi_next (&psi)) + { + gphi_iterator gsi = psi; + gphi *phi = psi.phi (); + + /* At this point, PHI should be a close phi in normal form. */ + gcc_assert (gimple_phi_num_args (phi) == 1); + + /* Iterate over the next phis and remove duplicates. */ + gsi_next (&gsi); + while (!gsi_end_p (gsi)) + if (gimple_phi_arg_def (phi, 0) == gimple_phi_arg_def (gsi.phi (), 0)) + { + replace_uses_by (gimple_phi_result (gsi.phi ()), + gimple_phi_result (phi)); + remove_phi_node (&gsi, true); + } + else + gsi_next (&gsi); } } @@ -443,7 +387,7 @@ static void canonicalize_loop_closed_ssa_form (void) { loop_p loop; - FOR_EACH_LOOP (loop, 0) + FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) canonicalize_loop_closed_ssa (loop); checking_verify_loop_closed_ssa (true); @@ -509,6 +453,19 @@ graphite_transform_loops (void) "loop nest optimized\n"); } + if (dump_file && (dump_flags & TDF_DETAILS)) + { + loop_p loop; + int num_no_dependency = 0; + + FOR_EACH_LOOP (loop, 0) + if (loop->can_be_parallel) + num_no_dependency++; + + fprintf (dump_file, "%d loops carried no dependency.\n", + num_no_dependency); + } + free_scops (scops); graphite_finalize (need_cfg_cleanup_p); the_isl_ctx = NULL; Index: gcc/testsuite/gcc.dg/graphite/scop-24.c =================================================================== --- gcc/testsuite/gcc.dg/graphite/scop-24.c (nonexistent) +++ gcc/testsuite/gcc.dg/graphite/scop-24.c (working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -floop-nest-optimize" } */ + +typedef struct _IO_FILE FILE; +extern struct _IO_FILE *stderr; +typedef float real; +typedef real rvec[3]; +int rgbset (int); +void ps_box (int, int); +void plot_phi(char *fn,rvec box,int natoms,rvec x[],real phi[]) +{ + real phi_max,rr,gg,bb,fac,dx,x0,y0; + int i; + for(i=0; (i __builtin_fabs(phi[i])) + ? phi_max : __builtin_fabs(phi[i])); + if (__builtin_fabs(phi_max)<1.2e-38) + __builtin_fprintf(stderr, "X"); + ps_box((real)(fac*box[0]-1),(real)(fac*box[1]-1)); + for(i=0; (i