From patchwork Fri Mar 27 05:36:46 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 455316 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8C6E01400A0 for ; Fri, 27 Mar 2015 16:37:58 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=u8/tfeNC; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=FoY0F1arjv8yXf/cd 0HICbTzi5HfsHZ6tRZY4p6NAEKKBFv3/XQ3P5n7OvC7WNL5Tm5tkblevuHrH+Pj8 V2HYF2G1ZmbBS40K/C1vokn2pvkfTTIDUqx+cfadr3tAXpliJQqAW65iHaE06jVk uG4mazBGOwq5Y/xi0osHJw5CZA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=u1bFCP8+DsCsGwPTKyK2x7k t3Xk=; b=u8/tfeNCzzaaOsd+LrvXL2UXb+l+F5BvTluZq+ACipXU88CtqzEKDkq TCIeOkv3e0ii5l4caZ9RpqqVBUgYyiRhzd/+dw4FOoLYrWr20nMsVYd9VGtWRzFG DBPpC/67W87MxJSgPiiHAeajMI19qb8m9zsqs1KNwpT5JFKqOfK4= Received: (qmail 86036 invoked by alias); 27 Mar 2015 05:37:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 85480 invoked by uid 89); 27 Mar 2015 05:36:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 27 Mar 2015 05:36:51 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 1EFB3546A0B; Fri, 27 Mar 2015 06:36:46 +0100 (CET) Date: Fri, 27 Mar 2015 06:36:46 +0100 From: Jan Hubicka To: Jan Hubicka Cc: Richard Biener , gcc-patches@gcc.gnu.org, manu@gcc.gnu.org Subject: Re: Optimize lto location stremaing Message-ID: <20150327053646.GA5017@kam.mff.cuni.cz> References: <20150325225448.GC85969@kam.mff.cuni.cz> <20150326143748.GF8802@atrey.karlin.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150326143748.GF8802@atrey.karlin.mff.cuni.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Hello, here is updated patch I intend to commit after bootstrap/regtest on x86_64-linux and some additional testing on Chromium/libreoffice (it works on Firefox). I turned the cache into class. I tried to avoid global variable, but ended up with the pointer to current cache because the use in ipa-devirt is bit deep in the code. We can cleanup it next stage1. Otherwise I think it is stil win to not polute global vars more and not dangle the location cache vector. Honza * lto-streamer.h (class lto_location_cache): New. (struct data_in): Add location_cache. (lto_input_location): Update prototype. (stream_input_location_now): New. * streamer-hooks.h (struct streamer_hooks): Make input_location to take pointer to location. (stream_input_location): Update. * ipa-devirt.c: Include streamer-hooks.h and lto-streamer.h (warn_odr): Apply location cache before warning. (lto_input_location): Update prototype. * gimple-streamer-in.c (input_phi, input_gimple_stmt): Use stream_input_location_now. * lto/lto.c (unify_scc): Revert location cache when unification suceeded. (lto_read_decls): Accept location cache after sucess; apply location cache before calling debug hooks. * lto-streamer-in.c (lto_location_cache::current_cache): New static variable. (lto_location_cache::cmp_loc): New function. (lto_location_cache::apply_location_cache): New function. (lto_location_cache::accept_location_cache): New function. (lto_location_cache::revert_location_cache): New function. (lto_location_cache::input_location): New function. (lto_input_location): Do location caching. (stream_input_location_now): New function. (input_eh_region, input_struct_function_base): Use stream_input_location_now. (lto_data_in_create): use new. (lto_data_in_delete): Use delete. * tree-streamer-in.c (unpack_ts_block_value_fields, unpack_ts_omp_clause_value_fields, streamer_read_tree_bitfields, lto_input_ts_exp_tree_pointers): Update for cached location api. Index: gimple-streamer-in.c =================================================================== --- gimple-streamer-in.c (revision 221703) +++ gimple-streamer-in.c (working copy) @@ -87,7 +87,9 @@ input_phi (struct lto_input_block *ib, b tree def = stream_read_tree (ib, data_in); int src_index = streamer_read_uhwi (ib); bitpack_d bp = streamer_read_bitpack (ib); - location_t arg_loc = stream_input_location (&bp, data_in); + /* Do not cache a location - we do not have API to get pointer to the + location in PHI statement and we may trigger reallocation. */ + location_t arg_loc = stream_input_location_now (&bp, data_in); basic_block sbb = BASIC_BLOCK_FOR_FN (fn, src_index); edge e = NULL; @@ -134,8 +136,9 @@ input_gimple_stmt (struct lto_input_bloc has_hist = bp_unpack_value (&bp, 1); stmt->subcode = bp_unpack_var_len_unsigned (&bp); - /* Read location information. */ - gimple_set_location (stmt, stream_input_location (&bp, data_in)); + /* Read location information. Caching here makes no sense until streamer + cache can handle the following gimple_set_block. */ + gimple_set_location (stmt, stream_input_location_now (&bp, data_in)); /* Read lexical block reference. */ gimple_set_block (stmt, stream_read_tree (ib, data_in)); Index: lto-streamer.h =================================================================== --- lto-streamer.h (revision 221703) +++ lto-streamer.h (working copy) @@ -307,6 +307,71 @@ typedef void (lto_free_section_data_f) ( const char *, size_t); +/* The location cache holds expanded locations for streamed in trees. + This is done to reduce memory usage of libcpp linemap that strongly preffers + locations to be inserted in the soruce order. */ + +class lto_location_cache +{ +public: + /* Apply all changes in location cache. Add locations into linemap and patch + trees. */ + bool apply_location_cache (); + /* Tree merging did not suceed; mark all changes in the cache as accepted. */ + void accept_location_cache (); + /* Tree merging did suceed; throw away recent changes. */ + void revert_location_cache (); + void input_location (location_t *loc, struct bitpack_d *bp, + struct data_in *data_in); + lto_location_cache () + : loc_cache (), accepted_length (0), current_file (NULL), current_line (0), + current_col (0), current_loc (UNKNOWN_LOCATION) + { + gcc_assert (!current_cache); + current_cache = this; + } + ~lto_location_cache () + { + apply_location_cache (); + gcc_assert (current_cache == this); + current_cache = NULL; + } + + /* There can be at most one instance of location cache (combining multiple + would bring it out of sync with libcpp linemap); point to current + one. */ + static lto_location_cache *current_cache; + +private: + static int cmp_loc (const void *pa, const void *pb); + + struct cached_location + { + const char *file; + location_t *loc; + int line, col; + }; + + /* The location cache. */ + + vec loc_cache; + + /* Accepted entries are ones used by trees that are known to be not unified + by tree merging. */ + + int accepted_length; + + /* Bookkeeping to remember state in between calls to lto_apply_location_cache + When streaming gimple, the location cache is not used and thus + lto_apply_location_cache happens per location basis. It is then + useful to avoid redundant calls of linemap API. */ + + const char *current_file; + int current_line; + int current_col; + location_t current_loc; +}; + /* Structure used as buffer for reading an LTO file. */ class lto_input_block { @@ -680,6 +745,9 @@ struct data_in /* Cache of pickled nodes. */ struct streamer_tree_cache_d *reader_cache; + + /* Cache of source code location. */ + lto_location_cache location_cache; }; @@ -788,7 +856,9 @@ extern struct data_in *lto_data_in_creat vec ); extern void lto_data_in_delete (struct data_in *); extern void lto_input_data_block (struct lto_input_block *, void *, size_t); -location_t lto_input_location (struct bitpack_d *, struct data_in *); +void lto_input_location (location_t *, struct bitpack_d *, struct data_in *); +location_t stream_input_location_now (struct bitpack_d *bp, + struct data_in *data); tree lto_input_tree_ref (struct lto_input_block *, struct data_in *, struct function *, enum LTO_tags); void lto_tag_check_set (enum LTO_tags, int, ...); Index: lto-streamer-in.c =================================================================== --- lto-streamer-in.c (revision 221703) +++ lto-streamer-in.c (working copy) @@ -172,47 +172,174 @@ canon_file_name (const char *string) } } +/* Pointer to currently alive instance of lto_location_cache. */ -/* Read a location bitpack from input block IB. */ +lto_location_cache *lto_location_cache::current_cache; -location_t -lto_input_location (struct bitpack_d *bp, struct data_in *data_in) +/* Sort locations in source order. Start with file from last application. */ + +int +lto_location_cache::cmp_loc (const void *pa, const void *pb) +{ + const cached_location *a = ((const cached_location *)pa); + const cached_location *b = ((const cached_location *)pb); + const char *current_file = current_cache->current_file; + int current_line = current_cache->current_line; + + if (a->file == current_file && b->file != current_file) + return -1; + if (a->file != current_file && b->file == current_file) + return 1; + if (a->file == current_file && b->file == current_file) + { + if (a->line == current_line && b->line != current_line) + return -1; + if (a->line != current_line && b->line == current_line) + return 1; + } + if (a->file != b->file) + return strcmp (a->file, b->file); + if (a->line != b->line) + return a->line - b->line; + return a->col - b->col; +} + +/* Apply all changes in location cache. Add locations into linemap and patch + trees. */ + +bool +lto_location_cache::apply_location_cache () +{ + static const char *prev_file; + if (!loc_cache.length ()) + return false; + if (loc_cache.length () > 1) + loc_cache.qsort (cmp_loc); + + for (unsigned int i = 0; i < loc_cache.length (); i++) + { + struct cached_location loc = loc_cache[i]; + + if (current_file != loc.file) + linemap_add (line_table, prev_file ? LC_RENAME : LC_ENTER, + false, loc.file, loc.line); + else if (current_line != loc.line) + { + int max = loc.col; + + for (unsigned int j = i + 1; j < loc_cache.length (); j++) + if (loc.file != loc_cache[j].file + || loc.line != loc_cache[j].line) + break; + else if (max < loc_cache[j].col) + max = loc_cache[j].col; + linemap_line_start (line_table, loc.line, max + 1); + } + gcc_assert (*loc.loc == BUILTINS_LOCATION + 1); + if (current_file == loc.file && current_line == loc.line + && current_col == loc.col) + *loc.loc = current_loc; + else + current_loc = *loc.loc = linemap_position_for_column (line_table, + loc.col); + current_line = loc.line; + prev_file = current_file = loc.file; + current_col = loc.col; + } + loc_cache.truncate (0); + accepted_length = 0; + return true; +} + +/* Tree merging did not suceed; mark all changes in the cache as accepted. */ + +void +lto_location_cache::accept_location_cache () { - static const char *current_file; - static int current_line; - static int current_col; + gcc_assert (current_cache == this); + accepted_length = loc_cache.length (); +} + +/* Tree merging did suceed; throw away recent changes. */ + +void +lto_location_cache::revert_location_cache () +{ + loc_cache.truncate (accepted_length); +} + +/* Read a location bitpack from input block IB and either update *LOC directly + or add it to the location cache. + It is neccesary to call apply_location_cache to get *LOC updated. */ + +void +lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp, + struct data_in *data_in) +{ + static const char *stream_file; + static int stream_line; + static int stream_col; bool file_change, line_change, column_change; - bool prev_file = current_file != NULL; + + gcc_assert (current_cache == this); if (bp_unpack_value (bp, 1)) - return UNKNOWN_LOCATION; + { + *loc = UNKNOWN_LOCATION; + return; + } + *loc = BUILTINS_LOCATION + 1; file_change = bp_unpack_value (bp, 1); line_change = bp_unpack_value (bp, 1); column_change = bp_unpack_value (bp, 1); if (file_change) - current_file = canon_file_name (bp_unpack_string (data_in, bp)); + stream_file = canon_file_name (bp_unpack_string (data_in, bp)); if (line_change) - current_line = bp_unpack_var_len_unsigned (bp); + stream_line = bp_unpack_var_len_unsigned (bp); if (column_change) - current_col = bp_unpack_var_len_unsigned (bp); + stream_col = bp_unpack_var_len_unsigned (bp); - if (file_change) + /* This optimization saves location cache operations druing gimple + streaming. */ + + if (current_file == stream_file && current_line == stream_line + && current_col == stream_col) { - if (prev_file) - linemap_add (line_table, LC_LEAVE, false, NULL, 0); - - linemap_add (line_table, LC_ENTER, false, current_file, current_line); + *loc = current_loc; + return; } - else if (line_change) - linemap_line_start (line_table, current_line, current_col); - return linemap_position_for_column (line_table, current_col); + struct cached_location entry = {stream_file, loc, stream_line, stream_col}; + loc_cache.safe_push (entry); } +/* Read a location bitpack from input block IB and either update *LOC directly + or add it to the location cache. + It is neccesary to call apply_location_cache to get *LOC updated. */ + +void +lto_input_location (location_t *loc, struct bitpack_d *bp, + struct data_in *data_in) +{ + data_in->location_cache.input_location (loc, bp, data_in); +} + +/* Read location and return it instead of going through location caching. + This should be used only when the resulting location is not going to be + discarded. */ + +location_t +stream_input_location_now (struct bitpack_d *bp, struct data_in *data_in) +{ + location_t loc; + stream_input_location (&loc, bp, data_in); + data_in->location_cache.apply_location_cache (); + return loc; +} /* Read a reference to a tree node from DATA_IN using input block IB. TAG is the expected node that should be found in IB, if TAG belongs @@ -390,7 +517,7 @@ input_eh_region (struct lto_input_block r->u.must_not_throw.failure_decl = stream_read_tree (ib, data_in); bitpack_d bp = streamer_read_bitpack (ib); r->u.must_not_throw.failure_loc - = stream_input_location (&bp, data_in); + = stream_input_location_now (&bp, data_in); } break; @@ -927,8 +1054,8 @@ input_struct_function_base (struct funct fn->last_clique = bp_unpack_value (&bp, sizeof (short) * 8); /* Input the function start and end loci. */ - fn->function_start_locus = stream_input_location (&bp, data_in); - fn->function_end_locus = stream_input_location (&bp, data_in); + fn->function_start_locus = stream_input_location_now (&bp, data_in); + fn->function_end_locus = stream_input_location_now (&bp, data_in); } @@ -1126,6 +1253,7 @@ lto_read_body_or_constructor (struct lto } else input_constructor (fn_decl, data_in, &ib_main); + data_in->location_cache.apply_location_cache (); /* And fixup types we streamed locally. */ { struct streamer_tree_cache_d *cache = data_in->reader_cache; @@ -1543,7 +1671,7 @@ lto_data_in_create (struct lto_file_decl unsigned len, vec resolutions) { - struct data_in *data_in = XCNEW (struct data_in); + struct data_in *data_in = new (struct data_in); data_in->file_data = file_data; data_in->strings = strings; data_in->strings_len = len; @@ -1560,5 +1688,5 @@ lto_data_in_delete (struct data_in *data { data_in->globals_resolution.release (); streamer_tree_cache_delete (data_in->reader_cache); - free (data_in); + delete data_in; } Index: streamer-hooks.h =================================================================== --- streamer-hooks.h (revision 221703) +++ streamer-hooks.h (working copy) @@ -52,7 +52,7 @@ struct streamer_hooks { tree (*read_tree) (struct lto_input_block *, struct data_in *); /* [REQ] Called by every streaming routine that needs to read a location. */ - location_t (*input_location) (struct bitpack_d *, struct data_in *); + void (*input_location) (location_t *, struct bitpack_d *, struct data_in *); /* [REQ] Called by every streaming routine that needs to write a location. */ void (*output_location) (struct output_block *, struct bitpack_d *, location_t); @@ -67,8 +67,8 @@ struct streamer_hooks { #define stream_read_tree(IB, DATA_IN) \ streamer_hooks.read_tree (IB, DATA_IN) -#define stream_input_location(BP, DATA_IN) \ - streamer_hooks.input_location (BP, DATA_IN) +#define stream_input_location(LOCPTR, BP, DATA_IN) \ + streamer_hooks.input_location (LOCPTR, BP, DATA_IN) #define stream_output_location(OB, BP, LOC) \ streamer_hooks.output_location (OB, BP, LOC) Index: tree-streamer-in.c =================================================================== --- tree-streamer-in.c (revision 221703) +++ tree-streamer-in.c (working copy) @@ -411,7 +411,7 @@ unpack_ts_block_value_fields (struct dat { BLOCK_ABSTRACT (expr) = (unsigned) bp_unpack_value (bp, 1); /* BLOCK_NUMBER is recomputed. */ - BLOCK_SOURCE_LOCATION (expr) = stream_input_location (bp, data_in); + stream_input_location (&BLOCK_SOURCE_LOCATION (expr), bp, data_in); } /* Unpack all the non-pointer fields of the TS_TRANSLATION_UNIT_DECL @@ -433,7 +433,7 @@ static void unpack_ts_omp_clause_value_fields (struct data_in *data_in, struct bitpack_d *bp, tree expr) { - OMP_CLAUSE_LOCATION (expr) = stream_input_location (bp, data_in); + stream_input_location (&OMP_CLAUSE_LOCATION (expr), bp, data_in); switch (OMP_CLAUSE_CODE (expr)) { case OMP_CLAUSE_DEFAULT: @@ -503,7 +503,7 @@ streamer_read_tree_bitfields (struct lto unpack_ts_fixed_cst_value_fields (&bp, expr); if (CODE_CONTAINS_STRUCT (code, TS_DECL_MINIMAL)) - DECL_SOURCE_LOCATION (expr) = stream_input_location (&bp, data_in); + stream_input_location (&DECL_SOURCE_LOCATION (expr), &bp, data_in); if (CODE_CONTAINS_STRUCT (code, TS_DECL_COMMON)) unpack_ts_decl_common_value_fields (&bp, expr); @@ -522,7 +522,7 @@ streamer_read_tree_bitfields (struct lto if (CODE_CONTAINS_STRUCT (code, TS_EXP)) { - SET_EXPR_LOCATION (expr, stream_input_location (&bp, data_in)); + stream_input_location (&EXPR_CHECK (expr)->exp.locus, &bp, data_in); if (code == MEM_REF || code == TARGET_MEM_REF) { @@ -905,11 +905,20 @@ lto_input_ts_exp_tree_pointers (struct l struct data_in *data_in, tree expr) { int i; + tree block; for (i = 0; i < TREE_OPERAND_LENGTH (expr); i++) TREE_OPERAND (expr, i) = stream_read_tree (ib, data_in); - TREE_SET_BLOCK (expr, stream_read_tree (ib, data_in)); + block = stream_read_tree (ib, data_in); + + /* TODO: Block is stored in the locus information. It may make more sense to + to make it go via the location cache. */ + if (block) + { + data_in->location_cache.apply_location_cache (); + TREE_SET_BLOCK (expr, block); + } } Index: ipa-devirt.c =================================================================== --- ipa-devirt.c (revision 221703) +++ ipa-devirt.c (working copy) @@ -166,6 +166,8 @@ along with GCC; see the file COPYING3. #include "gimple-pretty-print.h" #include "stor-layout.h" #include "intl.h" +#include "streamer-hooks.h" +#include "lto-streamer.h" /* Hash based set of pairs of types. */ typedef struct @@ -935,6 +937,10 @@ warn_odr (tree t1, tree t2, tree st1, tr if (!warn || !TYPE_NAME(t1)) return; + /* ODR warnings are output druing LTO streaming; we must apply location + cache for potential warnings to be output correctly. */ + lto_location_cache::current_cache->apply_location_cache (); + if (!warning_at (DECL_SOURCE_LOCATION (TYPE_NAME (t1)), OPT_Wodr, "type %qT violates one definition rule", t1)) Index: lto/lto.c =================================================================== --- lto/lto.c (revision 221703) +++ lto/lto.c (working copy) @@ -1734,10 +1734,11 @@ cmp_tree (const void *p1_, const void *p that was successful, otherwise return false. */ static bool -unify_scc (struct streamer_tree_cache_d *cache, unsigned from, +unify_scc (struct data_in *data_in, unsigned from, unsigned len, unsigned scc_entry_len, hashval_t scc_hash) { bool unified_p = false; + struct streamer_tree_cache_d *cache = data_in->reader_cache; tree_scc *scc = (tree_scc *) alloca (sizeof (tree_scc) + (len - 1) * sizeof (tree)); scc->next = NULL; @@ -1827,6 +1828,7 @@ unify_scc (struct streamer_tree_cache_d } /* Free the tree nodes from the read SCC. */ + data_in->location_cache.revert_location_cache (); for (unsigned i = 0; i < len; ++i) { enum tree_code code; @@ -1920,10 +1922,14 @@ lto_read_decls (struct lto_file_decl_dat /* Try to unify the SCC with already existing ones. */ if (!flag_ltrans - && unify_scc (data_in->reader_cache, from, + && unify_scc (data_in, from, len, scc_entry_len, scc_hash)) continue; + /* Tree merging failed, mark entries in location cache as + permanent. */ + data_in->location_cache.accept_location_cache (); + bool seen_type = false; for (unsigned i = 0; i < len; ++i) { @@ -1953,7 +1959,13 @@ lto_read_decls (struct lto_file_decl_dat /* Register TYPE_DECLs with the debuginfo machinery. */ if (!flag_wpa && TREE_CODE (t) == TYPE_DECL) - debug_hooks->type_decl (t, !DECL_FILE_SCOPE_P (t)); + { + /* Dwarf2out needs location information. + TODO: Moving this out of the streamer loop may noticealy + improve ltrans linemap memory use. */ + data_in->location_cache.apply_location_cache (); + debug_hooks->type_decl (t, !DECL_FILE_SCOPE_P (t)); + } if (!flag_ltrans) { /* Register variables and functions with the @@ -1979,6 +1991,7 @@ lto_read_decls (struct lto_file_decl_dat gcc_assert (t && data_in->reader_cache->nodes.length () == from); } } + data_in->location_cache.apply_location_cache (); /* Read in lto_in_decl_state objects. */ data_ptr = (const uint32_t *) ((const char*) data + decl_offset);