From patchwork Thu Nov 29 04:11:16 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Teresa Johnson X-Patchwork-Id: 202654 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id C01552C0082 for ; Thu, 29 Nov 2012 15:11:35 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1354767096; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:To:Subject:Message-Id:Date:From: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=gccAzwZMpkgz8rFheZdK cthVZUw=; b=P3AOocjvvKwuC0cBoBUuMGeg8XijtKdggyckA+aZnIsrqmmGdz7P 5FD4ckt7UkB9yrNBZjwIzjbx4DPrlX24mGX/rfao/hkhkw6OgOlaGpXuJcPVcVXL tOOwcZ0s32CV2WJMaP7T1DlC1EX0eEPUDsyBBjBjyRdGKKU6oyKl/6w= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:Received:Received:Received:Received:To:Subject:Message-Id:Date:From:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=aOUaUWKciiuBHZDOES2ak6ErVy39+XS7ubI3JPvqXA5egPtF+wNwhgZzj7Qx6C /3lNY7wjkReR9rFhu6b4ypB2M/GBBd6YEFq8sxhDUSuvP6PX7Tv9uRvhYnVYAJoF GI16+hsm1u2feIBP6KNI27ureHYmjSwzO6AbIQqE660Vs=; Received: (qmail 25925 invoked by alias); 29 Nov 2012 04:11:29 -0000 Received: (qmail 25903 invoked by uid 22791); 29 Nov 2012 04:11:28 -0000 X-SWARE-Spam-Status: No, hits=-4.7 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KHOP_RCVD_TRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail-lb0-f201.google.com (HELO mail-lb0-f201.google.com) (209.85.217.201) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 29 Nov 2012 04:11:20 +0000 Received: by mail-lb0-f201.google.com with SMTP id m4so962982lbo.2 for ; Wed, 28 Nov 2012 20:11:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=to:subject:message-id:date:from:x-gm-message-state; bh=LcAxCxlf7IPWUGziCnOuK9CyiTXOl9QJrY6jgLWU2sQ=; b=JudqRh/A1s7gIvOO3/DB2/0FrziOsBSyxtB0GbvD28FS4YzZWqKaQ0FIZ3J4wV0Sxu OVFkcLypHHsvGrdxvL/qCCojQiNIW8/8+HN5KfEZaM+wk/EsTE9vKeu8p6vqeI0pZsU3 VlmUNphfQPFlISgRuy1j7/JKQym7pMzd8pdxB5OfFrNfeZdmwE4TZI1QIphqNWEBm/dd NfCA86K2gJHGL/5Ur36pX0VKlb/9qpHatXgaNl484BMFmcFHXUw8G7DpAshfuVeLACT0 Y1Q1bLU9zoe1b/yw2IzUy5KA5ql9Bg3gfdwBzednb6DvoV9fcimRrZfZOHeENemlHuUS KuIQ== Received: by 10.14.180.2 with SMTP id i2mr3831545eem.1.1354162278145; Wed, 28 Nov 2012 20:11:18 -0800 (PST) Received: from hpza10.eem.corp.google.com ([74.125.121.33]) by gmr-mx.google.com with ESMTPS id z47si101530eel.0.2012.11.28.20.11.18 (version=TLSv1/SSLv3 cipher=AES128-SHA); Wed, 28 Nov 2012 20:11:18 -0800 (PST) Received: from tjsboxrox.mtv.corp.google.com (tjsboxrox.mtv.corp.google.com [172.17.129.49]) by hpza10.eem.corp.google.com (Postfix) with ESMTP id 9A83720004E; Wed, 28 Nov 2012 20:11:17 -0800 (PST) Received: by tjsboxrox.mtv.corp.google.com (Postfix, from userid 147431) id DB81E61422; Wed, 28 Nov 2012 20:11:16 -0800 (PST) To: reply@codereview.appspotmail.com,hubicka@ucw.cz,gcc-patches@gcc.gnu.org Subject: [PATCH] Stream profile summary histogram through LTO files (issue6782131) Message-Id: <20121129041116.DB81E61422@tjsboxrox.mtv.corp.google.com> Date: Wed, 28 Nov 2012 20:11:16 -0800 (PST) From: tejohnson@google.com (Teresa Johnson) X-Gm-Message-State: ALoCoQnUNiqN2UOSvvOHgtNJBG3/AbByxls0tdFlP7UC0fOIGIOo6oN0MxjUi4Uh4FK5ZI/PRVoLKIiSfe3ZdWr29Y/v0/SR7Hsh1Ycvb6NvSQLaOl2xubynaTo07YmVGSrUM433ud0r/Dp1GoV29qpI7dcTZObQvTrgb0heswvW7WhBby2B0qNGF+b90Psct2oTVCNGpleo X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch ensures that the histograms from the profile summary are streamed through the LTO files so that the working set can be computed for use in downstream optimizations. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2012-11-28 Teresa Johnson * lto-cgraph.c (output_profile_summary): Stream out sum_all and histogram. (input_profile_summary): Stream in sum_all and histogram. (merge_profile_summaries): Merge sum_all and histogram. (input_symtab): Call compute_working_sets after merging summaries. * gcov-io.c (gcov_histo_index): Make extern for compiler. * gcov-io.h (gcov_histo_index): Ditto. * profile.c (compute_working_sets): Remove static keyword. * profile.h (compute_working_sets): Ditto. --- This patch is available for review at http://codereview.appspot.com/6782131 Index: lto-cgraph.c =================================================================== --- lto-cgraph.c (revision 193909) +++ lto-cgraph.c (working copy) @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-streamer.h" #include "gcov-io.h" #include "tree-pass.h" +#include "profile.h" static void output_cgraph_opt_summary (void); static void input_cgraph_opt_summary (vec nodes); @@ -593,14 +594,49 @@ lto_output_ref (struct lto_simple_output_block *ob static void output_profile_summary (struct lto_simple_output_block *ob) { + unsigned h_ix, bv_ix, h_cnt = 0; + unsigned histo_bitvector[GCOV_HISTOGRAM_BITVECTOR_SIZE]; + if (profile_info) { - /* We do not output num, sum_all and run_max, they are not used by - GCC profile feedback and they are difficult to merge from multiple - units. */ + /* We do not output num and run_max, they are not used by + GCC profile feedback and they are difficult to merge from multiple + units. */ gcc_assert (profile_info->runs); streamer_write_uhwi_stream (ob->main_stream, profile_info->runs); streamer_write_uhwi_stream (ob->main_stream, profile_info->sum_max); + + /* sum_all is needed for computing the working set with the + histogram. */ + streamer_write_uhwi_stream (ob->main_stream, profile_info->sum_all); + + /* Count number of non-zero histogram entries, and fill in a bit vector + of non-zero indices. */ + counters. */ + for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++) + histo_bitvector[bv_ix] = 0; + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) + { + if (profile_info->histogram[h_ix].num_counters > 0) + { + histo_bitvector[h_ix / 32] |= 1 << (h_ix % 32); + h_cnt++; + } + } + /* Output the bitvector and the non-zero entries. */ + for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++) + streamer_write_uhwi_stream (ob->main_stream, histo_bitvector[bv_ix]); + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) + { + if (!profile_info->histogram[h_ix].num_counters) + continue; + streamer_write_uhwi_stream (ob->main_stream, + profile_info->histogram[h_ix].num_counters); + streamer_write_uhwi_stream (ob->main_stream, + profile_info->histogram[h_ix].min_value); + streamer_write_uhwi_stream (ob->main_stream, + profile_info->histogram[h_ix].cum_value); + } } else streamer_write_uhwi_stream (ob->main_stream, 0); @@ -1227,11 +1263,58 @@ static void input_profile_summary (struct lto_input_block *ib, struct lto_file_decl_data *file_data) { + unsigned h_ix, bv_ix, h_cnt = 0; + unsigned histo_bitvector[GCOV_HISTOGRAM_BITVECTOR_SIZE]; + unsigned cur_bitvector; unsigned int runs = streamer_read_uhwi (ib); if (runs) { file_data->profile_info.runs = runs; file_data->profile_info.sum_max = streamer_read_uhwi (ib); + file_data->profile_info.sum_all = streamer_read_uhwi (ib); + + memset (file_data->profile_info.histogram, 0, + sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE); + /* Input the bitvector of non-zero histogram indices. */ + for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++) + { + histo_bitvector[bv_ix] = streamer_read_uhwi (ib); + h_cnt += __builtin_popcountll (histo_bitvector[bv_ix]); + } + bv_ix = 0; + h_ix = 0; + cur_bitvector = 0; + while (h_cnt--) + { + /* Find the index corresponding to the next entry we will read in. + First find the next non-zero bitvector and re-initialize + the histogram index accordingly, then right shift and increment + the index until we find a set bit. */ + while (!cur_bitvector) + { + h_ix = bv_ix * 32; + gcc_assert(bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE); + cur_bitvector = histo_bitvector[bv_ix++]; + } + while (!(cur_bitvector & 0x1)) + { + h_ix++; + cur_bitvector >>= 1; + } + gcc_assert(h_ix < GCOV_HISTOGRAM_SIZE); + + file_data->profile_info.histogram[h_ix].num_counters + = streamer_read_uhwi (ib); + file_data->profile_info.histogram[h_ix].min_value + = streamer_read_uhwi (ib); + file_data->profile_info.histogram[h_ix].cum_value + = streamer_read_uhwi (ib); + + /* Shift off the index we are done with and increment to the + corresponding next histogram entry. */ + cur_bitvector >>= 1; + h_ix++; + } } } @@ -1242,10 +1325,13 @@ static void merge_profile_summaries (struct lto_file_decl_data **file_data_vec) { struct lto_file_decl_data *file_data; - unsigned int j; + unsigned int j, h_ix; gcov_unsigned_t max_runs = 0; struct cgraph_node *node; struct cgraph_edge *edge; + gcov_type saved_sum_all = 0; + gcov_ctr_summary *saved_profile_info = 0; + int saved_scale = 0; /* Find unit with maximal number of runs. If we ever get serious about roundoff errors, we might also consider computing smallest common @@ -1269,6 +1355,8 @@ merge_profile_summaries (struct lto_file_decl_data profile_info = <o_gcov_summary; lto_gcov_summary.runs = max_runs; lto_gcov_summary.sum_max = 0; + memset (lto_gcov_summary.histogram, 0, + sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE); /* Rescale all units to the maximal number of runs. sum_max can not be easily merged, as we have no idea what files come from @@ -1284,8 +1372,46 @@ merge_profile_summaries (struct lto_file_decl_data * scale + REG_BR_PROB_BASE / 2) / REG_BR_PROB_BASE); + lto_gcov_summary.sum_all = MAX (lto_gcov_summary.sum_all, + (file_data->profile_info.sum_all + * scale + + REG_BR_PROB_BASE / 2) + / REG_BR_PROB_BASE); + /* Save a pointer to the profile_info with the largest + scaled sum_all and the scale for use in merging the + histogram. */ + if (lto_gcov_summary.sum_all > saved_sum_all) + { + saved_profile_info = &file_data->profile_info; + saved_sum_all = lto_gcov_summary.sum_all; + saved_scale = scale; + } } + gcc_assert (saved_profile_info); + + /* Scale up the histogram from the profile that had the largest + scaled sum_all above. */ + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) + { + /* Scale up the min value as we did the corresponding sum_all + above. Use that to find the new histogram index. */ + int scaled_min = (saved_profile_info->histogram[h_ix].min_value + * saved_scale + REG_BR_PROB_BASE / 2) + / REG_BR_PROB_BASE; + unsigned new_ix = gcov_histo_index (scaled_min); + lto_gcov_summary.histogram[new_ix].min_value = scaled_min; + /* Some of the scaled counter values would ostensibly need to be placed + into different (larger) histogram buckets, but we keep things simple + here and place the scaled cumulative counter value in the bucket + corresponding to the scaled minimum counter value. */ + lto_gcov_summary.histogram[new_ix].cum_value + = (saved_profile_info->histogram[h_ix].cum_value + * saved_scale + REG_BR_PROB_BASE / 2) / REG_BR_PROB_BASE; + lto_gcov_summary.histogram[new_ix].num_counters + = saved_profile_info->histogram[h_ix].num_counters; + } + /* Watch roundoff errors. */ if (lto_gcov_summary.sum_max < max_runs) lto_gcov_summary.sum_max = max_runs; @@ -1365,7 +1491,9 @@ input_symtab (void) } merge_profile_summaries (file_data_vec); + compute_working_sets (); + /* Clear out the aux field that was used to store enough state to tell which nodes should be overwritten. */ FOR_EACH_FUNCTION (node) Index: gcov-io.c =================================================================== --- gcov-io.c (revision 193909) +++ gcov-io.c (working copy) @@ -622,10 +622,10 @@ gcov_time (void) } #endif /* IN_GCOV */ -#if IN_LIBGCOV || !IN_GCOV +#if !IN_GCOV /* Determine the index into histogram for VALUE. */ -static unsigned +GCOV_LINKAGE unsigned gcov_histo_index(gcov_type value) { gcov_type_unsigned v = (gcov_type_unsigned)value; @@ -801,4 +801,4 @@ static void gcov_histogram_merge(gcov_bucket_type /* Finally, copy the merged histogram into tgt_histo. */ memcpy(tgt_histo, tmp_histo, sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE); } -#endif /* IN_LIBGCOV || !IN_GCOV */ +#endif /* !IN_GCOV */ Index: gcov-io.h =================================================================== --- gcov-io.h (revision 193909) +++ gcov-io.h (working copy) @@ -608,6 +608,7 @@ GCOV_LINKAGE void gcov_sync (gcov_position_t /*bas #if !IN_GCOV /* Available outside gcov */ GCOV_LINKAGE void gcov_write_unsigned (gcov_unsigned_t) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE unsigned gcov_histo_index (gcov_type value); #endif #if !IN_GCOV && !IN_LIBGCOV Index: profile.c =================================================================== --- profile.c (revision 193909) +++ profile.c (working copy) @@ -207,7 +207,7 @@ instrument_values (histogram_values values) the number of counters required to cover that working set percentage and the minimum counter value in that working set. */ -static void +void compute_working_sets (void) { gcov_type working_set_cum_values[NUM_GCOV_WORKING_SETS]; Index: profile.h =================================================================== --- profile.h (revision 193909) +++ profile.h (working copy) @@ -47,4 +47,6 @@ extern gcov_type sum_edge_counts (vec extern void init_node_map (void); extern void del_node_map (void); +extern void compute_working_sets (void); + #endif /* PROFILE_H */