From patchwork Thu Jun 23 17:59:34 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sharad Singhai X-Patchwork-Id: 101660 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 121D0B6F73 for ; Fri, 24 Jun 2011 04:01:07 +1000 (EST) Received: (qmail 6678 invoked by alias); 23 Jun 2011 18:01:00 -0000 Received: (qmail 6545 invoked by uid 22791); 23 Jun 2011 18:00:31 -0000 X-SWARE-Spam-Status: No, hits=-0.1 required=5.0 tests=AWL, BAYES_50, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_PASS, TW_CP, TW_FN, TW_PF, TW_XF, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 23 Jun 2011 17:59:58 +0000 Received: from kpbe14.cbf.corp.google.com (kpbe14.cbf.corp.google.com [172.25.105.78]) by smtp-out.google.com with ESMTP id p5NHxumL012278 for ; Thu, 23 Jun 2011 10:59:57 -0700 Received: from pwi15 (pwi15.prod.google.com [10.241.219.15]) by kpbe14.cbf.corp.google.com with ESMTP id p5NHxtne001780 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Thu, 23 Jun 2011 10:59:55 -0700 Received: by pwi15 with SMTP id 15so1300995pwi.25 for ; Thu, 23 Jun 2011 10:59:54 -0700 (PDT) Received: by 10.142.149.31 with SMTP id w31mr538555wfd.31.1308851994175; Thu, 23 Jun 2011 10:59:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.216.10 with HTTP; Thu, 23 Jun 2011 10:59:34 -0700 (PDT) In-Reply-To: <20cf30563863d3fd6104a6648929@google.com> References: <20cf30563863d3fd6104a6648929@google.com> From: =?UTF-8?B?U2hhcmFkIFNpbmdoYWkgKOCktuCksOCkpiDgpLjgpL/gpILgpJjgpIgp?= Date: Thu, 23 Jun 2011 10:59:34 -0700 Message-ID: Subject: Re: add PMU profiling support (issue4638047) To: singhai@google.com, davidxl@google.com, xur@google.com, gcc-patches@gcc.gnu.org, reply@codereview.appspotmail.com X-System-Of-Record: true Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi Rong, Thanks for your comments and testing for kernel FDO. I have updated the patch with your suggested fixes for gi_filename_up in the kernel FDO path. Okay for google/main? Sharad On Thu, Jun 23, 2011 at 10:37 AM, wrote: > > I verified the patch works for kernel fdo once gi_filename_up is fixed. > > OK for google-main after fixing this issue. > > Thanks, > > -Rong > > > http://codereview.appspot.com/4638047/diff/5001/gcc/libgcov.c > File gcc/libgcov.c (left): > > http://codereview.appspot.com/4638047/diff/5001/gcc/libgcov.c#oldcode1093 > gcc/libgcov.c:1093: gi_filename_up = gi_filename + prefix_length; > This line needs to move to kernel version of gcov_alloc_filename() to > initialize gi_filename_up properly. > > http://codereview.appspot.com/4638047/ 2011-06-23 Sharad Singhai * libgcc/Makefile.in: Add pmu-profile.o to libgcov. * gcc/doc/invoke.texi: Document new pmu profile related options. * gcc/doc/gcov.texi: Document new options -m and -q. * gcc/gcc.c: Link libgcov for -fpmu-profile-generate option. * gcc/gcov.c (filter_pmu_data_lines): New function. (output_pmu_data_header): Ditto. (output_pmu_data): Ditto. (output_load_latency_line): Ditto. (output_branch_mispredict_line): Ditto. (static void process_pmu_profile): Ditto. * gcc/gcov-io.c (gcov_canonical_filename): New function. (gcov_read_pmu_load_latency_info): Ditto. (gcov_read_pmu_branch_mispredict_info): Ditto. (gcov_read_pmu_tool_header): Ditto. (gcov_string_length): Ditto. (convert_unsigned_to_pct): Ditto. (print_load_latency_line): Ditto. (print_branch_mispredict_line): Ditto. (print_pmu_tool_header): Ditto. (destroy_pmu_tool_header): Ditto. (gcov_read_string): Make it unconditionally available. * gcc/gcov-io.h (struct gcov_pmu_info): New structure. * gcc/opts.c: New option -fpmu-profile-generate. * gcc/pmu-profile.c (enum pmu_tool_type): New structure. (enum pmu_event_type): Ditto. (enum pmu_state): Ditto. (enum cpu_vendor_signature): Ditto. (struct pmu_tool_info): Ditto. (get_x86cpu_vendor): New function. (parse_pmu_profile_options): Ditto. (start_addr2line_symbolizer): Ditto. (reset_symbolizer_parent_pipes): Ditto. (reset_symbolizer_child_pipes): Ditto. (end_addr2line_symbolizer): Ditto. (symbolize_addr2line): Ditto. (start_pfmon_module): Ditto. (convert_pct_to_unsigned): Ditto. (parse_load_latency_line): Ditto. (parse_branch_mispredict_line): Ditto. (destroy_load_latency_infos): Ditto. (destroy_branch_mispredict_infos): Ditto. (parse_pfmon_load_latency): Ditto. (parse_pfmon_tool_header): Ditto. (parse_pfmon_branch_mispredicts): Ditto. (pmu_start): Ditto. (init_pmu_load_latency): Ditto. (init_pmu_branch_mispredict): Ditto. (init_pmu_tool): Ditto. (__gcov_init_pmu_profiler): Ditto. (__gcov_start_pmu_profiler): Ditto. (__gcov_stop_pmu_profiler): Ditto. (gcov_write_ll_line): Ditto. (gcov_write_branch_mispredict_line): Ditto. (gcov_write_load_latency_infos): Ditto. (gcov_write_branch_mispredict_infos): Ditto. (gcov_tag_pmu_tool_header_length): Ditto. (gcov_write_tool_header): Ditto. (__gcov_end_pmu_profiler): Ditto. * gcc/coverage.c (get_const_string_type): New function. (create_coverage): Do the coverage processing even if only flag_pmu_profile_generate is specified. (coverage_init): Call gimple_init_instrumentation_sampling from here instead from tree-profile.c:gimple_init_edge_profiler. (profiling_enabled_p): New function. (init_pmu_profiling): Ditto. (check_pmu_profile_options): Ditto. * gcc/coverage.h (check_pmu_profile_options): Declaration. (tree_init_instrumentation_sampling): Declaration. * gcc/common.opt: Add new options -fpmu-profile-generate and -fpmu-profile-use. * gcc/tree-profile.c (gimple_init_instrumentation_sampling): Make extern. Move the call from gimple_init_edge_profiler to coverage.c:coverage_init. * gcc/libgcov.c (gcov_alloc_filename) [__GCOV_KERNEL__]: Moved earlier. (gcov_alloc_filename) [!__GCOV_KERNEL__]: Initialize gi_filename_up. (pmu_profile_stop): New function. (gcov_dump_module_info): Replace gcov_strip_leading_dirs with a macro. (__gcov_init): Add initialization of PMU profiler. (gcov_exit): Add finalization of PMU profiler. (gcov_get_filename): Cleanup whitespaces. * gcc/params.def: New parameter pmu_profile_n_addresses. * gcc/gcov-dump.c (tag_pmu_load_latency_info): New function. (tag_pmu_branch_mispredict_info): Ditto. (tag_pmu_tool_header): Ditto. Index: libgcc/Makefile.in =================================================================== --- libgcc/Makefile.in (revision 175346) +++ libgcc/Makefile.in (working copy) @@ -747,10 +747,13 @@ dyn-ipa.o: %$(objext): $(gcc_srcdir)/libgcov.c $(gcc_compile) -c $(gcc_srcdir)/dyn-ipa.c +pmu-profile.o: %$(objext): $(gcc_srcdir)/libgcov.c + $(gcc_compile) -c $(gcc_srcdir)/pmu-profile.c + # Static libraries. libgcc.a: $(libgcc-objects) -libgcov.a: $(libgcov-objects) dyn-ipa$(objext) +libgcov.a: $(libgcov-objects) dyn-ipa$(objext) pmu-profile$(objext) libunwind.a: $(libunwind-objects) libgcc_eh.a: $(libgcc-eh-objects) Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 175346) +++ gcc/doc/invoke.texi (working copy) @@ -388,6 +388,8 @@ -fprofile-correction -fprofile-dir=@var{path} -fprofile-generate @gol -fprofile-generate=@var{path} -fprofile-generate-sampling @gol -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol +-fpmu-profile-generate=@var{pmuoption} @gol +-fpmu-profile-use=@var{pmuoption} @gol -freciprocal-math -fregmove -frename-registers -freorder-blocks @gol -freorder-blocks-and-partition -freorder-functions @gol -frerun-cse-after-loop -freschedule-modulo-scheduled-loops @gol @@ -8088,6 +8090,26 @@ If @var{path} is specified, GCC will look at the @var{path} to find the profile feedback data files. See @option{-fprofile-dir}. +@item -fpmu-profile-generate=@var{pmuoption} +@opindex fpmu-profile-generate + +Enable performance monitoring unit (PMU) profiling. This collects +hardware counter data corresponding to @var{pmuoption}. Currently +only @var{load-latency} and @var{branch-mispredict} are supported +using pfmon tool. You must use @option{-fpmu-profile-generate} both +when compiling and when linking your program. This PMU profile data +may later be used by the compiler during optimizations as well can be +displayed using coverage tool gcov. The params variable +"pmu_profile_n_addresses" can be used to restrict PMU data collection +to only this many addresses. + +@item -fpmu-profile-use=@var{pmuoption} +@opindex fpmu-profile-use + +Enable performance monitoring unit (PMU) profiling based +optimizations. Currently only @var{load-latency} and +@var{branch-mispredict} are supported. + @item -fripa @opindex fripa Perform dynamic inter-procedural analysis. This is used in conjunction with Index: gcc/doc/gcov.texi =================================================================== --- gcc/doc/gcov.texi (revision 175346) +++ gcc/doc/gcov.texi (working copy) @@ -124,9 +124,11 @@ [@option{-a}|@option{--all-blocks}] [@option{-b}|@option{--branch-probabilities}] [@option{-c}|@option{--branch-counts}] + [@option{-m}|@option{--pmu-profile}] [@option{-n}|@option{--no-output}] [@option{-l}|@option{--long-file-names}] [@option{-p}|@option{--preserve-paths}] + [@option{-q}|@option{--pmu_profile-path}] [@option{-f}|@option{--function-summaries}] [@option{-o}|@option{--object-directory} @var{directory|file}] @var{sourcefiles} [@option{-u}|@option{--unconditional-branches}] @@ -169,6 +171,14 @@ Write branch frequencies as the number of branches taken, rather than the percentage of branches taken. +@item -m +@itemx --pmu-profile +Output the additional PMU profile information if available. + +@item -q +@itemx --pmu_profile-path +PMU profile path (default @file{pmuprofile.gcda}). + @item -n @itemx --no-output Do not create the @command{gcov} output file. Index: gcc/gcc.c =================================================================== --- gcc/gcc.c (revision 175346) +++ gcc/gcc.c (working copy) @@ -662,7 +662,7 @@ %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\ %(mflib) " STACK_SPLIT_SPEC "\ - %{fprofile-arcs|fprofile-generate*|coverage:-lgcov}\ + %{fprofile-arcs|fprofile-generate*|fpmu-profile-generate*|coverage:-lgcov}\ %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}}}}}" #endif Index: gcc/gcov.c =================================================================== --- gcc/gcov.c (revision 175346) +++ gcc/gcov.c (working copy) @@ -209,6 +209,15 @@ char *name; } coverage_t; +/* Describes PMU profile data for either one source file or for the + entire program. */ + +typedef struct pmu_data +{ + ll_infos_t ll_infos; + brm_infos_t brm_infos; +} pmu_data_t; + /* Describes a single line of source. Contains a chain of basic blocks with code on it. */ @@ -242,6 +251,8 @@ coverage_t coverage; + pmu_data_t *pmu_data; /* PMU profile information for this file. */ + /* Functions in this source file. These are in ascending line number order. */ function_t *functions; @@ -301,6 +312,10 @@ /* Show unconditional branches too. */ static int flag_unconditional = 0; +/* Output performance monitoring unit (PMU) data, if available. */ + +static int flag_pmu_profile = 0; + /* Output a gcov file if this is true. This is on by default, and can be turned off by the -n option. */ @@ -345,6 +360,18 @@ static int flag_counts = 0; +/* PMU profile default filename. */ + +static char pmu_profile_default_filename[] = "pmuprofile.gcda"; + +/* PMU profile filename where the PMU profile data is read from. */ + +static char *pmu_profile_filename = 0; + +/* PMU data for the entire program. */ + +static pmu_data_t pmu_global_info; + /* Forward declarations. */ static void fnotice (FILE *, const char *, ...) ATTRIBUTE_PRINTF_2; static int process_args (int, char **); @@ -366,6 +393,17 @@ static void output_lines (FILE *, const source_t *); static char *make_gcov_file_name (const char *, const char *); static void release_structures (void); +static void process_pmu_profile (void); +static void filter_pmu_data_lines (source_t *src); +static void output_pmu_data_header (FILE *gcov_file, pmu_data_t *pmu_data); +static void output_pmu_data (FILE *gcov_file, const source_t *src, + const unsigned line_num); +static void output_load_latency_line (FILE *fp, + const gcov_pmu_ll_info_t *ll_info, + gcov_pmu_tool_header_t *tool_header); +static void output_branch_mispredict_line (FILE *fp, + const gcov_pmu_brm_info_t *brm_info); + extern int main (int, char **); int @@ -389,6 +427,15 @@ if (argc - argno > 1) multiple_files = 1; + /* We read pmu profile first because we later filter + src:line_numbers for each source. */ + if (flag_pmu_profile) + { + if (!pmu_profile_filename) + pmu_profile_filename = pmu_profile_default_filename; + process_pmu_profile (); + } + first_arg = argno; for (; argno != argc; argno++) @@ -433,12 +480,14 @@ fnotice (file, " -b, --branch-probabilities Include branch probabilities in output\n"); fnotice (file, " -c, --branch-counts Given counts of branches taken\n\ rather than percentages\n"); + fnotice (file, " -m, --pmu-profile Output PMU profile data if available\n"); fnotice (file, " -n, --no-output Do not create an output file\n"); fnotice (file, " -l, --long-file-names Use long output file names for included\n\ source files\n"); fnotice (file, " -f, --function-summaries Output summaries for each function\n"); fnotice (file, " -o, --object-directory DIR|FILE Search for object files in DIR or called FILE\n"); fnotice (file, " -p, --preserve-paths Preserve all pathname components\n"); + fnotice (file, " -q, --pmu_profile-path Path for PMU profile (default pmuprofile.gcda)\n"); fnotice (file, " -u, --unconditional-branches Show unconditional branch counts too\n"); fnotice (file, " -i, --intermediate-format Output .gcov file in an intermediate text\n\ format that can be used by 'lcov' or other\n\ @@ -473,6 +522,7 @@ { "all-blocks", no_argument, NULL, 'a' }, { "branch-probabilities", no_argument, NULL, 'b' }, { "branch-counts", no_argument, NULL, 'c' }, + { "pmu-profile", no_argument, NULL, 'm' }, { "no-output", no_argument, NULL, 'n' }, { "long-file-names", no_argument, NULL, 'l' }, { "function-summaries", no_argument, NULL, 'f' }, @@ -480,6 +530,7 @@ { "object-directory", required_argument, NULL, 'o' }, { "object-file", required_argument, NULL, 'o' }, { "unconditional-branches", no_argument, NULL, 'u' }, + { "pmu_profile-path", required_argument, NULL, 'q' }, { "display-progress", no_argument, NULL, 'd' }, { "intermediate-format", no_argument, NULL, 'i' }, { 0, 0, 0, 0 } @@ -492,7 +543,7 @@ { int opt; - while ((opt = getopt_long (argc, argv, "abcdfhilno:puv", options, NULL)) != + while ((opt = getopt_long (argc, argv, "abcdfhilno:pq:uv", options, NULL)) != -1) { switch (opt) @@ -515,6 +566,9 @@ case 'l': flag_long_names = 1; break; + case 'm': + flag_pmu_profile = 1; + break; case 'n': flag_gcov_file = 0; break; @@ -524,6 +578,9 @@ case 'p': flag_preserve_paths = 1; break; + case 'q': + pmu_profile_filename = optarg; + break; case 'u': flag_unconditional = 1; break; @@ -766,6 +823,8 @@ { function_t *fn; source_t *src; + ll_infos_t *ll_infos = &pmu_global_info.ll_infos; + brm_infos_t *brm_infos = &pmu_global_info.brm_infos; while ((src = sources)) { @@ -773,6 +832,14 @@ free (src->name); free (src->lines); + if (src->pmu_data) + { + if (src->pmu_data->ll_infos.ll_array) + free (src->pmu_data->ll_infos.ll_array); + if (src->pmu_data->brm_infos.brm_array) + free (src->pmu_data->brm_infos.brm_array); + free (src->pmu_data); + } } while ((fn = functions)) @@ -794,6 +861,42 @@ free (fn->blocks); free (fn->counts); } + + /* Cleanup PMU load latency info. */ + if (ll_infos->ll_count) + { + unsigned i; + + /* delete each element */ + for (i = 0; i < ll_infos->ll_count; ++i) + { + if (ll_infos->ll_array[i]->filename) + XDELETE (ll_infos->ll_array[i]->filename); + XDELETE (ll_infos->ll_array[i]); + } + /* delete the array itself */ + XDELETE (ll_infos->ll_array); + ll_infos->ll_array = NULL; + ll_infos->ll_count = 0; + } + + /* Cleanup PMU branch mispredict info. */ + if (brm_infos->brm_count) + { + unsigned i; + + /* delete each element */ + for (i = 0; i < brm_infos->brm_count; ++i) + { + if (brm_infos->brm_array[i]->filename) + XDELETE (brm_infos->brm_array[i]->filename); + XDELETE (brm_infos->brm_array[i]); + } + /* delete the array itself */ + XDELETE (brm_infos->brm_array); + brm_infos->brm_array = NULL; + brm_infos->brm_count = 0; + } } /* Generate the names of the graph and data files. If OBJECT_DIRECTORY @@ -890,6 +993,7 @@ src->coverage.name = src->name; src->index = source_index++; src->next = sources; + src->pmu_data = 0; sources = src; if (!stat (file_name, &status)) @@ -1806,6 +1910,140 @@ fnotice (stderr, "%s:no lines for '%s'\n", bbg_file_name, fn->name); } +/* Filter PMU profile global data for lines for SRC. Save PMU info + matching the source file and sort them by line number for later + line by line processing. */ + +static void +filter_pmu_data_lines (source_t *src) +{ + unsigned i; + int changed; + ll_infos_t *ll_infos; /* load latency information for this source */ + brm_infos_t *brm_infos; /* branch mispredict information for this source */ + + if (pmu_global_info.ll_infos.ll_count == 0 && + pmu_global_info.brm_infos.brm_count == 0) + /* If there are no global entries, there is nothing to filter. */ + return; + + src->pmu_data = XCNEW (pmu_data_t); + ll_infos = &src->pmu_data->ll_infos; + brm_infos = &src->pmu_data->brm_infos; + ll_infos->pmu_tool_header = pmu_global_info.ll_infos.pmu_tool_header; + brm_infos->pmu_tool_header = pmu_global_info.brm_infos.pmu_tool_header; + ll_infos->ll_array = 0; + brm_infos->brm_array = 0; + + /* Go over all the load latency entries and save the ones + corresponding to this source file. */ + for (i = 0; i < pmu_global_info.ll_infos.ll_count; ++i) + { + gcov_pmu_ll_info_t *ll_info = pmu_global_info.ll_infos.ll_array[i]; + if (0 == strcmp (src->name, ll_info->filename)) + { + if (!ll_infos->ll_array) + { + ll_infos->ll_count = 0; + ll_infos->alloc_ll_count = 64; + ll_infos->ll_array = XCNEWVEC (gcov_pmu_ll_info_t *, + ll_infos->alloc_ll_count); + } + /* Found a matching entry, save it. */ + ll_infos->ll_count++; + if (ll_infos->ll_count >= ll_infos->alloc_ll_count) + { + /* need to realloc */ + ll_infos->ll_array = (gcov_pmu_ll_info_t **) + xrealloc (ll_infos->ll_array, 2 * ll_infos->alloc_ll_count); + } + ll_infos->ll_array[ll_infos->ll_count - 1] = ll_info; + } + } + + /* Go over all the branch mispredict entries and save the ones + corresponding to this source file. */ + for (i = 0; i < pmu_global_info.brm_infos.brm_count; ++i) + { + gcov_pmu_brm_info_t *brm_info = pmu_global_info.brm_infos.brm_array[i]; + if (0 == strcmp (src->name, brm_info->filename)) + { + if (!brm_infos->brm_array) + { + brm_infos->brm_count = 0; + brm_infos->alloc_brm_count = 64; + brm_infos->brm_array = XCNEWVEC (gcov_pmu_brm_info_t *, + brm_infos->alloc_brm_count); + } + /* Found a matching entry, save it. */ + brm_infos->brm_count++; + if (brm_infos->brm_count >= brm_infos->alloc_brm_count) + { + /* need to realloc */ + brm_infos->brm_array = (gcov_pmu_brm_info_t **) + xrealloc (brm_infos->brm_array, 2 * brm_infos->alloc_brm_count); + } + brm_infos->brm_array[brm_infos->brm_count - 1] = brm_info; + } + } + + /* Sort the load latency data according to the line numbers because + we later iterate over sources in line number order. Normally we + expect the PMU tool to provide sorted data, but a few entries can + be out of order. Thus we use a very simple bubble sort here. */ + if (ll_infos->ll_count > 1) + { + changed = 1; + while (changed) + { + changed = 0; + for (i = 0; i < ll_infos->ll_count - 1; ++i) + { + gcov_pmu_ll_info_t *item1 = ll_infos->ll_array[i]; + gcov_pmu_ll_info_t *item2 = ll_infos->ll_array[i+1]; + if (item1->line > item2->line) + { + /* swap */ + gcov_pmu_ll_info_t *tmp = ll_infos->ll_array[i]; + ll_infos->ll_array[i] = ll_infos->ll_array[i+1]; + ll_infos->ll_array[i+1] = tmp; + changed = 1; + } + } + } + } + + /* Similarly, sort branch mispredict info as well. */ + if (brm_infos->brm_count > 1) + { + changed = 1; + while (changed) + { + changed = 0; + for (i = 0; i < brm_infos->brm_count - 1; ++i) + { + gcov_pmu_brm_info_t *item1 = brm_infos->brm_array[i]; + gcov_pmu_brm_info_t *item2 = brm_infos->brm_array[i+1]; + if (item1->line > item2->line) + { + /* swap */ + gcov_pmu_brm_info_t *tmp = brm_infos->brm_array[i]; + brm_infos->brm_array[i] = brm_infos->brm_array[i+1]; + brm_infos->brm_array[i+1] = tmp; + changed = 1; + } + } + } + } + + /* If no matching PMU info was found, relase the structures. */ + if (!brm_infos->brm_array && !ll_infos->ll_array) + { + free (src->pmu_data); + src->pmu_data = 0; + } +} + /* Accumulate the line counts of a file. */ static void @@ -1815,6 +2053,10 @@ function_t *fn, *fn_p, *fn_n; unsigned ix; + if (flag_pmu_profile) + /* Filter PMU profile by source files and save into matching line(s). */ + filter_pmu_data_lines (src); + /* Reverse the function order. */ for (fn = src->functions, fn_p = NULL; fn; fn_p = fn, fn = fn_n) @@ -2062,6 +2304,9 @@ else if (src->file_time == 0) fprintf (gcov_file, "%9s:%5d:Source is newer than graph\n", "-", 0); + if (src->pmu_data) + output_pmu_data_header (gcov_file, src->pmu_data); + if (flag_branches) fn = src->functions; @@ -2139,6 +2384,10 @@ for (ix = 0, arc = line->u.branches; arc; arc = arc->line_next) ix += output_branch_count (gcov_file, ix, arc); } + + /* Output PMU profile info if available. */ + if (flag_pmu_profile) + output_pmu_data (gcov_file, src, line_num); } /* Handle all remaining source lines. There may be lines after the @@ -2162,3 +2411,236 @@ if (source_file) fclose (source_file); } + +/* Print an explanatory header for PMU_DATA into GCOV_FILE. */ + +static void +output_pmu_data_header (FILE *gcov_file, pmu_data_t *pmu_data) +{ + /* Print header for the applicable PMU events. */ + fprintf (gcov_file, "%9s:%5d\n", "-", 0); + if (pmu_data->ll_infos.ll_count) + { + char *text = pmu_data->ll_infos.pmu_tool_header->column_description; + char c; + fprintf (gcov_file, "%9s:%5u: %s", "PMU_LL", 0, + pmu_data->ll_infos.pmu_tool_header->column_header); + /* The column description is multiline text and we want to print + each line separately after formatting it. */ + fprintf (gcov_file, "%9s:%5u: ", "PMU_LL", 0); + while ((c = *text++)) + { + fprintf (gcov_file, "%c", c); + /* Do not print a new header on trailing newline. */ + if (c == '\n' && text[1]) + fprintf (gcov_file, "%9s:%5u: ", "PMU_LL", 0); + } + fprintf (gcov_file, "%9s:%5d\n", "-", 0); + } + + if (pmu_data->brm_infos.brm_count) + { + + fprintf (gcov_file, "%9s:%5d:PMU BRM: line: %s %s %s\n", + "-", 0, "count", "self", "address"); + fprintf (gcov_file, "%9s:%5d: " + "count: number of branch mispredicts sampled at this address\n", + "-", 0); + fprintf (gcov_file, "%9s:%5d: " + "self: branch mispredicts as percentage of the entire program\n", + "-", 0); + fprintf (gcov_file, "%9s:%5d\n", "-", 0); + } +} + +/* Output pmu data corresponding to SRC and LINE_NUM into GCOV_FILE. */ + +static void +output_pmu_data (FILE *gcov_file, const source_t *src, const unsigned line_num) +{ + unsigned i; + ll_infos_t *ll_infos; + brm_infos_t *brm_infos; + gcov_pmu_tool_header_t *tool_header; + + if (!src->pmu_data) + return; + + ll_infos = &src->pmu_data->ll_infos; + brm_infos = &src->pmu_data->brm_infos; + + if (ll_infos->ll_array) + { + tool_header = src->pmu_data->ll_infos.pmu_tool_header; + + /* Search PMU load latency data for the matching line + numbers. There could be multiple entries with the same line + number. We use the fact that line numbers are sorted in + ll_array. */ + for (i = 0; i < ll_infos->ll_count && + ll_infos->ll_array[i]->line <= line_num; ++i) + { + gcov_pmu_ll_info_t *ll_info = ll_infos->ll_array[i]; + if (ll_info->line == line_num) + output_load_latency_line (gcov_file, ll_info, tool_header); + } + } + + if (brm_infos->brm_array) + { + tool_header = src->pmu_data->brm_infos.pmu_tool_header; + + /* Search PMU branch mispredict data for the matching line + numbers. There could be multiple entries with the same line + number. We use the fact that line numbers are sorted in + brm_array. */ + for (i = 0; i < brm_infos->brm_count && + brm_infos->brm_array[i]->line <= line_num; ++i) + { + gcov_pmu_brm_info_t *brm_info = brm_infos->brm_array[i]; + if (brm_info->line == line_num) + output_branch_mispredict_line (gcov_file, brm_info); + } + } +} + + +/* Output formatted load latency info pointed to by LL_INFO into the + open file FP. TOOL_HEADER contains additional explanation of + fields. */ + +static void +output_load_latency_line (FILE *fp, const gcov_pmu_ll_info_t *ll_info, + gcov_pmu_tool_header_t *tool_header ATTRIBUTE_UNUSED) +{ + fprintf (fp, "%9s:%5u: ", "PMU_LL", ll_info->line); + fprintf (fp, " %u %.2f%% %.2f%% %.2f%% %.2f%% %.2f%% %.2f%% %.2f%% " + "%.2f%% %.2f%% " HOST_WIDEST_INT_PRINT_HEX "\n", + ll_info->counts, + convert_unsigned_to_pct (ll_info->self), + convert_unsigned_to_pct (ll_info->cum), + convert_unsigned_to_pct (ll_info->lt_10), + convert_unsigned_to_pct (ll_info->lt_32), + convert_unsigned_to_pct (ll_info->lt_64), + convert_unsigned_to_pct (ll_info->lt_256), + convert_unsigned_to_pct (ll_info->lt_1024), + convert_unsigned_to_pct (ll_info->gt_1024), + convert_unsigned_to_pct (ll_info->wself), + ll_info->code_addr); +} + + +/* Output formatted branch mispredict info pointed to by BRM_INFO into + the open file FP. */ + +static void +output_branch_mispredict_line (FILE *fp, + const gcov_pmu_brm_info_t *ll_info) +{ + fprintf (fp, "%9s:%5u: count: %u self: %.2f%% addr: " + HOST_WIDEST_INT_PRINT_HEX "\n", + "PMU BRM", + ll_info->line, + ll_info->counts, + convert_unsigned_to_pct (ll_info->self), + ll_info->code_addr); +} + +/* Read in the PMU profile information from the global PMU profile file. */ + +static void process_pmu_profile (void) +{ + unsigned tag; + unsigned version; + int error = 0; + ll_infos_t *ll_infos = &pmu_global_info.ll_infos; + brm_infos_t *brm_infos = &pmu_global_info.brm_infos; + + /* Construct path for pmuprofile.gcda filename. */ + create_file_names (pmu_profile_filename); + if (!gcov_open (da_file_name, 1)) + { + fnotice (stderr, "%s:cannot open pmu profile file\n", + pmu_profile_filename); + return; + } + if (!gcov_magic (gcov_read_unsigned (), GCOV_DATA_MAGIC)) + { + fnotice (stderr, "%s:not a gcov data file\n", da_file_name); + cleanup:; + gcov_close (); + return; + } + version = gcov_read_unsigned (); + if (version != GCOV_VERSION) + { + char v[4], e[4]; + + GCOV_UNSIGNED2STRING (v, version); + GCOV_UNSIGNED2STRING (e, GCOV_VERSION); + fnotice (stderr, "%s:version '%.4s', prefer version '%.4s'\n", + da_file_name, v, e); + } + /* read stamp */ + tag = gcov_read_unsigned (); + + /* Initialize PMU data fields. */ + ll_infos->ll_count = 0; + ll_infos->alloc_ll_count = 64; + ll_infos->ll_array = XCNEWVEC (gcov_pmu_ll_info_t *, ll_infos->alloc_ll_count); + + brm_infos->brm_count = 0; + brm_infos->alloc_brm_count = 64; + brm_infos->brm_array = XCNEWVEC (gcov_pmu_brm_info_t *, + brm_infos->alloc_brm_count); + + while ((tag = gcov_read_unsigned ())) + { + unsigned length = gcov_read_unsigned (); + unsigned long base = gcov_position (); + + if (tag == GCOV_TAG_PMU_LOAD_LATENCY_INFO) + { + gcov_pmu_ll_info_t *ll_info = XCNEW (gcov_pmu_ll_info_t); + gcov_read_pmu_load_latency_info (ll_info, length); + ll_infos->ll_count++; + if (ll_infos->ll_count >= ll_infos->alloc_ll_count) + { + /* need to realloc */ + ll_infos->ll_array = (gcov_pmu_ll_info_t **) + xrealloc (ll_infos->ll_array, 2 * ll_infos->alloc_ll_count); + } + ll_infos->ll_array[ll_infos->ll_count - 1] = ll_info; + } + else if (tag == GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO) + { + gcov_pmu_brm_info_t *brm_info = XCNEW (gcov_pmu_brm_info_t); + gcov_read_pmu_branch_mispredict_info (brm_info, length); + brm_infos->brm_count++; + if (brm_infos->brm_count >= brm_infos->alloc_brm_count) + { + /* need to realloc */ + brm_infos->brm_array = (gcov_pmu_brm_info_t **) + xrealloc (brm_infos->brm_array, 2 * brm_infos->alloc_brm_count); + } + brm_infos->brm_array[brm_infos->brm_count - 1] = brm_info; + } + else if (tag == GCOV_TAG_PMU_TOOL_HEADER) + { + gcov_pmu_tool_header_t *tool_header = XCNEW (gcov_pmu_tool_header_t); + gcov_read_pmu_tool_header (tool_header, length); + ll_infos->pmu_tool_header = tool_header; + brm_infos->pmu_tool_header = tool_header; + } + + gcov_sync (base, length); + if ((error = gcov_is_error ())) + { + fnotice (stderr, error < 0 ? "%s:overflowed\n" : "%s:corrupted\n", + da_file_name); + goto cleanup; + } + } + + gcov_close (); +} Index: gcc/gcov-io.c =================================================================== --- gcc/gcov-io.c (revision 175346) +++ gcc/gcov-io.c (working copy) @@ -23,6 +23,12 @@ /* Routines declared in gcov-io.h. This file should be #included by another source file, after having #included gcov-io.h. */ +/* Redefine these here, rather than using the ones in system.h since + * including system.h leads to conflicting definitions of other + * symbols and macros. */ +#undef MIN +#define MIN(X,Y) ((X) < (Y) ? (X) : (Y)) + #if !IN_GCOV static void gcov_write_block (unsigned); static gcov_unsigned_t *gcov_write_words (unsigned); @@ -197,6 +203,104 @@ } #if !IN_LIBGCOV +/* Modify FILENAME to a canonical form after stripping known prefixes + in place. It removes '/proc/self/cwd' and '/proc/self/cwd/.'. + Returns the in-place modified filename. */ + +GCOV_LINKAGE char * +gcov_canonical_filename (char *filename) +{ + static char cwd_dot_str[] = "/proc/self/cwd/./"; + int cwd_dot_len = strlen (cwd_dot_str); + int cwd_len = cwd_dot_len - 2; /* without trailing './' */ + int filename_len = strlen (filename); + /* delete the longer prefix first */ + if (0 == strncmp (filename, cwd_dot_str, cwd_dot_len)) + { + memmove (filename, filename + cwd_dot_len, filename_len - cwd_dot_len); + filename[filename_len - cwd_dot_len] = '\0'; + return filename; + } + + if (0 == strncmp (filename, cwd_dot_str, cwd_len)) + { + memmove (filename, filename + cwd_len, filename_len - cwd_len); + filename[filename_len - cwd_len] = '\0'; + return filename; + } + return filename; +} + +/* Read LEN words and construct load latency info LL_INFO. */ + +GCOV_LINKAGE void +gcov_read_pmu_load_latency_info (gcov_pmu_ll_info_t *ll_info, + gcov_unsigned_t len ATTRIBUTE_UNUSED) +{ + const char *filename; + ll_info->counts = gcov_read_unsigned (); + ll_info->self = gcov_read_unsigned (); + ll_info->cum = gcov_read_unsigned (); + ll_info->lt_10 = gcov_read_unsigned (); + ll_info->lt_32 = gcov_read_unsigned (); + ll_info->lt_64 = gcov_read_unsigned (); + ll_info->lt_256 = gcov_read_unsigned (); + ll_info->lt_1024 = gcov_read_unsigned (); + ll_info->gt_1024 = gcov_read_unsigned (); + ll_info->wself = gcov_read_unsigned (); + ll_info->code_addr = gcov_read_counter (); + ll_info->line = gcov_read_unsigned (); + ll_info->discriminator = gcov_read_unsigned (); + filename = gcov_read_string (); + if (filename) + ll_info->filename = gcov_canonical_filename (xstrdup (filename)); + else + ll_info->filename = 0; +} + +/* Read LEN words and construct branch mispredict info BRM_INFO. */ + +GCOV_LINKAGE void +gcov_read_pmu_branch_mispredict_info (gcov_pmu_brm_info_t *brm_info, + gcov_unsigned_t len ATTRIBUTE_UNUSED) +{ + const char *filename; + brm_info->counts = gcov_read_unsigned (); + brm_info->self = gcov_read_unsigned (); + brm_info->cum = gcov_read_unsigned (); + brm_info->code_addr = gcov_read_counter (); + brm_info->line = gcov_read_unsigned (); + brm_info->discriminator = gcov_read_unsigned (); + filename = gcov_read_string (); + if (filename) + brm_info->filename = gcov_canonical_filename (xstrdup (filename)); + else + brm_info->filename = 0; +} + +/* Read LEN words from an open gcov file and construct data into pmu + tool header TOOL_HEADER. */ + +GCOV_LINKAGE void gcov_read_pmu_tool_header (gcov_pmu_tool_header_t *header, + gcov_unsigned_t len ATTRIBUTE_UNUSED) +{ + const char *str; + str = gcov_read_string (); + header->host_cpu = str ? xstrdup (str) : 0; + str = gcov_read_string (); + header->hostname = str ? xstrdup (str) : 0; + str = gcov_read_string (); + header->kernel_version = str ? xstrdup (str) : 0; + str = gcov_read_string (); + header->column_header = str ? xstrdup (str) : 0; + str = gcov_read_string (); + header->column_description = str ? xstrdup (str) : 0; + str = gcov_read_string (); + header->full_header = str ? xstrdup (str) : 0; +} +#endif + +#if !IN_LIBGCOV /* Check if MAGIC is EXPECTED. Use it to determine endianness of the file. Returns +1 for same endian, -1 for other endian and zero for not EXPECTED. */ @@ -245,6 +349,24 @@ gcov_var.offset -= size; } +#if IN_LIBGCOV +/* Return the number of words STRING would need including the length + field in the output stream itself. This should be identical to + "alloc" calculation in gcov_write_string(). */ + +GCOV_LINKAGE gcov_unsigned_t +gcov_string_length (const char *string) +{ + gcov_unsigned_t len = (string) ? strlen (string) : 0; + /* + 1 because of the length field. */ + gcov_unsigned_t alloc = 1 + ((len + 4) >> 2); + + /* Can not write a bigger than GCOV_BLOCK_SIZE string yet */ + gcc_assert (alloc < GCOV_BLOCK_SIZE); + return alloc; +} +#endif + /* Allocate space to write BYTES bytes to the gcov file. Return a pointer to those bytes, or NULL on failure. */ @@ -255,13 +377,15 @@ gcc_assert (gcov_var.mode < 0); #if IN_LIBGCOV - if (gcov_var.offset >= GCOV_BLOCK_SIZE) + if (gcov_var.offset + words >= GCOV_BLOCK_SIZE) { - gcov_write_block (GCOV_BLOCK_SIZE); + gcov_write_block (MIN (gcov_var.offset, GCOV_BLOCK_SIZE)); if (gcov_var.offset) { - gcc_assert (gcov_var.offset == 1); - memcpy (gcov_var.buffer, gcov_var.buffer + GCOV_BLOCK_SIZE, 4); + gcc_assert (gcov_var.offset < GCOV_BLOCK_SIZE); + memcpy (gcov_var.buffer, + gcov_var.buffer + GCOV_BLOCK_SIZE, + gcov_var.offset << 2); } } #else @@ -302,7 +426,6 @@ } #endif /* IN_LIBGCOV */ -#if !IN_LIBGCOV /* Write STRING to coverage file. Sets error flag on file error, overflow flag on overflow */ @@ -325,7 +448,6 @@ buffer[alloc] = 0; memcpy (&buffer[1], string, length); } -#endif #if !IN_LIBGCOV /* Write a tag TAG and reserve space for the record length. Return a @@ -413,14 +535,15 @@ unsigned excess = gcov_var.length - gcov_var.offset; gcc_assert (gcov_var.mode > 0); + gcc_assert (words < GCOV_BLOCK_SIZE); if (excess < words) { gcov_var.start += gcov_var.offset; #if IN_LIBGCOV if (excess) { - gcc_assert (excess == 1); - memcpy (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, 4); + gcc_assert (excess < GCOV_BLOCK_SIZE); + memmove (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, excess * 4); } #else memmove (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, excess * 4); @@ -428,8 +551,7 @@ gcov_var.offset = 0; gcov_var.length = excess; #if IN_LIBGCOV - gcc_assert (!gcov_var.length || gcov_var.length == 1); - excess = GCOV_BLOCK_SIZE; + excess = (sizeof (gcov_var.buffer) / sizeof (gcov_var.buffer[0])) - gcov_var.length; #else if (gcov_var.length + words > gcov_var.alloc) gcov_allocate (gcov_var.length + words); @@ -489,7 +611,6 @@ buffer, or NULL on empty string. You must copy the string before calling another gcov function. */ -#if !IN_LIBGCOV GCOV_LINKAGE const char * gcov_read_string (void) { @@ -500,7 +621,6 @@ return (const char *) gcov_read_words (length); } -#endif GCOV_LINKAGE void gcov_read_summary (struct gcov_summary *summary) @@ -629,6 +749,89 @@ } #endif +#ifndef __GCOV_KERNEL__ +/* Convert an unsigned NUMBER to a percentage after dividing by + 100. */ + +GCOV_LINKAGE float +convert_unsigned_to_pct (const unsigned number) +{ + return (float)number / 100.0f; +} +#endif + +#if !IN_LIBGCOV && IN_GCOV != 1 +/* Print load latency information given by LL_INFO in a human readable + format into an open output file pointed by FP. NEWLINE specifies + whether or not to print a trailing newline. */ + +GCOV_LINKAGE void +print_load_latency_line (FILE *fp, const gcov_pmu_ll_info_t *ll_info, + const enum print_newline newline) +{ + if (!ll_info) + return; + fprintf (fp, " %u %.2f%% %.2f%% %.2f%% %.2f%% %.2f%% %.2f%% %.2f%% " + "%.2f%% %.2f%% " HOST_WIDEST_INT_PRINT_HEX " %s %d %d", + ll_info->counts, + convert_unsigned_to_pct (ll_info->self), + convert_unsigned_to_pct (ll_info->cum), + convert_unsigned_to_pct (ll_info->lt_10), + convert_unsigned_to_pct (ll_info->lt_32), + convert_unsigned_to_pct (ll_info->lt_64), + convert_unsigned_to_pct (ll_info->lt_256), + convert_unsigned_to_pct (ll_info->lt_1024), + convert_unsigned_to_pct (ll_info->gt_1024), + convert_unsigned_to_pct (ll_info->wself), + ll_info->code_addr, + ll_info->filename, + ll_info->line, + ll_info->discriminator); + if (newline == add_newline) + fprintf (fp, "\n"); +} + +/* Print BRM_INFO into the file pointed by FP. NEWLINE specifies + whether or not to print a trailing newline. */ + +GCOV_LINKAGE void +print_branch_mispredict_line (FILE *fp, const gcov_pmu_brm_info_t *brm_info, + const enum print_newline newline) +{ + if (!brm_info) + return; + fprintf (fp, " %u %.2f%% %.2f%% " HOST_WIDEST_INT_PRINT_HEX " %s %d %d", + brm_info->counts, + convert_unsigned_to_pct (brm_info->self), + convert_unsigned_to_pct (brm_info->cum), + brm_info->code_addr, + brm_info->filename, + brm_info->line, + brm_info->discriminator); + if (newline == add_newline) + fprintf (fp, "\n"); +} + +/* Print TOOL_HEADER into the file pointed by FP. NEWLINE specifies + whether or not to print a trailing newline. */ + +GCOV_LINKAGE void +print_pmu_tool_header (FILE *fp, gcov_pmu_tool_header_t *tool_header, + const enum print_newline newline) +{ + if (!tool_header) + return; + fprintf (fp, "\nhost_cpu: %s\n", tool_header->host_cpu); + fprintf (fp, "hostname: %s\n", tool_header->hostname); + fprintf (fp, "kernel_version: %s\n", tool_header->kernel_version); + fprintf (fp, "column_header: %s\n", tool_header->column_header); + fprintf (fp, "column_description: %s\n", tool_header->column_description); + fprintf (fp, "full_header: %s\n", tool_header->full_header); + if (newline == add_newline) + fprintf (fp, "\n"); +} +#endif + #if IN_GCOV > 0 /* Return the modification time of the current gcov file. */ @@ -715,7 +918,7 @@ if (vsize <= vpos) { printk (KERN_ERR - "GCOV_KERNEL: something wrong: vbuf=%p vsize=%u vpos=%u\n", + "GCOV_KERNEL: something wrong: vbuf=%p vsize=%u vpos=%u\n", vbuf, vsize, vpos); return 0; } @@ -744,4 +947,29 @@ gcc_assert (0); /* should not reach here */ return 0; } +#else /* __GCOV_KERNEL__ */ + +#if IN_GCOV != 1 +/* Delete pmu tool header TOOL_HEADER. */ + +GCOV_LINKAGE void +destroy_pmu_tool_header (gcov_pmu_tool_header_t *tool_header) +{ + if (!tool_header) + return; + if (tool_header->host_cpu) + free (tool_header->host_cpu); + if (tool_header->hostname) + free (tool_header->hostname); + if (tool_header->kernel_version) + free (tool_header->kernel_version); + if (tool_header->column_header) + free (tool_header->column_header); + if (tool_header->column_description) + free (tool_header->column_description); + if (tool_header->full_header) + free (tool_header->full_header); +} +#endif + #endif /* GCOV_KERNEL */ Index: gcc/gcov-io.h =================================================================== --- gcc/gcov-io.h (revision 175346) +++ gcc/gcov-io.h (working copy) @@ -313,6 +313,7 @@ typedef unsigned gcov_unsigned_t; typedef unsigned gcov_position_t; + /* gcov_type is typedef'd elsewhere for the compiler */ #if IN_GCOV #define GCOV_LINKAGE static @@ -363,15 +364,24 @@ #define gcov_write_counter __gcov_write_counter #define gcov_write_summary __gcov_write_summary #define gcov_write_module_info __gcov_write_module_info +#define gcov_write_string __gcov_write_string +#define gcov_string_length __gcov_string_length #define gcov_read_unsigned __gcov_read_unsigned #define gcov_read_counter __gcov_read_counter +#define gcov_read_string __gcov_read_string #define gcov_read_summary __gcov_read_summary #define gcov_read_module_info __gcov_read_module_info #define gcov_sort_n_vals __gcov_sort_n_vals +#define gcov_canonical_filename _gcov_canonical_filename +#define gcov_read_pmu_load_latency_info __gcov_read_pmu_load_latency_info +#define gcov_read_pmu_branch_mispredict_info __gcov_read_pmu_branch_mispredict_info +#define gcov_read_pmu_tool_header __gcov_read_pmu_tool_header +#define destroy_pmu_tool_header __destroy_pmu_tool_header + /* Poison these, so they don't accidentally slip in. */ -#pragma GCC poison gcov_write_string gcov_write_tag gcov_write_length -#pragma GCC poison gcov_read_string gcov_sync gcov_time gcov_magic +#pragma GCC poison gcov_write_tag gcov_write_length +#pragma GCC poison gcov_sync gcov_time gcov_magic #ifdef HAVE_GAS_HIDDEN #define ATTRIBUTE_HIDDEN __attribute__ ((__visibility__ ("hidden"))) @@ -432,6 +442,13 @@ #define GCOV_TAG_SUMMARY_LENGTH \ (1 + GCOV_COUNTERS_SUMMABLE * (2 + 3 * 2)) #define GCOV_TAG_MODULE_INFO ((gcov_unsigned_t)0xa4000000) +#define GCOV_TAG_PMU_LOAD_LATENCY_INFO ((gcov_unsigned_t)0xa5000000) +#define GCOV_TAG_PMU_LOAD_LATENCY_LENGTH(filename) \ + (gcov_string_length (filename) + 12 + 2) +#define GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO ((gcov_unsigned_t)0xa7000000) +#define GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH(filename) \ + (gcov_string_length (filename) + 5 + 2) +#define GCOV_TAG_PMU_TOOL_HEADER ((gcov_unsigned_t)0xa9000000) /* Counters that are collected. */ #define GCOV_COUNTER_ARCS 0 /* Arc transitions. */ @@ -545,6 +562,8 @@ #define GCOV_MODULE_ASM_STMTS (1 << 16) #define GCOV_MODULE_LANG_MASK 0xffff +enum print_newline {no_newline, add_newline}; + /* Source module info. The data structure is used in both runtime and profile-use phase. Make sure to allocate enough space for the variable length member. */ @@ -576,6 +595,91 @@ && !((module_infos[0]->lang & GCOV_MODULE_ASM_STMTS) \ && flag_ripa_disallow_asm_modules)) +/* Information about the hardware performance monitoring unit. */ +struct gcov_pmu_info +{ + const char *pmu_profile_filename; /* pmu profile filename */ + const char *pmu_tool; /* canonical pmu tool options */ + gcov_unsigned_t pmu_top_n_address; /* how many top addresses to symbolize */ +}; + +/* Information about the PMU tool header. */ +typedef struct gcov_pmu_tool_header { + char *host_cpu; + char *hostname; + char *kernel_version; + char *column_header; + char *column_description; + char *full_header; +} gcov_pmu_tool_header_t; + +/* Available only for PMUs which support PEBS or IBS using pfmon + tool. If any field here is changed, the length computation in + GCOV_TAG_PMU_LOAD_LATENCY_LENGTH must be updated as well. All + percentages are multiplied by 100 to make them out of 10000 and + only integer part is kept. */ +typedef struct gcov_pmu_load_latency_info +{ + gcov_unsigned_t counts; /* raw count of samples */ + gcov_unsigned_t self; /* per 10k of total samples */ + gcov_unsigned_t cum; /* per 10k cumulative weight */ + gcov_unsigned_t lt_10; /* per 10k with latency <= 10 cycles */ + gcov_unsigned_t lt_32; /* per 10k with latency <= 32 cycles */ + gcov_unsigned_t lt_64; /* per 10k with latency <= 64 cycles */ + gcov_unsigned_t lt_256; /* per 10k with latency <= 256 cycles */ + gcov_unsigned_t lt_1024; /* per 10k with latency <= 1024 cycles */ + gcov_unsigned_t gt_1024; /* per 10k with latency > 1024 cycles */ + gcov_unsigned_t wself; /* weighted average cost of this miss in cycles */ + gcov_type code_addr; /* the actual miss address (pc+1 for Intel) */ + gcov_unsigned_t line; /* line number corresponding to this miss */ + gcov_unsigned_t discriminator; /* discriminator information for this miss */ + char *filename; /* filename corresponding to this miss */ +} gcov_pmu_ll_info_t; + +/* This structure is used during runtime as well as in gcov. */ +typedef struct load_latency_infos +{ + /* An array describing the total number of load latency fields. */ + gcov_pmu_ll_info_t **ll_array; + /* The total number of entries in the load latency array. */ + unsigned ll_count; + /* The total number of entries currently allocated in the array. + Used for bookkeeping. */ + unsigned alloc_ll_count; + /* PMU tool header */ + gcov_pmu_tool_header_t *pmu_tool_header; +} ll_infos_t; + +/* Available only for PMUs which support PEBS or IBS using pfmon + tool. If any field here is changed, the length computation in + GCOV_TAG_PMU_BR_MISPREDICT_LENGTH must be updated as well. All + percentages are multiplied by 100 to make them out of 10000 and + only integer part is kept. */ +typedef struct gcov_pmu_branch_mispredict_info +{ + gcov_unsigned_t counts; /* raw count of samples */ + gcov_unsigned_t self; /* per 10k of total samples */ + gcov_unsigned_t cum; /* per 10k cumulative weight */ + gcov_type code_addr; /* the actual mispredict address */ + gcov_unsigned_t line; /* line number corresponding to this event */ + gcov_unsigned_t discriminator; /* discriminator for this event */ + char *filename; /* filename corresponding to this event */ +} gcov_pmu_brm_info_t; + +/* This structure is used during runtime as well as in gcov. */ +typedef struct branch_mispredict_infos +{ + /* An array describing the total number of mispredict entries. */ + gcov_pmu_brm_info_t **brm_array; + /* The total number of entries in the above array. */ + unsigned brm_count; + /* The total number of entries currently allocated in the array. + Used for bookkeeping. */ + unsigned alloc_brm_count; + /* PMU tool header */ + gcov_pmu_tool_header_t *pmu_tool_header; +} brm_infos_t; + /* Structures embedded in coveraged program. The structures generated by write_profile must match these. */ @@ -635,9 +739,6 @@ /* Register a new object file module. */ extern void __gcov_init (struct gcov_info *) ATTRIBUTE_HIDDEN; -/* Set sampling rate to RATE. */ -extern void __gcov_set_sampling_rate (unsigned int rate); - /* Called before fork, to avoid double counting. */ extern void __gcov_flush (void) ATTRIBUTE_HIDDEN; @@ -674,6 +775,12 @@ extern void __gcov_ior_profiler (gcov_type *, gcov_type); extern void __gcov_sort_n_vals (gcov_type *value_array, int n); +/* Initialize/start/stop/dump performance monitoring unit (PMU) profile */ +void __gcov_init_pmu_profiler (struct gcov_pmu_info *) ATTRIBUTE_HIDDEN; +void __gcov_start_pmu_profiler (void) ATTRIBUTE_HIDDEN; +void __gcov_stop_pmu_profiler (void) ATTRIBUTE_HIDDEN; +void __gcov_end_pmu_profiler (int gcda_error) ATTRIBUTE_HIDDEN; + #ifndef inhibit_libc /* The wrappers around some library functions.. */ extern pid_t __gcov_fork (void) ATTRIBUTE_HIDDEN; @@ -746,14 +853,44 @@ static gcov_position_t gcov_position (void); static int gcov_is_error (void); +GCOV_LINKAGE const char *gcov_read_string (void) ATTRIBUTE_HIDDEN; GCOV_LINKAGE gcov_unsigned_t gcov_read_unsigned (void) ATTRIBUTE_HIDDEN; GCOV_LINKAGE gcov_type gcov_read_counter (void) ATTRIBUTE_HIDDEN; GCOV_LINKAGE void gcov_read_summary (struct gcov_summary *) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE char *gcov_canonical_filename (char *filename) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE void +gcov_read_pmu_load_latency_info (gcov_pmu_ll_info_t *ll_info, + gcov_unsigned_t len) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE void +gcov_read_pmu_branch_mispredict_info (gcov_pmu_brm_info_t *brm_info, + gcov_unsigned_t len) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE void +gcov_read_pmu_tool_header (gcov_pmu_tool_header_t *tool_header, + gcov_unsigned_t len) ATTRIBUTE_HIDDEN; +#ifndef __GCOV_KERNEL__ +GCOV_LINKAGE float convert_unsigned_to_pct ( + const unsigned number) ATTRIBUTE_HIDDEN; +#endif /* __GCOV_KERNEL__ */ + #if !IN_LIBGCOV && IN_GCOV != 1 GCOV_LINKAGE void gcov_read_module_info (struct gcov_module_info *mod_info, gcov_unsigned_t len) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE void print_load_latency_line (FILE *fp, + const gcov_pmu_ll_info_t *ll_info, + const enum print_newline); +GCOV_LINKAGE void +print_branch_mispredict_line (FILE *fp, const gcov_pmu_brm_info_t *brm_info, + const enum print_newline); +GCOV_LINKAGE void print_pmu_tool_header (FILE *fp, + gcov_pmu_tool_header_t *tool_header, + const enum print_newline); #endif +#if IN_GCOV != 1 +GCOV_LINKAGE void destroy_pmu_tool_header (gcov_pmu_tool_header_t *tool_header) + ATTRIBUTE_HIDDEN; +#endif + #if IN_LIBGCOV /* Available only in libgcov */ GCOV_LINKAGE void gcov_write_counter (gcov_type) ATTRIBUTE_HIDDEN; @@ -771,10 +908,10 @@ static void gcov_rewrite (void); GCOV_LINKAGE void gcov_seek (gcov_position_t /*position*/) ATTRIBUTE_HIDDEN; GCOV_LINKAGE void gcov_truncate (void) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE gcov_unsigned_t gcov_string_length (const char *) ATTRIBUTE_HIDDEN; GCOV_LINKAGE unsigned gcov_gcda_file_size (struct gcov_info *); #else /* Available outside libgcov */ -GCOV_LINKAGE const char *gcov_read_string (void); GCOV_LINKAGE void gcov_sync (gcov_position_t /*base*/, gcov_unsigned_t /*length */); #endif @@ -782,11 +919,11 @@ #if !IN_GCOV /* Available outside gcov */ GCOV_LINKAGE void gcov_write_unsigned (gcov_unsigned_t) ATTRIBUTE_HIDDEN; +GCOV_LINKAGE void gcov_write_string (const char *) ATTRIBUTE_HIDDEN; #endif #if !IN_GCOV && !IN_LIBGCOV /* Available only in compiler */ -GCOV_LINKAGE void gcov_write_string (const char *); GCOV_LINKAGE gcov_position_t gcov_write_tag (gcov_unsigned_t); GCOV_LINKAGE void gcov_write_length (gcov_position_t /*position*/); #endif Index: gcc/opts.c =================================================================== --- gcc/opts.c (revision 175346) +++ gcc/opts.c (working copy) @@ -36,6 +36,9 @@ #include "insn-attr.h" /* For INSN_SCHEDULING and DELAY_SLOTS. */ #include "target.h" +/* Defined in coverage.c. */ +extern int check_pmu_profile_options (const char *options); + /* Parse the -femit-struct-debug-detailed option value and set the flag variables. */ @@ -1597,6 +1600,15 @@ opts->x_flag_ipa_reference = false; break; + case OPT_fpmu_profile_generate_: + /* This should be ideally turned on in conjunction with + -fprofile-dir or -fprofile-generate in order to specify a + profile directory. */ + if (check_pmu_profile_options (arg)) + error ("Unrecognized pmu_profile_generate value \"%s\"", arg); + flag_pmu_profile_generate = xstrdup (arg); + break; + case OPT_fshow_column: dc->show_column = value; break; Index: gcc/pmu-profile.c =================================================================== --- gcc/pmu-profile.c (revision 0) +++ gcc/pmu-profile.c (revision 0) @@ -0,0 +1,1552 @@ +/* Performance monitoring unit (PMU) profiler. If available, use an + external tool to collect hardware performance counter data and + write it in the .gcda files. + + Copyright (C) 2011. Free Software Foundation, Inc. + Contributed by Sharad Singhai . + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +#include "tconfig.h" +#include "tsystem.h" +#include "coretypes.h" +#include "tm.h" +#if (defined (__x86_64__) || defined (__i386__)) +#include "cpuid.h" +#endif + +#if defined(inhibit_libc) +#define IN_LIBGCOV (-1) +#else +#include +#include +#define IN_LIBGCOV 1 + #if defined(L_gcov) + #define GCOV_LINKAGE /* nothing */ + #endif +#endif +#include "gcov-io.h" +#ifdef TARGET_POSIX_IO + #include + #include + #include + #include +#endif + +#if defined(inhibit_libc) +#else +#include +#include +#include +#include + +#include +#include + +#define XNEWVEC(type,ne) (type *)calloc((ne),sizeof(type)) +#define XNEW(type) (type *)malloc(sizeof(type)) +#define XDELETEVEC(p) free(p) +#define XDELETE(p) free(p) + +#define PFMON_CMD "/usr/bin/pfmon" +#define ADDR2LINE_CMD "/usr/bin/addr2line" +#define PMU_TOOL_MAX_ARGS (20) +static char default_addr2line[] = "??:0"; +static const char pfmon_ll_header[] = "# counts %self %cum " + "<10 <32 <64 <256 <1024 >=1024 %wself " + "code addr symbol\n"; +static const char pfmon_bm_header[] = + "# counts %self %cum code addr symbol\n"; + +const char *pfmon_intel_ll_args[PMU_TOOL_MAX_ARGS] = { + PFMON_CMD, + "--aggregate-results", + "--follow-all", + "--with-header", + "--smpl-module=pebs-ll", + "--ld-lat-threshold=4", + "--pebs-ll-dcmiss-code", + "--resolve-addresses", + "-emem_inst_retired:LATENCY_ABOVE_THRESHOLD", + "--long-smpl-periods=10000", + 0 /* terminating NULL must be present */ +}; + +const char *pfmon_amd_ll_args[PMU_TOOL_MAX_ARGS] = { + PFMON_CMD, + "--aggregate-results", + "--follow-all", + "-uk", + "--with-header", + "--smpl-module=ibs", + "--resolve-addresses", + "-eibsop_event:uops", + "--ibs-dcmiss-code", + "--long-smpl-periods=0xffff0", + 0 /* terminating NULL must be present */ +}; + +const char *pfmon_intel_brm_args[PMU_TOOL_MAX_ARGS] = { + PFMON_CMD, + "--aggregate-results", + "--follow-all", + "--with-header", + "--resolve-addresses", + "-eMISPREDICTED_BRANCH_RETIRED", + "--long-smpl-periods=10000", + 0 /* terminating NULL must be present */ +}; + +const char *pfmon_amd_brm_args[PMU_TOOL_MAX_ARGS] = { + PFMON_CMD, + "--aggregate-results", + "--follow-all", + "--with-header", + "--resolve-addresses", + "-eRETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", + "--long-smpl-periods=10000", + 0 /* terminating NULL must be present */ +}; + +const char *addr2line_args[PMU_TOOL_MAX_ARGS] = { + ADDR2LINE_CMD, + "-e", + 0 /* terminating NULL must be present */ +}; + + +enum pmu_tool_type +{ + PTT_PFMON, + PTT_LAST +}; + +enum pmu_event_type +{ + PET_INTEL_LOAD_LATENCY, + PET_AMD_LOAD_LATENCY, + PET_INTEL_BRANCH_MISPREDICT, + PET_AMD_BRANCH_MISPREDICT, + PET_LAST +}; + +typedef struct pmu_tool_fns { + const char *name; /* name of the pmu tool */ + /* pmu tool commandline argument. */ + const char **arg_array; + /* Initialize pmu module. */ + void *(*init_pmu_module) (void); + /* Start profililing. */ + void (*start_pmu_module) (pid_t ppid, char *tmpfile, const char **args); + /* Stop profililing. */ + void (*stop_pmu_module) (void); + /* How to parse the output generated by the PMU tool. */ + int (*parse_pmu_output) (char *filename, void *pmu_data); + /* How to write parsed pmu data into gcda file. */ + void (*gcov_write_pmu_data) (void *data); + /* How to cleanup any data structure created during parsing. */ + void (*cleanup_pmu_data) (void *data); + /* How to initialize symbolizer for the PPID. */ + int (*start_symbolizer) (pid_t ppid); + void (*end_symbolizer) (void); + char *(*symbolize) (void *addr); +} pmu_tool_fns; + +enum pmu_state +{ + PMU_NONE, /* Not configurated at all. */ + PMU_INITIALIZED, /* Configured and initialized. */ + PMU_ERROR, /* Configuration error. Cannot recover. */ + PMU_ON, /* Currently profiling. */ + PMU_OFF /* Currently stopped, but can be restarted. */ +}; + +enum cpu_vendor_signature +{ + CPU_VENDOR_UKNOWN = 0, + CPU_VENDOR_INTEL = 0x756e6547, /* Genu */ + CPU_VENDOR_AMD = 0x68747541 /* Auth */ +}; + +/* Info about pmu tool during the run time. */ +struct pmu_tool_info +{ + /* Current pmu tool. */ + enum pmu_tool_type tool; + /* Current event. */ + enum pmu_event_type event; + /* filename for storing the pmu profile. */ + char *pmu_profile_filename; + /* Intermediate file where the tool stores the PMU data. */ + char *raw_pmu_profile_filename; + /* Where PMU tool's stderr should be stored. */ + char *tool_stderr_filename; + enum pmu_state pmu_profiling_state; + enum cpu_vendor_signature cpu_vendor; /* as discovered by cpuid */ + pid_t pmu_tool_pid; /* process id of the pmu tool */ + pid_t symbolizer_pid; /* process id of the symbolizer */ + int symbolizer_to_pipefd[2]; /* pipe for writing to the symbolizer */ + int symbolizer_from_pipefd[2]; /* pipe for reading from the symbolizer */ + void *pmu_data; /* an opaque pointer for the tool to store pmu data */ + int verbose; /* turn on additional debugging */ + unsigned top_n_address; /* how many addresses to symbolize */ + pmu_tool_fns *tool_details; /* list of functions how to start/stop/parse */ +}; + +/* Global struct for recordkeeping. */ +static struct pmu_tool_info *the_pmu_tool_info; + +/* Additional info is printed if these are non-zero. */ +static int tool_debug = 0; +static int sym_debug = 0; + +static int parse_load_latency_line (char *line, gcov_pmu_ll_info_t *ll_info); +static int parse_branch_mispredict_line (char *line, + gcov_pmu_brm_info_t *brm_info); +static unsigned convert_pct_to_unsigned (float pct); +static void start_pfmon_module (pid_t ppid, char *tmpfile, const char **pfmon_args); +static void *init_pmu_load_latency (void); +static void *init_pmu_branch_mispredict (void); +static void destroy_load_latency_infos (void *info); +static void destroy_branch_mispredict_infos (void *info); +static int parse_pfmon_load_latency (char *filename, void *pmu_data); +static int parse_pfmon_branch_mispredicts (char *filename, void *pmu_data); +static gcov_unsigned_t gcov_tag_pmu_tool_header_length (gcov_pmu_tool_header_t + *header); +static void gcov_write_tool_header (gcov_pmu_tool_header_t *header); +static void gcov_write_load_latency_infos (void *info); +static void gcov_write_branch_mispredict_infos (void *info); +static void gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info); +static void gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t + *brm_info); +static int start_addr2line_symbolizer (pid_t pid); +static void end_addr2line_symbolizer (void); +static char *symbolize_addr2line (void *p); +static void reset_symbolizer_parent_pipes (void); +static void reset_symbolizer_child_pipes (void); +/* parse and cache relevant tool info. */ +static int parse_pmu_profile_options (const char *options); +static gcov_pmu_tool_header_t *parse_pfmon_tool_header (FILE *fp, + const char *end_header); + + +/* How to access the necessary functions for the PMU tools. */ +pmu_tool_fns all_pmu_tool_fns[PTT_LAST][PET_LAST] = { + { + { + "intel-load-latency", /* name */ + pfmon_intel_ll_args, /* tool args */ + init_pmu_load_latency, /* initialization */ + start_pfmon_module, /* start */ + 0, /* stop */ + parse_pfmon_load_latency, /* parse */ + gcov_write_load_latency_infos, /* write */ + destroy_load_latency_infos, /* cleanup */ + start_addr2line_symbolizer, /* start symbolizer */ + end_addr2line_symbolizer, /* end symbolizer */ + symbolize_addr2line, /* symbolize */ + }, + { + "amd-load-latency", /* name */ + pfmon_amd_ll_args, /* tool args */ + init_pmu_load_latency, /* initialization */ + start_pfmon_module, /* start */ + 0, /* stop */ + parse_pfmon_load_latency, /* parse */ + gcov_write_load_latency_infos, /* write */ + destroy_load_latency_infos, /* cleanup */ + start_addr2line_symbolizer, /* start symbolizer */ + end_addr2line_symbolizer, /* end symbolizer */ + symbolize_addr2line, /* symbolize */ + }, + { + "intel-branch-mispredict", /* name */ + pfmon_intel_brm_args, /* tool args */ + init_pmu_branch_mispredict, /* initialization */ + start_pfmon_module, /* start */ + 0, /* stop */ + parse_pfmon_branch_mispredicts, /* parse */ + gcov_write_branch_mispredict_infos,/* write */ + destroy_branch_mispredict_infos, /* cleanup */ + start_addr2line_symbolizer, /* start symbolizer */ + end_addr2line_symbolizer, /* end symbolizer */ + symbolize_addr2line, /* symbolize */ + }, + { + "amd-branch-mispredict", /* name */ + pfmon_amd_brm_args, /* tool args */ + init_pmu_branch_mispredict, /* initialization */ + start_pfmon_module, /* start */ + 0, /* stop */ + parse_pfmon_branch_mispredicts, /* parse */ + gcov_write_branch_mispredict_infos,/* write */ + destroy_branch_mispredict_infos, /* cleanup */ + start_addr2line_symbolizer, /* start symbolizer */ + end_addr2line_symbolizer, /* end symbolizer */ + symbolize_addr2line, /* symbolize */ + } + } +}; + +/* Determine the CPU vendor. Currently only distinguishes x86 based + cpus where the vendor is either Intel or AMD. Returns one of the + enum cpu_vendor_signatures. */ + +static unsigned int +get_x86cpu_vendor (void) +{ + unsigned int vendor = CPU_VENDOR_UKNOWN; + +#if (defined (__x86_64__) || defined (__i386__)) + if (__get_cpuid_max (0, &vendor) < 1) + return CPU_VENDOR_UKNOWN; /* Cannot determine cpu type. */ +#endif + + if (vendor == CPU_VENDOR_INTEL || vendor == CPU_VENDOR_AMD) + return vendor; + else + return CPU_VENDOR_UKNOWN; +} + + +/* Parse PMU tool option string provided on the command line and store + information in global structure. Return 0 on success, otherwise + return 1. Any changes to this should be synced with + check_pmu_profile_options() which does compile time check. */ + +static int +parse_pmu_profile_options (const char *options) +{ + enum pmu_tool_type ptt = the_pmu_tool_info->tool; + enum pmu_event_type pet = PET_LAST; + const char *pmutool_path; + the_pmu_tool_info->cpu_vendor = get_x86cpu_vendor (); + /* Determine the platform we are running on. */ + if (the_pmu_tool_info->cpu_vendor == CPU_VENDOR_UKNOWN) + { + /* Cpuid failed or uknown vendor. */ + the_pmu_tool_info->pmu_profiling_state = PMU_ERROR; + return 1; + } + + /* Validate the options. */ + if (strcmp(options, "load-latency") && + strcmp(options, "load-latency-verbose") && + strcmp(options, "branch-mispredict") && + strcmp(options, "branch-mispredict-verbose")) + return 1; + + /* Check if are aksed to collect load latency PMU data. */ + if (!strcmp(options, "load-latency") || + !strcmp(options, "load-latency-verbose")) + { + if (the_pmu_tool_info->cpu_vendor == CPU_VENDOR_INTEL) + pet = PET_INTEL_LOAD_LATENCY; + else + pet = PET_AMD_LOAD_LATENCY; + if (!strcmp(options, "load-latency-verbose")) + the_pmu_tool_info->verbose = 1; + } + + /* Check if are aksed to collect branch mispredict PMU data. */ + if (!strcmp(options, "branch-mispredict") || + !strcmp(options, "branch-mispredict-verbose")) + { + if (the_pmu_tool_info->cpu_vendor == CPU_VENDOR_INTEL) + pet = PET_INTEL_BRANCH_MISPREDICT; + else + pet = PET_AMD_BRANCH_MISPREDICT; + if (!strcmp(options, "branch-mispredict-verbose")) + the_pmu_tool_info->verbose = 1; + } + + the_pmu_tool_info->tool_details = &all_pmu_tool_fns[ptt][pet]; + the_pmu_tool_info->event = pet; + + /* Allow users to override the default tool path. */ + pmutool_path = getenv ("GCOV_PMUTOOL_PATH"); + if (pmutool_path && strlen (pmutool_path)) + the_pmu_tool_info->tool_details->arg_array[0] = pmutool_path; + + return 0; +} + +/* Do the initialization of addr2line symbolizer for the process id + given by TASK_PID. It forks an addr2line process and creates two + pipes where addresses can be written and source_filename:line_num + entries can be read. Returns 0 on success, non-zero otherwise. */ + +static int +start_addr2line_symbolizer (pid_t task_pid) +{ + pid_t pid; + char *addr2line_path; + + /* Allow users to override the default addr2line path. */ + addr2line_path = getenv ("GCOV_ADDR2LINE_PATH"); + if (addr2line_path && strlen (addr2line_path)) + addr2line_args[0] = addr2line_path; + + if (pipe (the_pmu_tool_info->symbolizer_from_pipefd) == -1) + { + fprintf (stderr, "Cannot create symbolizer write pipe.\n"); + return 1; + } + if (pipe (the_pmu_tool_info->symbolizer_to_pipefd) == -1) + { + fprintf (stderr, "Cannot create symbolizer read pipe.\n"); + return 1; + } + + pid = fork (); + if (pid == -1) + { + /* error condition */ + fprintf (stderr, "Cannot create symbolizer process.\n"); + reset_symbolizer_parent_pipes (); + reset_symbolizer_child_pipes (); + return 1; + } + + if (pid == 0) + { + /* child does an exec and then connects to/from the pipe */ + unsigned n_args = 0; + char proc_exe_buf[128]; + int new_write_fd, new_read_fd; + int i; + + /* Go over the current addr2line args. */ + for (i = 0; i < PMU_TOOL_MAX_ARGS && addr2line_args[i]; ++i) + n_args++; + + /* We are going to add one more arg for the /proc/pid/exe */ + if (n_args >= (PMU_TOOL_MAX_ARGS - 1)) + { + fprintf (stderr, "too many addr2line args: %d\n", n_args); + _exit (0); + } + snprintf (proc_exe_buf, sizeof (proc_exe_buf), "/proc/%d/exe", + task_pid); + + /* Add the extra arg for the process id. */ + addr2line_args[n_args] = proc_exe_buf; + n_args++; + + addr2line_args[n_args] = (const char *)NULL; /* terminating NULL */ + + if (sym_debug) + { + fprintf (stderr, "addr2line args:"); + for (i = 0; i < PMU_TOOL_MAX_ARGS && addr2line_args[i]; ++i) + fprintf (stderr, " %s", addr2line_args[i]); + fprintf (stderr, "\n"); + } + + /* Close unused ends of the two pipes. */ + reset_symbolizer_child_pipes (); + + /* Connect the pipes to stdin/stdout of the child process. */ + new_read_fd = dup2 (the_pmu_tool_info->symbolizer_to_pipefd[0], 0); + new_write_fd = dup2 (the_pmu_tool_info->symbolizer_from_pipefd[1], 1); + if (new_read_fd == -1 || new_write_fd == -1) + { + fprintf (stderr, "could not dup symbolizer fds\n"); + reset_symbolizer_parent_pipes (); + reset_symbolizer_child_pipes (); + _exit (0); + } + the_pmu_tool_info->symbolizer_to_pipefd[0] = new_read_fd; + the_pmu_tool_info->symbolizer_from_pipefd[1] = new_write_fd; + + /* Do execve with NULL env. */ + execve (addr2line_args[0], (char * const*)addr2line_args, + (char * const*)NULL); + /* exec returned, an error condition. */ + fprintf (stderr, "could not create symbolizer process: %s\n", + addr2line_args[0]); + reset_symbolizer_parent_pipes (); + reset_symbolizer_child_pipes (); + _exit (0); + } + else + { + /* parent */ + the_pmu_tool_info->symbolizer_pid = pid; + /* Close unused ends of the two pipes. */ + reset_symbolizer_parent_pipes (); + return 0; + } + return 0; +} + +/* Close unused write end of the from-pipe and read end of the + to-pipe. */ + +static void +reset_symbolizer_parent_pipes (void) +{ + if (the_pmu_tool_info->symbolizer_from_pipefd[1] != -1) + { + close (the_pmu_tool_info->symbolizer_from_pipefd[1]); + the_pmu_tool_info->symbolizer_from_pipefd[1] = -1; + } + if (the_pmu_tool_info->symbolizer_to_pipefd[0] != -1) + { + close (the_pmu_tool_info->symbolizer_to_pipefd[0]); + the_pmu_tool_info->symbolizer_to_pipefd[0] = -1; + } +} + +/* Close unused write end of the to-pipe and read end of the + from-pipe. */ + +static void +reset_symbolizer_child_pipes (void) +{ + if (the_pmu_tool_info->symbolizer_to_pipefd[1] != -1) + { + close (the_pmu_tool_info->symbolizer_to_pipefd[1]); + the_pmu_tool_info->symbolizer_to_pipefd[1] = -1; + } + if (the_pmu_tool_info->symbolizer_from_pipefd[0] != -1) + { + close (the_pmu_tool_info->symbolizer_from_pipefd[0]); + the_pmu_tool_info->symbolizer_from_pipefd[0] = -1; + } +} + + +/* Perform cleanup for the symbolizer process. */ + +static void +end_addr2line_symbolizer (void) +{ + int pid_status; + int wait_status; + pid_t pid = the_pmu_tool_info->symbolizer_pid; + + /* Symbolizer was not running. */ + if (!pid) + return; + + reset_symbolizer_parent_pipes (); + reset_symbolizer_child_pipes (); + kill (pid, SIGTERM); + wait_status = waitpid (pid, &pid_status, 0); + if (sym_debug) + { + if (wait_status == pid) + fprintf (stderr, "Normal exit. symbolizer terminated.\n"); + else + fprintf (stderr, "Abnormal exit. symbolizer status, %d.\n", pid_status); + } + the_pmu_tool_info->symbolizer_pid = 0; /* Symoblizer no longer running. */ +} + + +/* Given an address ADDR, return a string containing + source_filename:line_num entries. */ + +static char * +symbolize_addr2line (void *addr) +{ + char buf[32]; /* holds the ascii version of address */ + int write_count; + int read_count; + char *srcfile_linenum; + size_t max_length = 1024; + + if (!the_pmu_tool_info->symbolizer_pid) + return default_addr2line; /* symbolizer is not running */ + + write_count = snprintf (buf, sizeof (buf), "%p\n", addr); + + /* Write the address into the pipe. */ + if (write (the_pmu_tool_info->symbolizer_to_pipefd[1], buf, write_count) + < write_count) + { + if (sym_debug) + fprintf (stderr, "Cannot write symbolizer pipe.\n"); + return default_addr2line; + } + + srcfile_linenum = XNEWVEC (char, max_length); + read_count = read (the_pmu_tool_info->symbolizer_from_pipefd[0], + srcfile_linenum, max_length); + if (read_count == -1) + { + if (sym_debug) + fprintf (stderr, "Cannot read symbolizer pipe.\n"); + XDELETEVEC (srcfile_linenum); + return default_addr2line; + } + + srcfile_linenum[read_count] = 0; + if (sym_debug) + fprintf (stderr, "symbolizer: for address %p, read_count %d, got %s\n", + addr, read_count, srcfile_linenum); + return srcfile_linenum; +} + +/* Start monitoring PPID process via pfmon tool using TMPFILE as a + file to store the raw data and using PFMON_ARGS as the command line + arguments. */ + +static void +start_pfmon_module (pid_t ppid, char *tmpfile, const char **pfmon_args) +{ + int i; + unsigned int n_args = 0; + unsigned n_chars; + char pid_buf[64]; + char filename_buf[1024]; + char top_n_buf[24]; + unsigned extra_args; + + /* Go over the current pfmon args */ + for (i = 0; i < PMU_TOOL_MAX_ARGS && pfmon_args[i]; ++i) + n_args++; + + if (the_pmu_tool_info->verbose) + extra_args = 4; /* account for additional --verbose */ + else + extra_args = 3; + + /* We are going to add args. */ + if (n_args >= (PMU_TOOL_MAX_ARGS - extra_args)) + { + fprintf (stderr, "too many pfmon args: %d\n", n_args); + _exit (0); + } + + n_chars = snprintf (pid_buf, sizeof (pid_buf), "--attach-task=%ld", + (long)ppid); + if (n_chars >= sizeof (pid_buf)) + { + fprintf (stderr, "pfmon task id too long: %s\n", pid_buf); + return; + } + pfmon_args[n_args] = pid_buf; + n_args++; + + n_chars = snprintf (filename_buf, sizeof (filename_buf), "--smpl-outfile=%s", + tmpfile); + if (n_chars >= sizeof (filename_buf)) + { + fprintf (stderr, "pfmon filename too long: %s\n", filename_buf); + return; + } + pfmon_args[n_args] = filename_buf; + n_args++; + + n_chars = snprintf (top_n_buf, sizeof (top_n_buf), "--smpl-show-top=%d", + the_pmu_tool_info->top_n_address); + if (n_chars >= sizeof (top_n_buf)) + { + fprintf (stderr, "pfmon option too long: %s\n", top_n_buf); + return; + } + pfmon_args[n_args] = top_n_buf; + n_args++; + + if (the_pmu_tool_info->verbose) { + /* Add --verbose as well. */ + pfmon_args[n_args] = "--verbose"; + n_args++; + } + pfmon_args[n_args] = (char *)NULL; + + if (tool_debug) + { + fprintf (stderr, "pfmon args:"); + for (i = 0; i < PMU_TOOL_MAX_ARGS && pfmon_args[i]; ++i) + fprintf (stderr, " %s", pfmon_args[i]); + fprintf (stderr, "\n"); + } + /* Do execve with NULL env. */ + execve (pfmon_args[0], (char *const *)pfmon_args, (char * const*)NULL); + /* does not return */ +} + +/* Convert a fractional PCT to an unsigned integer after + muliplying by 100. */ + +static unsigned +convert_pct_to_unsigned (float pct) +{ + return (unsigned)(pct * 100.0f); +} + +/* Parse the load latency info pointed by LINE and save it into + LL_INFO. Returns 0 if the line was parsed successfully, non-zero + otherwise. + + An example header+line look like these: + "counts %self %cum <10 <32 <64 <256 <1024 >=1024 + %wself code addr symbol" + "218 24.06% 24.06% 100.00% 0.00% 0.00% 0.00% 0.00% 0.00% 22.70% + 0x0000000000413e75 CalcSSIM(...)+965" +*/ + +static int +parse_load_latency_line (char *line, gcov_pmu_ll_info_t *ll_info) +{ + unsigned counts; + /* These are percentages parsed as floats, but then converted to + integers after multiplying by 100. */ + float self, cum, lt_10, lt_32, lt_64, lt_256, lt_1024, gt_1024, wself; + long unsigned int p; + int n_values; + pmu_tool_fns *tool_details = the_pmu_tool_info->tool_details; + + n_values = sscanf (line, "%u%f%%%f%%%f%%%f%%%f%%%f%%%f%%%f%%%f%%%lx", + &counts, &self, &cum, <_10, <_32, <_64, <_256, + <_1024, >_1024, &wself, &p); + if (n_values != 11) + return 1; + + /* Values read successfully. Do the assignment after converting + * percentages into ints. */ + ll_info->counts = counts; + ll_info->self = convert_pct_to_unsigned (self); + ll_info->cum = convert_pct_to_unsigned (cum); + ll_info->lt_10 = convert_pct_to_unsigned (lt_10); + ll_info->lt_32 = convert_pct_to_unsigned (lt_32); + ll_info->lt_64 = convert_pct_to_unsigned (lt_64); + ll_info->lt_256 = convert_pct_to_unsigned (lt_256); + ll_info->lt_1024 = convert_pct_to_unsigned (lt_1024); + ll_info->gt_1024 = convert_pct_to_unsigned (gt_1024); + ll_info->wself = convert_pct_to_unsigned (wself); + ll_info->code_addr = p; + + /* Run the raw address through the symbolizer. */ + if (tool_details->symbolize) + { + char *sym_info = tool_details->symbolize ((void *)p); + /* sym_info is of the form src_filename:linenum. Descriminator is + currently not supported by addr2line. */ + char *sep = strchr (sym_info, ':'); + if (!sep) + { + /* Assume entire string is srcfile. */ + ll_info->filename = (char *)sym_info; + ll_info->line = 0; + } + else + { + /* Terminate the filename string at the separator. */ + *sep = 0; + ll_info->filename = (char *)sym_info; + /* Convert rest of the sym info to a line number. */ + ll_info->line = atol (sep+1); + } + ll_info->discriminator = 0; + } + else + { + /* No symbolizer available. */ + ll_info->filename = NULL; + ll_info->line = 0; + ll_info->discriminator = 0; + } + return 0; +} + +/* Parse the branch mispredict info pointed by LINE and save it into + BRM_INFO. Returns 0 if the line was parsed successfully, non-zero + otherwise. + + An example header+line look like these: + "counts %self %cum code addr symbol" + "6869 37.67% 37.67% 0x00000000004007e5 sum(std::vector > const&)+51" +*/ + +static int +parse_branch_mispredict_line (char *line, gcov_pmu_brm_info_t *brm_info) +{ + unsigned counts; + /* These are percentages parsed as floats, but then converted to + ints after multiplying by 100. */ + float self, cum; + long unsigned int p; + int n_values; + pmu_tool_fns *tool_details = the_pmu_tool_info->tool_details; + + n_values = sscanf (line, "%u%f%%%f%%%lx", + &counts, &self, &cum, &p); + if (n_values != 4) + return 1; + + /* Values read successfully. Do the assignment after converting + * percentages into ints. */ + brm_info->counts = counts; + brm_info->self = convert_pct_to_unsigned (self); + brm_info->cum = convert_pct_to_unsigned (cum); + brm_info->code_addr = p; + + /* Run the raw address through the symbolizer. */ + if (tool_details->symbolize) + { + char *sym_info = tool_details->symbolize ((void *)p); + /* sym_info is of the form src_filename:linenum. Descriminator is + currently not supported by addr2line. */ + char *sep = strchr (sym_info, ':'); + if (!sep) + { + /* Assume entire string is srcfile. */ + brm_info->filename = sym_info; + brm_info->line = 0; + } + else + { + /* Terminate the filename string at the separator. */ + *sep = 0; + brm_info->filename = sym_info; + /* Convert rest of the sym info to a line number. */ + brm_info->line = atol (sep+1); + } + brm_info->discriminator = 0; + } + else + { + /* No symbolizer available. */ + brm_info->filename = NULL; + brm_info->line = 0; + brm_info->discriminator = 0; + } + return 0; +} + +/* Delete load latency info structures INFO. */ + +static void +destroy_load_latency_infos (void *info) +{ + unsigned i; + ll_infos_t* ll_infos = (ll_infos_t *)info; + + /* delete each element */ + for (i = 0; i < ll_infos->ll_count; ++i) + XDELETE (ll_infos->ll_array[i]); + /* delete the array itself */ + XDELETE (ll_infos->ll_array); + __destroy_pmu_tool_header (ll_infos->pmu_tool_header); + free (ll_infos->pmu_tool_header); + ll_infos->ll_array = 0; + ll_infos->ll_count = 0; +} + +/* Delete branch mispredict structure INFO. */ + +static void +destroy_branch_mispredict_infos (void *info) +{ + unsigned i; + brm_infos_t* brm_infos = (brm_infos_t *)info; + + /* delete each element */ + for (i = 0; i < brm_infos->brm_count; ++i) + XDELETE (brm_infos->brm_array[i]); + /* delete the array itself */ + XDELETE (brm_infos->brm_array); + __destroy_pmu_tool_header (brm_infos->pmu_tool_header); + free (brm_infos->pmu_tool_header); + brm_infos->brm_array = 0; + brm_infos->brm_count = 0; +} + +/* Parse FILENAME for load latency lines into a structure + PMU_DATA. Returns 0 on on success. Returns non-zero on + failure. */ + +static int +parse_pfmon_load_latency (char *filename, void *pmu_data) +{ + FILE *fp; + size_t buflen = 2*1024; + char *buf; + ll_infos_t *load_latency_infos = (ll_infos_t *)pmu_data; + gcov_pmu_tool_header_t *tool_header = 0; + + if ((fp = fopen (filename, "r")) == NULL) + { + fprintf (stderr, "cannot open pmu data file: %s\n", filename); + return 1; + } + + if (!(tool_header = parse_pfmon_tool_header (fp, pfmon_ll_header))) + { + fprintf (stderr, "cannot parse pmu data file header: %s\n", filename); + return 1; + } + + buf = XNEWVEC (char, buflen); + while (fgets (buf, buflen, fp)) + { + gcov_pmu_ll_info_t *ll_info = XNEW (gcov_pmu_ll_info_t); + if (!parse_load_latency_line (buf, ll_info)) + { + /* valid line, add to the array */ + load_latency_infos->ll_count++; + if (load_latency_infos->ll_count >= + load_latency_infos->alloc_ll_count) + { + /* need to realloc */ + load_latency_infos->ll_array = + realloc (load_latency_infos->ll_array, + 2 * load_latency_infos->alloc_ll_count); + if (load_latency_infos->ll_array == NULL) + { + fprintf (stderr, "Cannot allocate load latency memory.\n"); + __destroy_pmu_tool_header (tool_header); + free (buf); + fclose (fp); + return 1; + } + } + load_latency_infos->ll_array[load_latency_infos->ll_count - 1] = + ll_info; + } + else + /* Delete invalid line. */ + XDELETE (ll_info); + } + free (buf); + fclose (fp); + load_latency_infos->pmu_tool_header = tool_header; + return 0; +} + +/* Parse open file FP until END_HEADER is seen. The data matching + gcov_pmu_tool_header_t fields is saved and returned in a new + struct. In case of failure, it returns NULL. */ + +static gcov_pmu_tool_header_t * +parse_pfmon_tool_header (FILE *fp, const char *end_header) +{ + static const char tag_hostname[] = "# hostname: "; + static const char tag_kversion[] = "# kernel version: "; + static const char tag_hostcpu[] = "# host CPUs: "; + static const char tag_column_desc_start[] = "# description of columns:"; + static const char tag_column_desc_end[] = + "# other columns are self-explanatory"; + size_t buflen = 4*1024; + char *buf, *buf_start, *buf_end; + gcov_pmu_tool_header_t *tool_header = XNEWVEC (gcov_pmu_tool_header_t, 1); + char *hostname = 0; + char *kversion = 0; + char *hostcpu = 0; + char *column_description = 0; + char *column_desc_start = 0; + char *column_desc_end = 0; + const char *column_header = 0; + int got_hostname = 0; + int got_kversion = 0 ; + int got_hostcpu = 0; + int got_end_header = 0; + int got_column_description = 0; + + buf = XNEWVEC (char, buflen); + buf_start = buf; + buf_end = buf + buflen; + while (buf < (buf_end - 1) && fgets (buf, buf_end - buf, fp)) + { + if (strncmp (end_header, buf, buf_end - buf) == 0) + { + got_end_header = 1; + break; + } + if (!got_hostname && + strncmp (buf, tag_hostname, strlen (tag_hostname)) == 0) + { + size_t len = strlen (buf) - strlen (tag_hostname); + hostname = XNEWVEC (char, len); + memcpy (hostname, buf + strlen (tag_hostname), len); + hostname[len - 1] = 0; + tool_header->hostname = hostname; + got_hostname = 1; + } + + if (!got_kversion && + strncmp (buf, tag_kversion, strlen (tag_kversion)) == 0) + { + size_t len = strlen (buf) - strlen (tag_kversion); + kversion = XNEWVEC (char, len); + memcpy (kversion, buf + strlen (tag_kversion), len); + kversion[len - 1] = 0; + tool_header->kernel_version = kversion; + got_kversion = 1; + } + + if (!got_hostcpu && + strncmp (buf, tag_hostcpu, strlen (tag_hostcpu)) == 0) + { + size_t len = strlen (buf) - strlen (tag_hostcpu); + hostcpu = XNEWVEC (char, len); + memcpy (hostcpu, buf + strlen (tag_hostcpu), len); + hostcpu[len - 1] = 0; + tool_header->host_cpu = hostcpu; + got_hostcpu = 1; + } + if (!got_column_description && + strncmp (buf, tag_column_desc_start, strlen (tag_column_desc_start)) + == 0) + { + column_desc_start = buf; + column_desc_end = 0; + /* Continue reading until end of the column descriptor. */ + while (buf < (buf_end - 1) && fgets (buf, buf_end - buf, fp)) + { + if (strncmp (buf, tag_column_desc_end, + strlen (tag_column_desc_end)) == 0) + { + column_desc_end = buf + strlen (tag_column_desc_end); + break; + } + buf += strlen (buf); + } + if (column_desc_end) + { + /* Found the end, copy it into a new string. */ + column_description = XNEWVEC (char, column_desc_end - + column_desc_start + 1); + got_column_description = 1; + strcpy (column_description, column_desc_start); + tool_header->column_description = column_description; + } + } + buf += strlen (buf); + } + + /* If we are missing any of the fields, return NULL. */ + if (!got_end_header || !got_hostname || !got_kversion || !got_hostcpu + || !got_column_description) + { + free (hostname); + free (kversion); + free (hostcpu); + free (column_description); + free (buf_start); + free (tool_header); + return NULL; + } + + switch (the_pmu_tool_info->event) + { + case PET_INTEL_LOAD_LATENCY: + case PET_AMD_LOAD_LATENCY: + column_header = pfmon_ll_header; + break; + case PET_INTEL_BRANCH_MISPREDICT: + case PET_AMD_BRANCH_MISPREDICT: + column_header = pfmon_bm_header; + break; + default: + break; + } + tool_header->column_header = strdup (column_header); + tool_header->full_header = buf_start; + return tool_header; +} + + +/* Parse FILENAME for branch mispredict lines into a structure + PMU_DATA. Returns 0 on on success. Returns non-zero on + failure. */ + +static int +parse_pfmon_branch_mispredicts (char *filename, void *pmu_data) +{ + FILE *fp; + size_t buflen = 2*1024; + char *buf; + brm_infos_t *brm_infos = (brm_infos_t *)pmu_data; + gcov_pmu_tool_header_t *tool_header = 0; + + if ((fp = fopen (filename, "r")) == NULL) + { + fprintf (stderr, "cannot open pmu data file: %s\n", filename); + return 1; + } + + if (!(tool_header = parse_pfmon_tool_header (fp, pfmon_bm_header))) + { + fprintf (stderr, "cannot parse pmu data file header: %s\n", filename); + return 1; + } + + buf = XNEWVEC (char, buflen); + while (fgets (buf, buflen, fp)) + { + gcov_pmu_brm_info_t *brm = XNEW (gcov_pmu_brm_info_t); + if (!parse_branch_mispredict_line (buf, brm)) + { + /* Valid line, add to the array. */ + brm_infos->brm_count++; + if (brm_infos->brm_count >= brm_infos->alloc_brm_count) + { + /* Do we need to realloc? */ + brm_infos->brm_array = + realloc (brm_infos->brm_array, + 2 * brm_infos->alloc_brm_count); + if (brm_infos->brm_array == NULL) { + fprintf (stderr, + "Cannot allocate memory for br mispredicts.\n"); + __destroy_pmu_tool_header (tool_header); + free (buf); + fclose (fp); + return 1; + } + } + brm_infos->brm_array[brm_infos->brm_count - 1] = brm; + } + else + /* Delete invalid line. */ + XDELETE (brm); + } + free (buf); + fclose (fp); + brm_infos->pmu_tool_header = tool_header; + return 0; +} + +/* Start the monitoring process using pmu tool. Return 0 on success, + non-zero otherwise. */ + +static int +pmu_start (void) +{ + pid_t pid; + + /* no start function */ + if (!the_pmu_tool_info->tool_details->start_pmu_module) + return 1; + + pid = fork (); + if (pid == -1) + { + /* error condition */ + fprintf (stderr, "Cannot create PMU profiling process, exiting.\n"); + return 1; + } + else if (pid == 0) + { + /* child */ + pid_t ppid = getppid(); + char *tmpfile = the_pmu_tool_info->raw_pmu_profile_filename; + const char **pfmon_args = the_pmu_tool_info->tool_details->arg_array; + int new_stderr_fd; + + /* Redirect stderr from the child process into a separate file. */ + new_stderr_fd = creat (the_pmu_tool_info->tool_stderr_filename, + S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH); + if (new_stderr_fd != -1) + dup2 (new_stderr_fd, 2); + /* The following does an exec and thus is not expected to return. */ + the_pmu_tool_info->tool_details->start_pmu_module(ppid, tmpfile, + pfmon_args); + /* exec returned, an error condition. */ + fprintf (stderr, "could not create profiling process: %s\n", + the_pmu_tool_info->tool_details->arg_array[0]); + _exit (0); + } + else + { + /* parent */ + the_pmu_tool_info->pmu_tool_pid = pid; + return 0; + } +} + +/* Allocate and initialize pmu load latency structure. */ + +static void * +init_pmu_load_latency (void) +{ + ll_infos_t *load_latency = XNEWVEC (ll_infos_t, 1); + load_latency->ll_count = 0; + load_latency->alloc_ll_count = 64; + load_latency->ll_array = XNEWVEC (gcov_pmu_ll_info_t *, + load_latency->alloc_ll_count); + return (void *)load_latency; +} + +/* Allocate and initialize pmu branch mispredict structure. */ + +static void * +init_pmu_branch_mispredict (void) +{ + brm_infos_t *brm_info = XNEWVEC (brm_infos_t, 1); + brm_info->brm_count = 0; + brm_info->alloc_brm_count = 64; + brm_info->brm_array = XNEWVEC (gcov_pmu_brm_info_t *, + brm_info->alloc_brm_count); + return (void *)brm_info; +} + +/* Initialize pmu tool based upon PMU_INFO. Sets the appropriate tool + type in the global the_pmu_tool_info. */ + +static int +init_pmu_tool (struct gcov_pmu_info *pmu_info) +{ + the_pmu_tool_info->pmu_profiling_state = PMU_NONE; + the_pmu_tool_info->verbose = 0; + the_pmu_tool_info->tool = PTT_PFMON; /* we support only pfmon */ + the_pmu_tool_info->pmu_tool_pid = 0; + the_pmu_tool_info->top_n_address = pmu_info->pmu_top_n_address; + the_pmu_tool_info->symbolizer_pid = 0; + the_pmu_tool_info->symbolizer_to_pipefd[0] = -1; + the_pmu_tool_info->symbolizer_to_pipefd[1] = -1; + the_pmu_tool_info->symbolizer_from_pipefd[0] = -1; + the_pmu_tool_info->symbolizer_from_pipefd[1] = -1; + + if (parse_pmu_profile_options (pmu_info->pmu_tool)) + return 1; + + if (the_pmu_tool_info->pmu_profiling_state == PMU_ERROR) + { + fprintf (stderr, "Unsupported PMU module: %s, disabling PMU profiling.\n", + pmu_info->pmu_tool); + return 1; + } + + if (the_pmu_tool_info->tool_details->init_pmu_module) + /* initialize module */ + the_pmu_tool_info->pmu_data = + the_pmu_tool_info->tool_details->init_pmu_module(); + return 0; +} + +/* Initialize PMU profiling based upon the information passed in + PMU_INFO and use pmu_profile_filename as the file to store the PMU + profile. This is called multiple times from libgcov, once per + object file. We need to make sure to do the necessary + initialization only the first time. For subsequent invocations it + behaves as a NOOP. */ + +void +__gcov_init_pmu_profiler (struct gcov_pmu_info *pmu_info) +{ + char *raw_pmu_profile_filename; + char *tool_stderr_filename; + if (!pmu_info || !pmu_info->pmu_profile_filename || !pmu_info->pmu_tool) + return; + + /* Allocate the global structure on first invocation. */ + if (!the_pmu_tool_info) + { + the_pmu_tool_info = XNEWVEC (struct pmu_tool_info, 1); + if (!the_pmu_tool_info) + { + fprintf (stderr, "Error allocating memory for PMU tool\n"); + return; + } + if (init_pmu_tool (pmu_info)) + { + /* Initialization error. */ + XDELETE (the_pmu_tool_info); + the_pmu_tool_info = 0; + return; + } + } + + switch (the_pmu_tool_info->pmu_profiling_state) + { + case PMU_NONE: + the_pmu_tool_info->pmu_profile_filename = + strdup (pmu_info->pmu_profile_filename); + /* Construct an intermediate filename by substituting trailing + '.gcda' with '.pmud'. */ + raw_pmu_profile_filename = strdup (pmu_info->pmu_profile_filename); + if (raw_pmu_profile_filename == NULL) + { + fprintf (stderr, "Cannot allocate memory\n"); + exit (1); + } + strcpy (raw_pmu_profile_filename + strlen (raw_pmu_profile_filename) - 4, + "pmud"); + + /* Construct a filename for collecting PMU tool's stderr by + substituting trailing '.gcda' with '.stderr'. */ + tool_stderr_filename = + XNEWVEC (char, strlen (pmu_info->pmu_profile_filename) + 1 + 2); + strcpy (tool_stderr_filename, pmu_info->pmu_profile_filename); + strcpy (tool_stderr_filename + strlen (tool_stderr_filename) - 4, + "stderr"); + the_pmu_tool_info->raw_pmu_profile_filename = raw_pmu_profile_filename; + the_pmu_tool_info->tool_stderr_filename = tool_stderr_filename; + the_pmu_tool_info->pmu_profiling_state = PMU_INITIALIZED; + break; + + case PMU_INITIALIZED: + case PMU_OFF: + case PMU_ON: + case PMU_ERROR: + break; + default: + break; + } +} + +/* Start PMU profiling. It updates the current state. */ + +void +__gcov_start_pmu_profiler (void) +{ + if (!the_pmu_tool_info) + return; + + switch (the_pmu_tool_info->pmu_profiling_state) + { + case PMU_INITIALIZED: + if (!pmu_start ()) + the_pmu_tool_info->pmu_profiling_state = PMU_ON; + else + the_pmu_tool_info->pmu_profiling_state = PMU_ERROR; + break; + + case PMU_NONE: + /* PMU was not properly initialized, don't attempt start it. */ + the_pmu_tool_info->pmu_profiling_state = PMU_ERROR; + break; + + case PMU_OFF: + /* Restarting PMU is not yet supported. */ + case PMU_ON: + /* Do nothing. */ + case PMU_ERROR: + break; + + default: + break; + } +} + +/* Stop PMU profiling. Currently it doesn't do anything except + bookkeeping. */ + +void +__gcov_stop_pmu_profiler (void) +{ + if (!the_pmu_tool_info) + return; + + if (the_pmu_tool_info->tool_details->stop_pmu_module) + the_pmu_tool_info->tool_details->stop_pmu_module(); + if (the_pmu_tool_info->pmu_profiling_state == PMU_ON) + the_pmu_tool_info->pmu_profiling_state = PMU_OFF; +} + +/* Write the load latency information LL_INFO into the gcda file. */ + +static void +gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info) +{ + gcov_unsigned_t len = GCOV_TAG_PMU_LOAD_LATENCY_LENGTH (ll_info->filename); + gcov_write_tag_length (GCOV_TAG_PMU_LOAD_LATENCY_INFO, len); + gcov_write_unsigned (ll_info->counts); + gcov_write_unsigned (ll_info->self); + gcov_write_unsigned (ll_info->cum); + gcov_write_unsigned (ll_info->lt_10); + gcov_write_unsigned (ll_info->lt_32); + gcov_write_unsigned (ll_info->lt_64); + gcov_write_unsigned (ll_info->lt_256); + gcov_write_unsigned (ll_info->lt_1024); + gcov_write_unsigned (ll_info->gt_1024); + gcov_write_unsigned (ll_info->wself); + gcov_write_counter (ll_info->code_addr); + gcov_write_unsigned (ll_info->line); + gcov_write_unsigned (ll_info->discriminator); + gcov_write_string (ll_info->filename); +} + + +/* Write the branch mispredict information BRM_INFO into the gcda file. */ + +static void +gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t *brm_info) +{ + gcov_unsigned_t len = GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH ( + brm_info->filename); + gcov_write_tag_length (GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO, len); + gcov_write_unsigned (brm_info->counts); + gcov_write_unsigned (brm_info->self); + gcov_write_unsigned (brm_info->cum); + gcov_write_counter (brm_info->code_addr); + gcov_write_unsigned (brm_info->line); + gcov_write_unsigned (brm_info->discriminator); + gcov_write_string (brm_info->filename); +} + +/* Write load latency information INFO into the gcda file. The gcda + file has already been opened and is available for writing. */ + +static void +gcov_write_load_latency_infos (void *info) +{ + unsigned i; + const ll_infos_t *ll_infos = (const ll_infos_t *)info; + gcov_unsigned_t stamp = 0; /* Don't use stamp as we don't support merge. */ + /* We don't support merge, and instead always rewrite the file. But + to rewrite a gcov file we must first read it, however the read + value is ignored. */ + gcov_read_unsigned (); + gcov_rewrite (); + gcov_write_tag_length (GCOV_DATA_MAGIC, GCOV_VERSION); + gcov_write_unsigned (stamp); + if (ll_infos->pmu_tool_header) + gcov_write_tool_header (ll_infos->pmu_tool_header); + for (i = 0; i < ll_infos->ll_count; ++i) + { + /* Write each line. */ + gcov_write_ll_line (ll_infos->ll_array[i]); + } + gcov_truncate (); +} + +/* Write branch mispredict information INFO into the gcda file. The + gcda file has already been opened and is available for writing. */ + +static void +gcov_write_branch_mispredict_infos (void *info) +{ + unsigned i; + const brm_infos_t *brm_infos = (const brm_infos_t *)info; + gcov_unsigned_t stamp = 0; /* Don't use stamp as we don't support merge. */ + /* We don't support merge, and instead always rewrite the file. */ + gcov_rewrite (); + gcov_write_tag_length (GCOV_DATA_MAGIC, GCOV_VERSION); + gcov_write_unsigned (stamp); + if (brm_infos->pmu_tool_header) + gcov_write_tool_header (brm_infos->pmu_tool_header); + for (i = 0; i < brm_infos->brm_count; ++i) + { + /* Write each line. */ + gcov_write_branch_mispredict_line (brm_infos->brm_array[i]); + } + gcov_truncate (); +} + +/* Compute TOOL_HEADER length for writing into the gcov file. */ + +static gcov_unsigned_t +gcov_tag_pmu_tool_header_length (gcov_pmu_tool_header_t *header) +{ + gcov_unsigned_t len = 0; + if (header) + { + len += gcov_string_length (header->host_cpu); + len += gcov_string_length (header->hostname); + len += gcov_string_length (header->kernel_version); + len += gcov_string_length (header->column_header); + len += gcov_string_length (header->column_description); + len += gcov_string_length (header->full_header); + } + return len; +} + +/* Write tool header into the gcda file. It assumes that the gcda file + has already been opened and is available for writing. */ + +static void +gcov_write_tool_header (gcov_pmu_tool_header_t *header) +{ + gcov_unsigned_t len = gcov_tag_pmu_tool_header_length (header); + gcov_write_tag_length (GCOV_TAG_PMU_TOOL_HEADER, len); + gcov_write_string (header->host_cpu); + gcov_write_string (header->hostname); + gcov_write_string (header->kernel_version); + gcov_write_string (header->column_header); + gcov_write_string (header->column_description); + gcov_write_string (header->full_header); +} + + +/* End PMU profiling. If GCDA_ERROR is non-zero then write profiling data into + already open gcda file */ + +void +__gcov_end_pmu_profiler (int gcda_error) +{ + int pid_status; + int wait_status; + pid_t pid; + pmu_tool_fns *tool_details; + + if (!the_pmu_tool_info) + return; + + tool_details = the_pmu_tool_info->tool_details; + pid = the_pmu_tool_info->pmu_tool_pid; + if (pid) + { + if (tool_debug) + fprintf (stderr, "terminating PMU profiling process %ld\n", (long)pid); + kill (pid, SIGTERM); + if (tool_debug) + fprintf (stderr, "parent: waiting for pmu process to end\n"); + wait_status = waitpid (pid, &pid_status, 0); + if (tool_debug) { + if (wait_status == pid) + fprintf (stderr, "Normal exit. Child terminated.\n"); + else + fprintf (stderr, "Abnormal exit. child status, %d.\n", pid_status); + } + } + + if (the_pmu_tool_info->pmu_profiling_state != PMU_OFF) + { + /* nothing to do */ + fprintf (stderr, + "__gcov_dump_pmu_profile: incorrect pmu state: %d, pid: %ld\n", + the_pmu_tool_info->pmu_profiling_state, + (unsigned long)pid); + return; + } + + if (!tool_details->parse_pmu_output) + return; + + /* Since we are going to parse the output, we also need symbolizer. */ + if (tool_details->start_symbolizer) + tool_details->start_symbolizer (getpid ()); + + if (!tool_details->parse_pmu_output + (the_pmu_tool_info->raw_pmu_profile_filename, + the_pmu_tool_info->pmu_data)) + { + if (!gcda_error && tool_details->gcov_write_pmu_data) + /* Write tool output into the gcda file. */ + tool_details->gcov_write_pmu_data (the_pmu_tool_info->pmu_data); + } + + if (tool_details->end_symbolizer) + tool_details->end_symbolizer (); + + if (tool_details->cleanup_pmu_data) + tool_details->cleanup_pmu_data (the_pmu_tool_info->pmu_data); +} + +#endif Index: gcc/coverage.c =================================================================== --- gcc/coverage.c (revision 175346) +++ gcc/coverage.c (working copy) @@ -62,6 +62,9 @@ #include "dbgcnt.h" #include "input.h" +/* Defined in tree-profile.c. */ +void gimple_init_instrumentation_sampling (void); + struct function_list { struct function_list *next; /* next function */ @@ -120,6 +123,9 @@ static char *da_base_file_name; static char *main_input_file_name; +/* Filename for the global pmu profile */ +static char pmu_profile_filename[] = "pmuprofile"; + /* Hash table of count data. */ static htab_t counts_hash = NULL; @@ -146,6 +152,16 @@ /* True if the current module has any asm statements. */ static bool has_asm_statement; +/* extern const char * __gcov_pmu_profile_filename */ +static tree gcov_pmu_filename_decl = NULL_TREE; +/* extern const char * __gcov_pmu_profile_options */ +static tree gcov_pmu_options_decl = NULL_TREE; +/* extern gcov_unsigned_t __gcov_pmu_top_n_address */ +static tree gcov_pmu_top_n_address_decl = NULL_TREE; + +/* To ensure that the above variables are initialized only once. */ +static int pmu_profiling_initialized = 0; + /* Forward declarations. */ static hashval_t htab_counts_entry_hash (const void *); static int htab_counts_entry_eq (const void *, const void *); @@ -158,6 +174,8 @@ static tree build_gcov_info (void); static void create_coverage (void); static char * get_da_file_name (const char *); +static void init_pmu_profiling (void); +static bool profiling_enabled_p (void); /* Return the type node for gcov_type. */ @@ -175,6 +193,15 @@ return lang_hooks.types.type_for_size (32, true); } +/* Return the type node for const char *. */ + +static tree +get_const_string_type (void) +{ + return build_pointer_type + (build_qualified_type (char_type_node, TYPE_QUAL_CONST)); +} + static hashval_t htab_counts_entry_hash (const void *of) { @@ -1688,7 +1715,7 @@ no_coverage = 1; /* Disable any further coverage. */ - if (!prg_ctr_mask) + if (!prg_ctr_mask && !flag_pmu_profile_generate) return; t = build_gcov_info (); @@ -1910,8 +1937,122 @@ read_counts_file (get_da_file_name (module_infos[i]->da_filename), module_infos[i]->ident); } + + /* Define variables which are referenced at runtime by libgcov. */ + if (profiling_enabled_p ()) + { + init_pmu_profiling (); + gimple_init_instrumentation_sampling (); + } } +/* Return True if any type of profiling is enabled which requires linking + in libgcov otherwise return False. */ + +static bool +profiling_enabled_p (void) +{ + return flag_pmu_profile_generate || profile_arc_flag || + flag_profile_generate_sampling || flag_test_coverage || + flag_branch_probabilities || flag_profile_reusedist; +} + +/* Construct variables for PMU profiling. + 1) __gcov_pmu_profile_filename, + 2) __gcov_pmu_profile_options, + 3) __gcov_pmu_top_n_address. */ + +static void +init_pmu_profiling (void) +{ + if (!pmu_profiling_initialized) + { + unsigned top_n_addr = PARAM_VALUE (PARAM_PMU_PROFILE_N_ADDRESS); + tree filename_ptr, options_ptr; + + /* Construct an initializer for __gcov_pmu_profile_filename. */ + gcov_pmu_filename_decl = + build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier ("__gcov_pmu_profile_filename"), + get_const_string_type ()); + TREE_PUBLIC (gcov_pmu_filename_decl) = 1; + DECL_ARTIFICIAL (gcov_pmu_filename_decl) = 1; + make_decl_one_only (gcov_pmu_filename_decl, + DECL_ASSEMBLER_NAME (gcov_pmu_filename_decl)); + TREE_STATIC (gcov_pmu_filename_decl) = 1; + + if (flag_pmu_profile_generate) + { + const char *filename = get_da_file_name (pmu_profile_filename); + int file_name_len; + tree filename_string; + file_name_len = strlen (filename); + filename_string = build_string (file_name_len + 1, filename); + TREE_TYPE (filename_string) = build_array_type + (char_type_node, build_index_type + (build_int_cst (NULL_TREE, file_name_len))); + filename_ptr = build1 (ADDR_EXPR, get_const_string_type (), + filename_string); + } + else + filename_ptr = null_pointer_node; + + DECL_INITIAL (gcov_pmu_filename_decl) = filename_ptr; + assemble_variable (gcov_pmu_filename_decl, 0, 0, 0); + + /* Construct an initializer for __gcov_pmu_profile_options. */ + gcov_pmu_options_decl = + build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier ("__gcov_pmu_profile_options"), + get_const_string_type ()); + TREE_PUBLIC (gcov_pmu_options_decl) = 1; + DECL_ARTIFICIAL (gcov_pmu_options_decl) = 1; + make_decl_one_only (gcov_pmu_options_decl, + DECL_ASSEMBLER_NAME (gcov_pmu_options_decl)); + TREE_STATIC (gcov_pmu_options_decl) = 1; + + /* If the flag is false we generate a null pointer to indicate + that we are not doing the pmu profiling. */ + if (flag_pmu_profile_generate) + { + const char *pmu_options = flag_pmu_profile_generate; + int pmu_options_len; + tree pmu_options_string; + + pmu_options_len = strlen (pmu_options); + pmu_options_string = build_string (pmu_options_len + 1, pmu_options); + TREE_TYPE (pmu_options_string) = build_array_type + (char_type_node, build_index_type (build_int_cst + (NULL_TREE, pmu_options_len))); + options_ptr = build1 (ADDR_EXPR, get_const_string_type (), + pmu_options_string); + } + else + options_ptr = null_pointer_node; + + DECL_INITIAL (gcov_pmu_options_decl) = options_ptr; + assemble_variable (gcov_pmu_options_decl, 0, 0, 0); + + /* Construct an initializer for __gcov_pmu_top_n_address. We + don't need to guard this with the flag_pmu_profile generate + because the value of __gcov_pmu_top_n_address is ignored when + not doing profiling. */ + gcov_pmu_top_n_address_decl = + build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier ("__gcov_pmu_top_n_address"), + get_gcov_unsigned_t ()); + TREE_PUBLIC (gcov_pmu_top_n_address_decl) = 1; + DECL_ARTIFICIAL (gcov_pmu_top_n_address_decl) = 1; + make_decl_one_only (gcov_pmu_top_n_address_decl, + DECL_ASSEMBLER_NAME (gcov_pmu_top_n_address_decl)); + TREE_STATIC (gcov_pmu_top_n_address_decl) = 1; + DECL_INITIAL (gcov_pmu_top_n_address_decl) = + build_int_cstu (get_gcov_unsigned_t (), top_n_addr); + assemble_variable (gcov_pmu_top_n_address_decl, 0, 0, 0); + } + pmu_profiling_initialized = 1; +} + /* Performs file-level cleanup. Close graph file, generate coverage variables and constructor. */ @@ -1989,4 +2130,19 @@ has_asm_statement = flag_ripa_disallow_asm_modules; } +/* Check the command line OPTIONS passed to + -fpmu-profile-generate. Return 0 if the options are valid, non-zero + otherwise. */ + +int +check_pmu_profile_options (const char *options) +{ + if (strcmp(options, "load-latency") && + strcmp(options, "load-latency-verbose") && + strcmp(options, "branch-mispredict") && + strcmp(options, "branch-mispredict-verbose")) + return 1; + return 0; +} + #include "gt-coverage.h" Index: gcc/coverage.h =================================================================== --- gcc/coverage.h (revision 175346) +++ gcc/coverage.h (working copy) @@ -77,4 +77,10 @@ /* Mark this module as containing asm statements. */ extern void coverage_has_asm_stmt (void); +/* Check if the specified options are valid for pmu profilig. */ +extern int check_pmu_profile_options (const char *options); + +/* Defined in tree-profile.c. */ +extern void tree_init_instrumentation_sampling (void); + #endif Index: gcc/common.opt =================================================================== --- gcc/common.opt (revision 175346) +++ gcc/common.opt (working copy) @@ -1606,6 +1606,14 @@ Common Joined RejectNegative Var(common_deferred_options) Defer -fplugin-arg--[=] Specify argument = for plugin +fpmu-profile-generate= +Common Joined RejectNegative Var(flag_pmu_profile_generate) +-fpmu-profile-generate=[load-latency] Generate pmu profile for cache misses. Currently only pfmon based load latency profiling is supported on Intel/PEBS and AMD/IBS platforms. + +fpmu-profile-use= +Common Joined RejectNegative Var(flag_pmu_profile_use) +-fpmu-profile-use=[load-latency] Use pmu profile data while optimizing. Currently only perfmon based load latency profiling is supported on Intel/PEBS and AMD/IBS platforms. + fpredictive-commoning Common Report Var(flag_predictive_commoning) Optimization Run predictive commoning optimization. Index: gcc/tree-profile.c =================================================================== --- gcc/tree-profile.c (revision 175346) +++ gcc/tree-profile.c (working copy) @@ -168,6 +168,9 @@ /* extern gcov_unsigned_t __gcov_sampling_rate */ static tree gcov_sampling_rate_decl = NULL_TREE; +/* forward declaration. */ +void gimple_init_instrumentation_sampling (void); + /* Insert STMT_IF around given sequence of consecutive statements in the same basic block starting with STMT_START, ending with STMT_END. */ @@ -287,7 +290,7 @@ } } -static void +void gimple_init_instrumentation_sampling (void) { if (!gcov_sampling_rate_decl) @@ -341,8 +344,6 @@ tree dc_profiler_fn_type; tree average_profiler_fn_type; - gimple_init_instrumentation_sampling (); - if (!gcov_type_node) { char name_buf[32]; Index: gcc/libgcov.c =================================================================== --- gcc/libgcov.c (revision 175346) +++ gcc/libgcov.c (working copy) @@ -123,9 +123,15 @@ } #ifndef __GCOV_KERNEL__ +/* Emitted in coverage.c. */ +extern char * __gcov_pmu_profile_filename; +extern char * __gcov_pmu_profile_options; +extern gcov_unsigned_t __gcov_pmu_top_n_address; + /* Sampling rate. */ extern gcov_unsigned_t __gcov_sampling_rate; static int gcov_sampling_rate_initialized = 0; +void __gcov_set_sampling_rate (unsigned int rate); /* Set sampling rate to RATE. */ @@ -343,7 +349,7 @@ /* Update complete filename with stripped original. */ if (prefix_length != 0 && !IS_DIR_SEPARATOR (*filename)) { - /* If prefix is given, add diretory separator. */ + /* If prefix is given, add directory separator. */ strcpy (gi_filename_up, "/"); strcpy (gi_filename_up + 1, filename); } @@ -351,6 +357,88 @@ strcpy (gi_filename_up, filename); } +/* This function allocates the space to store current file name. */ + +static void +gcov_alloc_filename (void) +{ + /* Get file name relocation prefix. Non-absolute values are ignored. */ + char *gcov_prefix = 0; + + prefix_length = 0; + gcov_prefix_strip = 0; + + { + /* Check if the level of dirs to strip off specified. */ + char *tmp = getenv ("GCOV_PREFIX_STRIP"); + if (tmp) + { + gcov_prefix_strip = atoi (tmp); + /* Do not consider negative values. */ + if (gcov_prefix_strip < 0) + gcov_prefix_strip = 0; + } + } + /* Get file name relocation prefix. Non-absolute values are ignored. */ + gcov_prefix = getenv ("GCOV_PREFIX"); + if (gcov_prefix) + { + prefix_length = strlen(gcov_prefix); + + /* Remove an unnecessary trailing '/' */ + if (IS_DIR_SEPARATOR (gcov_prefix[prefix_length - 1])) + prefix_length--; + } + else + prefix_length = 0; + + /* If no prefix was specified and a prefix stip, then we assume + relative. */ + if (gcov_prefix_strip != 0 && prefix_length == 0) + { + gcov_prefix = "."; + prefix_length = 1; + } + + /* Allocate and initialize the filename scratch space. */ + gi_filename = (char *) malloc (prefix_length + gcov_max_filename + 2); + if (prefix_length) + memcpy (gi_filename, gcov_prefix, prefix_length); + + gi_filename_up = gi_filename + prefix_length; +} + +/* Stop the pmu profiler and dump pmu profile info into the global file. */ + +static void +pmu_profile_stop (void) +{ + const char *pmu_profile_filename = __gcov_pmu_profile_filename; + const char *pmu_options = __gcov_pmu_profile_options; + size_t filename_length; + int gcda_error; + + if (!pmu_profile_filename || !pmu_options) + return; + + __gcov_stop_pmu_profiler (); + + filename_length = strlen (pmu_profile_filename); + if (filename_length > gcov_max_filename) + gcov_max_filename = filename_length; + /* Allocate and initialize the filename scratch space. */ + gcov_alloc_filename (); + GCOV_GET_FILENAME (prefix_length, gcov_prefix_strip, pmu_profile_filename, + gi_filename_up); + /* Open the gcda file for writing. We don't support merge yet. */ + gcda_error = gcov_open_by_filename (gi_filename); + __gcov_end_pmu_profiler (gcda_error); + if ((gcda_error = gcov_close ())) + gcov_error (gcda_error < 0 ? "pmu_profile_stop:%s:Overflow writing\n" : + "pmu_profile_stop:%s:Error writing\n", + gi_filename); +} + /* Sort N entries in VALUE_ARRAY in descending order. Each entry in VALUE_ARRAY has two values. The sorting is based on the second value. */ @@ -437,56 +525,7 @@ } } -/* This function allocates the space to store current file name. */ - static void -gcov_alloc_filename (void) -{ - /* Get file name relocation prefix. Non-absolute values are ignored. */ - char *gcov_prefix = 0; - - prefix_length = 0; - gcov_prefix_strip = 0; - - { - /* Check if the level of dirs to strip off specified. */ - char *tmp = getenv ("GCOV_PREFIX_STRIP"); - if (tmp) - { - gcov_prefix_strip = atoi (tmp); - /* Do not consider negative values. */ - if (gcov_prefix_strip < 0) - gcov_prefix_strip = 0; - } - } - /* Get file name relocation prefix. Non-absolute values are ignored. */ - gcov_prefix = getenv ("GCOV_PREFIX"); - if (gcov_prefix) - { - prefix_length = strlen(gcov_prefix); - - /* Remove an unnecessary trailing '/' */ - if (IS_DIR_SEPARATOR (gcov_prefix[prefix_length - 1])) - prefix_length--; - } - else - prefix_length = 0; - - /* If no prefix was specified and a prefix stip, then we assume - relative. */ - if (gcov_prefix_strip != 0 && prefix_length == 0) - { - gcov_prefix = "."; - prefix_length = 1; - } - - /* Aelocate and initialize the filename scratch space. */ - gi_filename = (char *) malloc (prefix_length + gcov_max_filename + 2); - if (prefix_length) - memcpy (gi_filename, gcov_prefix, prefix_length); -} - -static void gcov_dump_module_info (void) { struct gcov_info *gi_ptr; @@ -498,8 +537,8 @@ { int error; - gcov_strip_leading_dirs (prefix_length, gcov_prefix_strip, - gi_ptr->filename, gi_filename_up); + GCOV_GET_FILENAME (prefix_length, gcov_prefix_strip, gi_ptr->filename, + gi_filename_up); error = gcov_open_by_filename (gi_filename); if (error != 0) continue; @@ -533,9 +572,11 @@ struct gcov_info *gi_ptr; int dump_module_info; + /* Stop and write the PMU profile data into the global file. */ + pmu_profile_stop (); + dump_module_info = gcov_exit_init (); - for (gi_ptr = __gcov_list; gi_ptr; gi_ptr = gi_ptr->next) gcov_dump_one_gcov (gi_ptr); @@ -571,11 +612,25 @@ const char *ptr = info->filename; gcov_unsigned_t crc32 = gcov_crc32; size_t filename_length = strlen (info->filename); + struct gcov_pmu_info pmu_info; /* Refresh the longest file name information. */ if (filename_length > gcov_max_filename) gcov_max_filename = filename_length; + /* Initialize the pmu profiler. */ + pmu_info.pmu_profile_filename = __gcov_pmu_profile_filename; + pmu_info.pmu_tool = __gcov_pmu_profile_options; + pmu_info.pmu_top_n_address = __gcov_pmu_top_n_address; + __gcov_init_pmu_profiler (&pmu_info); + if (pmu_info.pmu_profile_filename) + { + /* Refresh the longest file name information. */ + filename_length = strlen (pmu_info.pmu_profile_filename); + if (filename_length > gcov_max_filename) + gcov_max_filename = filename_length; + } + /* Assign the module ID (starting at 1). */ info->mod_info->ident = (++gcov_cur_module_id); gcc_assert (EXTRACT_MODULE_ID_FROM_GLOBAL_ID (GEN_FUNC_GLOBAL_ID ( @@ -600,7 +655,11 @@ gcov_crc32 = crc32; if (!__gcov_list) - atexit (gcov_exit); + { + atexit (gcov_exit); + /* Start pmu profiler. */ + __gcov_start_pmu_profiler (); + } info->next = __gcov_list; __gcov_list = info; @@ -617,6 +676,7 @@ { const struct gcov_info *gi_ptr; + __gcov_stop_pmu_profiler (); gcov_exit (); for (gi_ptr = __gcov_list; gi_ptr; gi_ptr = gi_ptr->next) { @@ -630,6 +690,7 @@ ci_ptr++; } } + __gcov_start_pmu_profiler (); } #else /* __GCOV_KERNEL__ */ @@ -639,8 +700,8 @@ /* Copy the filename to the buffer. */ static inline void -gcov_get_filename (int prefix_length __attribute__ ((unused)), - int gcov_prefix_strip __attribute__ ((unused)), +gcov_get_filename (int prefix_length __attribute__ ((unused)), + int gcov_prefix_strip __attribute__ ((unused)), const char *filename, char *gi_filename_up) { strcpy (gi_filename_up, filename); @@ -666,6 +727,7 @@ prefix_length = 0; gcov_prefix_strip = 0; gi_filename = _kernel_gi_filename; + gi_filename_up = _kernel_gi_filename; } #endif /* __GCOV_KERNEL__ */ @@ -1089,7 +1151,6 @@ } gcov_alloc_filename (); - gi_filename_up = gi_filename + prefix_length; return dump_module_info; } Index: gcc/params.def =================================================================== --- gcc/params.def (revision 175346) +++ gcc/params.def (working copy) @@ -1011,6 +1011,11 @@ ".note.callgraph.text section", 0, 0, 0) +DEFPARAM (PARAM_PMU_PROFILE_N_ADDRESS, + "pmu_profile_n_addresses", + "While doing PMU profiling symbolize this many top addresses.", + 50, 1, 10000) + /* Local variables: mode:c Index: gcc/gcov-dump.c =================================================================== --- gcc/gcov-dump.c (revision 175346) +++ gcc/gcov-dump.c (working copy) @@ -39,6 +39,10 @@ static void tag_counters (const char *, unsigned, unsigned); static void tag_summary (const char *, unsigned, unsigned); static void tag_module_info (const char *, unsigned, unsigned); +static void tag_pmu_load_latency_info (const char *, unsigned, unsigned); +static void tag_pmu_branch_mispredict_info (const char *, unsigned, unsigned); +static void tag_pmu_tool_header (const char *, unsigned, unsigned); + extern int main (int, char **); typedef struct tag_format @@ -73,6 +77,11 @@ {GCOV_TAG_OBJECT_SUMMARY, "OBJECT_SUMMARY", tag_summary}, {GCOV_TAG_PROGRAM_SUMMARY, "PROGRAM_SUMMARY", tag_summary}, {GCOV_TAG_MODULE_INFO, "MODULE INFO", tag_module_info}, + {GCOV_TAG_PMU_LOAD_LATENCY_INFO, "PMU_LOAD_LATENCY_INFO", + tag_pmu_load_latency_info}, + {GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO, "PMU_BRANCH_MISPREDICT_INFO", + tag_pmu_branch_mispredict_info}, + {GCOV_TAG_PMU_TOOL_HEADER, "PMU_TOOL_HEADER", tag_pmu_tool_header}, {0, NULL, NULL} }; @@ -519,3 +528,43 @@ printf (": %s [%s]", mod_info->source_filename, suffix); } } + +/* Read gcov tag GCOV_TAG_PMU_LOAD_LATENCY_INFO from the gcda file and + print the contents in a human readable form. */ + +static void +tag_pmu_load_latency_info (const char *filename ATTRIBUTE_UNUSED, + unsigned tag ATTRIBUTE_UNUSED, unsigned length) +{ + gcov_pmu_ll_info_t ll_info; + gcov_read_pmu_load_latency_info (&ll_info, length); + print_load_latency_line (stdout, &ll_info, no_newline); + free (ll_info.filename); +} + +/* Read gcov tag GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO from the gcda + file and print the contents in a human readable form. */ + +static void +tag_pmu_branch_mispredict_info (const char *filename ATTRIBUTE_UNUSED, + unsigned tag ATTRIBUTE_UNUSED, unsigned length) +{ + gcov_pmu_brm_info_t brm_info; + gcov_read_pmu_branch_mispredict_info (&brm_info, length); + print_branch_mispredict_line (stdout, &brm_info, no_newline); + free (brm_info.filename); +} + + +/* Read gcov tag GCOV_TAG_PMU_TOOL_HEADER from the gcda file and print + the contents in a human readable form. */ + +static void +tag_pmu_tool_header (const char *filename ATTRIBUTE_UNUSED, + unsigned tag ATTRIBUTE_UNUSED, unsigned length) +{ + gcov_pmu_tool_header_t tool_header; + gcov_read_pmu_tool_header (&tool_header, length); + print_pmu_tool_header (stdout, &tool_header, no_newline); + destroy_pmu_tool_header (&tool_header); +}