Patchwork Add an intermediate coverage format for gcov

login
register
mail settings
Submitter Sharad Singhai
Date Oct. 18, 2011, 11:19 p.m.
Message ID <CAKxPW66ch7Ym3Y7+0P-wtkq2_8RjBHFP8P_BMBs2VeN_EqEFtQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/120534/
State New
Headers show

Comments

Sharad Singhai - Oct. 18, 2011, 11:19 p.m.
On Wed, Oct 5, 2011 at 9:58 AM, Mike Stump <mikestump@comcast.net> wrote:
>
> On Oct 5, 2011, at 12:47 AM, Sharad Singhai wrote:
> > This patch adds an intermediate coverage format (enabled via 'gcov
> > -i'). This is a compact format as it does not require source files.
>
> I don't like any of the tags, I think if you showed them to 100 people that had used gcov before, very few of them would be able to figure out the meaning without reading the manual.  Why make it completely cryptic?  A binary encoded compress format is smaller and just as readable.
>
> SF, sounds like single float to me, I'd propose file.
> FN, sounds like filename, I'd propose line.
> FNDA, can't even guess at what it means, I'd propose fcount.
> BA, I'd propose branch, for 0, notexe, for 1, nottaken, for 2 taken.
> DA, I'd propose lcount
>
> I'd propose fcount be listed as fname,ecount, to match the ordering of lcount.  Also, implicit in the name, is the ordering f, then count, l, then count.
>
> I think if you showed the above to 100 people that had used gcov before, I think we'd be up past 90% of the people able to figure it out, without reading the doc.

Okay, I liked the idea of self-descriptive tags. I have updated the
patch based on your suggestions. I have simplified the format
somewhat. Instead of repeating function name, I use a 'function' tag
with the format

function:<name>,<line number>,<execution count>

I also dropped the unmangled function names, they were turning out to
be too unreadable and not really useful in this context.

Here is the updated patch. OK for trunk?

2011-10-18   Sharad Singhai  <singhai@google.com>

	* doc/gcov.texi: Document gcov intermediate format.
	* gcov.c (print_usage): Handle new option.
	(process_args): Handle new option.
	(get_gcov_file_intermediate_name): New function.
	(output_intermediate_file): New function.
	(generate_results): Handle new option.
	* testsuite/lib/gcov.exp: Handle intermediate format.
	* testsuite/g++.dg/gcov/gcov-8.C: New testcase.

+/* { dg-final { run-gcov intermediate { -i -b gcov-8.C } } } */
Mike Stump - Oct. 19, 2011, 1:05 a.m.
On Oct 18, 2011, at 4:19 PM, Sharad Singhai <singhai@google.com> wrote:
> Okay, I liked the idea of self-descriptive tags. I have updated the
> patch based on your suggestions. I have simplified the format
> somewhat. Instead of repeating function name, I use a 'function' tag
> with the format
> 
> function:<name>,<line number>,<execution count>

Sound nice.

> I also dropped the unmangled function names, they were turning out to
> be too unreadable and not really useful in this context.

Ah, I'd argue for mangled names.  Every one knows they can stream through c++filt and get unmangled, but once you unmangle, you can never go back.  Also, the mangled version is significantly smaller.  For c, it is irrelevant, for c++, it makes a big difference.
>
Jan Hubicka - Oct. 19, 2011, 8:48 a.m.
> On Oct 18, 2011, at 4:19 PM, Sharad Singhai <singhai@google.com> wrote:
> > Okay, I liked the idea of self-descriptive tags. I have updated the
> > patch based on your suggestions. I have simplified the format
> > somewhat. Instead of repeating function name, I use a 'function' tag
> > with the format
> > 
> > function:<name>,<line number>,<execution count>
> 
> Sound nice.
> 
> > I also dropped the unmangled function names, they were turning out to
> > be too unreadable and not really useful in this context.
> 
> Ah, I'd argue for mangled names.  Every one knows they can stream through c++filt and get unmangled, but once you unmangle, you can never go back.  Also, the mangled version is significantly smaller.  For c, it is irrelevant, for c++, it makes a big difference.

I would also support unmangled variant. Otherwise the patch seems resonable to me.

Honza
> >
Sharad Singhai - Oct. 19, 2011, 7:06 p.m.
Since the updated patch already uses unmangled function names, is it
good to commit then?

Sharad

On Wed, Oct 19, 2011 at 1:48 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Oct 18, 2011, at 4:19 PM, Sharad Singhai <singhai@google.com> wrote:
>> > Okay, I liked the idea of self-descriptive tags. I have updated the
>> > patch based on your suggestions. I have simplified the format
>> > somewhat. Instead of repeating function name, I use a 'function' tag
>> > with the format
>> >
>> > function:<name>,<line number>,<execution count>
>>
>> Sound nice.
>>
>> > I also dropped the unmangled function names, they were turning out to
>> > be too unreadable and not really useful in this context.
>>
>> Ah, I'd argue for mangled names.  Every one knows they can stream through c++filt and get unmangled, but once you unmangle, you can never go back.  Also, the mangled version is significantly smaller.  For c, it is irrelevant, for c++, it makes a big difference.
>
> I would also support unmangled variant. Otherwise the patch seems resonable to me.
>
> Honza
>> >
>
Sharad Singhai - Oct. 19, 2011, 7:12 p.m.
Sorry, I misunderstood your comment. I see that you are asking for
unmangled function names whereas the current patch supports only
mangled names. I can print unmangled names under another option. Would
that work?

Thanks,
Sharad

On Wed, Oct 19, 2011 at 12:06 PM, Sharad Singhai <singhai@google.com> wrote:
> Since the updated patch already uses unmangled function names, is it
> good to commit then?
>
> Sharad
>
> On Wed, Oct 19, 2011 at 1:48 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> On Oct 18, 2011, at 4:19 PM, Sharad Singhai <singhai@google.com> wrote:
>>> > Okay, I liked the idea of self-descriptive tags. I have updated the
>>> > patch based on your suggestions. I have simplified the format
>>> > somewhat. Instead of repeating function name, I use a 'function' tag
>>> > with the format
>>> >
>>> > function:<name>,<line number>,<execution count>
>>>
>>> Sound nice.
>>>
>>> > I also dropped the unmangled function names, they were turning out to
>>> > be too unreadable and not really useful in this context.
>>>
>>> Ah, I'd argue for mangled names.  Every one knows they can stream through c++filt and get unmangled, but once you unmangle, you can never go back.  Also, the mangled version is significantly smaller.  For c, it is irrelevant, for c++, it makes a big difference.
>>
>> I would also support unmangled variant. Otherwise the patch seems resonable to me.
>>
>> Honza
>>> >
>>
>
Mike Stump - Oct. 19, 2011, 8:41 p.m.
On Oct 19, 2011, at 12:12 PM, Sharad Singhai <singhai@google.com> wrote:
> Sorry, I misunderstood your comment. I see that you are asking for
> unmangled function names whereas the current patch supports only
> mangled names. I can print unmangled names under another option. Would
> that work?

I defer to the person that says ok.  :-)

Patch

Index: doc/gcov.texi
===================================================================
--- doc/gcov.texi	(revision 179873)
+++ doc/gcov.texi	(working copy)
@@ -130,6 +130,7 @@  gcov [@option{-v}|@option{--version}] [@
      [@option{-f}|@option{--function-summaries}]
      [@option{-o}|@option{--object-directory} @var{directory|file}]
@var{sourcefiles}
      [@option{-u}|@option{--unconditional-branches}]
+     [@option{-i}|@option{--intermediate-format}]
      [@option{-d}|@option{--display-progress}]
 @c man end
 @c man begin SEEALSO
@@ -216,6 +217,45 @@  Unconditional branches are normally not
 @itemx --display-progress
 Display the progress on the standard output.

+@item -i
+@itemx --intermediate-format
+Output gcov file in an easy-to-parse intermediate text format that can
+be used by @command{lcov} or other tools. The output is a single
+@file{.gcov} file per @file{.gcda} file. No source code is required.
+
+The format of the intermediate @file{.gcov} file is plain text with
+one entry per line
+
+@smallexample
+file:@var{source_file_name}
+function:@var{function_name},@var{line_number},@var{execution_count}
+lcount:@var{line number},@var{execution_count}
+branch:@var{line_number},@var{branch_coverage_type}
+
+Where the @var{branch_coverage_type} is
+   notexec (Branch not executed)
+   taken (Branch executed and taken)
+   nottaken (Branch executed, but not taken)
+
+There can be multiple @var{file} entries in an intermediate gcov
+file. All entries following a @var{file} pertain to that source file
+until the next @var{file} entry.
+@end smallexample
+
+Here is a sample when @option{-i} is used in conjuction with
@option{-b} option:
+
+@smallexample
+file:array.cc
+function:_Z3sumRKSt6vectorIPiSaIS0_EE,11,1
+function:main,22,1
+lcount:11,1
+lcount:12,1
+lcount:14,1
+branch:14,taken
+lcount:26,1
+branch:28,nottaken
+@end smallexample
+
 @end table

 @command{gcov} should be run with the current directory the same as that
Index: gcov.c
===================================================================
--- gcov.c	(revision 179873)
+++ gcov.c	(working copy)
@@ -311,6 +311,9 @@  static int flag_gcov_file = 1;

 static int flag_display_progress = 0;

+/* Output *.gcov file in intermediate format used by 'lcov'.  */
+static int flag_intermediate_format = 0;
+
 /* For included files, make the gcov output file name include the name
    of the input source file.  For example, if x.h is included in a.c,
    then the output file name is a.c##x.h.gcov instead of x.h.gcov.  */
@@ -436,6 +439,7 @@  print_usage (int error_p)
   fnotice (file, "  -o, --object-directory DIR|FILE Search for object
files in DIR or called FILE\n");
   fnotice (file, "  -p, --preserve-paths            Preserve all
pathname components\n");
   fnotice (file, "  -u, --unconditional-branches    Show
unconditional branch counts too\n");
+  fnotice (file, "  -i, --intermediate-format       Output .gcov file
in intermediate text format\n");
   fnotice (file, "  -d, --display-progress          Display progress
information\n");
   fnotice (file, "\nFor bug reporting instructions, please see:\n%s.\n",
 	   bug_report_url);
@@ -472,6 +476,7 @@  static const struct option options[] =
   { "object-file",          required_argument, NULL, 'o' },
   { "unconditional-branches", no_argument,     NULL, 'u' },
   { "display-progress",     no_argument,       NULL, 'd' },
+  { "intermediate-format",  no_argument,       NULL, 'i' },
   { 0, 0, 0, 0 }
 };

@@ -482,7 +487,8 @@  process_args (int argc, char **argv)
 {
   int opt;

-  while ((opt = getopt_long (argc, argv, "abcdfhlno:puv", options,
NULL)) != -1)
+  while ((opt = getopt_long (argc, argv, "abcdfhilno:puv", options, NULL)) !=
+         -1)
     {
       switch (opt)
 	{
@@ -516,6 +522,10 @@  process_args (int argc, char **argv)
 	case 'u':
 	  flag_unconditional = 1;
 	  break;
+	case 'i':
+          flag_intermediate_format = 1;
+          flag_gcov_file = 1;
+          break;
         case 'd':
           flag_display_progress = 1;
           break;
@@ -531,6 +541,100 @@  process_args (int argc, char **argv)
   return optind;
 }

+/* Get the name of the gcov file.  The return value must be free'd.
+
+   It appends the '.gcov' extension to the *basename* of the file.
+   The resulting file name will be in PWD.
+
+   e.g.,
+   input: foo.da,       output: foo.da.gcov
+   input: a/b/foo.cc,   output: foo.cc.gcov  */
+
+static char *
+get_gcov_file_intermediate_name (const char *file_name)
+{
+  const char *gcov = ".gcov";
+  char *result;
+  const char *cptr;
+
+  /* Find the 'basename'.  */
+  cptr = lbasename (file_name);
+
+  result = XNEWVEC(char, strlen (cptr) + strlen (gcov) + 1);
+  sprintf (result, "%s%s", cptr, gcov);
+
+  return result;
+}
+
+/* Output the result in intermediate format used by 'lcov'.
+
+The intermediate format contains a single file named 'foo.cc.gcov',
+with no source code included.
+
+file:/home/.../foo.h
+lcount:10,1
+lcount:35,1
+file:/home/.../bar.h
+lcount:12,0
+lcount:33,0
+file:/home/.../foo.cc
+function:foo,32,1
+lcount:42,0
+lcount:53,1
+branch:55,taken
+branch:57,notexec
+lcount:95,1
+...
+
+The default format would have 3 separate files: 'foo.h.gcov',
+'foo.cc.gcov', 'bar.h.gcov', each with source code included.  */
+
+static void
+output_intermediate_file (FILE *gcov_file, source_t *src)
+{
+  unsigned line_num;    /* current line number.  */
+  const line_t *line;   /* current line info ptr.  */
+  function_t *fn;       /* current function info ptr. */
+
+  fprintf (gcov_file, "file:%s\n", src->name);    /* source file name */
+
+  for (fn = src->functions; fn; fn = fn->line_next)
+    {
+      /* function:<name>,<line_number>,<execution_count> */
+      fprintf (gcov_file, "function:%s,%d,%s\n", fn->name, fn->line,
+               format_gcov (fn->blocks[0].count, 0, -1));
+    }
+
+  for (line_num = 1, line = &src->lines[line_num];
+       line_num < src->num_lines;
+       line_num++, line++)
+    {
+      arc_t *arc;
+      if (line->exists)
+        fprintf (gcov_file, "lcount:%u,%d\n", line_num,
+                 line->count != 0 ? 1 : 0);
+      if (flag_branches)
+        for (arc = line->u.branches; arc; arc = arc->line_next)
+          {
+            if (!arc->is_unconditional && !arc->is_call_non_return)
+              {
+                const char *branch_type;
+                /* branch:<line_num>,<branch_coverage_type>
+                   branch_coverage_type
+                     : notexec (Branch not executed)
+                     : taken (Branch executed and taken)
+                     : nottaken (Branch executed, but not taken)
+                */
+                if (arc->src->count)
+                  branch_type = (arc->count > 0) ? "taken" : "nottaken";
+                else
+                  branch_type = "notexec";
+                fprintf(gcov_file, "branch:%d,%s\n", line_num, branch_type);
+              }
+          }
+    }
+}
+
 /* Process a single source file.  */

 static void
@@ -570,6 +674,8 @@  generate_results (const char *file_name)
 {
   source_t *src;
   function_t *fn;
+  FILE *gcov_file_intermediate = NULL;
+  char *gcov_file_intermediate_name = NULL;

   for (src = sources; src; src = src->next)
     src->lines = XCNEWVEC (line_t, src->num_lines);
@@ -587,31 +693,55 @@  generate_results (const char *file_name)
 	}
     }

+  if (flag_gcov_file && flag_intermediate_format)
+    {
+      /* Open the intermediate file.  */
+      gcov_file_intermediate_name =
+        get_gcov_file_intermediate_name (file_name);
+      gcov_file_intermediate = fopen (gcov_file_intermediate_name, "w");
+    }
   for (src = sources; src; src = src->next)
     {
       accumulate_line_counts (src);
       function_summary (&src->coverage, "File");
       if (flag_gcov_file)
 	{
-	  char *gcov_file_name = make_gcov_file_name (file_name, src->name);
-	  FILE *gcov_file = fopen (gcov_file_name, "w");
-
-	  if (gcov_file)
-	    {
-	      fnotice (stdout, "%s:creating '%s'\n",
-		       src->name, gcov_file_name);
-	      output_lines (gcov_file, src);
-	      if (ferror (gcov_file))
-		    fnotice (stderr, "%s:error writing output file '%s'\n",
-			     src->name, gcov_file_name);
-	      fclose (gcov_file);
-	    }
-	  else
-	    fnotice (stderr, "%s:could not open output file '%s'\n",
-		     src->name, gcov_file_name);
-	  free (gcov_file_name);
-	}
-      fnotice (stdout, "\n");
+         if (flag_intermediate_format)
+           /* Now output in the intermediate format without requiring
+              source files.  This outputs a section to a *single* file.  */
+           output_intermediate_file (gcov_file_intermediate, src);
+         else
+           {
+             /* Now output the version with source files.
+                This outputs a separate *.gcov file for each source file
+                involved.  */
+             char *gcov_file_name = make_gcov_file_name (file_name, src->name);
+             FILE *gcov_file = fopen (gcov_file_name, "w");
+
+             if (gcov_file)
+               {
+                 fnotice (stdout, "%s:creating '%s'\n",
+                          src->name, gcov_file_name);
+                 output_lines (gcov_file, src);
+                 if (ferror (gcov_file))
+                   fnotice (stderr, "%s:error writing output file '%s'\n",
+                            src->name, gcov_file_name);
+                 fclose (gcov_file);
+               }
+             else
+               fnotice (stderr, "%s:could not open output file '%s'\n",
+                        src->name, gcov_file_name);
+             free (gcov_file_name);
+           }
+         fnotice (stdout, "\n");
+        }
+    }
+
+  if (flag_gcov_file && flag_intermediate_format)
+    {
+      /* Now we've finished writing the intermediate file.  */
+      fclose (gcov_file_intermediate);
+      XDELETEVEC (gcov_file_intermediate_name);
     }
 }

Index: testsuite/lib/gcov.exp
===================================================================
--- testsuite/lib/gcov.exp	(revision 179873)
+++ testsuite/lib/gcov.exp	(working copy)
@@ -60,6 +60,59 @@  proc verify-lines { testcase file } {
 }

 #
+# verify-intermediate -- check that intermediate file has certain lines
+#
+# TESTCASE is the name of the test.
+# FILE is the name of the gcov output file.
+#
+# Checks are very loose, they are based on being certain tags present
+# in the output. They do not check for exact expected execution
+# counts. For that the regular gcov format should be checked.
+#
+proc verify-intermediate { testcase file } {
+    set failed 0
+    set srcfile 0
+    set function 0
+    set lcount 0
+    set branch 0
+    set fd [open $file r]
+    while { [gets $fd line] >= 0 } {
+	if [regexp "^file:" $line] {
+	    incr srcfile
+	}
+	if [regexp "^function:.*,(\[0-9\]+),(\[0-9\]+)" $line] {
+	    incr function
+	}
+	if [regexp "^lcount:(\[0-9\]+),(\[0-9\]+)" $line] {
+	    incr lcount
+	}
+	if [regexp "^branch:(\[0-9\]+)," $line] {
+	    incr branch
+	}
+    }
+
+    # We should see at least one tag of each type
+    if {$srcfile == 0} {
+	fail "expected 'file:' tag not found"
+	incr failed
+    }
+    if {$function == 0} {
+	fail "expected 'function:' tag not found"
+	incr failed
+    }
+    if {$lcount == 0} {
+	fail "expected 'lcount:' tag not found"
+	incr failed
+    }
+    if {$branch == 0} {
+	fail "expected 'branch:' tag not found"
+	incr failed
+    }
+    return $failed
+}
+
+
+#
 # verify-branches -- check that branch percentages are as expected
 #
 # TESTCASE is the name of the test.
@@ -234,6 +287,8 @@  proc run-gcov { args } {

     set gcov_verify_calls 0
     set gcov_verify_branches 0
+    set gcov_verify_lines 1
+    set gcov_verify_intermediate 0
     set gcov_execute_xfail ""
     set gcov_verify_xfail ""

@@ -242,6 +297,11 @@  proc run-gcov { args } {
 	  set gcov_verify_calls 1
 	} elseif { $a == "branches" } {
 	  set gcov_verify_branches 1
+	} elseif { $a == "intermediate" } {
+	  set gcov_verify_intermediate 1
+	  set gcov_verify_calls 0
+	  set gcov_verify_branches 0
+	  set gcov_verify_lines 0
 	}
     }

@@ -274,8 +334,12 @@  proc run-gcov { args } {
 	eval setup_xfail [split $gcov_verify_xfail]
     }

-    # Check that line execution counts are as expected.
-    set lfailed [verify-lines $testcase $testcase.gcov]
+    if { $gcov_verify_lines } {
+	# Check that line execution counts are as expected.
+	set lfailed [verify-lines $testcase $testcase.gcov]
+    } else {
+	set lfailed 0
+    }

     # If requested via the .x file, check that branch and call information
     # is correct.
@@ -289,12 +353,18 @@  proc run-gcov { args } {
     } else {
 	set cfailed 0
     }
+    if { $gcov_verify_intermediate } {
+	# Check that intermediate format has the expected format
+	set ifailed [verify-intermediate $testcase $testcase.gcov]
+    } else {
+	set ifailed 0
+    }

     # Report whether the gcov test passed or failed.  If there were
     # multiple failures then the message is a summary.
-    set tfailed [expr $lfailed + $bfailed + $cfailed]
+    set tfailed [expr $lfailed + $bfailed + $cfailed + $ifailed]
     if { $tfailed > 0 } {
-	fail "$subdir/$testcase gcov: $lfailed failures in line counts,
$bfailed in branch percentages, $cfailed in return percentages"
+	fail "$subdir/$testcase gcov: $lfailed failures in line counts,
$bfailed in branch percentages, $cfailed in return percentages,
$ifailed in intermediate format"
     } else {
 	pass "$subdir/$testcase gcov"
 	clean-gcov $testcase
Index: testsuite/g++.dg/gcov/gcov-8.C
===================================================================
--- testsuite/g++.dg/gcov/gcov-8.C	(revision 0)
+++ testsuite/g++.dg/gcov/gcov-8.C	(revision 0)
@@ -0,0 +1,35 @@ 
+/* Verify that intermediate coverage format can be generated for
simple code. */
+
+/* { dg-options "-fprofile-arcs -ftest-coverage" } */
+/* { dg-do run { target native } } */
+
+class C {
+public:
+  C()
+  {
+    i = 0;
+  }
+  ~C() {}
+  void seti (int j)
+  {
+    if (j > 0)
+      i = j;
+    else
+      i = 0;
+  }
+private:
+  int i;
+};
+
+void foo()
+{
+  C c;
+  c.seti (1);
+}
+
+int main()
+{
+  foo();
+}
+