-fsave-optimization-record: compress the output using zlib

Message ID 1541785877-4545-1-git-send-email-dmalcolm@redhat.com
State New
Headers show
Series
  • -fsave-optimization-record: compress the output using zlib
Related show

Commit Message

David Malcolm Nov. 9, 2018, 5:51 p.m.
One of the concerns noted at Cauldron about -fsave-optimization-record
was the size of the output files.

This file implements compression of the -fsave-optimization-record
output, using zlib.

I did some before/after testing of this patch, using SPEC 2017's
502.gcc_r with -O3, looking at the sizes of the generated
FILENAME.opt-record.json[.gz] files.

The largest file was for insn-attrtab.c:
  before:  171736285 bytes (164M)
  after:     5304015 bytes (5.1M)

Smallest file was for vasprintf.c:
  before:      30567 bytes
  after:        4485 bytes

Median file by size before was lambda-mat.c:
  before:    2266738 bytes (2.2M)
  after:       75988 bytes (15K)

Total of all files in the benchmark:
  before: 2041720713 bytes (1.9G)
  after:    66870770 bytes (63.8M)

...so clearly compression is a big win in terms of file size, at the
cost of making the files slightly more awkward to work with. [1]
I also wonder if we want to support any pre-filtering of the output
(FWIW roughly half of the biggest file seems to be "Adding assert for "
messages from tree-vrp.c).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

[1] I've updated my optrecord.py module to deal with this, which
simplifies things; it's still not clear to me if that should live
in "contrib/" or not.

gcc/ChangeLog:
	* doc/invoke.texi (-fsave-optimization-record): Note that the
	output is compressed.
	* optinfo-emit-json.cc: Include <zlib.h>.
	(optrecord_json_writer::write): Compress the output.
---
 gcc/doc/invoke.texi      |  6 +++---
 gcc/optinfo-emit-json.cc | 35 +++++++++++++++++++++++++++--------
 2 files changed, 30 insertions(+), 11 deletions(-)

Comments

Jeff Law Nov. 9, 2018, 9 p.m. | #1
On 11/9/18 10:51 AM, David Malcolm wrote:
> One of the concerns noted at Cauldron about -fsave-optimization-record
> was the size of the output files.
> 
> This file implements compression of the -fsave-optimization-record
> output, using zlib.
> 
> I did some before/after testing of this patch, using SPEC 2017's
> 502.gcc_r with -O3, looking at the sizes of the generated
> FILENAME.opt-record.json[.gz] files.
> 
> The largest file was for insn-attrtab.c:
>   before:  171736285 bytes (164M)
>   after:     5304015 bytes (5.1M)
> 
> Smallest file was for vasprintf.c:
>   before:      30567 bytes
>   after:        4485 bytes
> 
> Median file by size before was lambda-mat.c:
>   before:    2266738 bytes (2.2M)
>   after:       75988 bytes (15K)
> 
> Total of all files in the benchmark:
>   before: 2041720713 bytes (1.9G)
>   after:    66870770 bytes (63.8M)
> 
> ...so clearly compression is a big win in terms of file size, at the
> cost of making the files slightly more awkward to work with. [1]
> I also wonder if we want to support any pre-filtering of the output
> (FWIW roughly half of the biggest file seems to be "Adding assert for "
> messages from tree-vrp.c).
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?
> 
> [1] I've updated my optrecord.py module to deal with this, which
> simplifies things; it's still not clear to me if that should live
> in "contrib/" or not.
> 
> gcc/ChangeLog:
> 	* doc/invoke.texi (-fsave-optimization-record): Note that the
> 	output is compressed.
> 	* optinfo-emit-json.cc: Include <zlib.h>.
> 	(optrecord_json_writer::write): Compress the output.
OK.
jeff

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4ff3a150..f26ada0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14408,11 +14408,11 @@  dumps from the vectorizer about missed opportunities.
 
 @item -fsave-optimization-record
 @opindex fsave-optimization-record
-Write a SRCFILE.opt-record.json file detailing what optimizations
+Write a SRCFILE.opt-record.json.gz file detailing what optimizations
 were performed, for those optimizations that support @option{-fopt-info}.
 
-This option is experimental and the format of the data within the JSON
-file is subject to change.
+This option is experimental and the format of the data within the
+compressed JSON file is subject to change.
 
 It is roughly equivalent to a machine-readable version of
 @option{-fopt-info-all}, as a collection of messages with source file,
diff --git a/gcc/optinfo-emit-json.cc b/gcc/optinfo-emit-json.cc
index 31029ad..6d4502c 100644
--- a/gcc/optinfo-emit-json.cc
+++ b/gcc/optinfo-emit-json.cc
@@ -45,6 +45,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "pass_manager.h"
 #include "selftest.h"
 #include "dump-context.h"
+#include <zlib.h>
 
 /* A class for writing out optimization records in JSON format.  */
 
@@ -133,16 +134,34 @@  optrecord_json_writer::~optrecord_json_writer ()
 void
 optrecord_json_writer::write () const
 {
-  char *filename = concat (dump_base_name, ".opt-record.json", NULL);
-  FILE *outfile = fopen (filename, "w");
-  if (outfile)
+  pretty_printer pp;
+  m_root_tuple->print (&pp);
+
+  bool emitted_error = false;
+  char *filename = concat (dump_base_name, ".opt-record.json.gz", NULL);
+  gzFile outfile = gzopen (filename, "w");
+  if (outfile == NULL)
     {
-      m_root_tuple->dump (outfile);
-      fclose (outfile);
+      error_at (UNKNOWN_LOCATION, "cannot open file %qs for writing optimization records",
+		filename); // FIXME: more info?
+      goto cleanup;
     }
-  else
-    error_at (UNKNOWN_LOCATION, "unable to write optimization records to %qs",
-	      filename); // FIXME: more info?
+
+  if (gzputs (outfile, pp_formatted_text (&pp)) <= 0)
+    {
+      int tmp;
+      error_at (UNKNOWN_LOCATION, "error writing optimization records to %qs: %s",
+		filename, gzerror (outfile, &tmp));
+      emitted_error = true;
+    }
+
+ cleanup:
+  if (outfile)
+    if (gzclose (outfile) != Z_OK)
+      if (!emitted_error)
+	error_at (UNKNOWN_LOCATION, "error closing optimization records %qs",
+		  filename);
+
   free (filename);
 }