diff mbox

[v4,2015.08,2/4] support/scripts: add size-stats script

Message ID 1432591007-27689-3-git-send-email-thomas.petazzoni@free-electrons.com
State Superseded
Headers show

Commit Message

Thomas Petazzoni May 25, 2015, 9:56 p.m. UTC
This new script uses the data collected by the step_pkg_size
instrumentation hook to generate a pie chart of the size contribution
of each package to the target root filesystem, and two CSV files with
statistics about the package size and file size. To achieve this, it
looks at each file in $(TARGET_DIR), and using the
packages-file-list.txt information collected by the step_pkg_size
hook, it determines to which package the file belongs. It is therefore
able to give the size installed by each package.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 238 insertions(+)
 create mode 100755 support/scripts/size-stats

Comments

Ryan Barnett May 28, 2015, 3:18 a.m. UTC | #1
Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem, and two CSV files with
> statistics about the package size and file size. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the
> packages-file-list.txt information collected by the step_pkg_size
> hook, it determines to which package the file belongs. It is therefore
> able to give the size installed by each package.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 238 insertions(+)
>  create mode 100755 support/scripts/size-stats

Other than a few minor suggestion below, things look good.

Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>
Tested-by: Ryan Barnett <ryanbarnett3@gmail.com>

> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
> new file mode 100755
> index 0000000..48a64cd
> --- /dev/null
> +++ b/support/scripts/size-stats
> @@ -0,0 +1,238 @@

[...]

> +#
> +# This function builds a dictionary that contains the name of a
> +# package as key, and the size of the files installed by this package
> +# as the value.
> +#
> +# filesdict: dictionary with the name of the files as key, and as
> +# value a tuple containing the name of the package to which the files
> +# belongs, and the size of the file. As returned by
> +# build_package_dict.
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_package_size(filesdict, builddir):
> +    pkgsize = collections.defaultdict(int)
> +
> +    for root, _, files in os.walk(os.path.join(builddir, "target")):
> +        for f in files:
> +            fpath = os.path.join(root, f)
> +            if os.path.islink(fpath):
> +                continue
> +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
> +            if not frelpath in filesdict:
> +                print("WARNING: %s is not part of any package" % frelpath)

Would it be useful to have an exclusion list since this will always be
printed out?

Every time you run 'make clean all size-stats' you will be faced with
warnings such as this:

WARNING: THIS_IS_NOT_YOUR_ROOT_FILESYSTEM is not part of any package
WARNING: etc/ld.so.cache is not part of any package
WARNING: etc/hostname is not part of any package
WARNING: etc/os-release is not part of any package
WARNING: etc/nsswitch.conf is not part of any package
WARNING: etc/ld.so.conf is not part of any package
WARNING: etc/network/interfaces is not part of any package
WARNING: tmp/ldconfig/aux-cache is not part of any package
WARNING: dev/console is not part of any package

Initially when I saw this I didn't do something correct, however, I
quickly released that these are files that are generated by
buildroot's makefiles (such as THIS_IS_NOT_YOUR_ROOT_FILESYSTEM and
etc/hostname). Since this files are generated by buildroot and one
shouldn't be concerned about this files not being a part of any
package. While typing this, would it make sense to create a package
called 'buildroot' whose files are defined statically within this
script?

> +                pkg = "unknown"
> +            else:
> +                pkg = filesdict[frelpath][0]
> +
> +            pkgsize[pkg] += os.path.getsize(fpath)
> +
> +    return pkgsize
> +
> +#
> +# Given a dict returned by build_package_size(), this function
> +# generates a pie chart of the size installed by each package.
> +#
> +# pkgsize: dictionary with the name of the package as a key, and the
> +# size as the value, as returned by build_package_size.
> +#
> +# outputf: output file for the graph
> +#
> +def draw_graph(pkgsize, outputf):
> +    total = sum(pkgsize.values())
> +    labels = []
> +    values = []
> +    other_value = 0
> +    for (p, sz) in pkgsize.items():
> +        if sz < (total * 0.01):
> +            other_value += sz
> +        else:
> +            labels.append("%s (%d kB)" % (p, sz / 1000.))
> +            values.append(sz)
> +    labels.append("Other (%d kB)" % (other_value / 1000.))
> +    values.append(other_value)
> +
> +    plt.figure()
> +    patches, texts, autotexts = plt.pie(values, labels=labels,
> +                                        autopct='%1.1f%%', shadow=True,
> +                                        colors=colors)
> +    # Reduce text size
> +    proptease = fm.FontProperties()
> +    proptease.set_size('xx-small')
> +    plt.setp(autotexts, fontproperties=proptease)
> +    plt.setp(texts, fontproperties=proptease)

Could the total size of filesystem be placed on this graph? I was
thinking maybe at the bottom of the graph in as a subtitle - don't
know if this possible?

Another idea would be the option to specify a chart title. This could
be something that could be used with dependency graph as well or any
other graph generated by buildroot (can't think of any others off at
this moment). However, I would say that should be a future feature
that is implemented.

Thanks,
-Ryan

> +    plt.title('Size per package')
> +    plt.savefig(outputf)

[...]
Matt Weber May 28, 2015, 2:55 p.m. UTC | #2
Ryan, Thomas,

On Wed, May 27, 2015 at 10:18 PM, Ryan Barnett <ryanbarnett3@gmail.com> wrote:
> Thomas,

>

> On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni

> <thomas.petazzoni@free-electrons.com> wrote:

>> This new script uses the data collected by the step_pkg_size

>> instrumentation hook to generate a pie chart of the size contribution

>> of each package to the target root filesystem, and two CSV files with

>> statistics about the package size and file size. To achieve this, it

>> looks at each file in $(TARGET_DIR), and using the

>> packages-file-list.txt information collected by the step_pkg_size

>> hook, it determines to which package the file belongs. It is therefore

>> able to give the size installed by each package.

>>

>> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

>> ---

>>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++

>>  1 file changed, 238 insertions(+)

>>  create mode 100755 support/scripts/size-stats

>

> Other than a few minor suggestion below, things look good.

>

> Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>

> Tested-by: Ryan Barnett <ryanbarnett3@gmail.com>

>

>> diff --git a/support/scripts/size-stats b/support/scripts/size-stats

>> new file mode 100755

>> index 0000000..48a64cd

>> --- /dev/null

>> +++ b/support/scripts/size-stats

>> @@ -0,0 +1,238 @@

>

> [...]

>

>> +#

>> +# This function builds a dictionary that contains the name of a

>> +# package as key, and the size of the files installed by this package

>> +# as the value.

>> +#

>> +# filesdict: dictionary with the name of the files as key, and as

>> +# value a tuple containing the name of the package to which the files

>> +# belongs, and the size of the file. As returned by

>> +# build_package_dict.

>> +#

>> +# builddir: path to the Buildroot output directory

>> +#

>> +def build_package_size(filesdict, builddir):

>> +    pkgsize = collections.defaultdict(int)

>> +

>> +    for root, _, files in os.walk(os.path.join(builddir, "target")):

>> +        for f in files:

>> +            fpath = os.path.join(root, f)

>> +            if os.path.islink(fpath):

>> +                continue

>> +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))

>> +            if not frelpath in filesdict:

>> +                print("WARNING: %s is not part of any package" % frelpath)

>

> Would it be useful to have an exclusion list since this will always be

> printed out?


Or maybe a single warning that has a path to a file where these are appended to?

<snip?



-- 
Matthew L Weber / Pr Software Engineer
Airborne Information Systems / Security Systems and Software / Secure Platforms
MS 131-100, C Ave NE, Cedar Rapids, IA, 52498, USA
www.rockwellcollins.com

Note: Any Export License Required Information and License Restricted
Third Party Intellectual Property (TPIP) content must be encrypted and
sent to matthew.weber@corp.rockwellcollins.com.
_______________________________________________
buildroot mailing list
buildroot@busybox.net
http://lists.busybox.net/mailman/listinfo/buildroot
Clayton Shotwell June 3, 2015, 3:50 p.m. UTC | #3
Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem, and two CSV files with
> statistics about the package size and file size. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the
> packages-file-list.txt information collected by the step_pkg_size
> hook, it determines to which package the file belongs. It is therefore
> able to give the size installed by each package.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 238 insertions(+)
>  create mode 100755 support/scripts/size-stats
>
> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
> new file mode 100755
> index 0000000..48a64cd
> --- /dev/null
> +++ b/support/scripts/size-stats

> +import sys
> +import os
> +import os.path
> +import argparse
> +import csv
> +import collections
> +
> +try:

Would it be possible to add in the following lines here to ensure the
graphing does not try to connect to an X-server?

import matplotlib
matplotlib.use('Agg')

I ran into an issue testing this on my setup where I ssh into my build
server using screen. I found this solution on stack overflow at the
following link.

http://stackoverflow.com/questions/4706451/how-to-save-a-figure-remotely-with-pylab/4706614#4706614

> +    import matplotlib.font_manager as fm
> +    import matplotlib.pyplot as plt
> +except ImportError:
> +    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
> +    exit(1)

Thanks,
Clayton

Clayton Shotwell
Senior Software Engineer, Rockwell Collins
clayton.shotwell@rockwellcollins.com
Romain Naour July 11, 2015, 11:46 a.m. UTC | #4
Hi Clayton, Thomas, all

Le 03/06/2015 17:50, Clayton Shotwell a écrit :
> Thomas,
> 
> On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
> <thomas.petazzoni@free-electrons.com> wrote:
>> This new script uses the data collected by the step_pkg_size
>> instrumentation hook to generate a pie chart of the size contribution
>> of each package to the target root filesystem, and two CSV files with
>> statistics about the package size and file size. To achieve this, it
>> looks at each file in $(TARGET_DIR), and using the
>> packages-file-list.txt information collected by the step_pkg_size
>> hook, it determines to which package the file belongs. It is therefore
>> able to give the size installed by each package.
>>
>> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>> ---
>>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 238 insertions(+)
>>  create mode 100755 support/scripts/size-stats
>>
>> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
>> new file mode 100755
>> index 0000000..48a64cd
>> --- /dev/null
>> +++ b/support/scripts/size-stats
> 
>> +import sys
>> +import os
>> +import os.path
>> +import argparse
>> +import csv
>> +import collections
>> +
>> +try:
> 
> Would it be possible to add in the following lines here to ensure the
> graphing does not try to connect to an X-server?
> 
> import matplotlib
> matplotlib.use('Agg')
> 
> I ran into an issue testing this on my setup where I ssh into my build
> server using screen. I found this solution on stack overflow at the
> following link.
> 
> http://stackoverflow.com/questions/4706451/how-to-save-a-figure-remotely-with-pylab/4706614#4706614

With Samuel, we are ok to import matplitlib entirely to ensure that it doesn't
try to connect to X-server.
The overhead seems negligible, I not able to see any difference with or without
import matplotlib.

Best regards,
Romain Naour

> 
>> +    import matplotlib.font_manager as fm
>> +    import matplotlib.pyplot as plt
>> +except ImportError:
>> +    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
>> +    exit(1)
> 
> Thanks,
> Clayton
> 
> Clayton Shotwell
> Senior Software Engineer, Rockwell Collins
> clayton.shotwell@rockwellcollins.com
> _______________________________________________
> buildroot mailing list
> buildroot@busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot
>
Thomas Petazzoni Sept. 2, 2015, 9:08 p.m. UTC | #5
Ryan,

(Yes, I'm replying to a very old e-mail)

On Wed, 27 May 2015 22:18:32 -0500, Ryan Barnett wrote:

> > +def build_package_size(filesdict, builddir):
> > +    pkgsize = collections.defaultdict(int)
> > +
> > +    for root, _, files in os.walk(os.path.join(builddir, "target")):
> > +        for f in files:
> > +            fpath = os.path.join(root, f)
> > +            if os.path.islink(fpath):
> > +                continue
> > +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
> > +            if not frelpath in filesdict:
> > +                print("WARNING: %s is not part of any package" % frelpath)
> 
> Would it be useful to have an exclusion list since this will always be
> printed out?
> 
> Every time you run 'make clean all size-stats' you will be faced with
> warnings such as this:
> 
> WARNING: THIS_IS_NOT_YOUR_ROOT_FILESYSTEM is not part of any package
> WARNING: etc/ld.so.cache is not part of any package
> WARNING: etc/hostname is not part of any package
> WARNING: etc/os-release is not part of any package
> WARNING: etc/nsswitch.conf is not part of any package
> WARNING: etc/ld.so.conf is not part of any package
> WARNING: etc/network/interfaces is not part of any package
> WARNING: tmp/ldconfig/aux-cache is not part of any package
> WARNING: dev/console is not part of any package
> 
> Initially when I saw this I didn't do something correct, however, I
> quickly released that these are files that are generated by
> buildroot's makefiles (such as THIS_IS_NOT_YOUR_ROOT_FILESYSTEM and
> etc/hostname). Since this files are generated by buildroot and one
> shouldn't be concerned about this files not being a part of any
> package. While typing this, would it make sense to create a package
> called 'buildroot' whose files are defined statically within this
> script?

Since I am not sure how to handle those files yet, I've left this as is
for the moment. In my tests I'm seeing less warnings now:

thomas@skate:~/projets/buildroot (size-stats-v5)$ make size-stats
WARNING: etc/os-release is not part of any package
WARNING: etc/ld.so.conf is not part of any package
WARNING: etc/hostname is not part of any package
WARNING: etc/network/interfaces is not part of any package

Which looks a bit more reasonable.


> > +    # Reduce text size
> > +    proptease = fm.FontProperties()
> > +    proptease.set_size('xx-small')
> > +    plt.setp(autotexts, fontproperties=proptease)
> > +    plt.setp(texts, fontproperties=proptease)
> 
> Could the total size of filesystem be placed on this graph? I was
> thinking maybe at the bottom of the graph in as a subtitle - don't
> know if this possible?

I've implemented this idea, thanks for the suggestion!

> Another idea would be the option to specify a chart title. This could
> be something that could be used with dependency graph as well or any
> other graph generated by buildroot (can't think of any others off at
> this moment). However, I would say that should be a future feature
> that is implemented.

For this one, I'd say we should handle it together with the other
graphs generated by Buildroot, so I've left it on the side for now.

Thanks!

Thomas
diff mbox

Patch

diff --git a/support/scripts/size-stats b/support/scripts/size-stats
new file mode 100755
index 0000000..48a64cd
--- /dev/null
+++ b/support/scripts/size-stats
@@ -0,0 +1,238 @@ 
+#!/usr/bin/env python
+
+# Copyright (C) 2014 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+import sys
+import os
+import os.path
+import argparse
+import csv
+import collections
+
+try:
+    import matplotlib.font_manager as fm
+    import matplotlib.pyplot as plt
+except ImportError:
+    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
+    exit(1)
+
+colors = ['#e60004', '#009836', '#2e1d86', '#ffed00',
+          '#0068b5', '#f28e00', '#940084', '#97c000']
+
+#
+# This function adds a new file to 'filesdict', after checking its
+# size. The 'filesdict' contain the relative path of the file as the
+# key, and as the value a tuple containing the name of the package to
+# which the file belongs and the size of the file.
+#
+# filesdict: the dict to which  the file is added
+# relpath: relative path of the file
+# fullpath: absolute path to the file
+# pkg: package to which the file belongs
+#
+def add_file(filesdict, relpath, abspath, pkg):
+    if not os.path.exists(abspath):
+        return
+    if os.path.islink(abspath):
+        return
+    sz = os.stat(abspath).st_size
+    filesdict[relpath] = (pkg, sz)
+
+#
+# This function returns a dict containing as keys the files present in
+# the filesystem skeleton, and as value, the string "skeleton". It is
+# used to simulate a fake "skeleton" package, to assign the files from
+# the skeleton to some package.
+#
+# builddir: path to the Buildroot output directory
+# skeleton_path: path to the rootfs skeleton
+#
+def build_skeleton_dict(builddir, skeleton_path):
+    skeleton_files = {}
+    for root, _, files in os.walk(skeleton_path):
+        for f in files:
+            if f == ".empty":
+                continue
+            frelpath = os.path.relpath(os.path.join(root, f), skeleton_path)
+            # Get the real size of the installed file
+            targetpath = os.path.join(builddir, "target", frelpath)
+            add_file(skeleton_files, frelpath, targetpath, "skeleton")
+    return skeleton_files
+
+#
+# This function returns a dict where each key is the path of a file in
+# the root filesystem, and the value is a tuple containing two
+# elements: the name of the package to which this file belongs and the
+# size of the file.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_dict(builddir):
+    filesdict = {}
+    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
+        for l in filelistf.readlines():
+            pkg, fpath = l.split(",")
+            # remove the initial './' in each file path
+            fpath = fpath.strip()[2:]
+            fullpath = os.path.join(builddir, "target", fpath)
+            add_file(filesdict, fpath, fullpath, pkg)
+    return filesdict
+
+#
+# This function builds a dictionary that contains the name of a
+# package as key, and the size of the files installed by this package
+# as the value.
+#
+# filesdict: dictionary with the name of the files as key, and as
+# value a tuple containing the name of the package to which the files
+# belongs, and the size of the file. As returned by
+# build_package_dict.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_size(filesdict, builddir):
+    pkgsize = collections.defaultdict(int)
+
+    for root, _, files in os.walk(os.path.join(builddir, "target")):
+        for f in files:
+            fpath = os.path.join(root, f)
+            if os.path.islink(fpath):
+                continue
+            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
+            if not frelpath in filesdict:
+                print("WARNING: %s is not part of any package" % frelpath)
+                pkg = "unknown"
+            else:
+                pkg = filesdict[frelpath][0]
+
+            pkgsize[pkg] += os.path.getsize(fpath)
+
+    return pkgsize
+
+#
+# Given a dict returned by build_package_size(), this function
+# generates a pie chart of the size installed by each package.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output file for the graph
+#
+def draw_graph(pkgsize, outputf):
+    total = sum(pkgsize.values())
+    labels = []
+    values = []
+    other_value = 0
+    for (p, sz) in pkgsize.items():
+        if sz < (total * 0.01):
+            other_value += sz
+        else:
+            labels.append("%s (%d kB)" % (p, sz / 1000.))
+            values.append(sz)
+    labels.append("Other (%d kB)" % (other_value / 1000.))
+    values.append(other_value)
+
+    plt.figure()
+    patches, texts, autotexts = plt.pie(values, labels=labels,
+                                        autopct='%1.1f%%', shadow=True,
+                                        colors=colors)
+    # Reduce text size
+    proptease = fm.FontProperties()
+    proptease.set_size('xx-small')
+    plt.setp(autotexts, fontproperties=proptease)
+    plt.setp(texts, fontproperties=proptease)
+
+    plt.title('Size per package')
+    plt.savefig(outputf)
+
+#
+# Generate a CSV file with statistics about the size of each file, its
+# size contribution to the package and to the overall system.
+#
+# filesdict: dictionary with the name of the files as key, and as
+# value a tuple containing the name of the package to which the files
+# belongs, and the size of the file. As returned by
+# build_package_dict.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output CSV file
+#
+def gen_files_csv(filesdict, pkgsizes, outputf):
+    total = 0
+    for (p, sz) in pkgsizes.items():
+        total += sz
+    with open(outputf, 'w') as csvfile:
+        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
+        wr.writerow(["File name",
+                     "Package name",
+                     "File size",
+                     "Package size",
+                     "File size in package (%)",
+                     "File size in system (%)"])
+        for f, (pkgname, filesize) in filesdict.items():
+            pkgsize = pkgsizes[pkgname]
+            wr.writerow([f, pkgname, filesize, pkgsize,
+                         "%.1f" % (float(filesize) / pkgsize * 100),
+                         "%.1f" % (float(filesize) / total * 100)])
+
+
+#
+# Generate a CSV file with statistics about the size of each package,
+# and their size contribution to the overall system.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output CSV file
+#
+def gen_packages_csv(pkgsizes, outputf):
+    total = sum(pkgsizes.values())
+    with open(outputf, 'w') as csvfile:
+        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
+        wr.writerow(["Package name", "Package size", "Package size in system (%)"])
+        for (pkg, size) in pkgsizes.items():
+            wr.writerow([pkg, size, "%.1f" % (float(size) / total * 100)])
+
+parser = argparse.ArgumentParser(description='Draw build time graphs')
+
+parser.add_argument("--builddir", '-i', metavar="BUILDDIR", required=True,
+                    help="Buildroot output directory")
+parser.add_argument("--graph", '-g', metavar="GRAPH",
+                    help="Graph output file (.pdf or .png extension)")
+parser.add_argument("--file-size-csv", '-f', metavar="FILE_SIZE_CSV",
+                    help="CSV output file with file size statistics")
+parser.add_argument("--package-size-csv", '-p', metavar="PKG_SIZE_CSV",
+                    help="CSV output file with package size statistics")
+parser.add_argument("--skeleton-path", '-s', metavar="SKELETON_PATH", required=True,
+                    help="Path to the skeleton used for the system")
+args = parser.parse_args()
+
+# Find out which package installed what files
+pkgdict = build_package_dict(args.builddir)
+pkgdict.update(build_skeleton_dict(args.builddir, args.skeleton_path))
+
+# Collect the size installed by each package
+pkgsize = build_package_size(pkgdict, args.builddir)
+
+if args.graph:
+    draw_graph(pkgsize, args.graph)
+if args.file_size_csv:
+    gen_files_csv(pkgdict, pkgsize, args.file_size_csv)
+if args.package_size_csv:
+    gen_packages_csv(pkgsize, args.package_size_csv)