Message ID | 20190711092801.6314-1-victor.huesca@bootlin.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [v2] support/script/pkg-stats: add support for json output | expand |
Hello Victor, Thanks for this work! On Thu, 11 Jul 2019 11:28:02 +0200 Victor Huesca <victor.huesca@bootlin.com> wrote: > Pkg-stats is a great script that get a lot of intersting infos from Typo: interesting > buildroot packages. Unfortunatly it is currently designed to output a Typo: unfortunately > static HTML page only. While html is great for human to read, it is a > pain to parse for scripts. > > This patch provide a new option to output a JSON file in addition to the provide -> provides > HTML one. This allow other scripts to use all the usefull informations allow -> allows informations -> information > computed by pkg-stats. > > Please note that the old 'output' option has been renamed to 'html' > to distinguish from the new 'json' option. Also the 'npackages' and > 'packages' now uses the argparse `mutualy_exclusive` group. uses -> use mutually_exclusive > > Signed-off-by: Victor Huesca <victor.huesca@bootlin.com> I think ideally this patch should have been split into two: - One migrating the options to a better argparse usage - A second one adding the JSON output support itself. > +def dump_json(packages, stats, output): > + excluded_fields = ['url_worker', 'name'] > + pkgs = { > + pkg.name: { > + k: v > + for k, v in pkg.__dict__.items() > + if k not in excluded_fields > + } for pkg in packages > + } > + statistics = { > + k: v > + for k, v in stats.items() > + if not k.startswith('infra-') > + } > + statistics['infra'] = {k[6:]: v for k, v in stats.items() if k.startswith('infra-')} Perhaps a short comment here would be useful to explain what's going on with the "infra" key in the statistics dict. > + o = subprocess.check_output(['git', 'log', 'master', '-n', '1', '--pretty=format:%H']) > + final = {'packages': pkgs, > + 'stats': statistics, > + 'commit': o.splitlines()[0], > + 'date': str(datetime.datetime.utcnow())} For both the git commit and date, I find it a bit odd that we need to retrieve it separately once for the HTML output and once for the JSON output. I think it would make more sense if the calling function of dump_html() and dump_json() was getting those two values (Git commit id and date), and pass them as argument to dump_html() and dump_json(). Besides these comments, it looks good to me! Thanks again, Thomas
diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats index b0be7d919b..1ff3fb2bd1 100755 --- a/support/scripts/pkg-stats +++ b/support/scripts/pkg-stats @@ -23,7 +23,6 @@ import os from collections import defaultdict import re import subprocess -import sys import requests # URL checking import json import certifi @@ -696,22 +695,52 @@ def dump_html(packages, stats, output): f.write(html_footer) +def dump_json(packages, stats, output): + excluded_fields = ['url_worker', 'name'] + pkgs = { + pkg.name: { + k: v + for k, v in pkg.__dict__.items() + if k not in excluded_fields + } for pkg in packages + } + statistics = { + k: v + for k, v in stats.items() + if not k.startswith('infra-') + } + statistics['infra'] = {k[6:]: v for k, v in stats.items() if k.startswith('infra-')} + o = subprocess.check_output(['git', 'log', 'master', '-n', '1', '--pretty=format:%H']) + final = {'packages': pkgs, + 'stats': statistics, + 'commit': o.splitlines()[0], + 'date': str(datetime.datetime.utcnow())} + + with open(output, 'w') as f: + json.dump(final, f, indent=2, separators=(',', ': ')) + f.write('\n') + + def parse_args(): parser = argparse.ArgumentParser() - parser.add_argument('-o', dest='output', action='store', required=True, + group1 = parser.add_argument_group('output', 'Output file(s)') + group1.add_argument('--html', dest='html', action='store', help='HTML output file') - parser.add_argument('-n', dest='npackages', type=int, action='store', + group1.add_argument('--json', dest='json', action='store', + help='JSON output file') + group2 = parser.add_mutually_exclusive_group() + group2.add_argument('-n', dest='npackages', type=int, action='store', help='Number of packages') - parser.add_argument('-p', dest='packages', action='store', + group2.add_argument('-p', dest='packages', action='store', help='List of packages (comma separated)') - return parser.parse_args() + args = parser.parse_args() + if not args.html and not args.json: + parser.error('at least one of --html or --json (or both) is required') + return args def __main__(): args = parse_args() - if args.npackages and args.packages: - print("ERROR: -n and -p are mutually exclusive") - sys.exit(1) if args.packages: package_list = args.packages.split(",") else: @@ -735,8 +764,12 @@ def __main__(): check_package_latest_version(packages) print("Calculate stats") stats = calculate_stats(packages) - print("Write HTML") - dump_html(packages, stats, args.output) + if args.html: + print("Write HTML") + dump_html(packages, stats, args.html) + if args.json: + print("Write JSON") + dump_json(packages, stats, args.json) __main__()
Pkg-stats is a great script that get a lot of intersting infos from buildroot packages. Unfortunatly it is currently designed to output a static HTML page only. While html is great for human to read, it is a pain to parse for scripts. This patch provide a new option to output a JSON file in addition to the HTML one. This allow other scripts to use all the usefull informations computed by pkg-stats. Please note that the old 'output' option has been renamed to 'html' to distinguish from the new 'json' option. Also the 'npackages' and 'packages' now uses the argparse `mutualy_exclusive` group. Signed-off-by: Victor Huesca <victor.huesca@bootlin.com> --- Changes v1 --> v2: - Make commit message more explicit - fix coding style using flake8 --- support/scripts/pkg-stats | 53 +++++++++++++++++++++++++++++++-------- 1 file changed, 43 insertions(+), 10 deletions(-)