[FYI] Scripts to generate project stats

Submitted by Anthony Liguori on Aug. 2, 2011, 2:36 a.m.

Details

Message ID 1312252615-31567-1-git-send-email-aliguori@us.ibm.com
State New
Headers show

Commit Message

Anthony Liguori Aug. 2, 2011, 2:36 a.m.
As part of my talk for KVM Forum, I am collecting some stats on the project
since last year.  I thought I'd share the scripts in case anyone is interested
in how they work.

I think this is just about all of the data I need, but patches are certainly
welcome.

Of course, you'll have to come to KVM Forum to see the pretty version of these
stats (there should be videos too for those that can't make it :-))

And thanks to Alex for poking me to collect these too.
---
 scripts/aliases.txt   |   18 +++++++++
 scripts/companies.txt |   20 ++++++++++
 scripts/genstats.sh   |   96 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 134 insertions(+), 0 deletions(-)
 create mode 100644 scripts/aliases.txt
 create mode 100644 scripts/companies.txt
 create mode 100755 scripts/genstats.sh

Comments

Anthony Liguori Aug. 2, 2011, 3:12 a.m.
On 08/01/2011 09:36 PM, Anthony Liguori wrote:
> As part of my talk for KVM Forum, I am collecting some stats on the project
> since last year.  I thought I'd share the scripts in case anyone is interested
> in how they work.
>
> +function gen-stats() {
> +    until="$1"
> +    since="$2"
> +
> +    echo 'Total Commits'
> +    echo '-------------'
> +    gen-commits "$until" "$since"
> +    echo
> +
> +    echo 'Committers'
> +    echo '----------'
> +    gen-committers "$until" "$since"
> +    echo
> +
> +    echo 'Authors'
> +    echo '-------'
> +    gen-authors "$until" "$since"
> +    echo
> +
> +    echo 'Companies'
> +    echo '---------'
> +    gen-companies "$util" "$since"

Should be:

gen-companies "$until" "$since"

Regards,

Anthony Liguori

> +}
> +
> +gen-stats "$1" "$2"
> +
Peter Maydell Aug. 2, 2011, 9:13 a.m.
On 2 August 2011 03:36, Anthony Liguori <aliguori@us.ibm.com> wrote:
> diff --git a/scripts/aliases.txt b/scripts/aliases.txt
> new file mode 100644
> index 0000000..aadea25
> --- /dev/null
> +++ b/scripts/aliases.txt
> @@ -0,0 +1,18 @@
> +andrew.zaborowski@intel.com: balrog@zabor.org
> +edgar@axis.com: edgar.iglesias@gmail.com
> +edgar.iglesias@petalogix.com: edgar.iglesias@gmail.com
> +lcapitulino@gmail.com: lcapitulino@redhat.com
> +riku.voipio@nokia.com: riku.voipio@linaro.org
> +riku.voipio@iki.fi: riku.voipio@linaro.org
> +andreas.faerber: andreas.faerber@web.de
> +anthony@codemonkey.ws: aliguori@us.ibm.com
> +atar4qemu@googlemail.com: atar4qemu@gmail.com
> +bernhard.kohl@gmx.net: bernhard.kohl@nsn.com
> +jan.kiszka@web.de: jan.kiszka@siemens.com
> +mail@kevin-wolf.de: kwolf@redhat.com
> +marcandre.lureau@gmail.com: marcandre.lureau@redhat.com
> +rth@twiddle.net: rth@redhat.com
> +sripathi@sripathi.in.ibm.com: sripathik@in.ibm.com

Maybe we should have a git ".mailmap" instead? Then
"git shortlog -nse" and friends would use it.

> +++ b/scripts/companies.txt

> +Code Sourcery: codesourcery.com

I believe there's no space in this name, ie it should be
"CodeSourcery". (source: http://www.codesourcery.com/company)

-- PMM

Patch hide | download patch | download mbox

diff --git a/scripts/aliases.txt b/scripts/aliases.txt
new file mode 100644
index 0000000..aadea25
--- /dev/null
+++ b/scripts/aliases.txt
@@ -0,0 +1,18 @@ 
+andrew.zaborowski@intel.com: balrog@zabor.org
+edgar@axis.com: edgar.iglesias@gmail.com
+edgar.iglesias@petalogix.com: edgar.iglesias@gmail.com
+lcapitulino@gmail.com: lcapitulino@redhat.com
+riku.voipio@nokia.com: riku.voipio@linaro.org
+riku.voipio@iki.fi: riku.voipio@linaro.org
+andreas.faerber: andreas.faerber@web.de
+anthony@codemonkey.ws: aliguori@us.ibm.com
+atar4qemu@googlemail.com: atar4qemu@gmail.com
+bernhard.kohl@gmx.net: bernhard.kohl@nsn.com
+jan.kiszka@web.de: jan.kiszka@siemens.com
+mail@kevin-wolf.de: kwolf@redhat.com
+marcandre.lureau@gmail.com: marcandre.lureau@redhat.com
+rth@twiddle.net: rth@redhat.com
+sripathi@sripathi.in.ibm.com: sripathik@in.ibm.com
+
+
+
diff --git a/scripts/companies.txt b/scripts/companies.txt
new file mode 100644
index 0000000..436e3b3
--- /dev/null
+++ b/scripts/companies.txt
@@ -0,0 +1,20 @@ 
+Red Hat: redhat.com hch@lst.de glommer@mothafucka.localdomain
+SuSE: suse.de novell.com
+IBM: ibm.com kernel.crashing.org gibson.dropbear.id.au
+AMD: amd.com
+Citrix: citrix.com
+Canonical: canonical.com
+Intel: intel.com
+VIA: viatech.com.cn
+Linaro: linaro
+Google: google.com
+Code Sourcery: codesourcery.com
+Siemens: siemens.com siemens-enterprise.com
+Fujitsu: fujitsu.com
+Dream Host: dreamhost.com
+Nokia: nokia.com
+Samsung: samsung.com
+NTT: lab.ntt.co.jp
+FreeScale: freescale.com
+XenSource: xensource.com
+VA Linux: valinux.co.jp
diff --git a/scripts/genstats.sh b/scripts/genstats.sh
new file mode 100755
index 0000000..6d1228f
--- /dev/null
+++ b/scripts/genstats.sh
@@ -0,0 +1,96 @@ 
+#!/bin/bash
+
+# Usage: scripts/genstats.sh "today" "1 year ago"
+
+aliases="scripts/aliases.txt"
+companies="scripts/companies.txt"
+
+function dedup() {
+    while read addr; do
+	f=`grep "^$addr: " "$aliases" | cut -f2- -d' '`
+	if test "$f"; then
+	    echo "$f"
+	else
+	    echo "$addr"
+	fi
+    done
+}
+
+function gen-committers() {
+    until="$1"
+    since="$2"
+
+    git log --until="$until" --since="$since" --pretty=format:%ce | \
+	sort -u | dedup | sort -u | while read committer; do
+	addresses=`grep " $committer\$" "$aliases" | cut -f1 -d: | while read a; do echo -n "--committer=$a "; done`
+	
+	echo -n "$committer, "
+	git log --until="$until" --since="$since" \
+	    --pretty=oneline --committer="$committer" $addresses | wc -l
+    done
+}
+
+function gen-authors() {
+    until="$1"
+    since="$2"
+
+    git log --until="$until" --since="$since" --pretty=format:%ae | \
+	sort -u | dedup | sort -u | while read author; do
+	addresses=`grep " $author\$" "$aliases" | cut -f1 -d: | while read a; do echo -n "--author=$a "; done`
+	
+	echo -n "$author, "
+	git log --until="$until" --since="$since" \
+	    --pretty=oneline --author="$author" | wc -l
+    done
+}
+
+function gen-commits() {
+    until="$1"
+    since="$2"
+
+    git log --until="$until" --since="$since" --pretty=oneline | wc -l
+}
+
+function gen-companies() {
+    until="$1"
+    since="$2"
+
+    cat "$companies" | while read LINE; do
+	company=`echo $LINE | cut -f1 -d:`
+	addrs=`echo $LINE | cut -f2- -d:`
+
+	authors=`echo "$addrs" | sed -e 's: : --author=:g'`
+	echo "$company," \
+	    `git log --until="$until" --since="$since" --pretty=oneline \
+	         $authors | wc -l`, \
+            `git log --until="$until" --since="$since" --pretty="format:%ae\n" \
+	         $authors | sort -u | dedup | sort -u | wc -l`
+    done
+}
+
+function gen-stats() {
+    until="$1"
+    since="$2"
+
+    echo 'Total Commits'
+    echo '-------------'
+    gen-commits "$until" "$since"
+    echo
+
+    echo 'Committers'
+    echo '----------'
+    gen-committers "$until" "$since"
+    echo
+
+    echo 'Authors'
+    echo '-------'
+    gen-authors "$until" "$since"
+    echo
+
+    echo 'Companies'
+    echo '---------'
+    gen-companies "$util" "$since"
+}
+
+gen-stats "$1" "$2"
+