From patchwork Fri Feb 24 04:28:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 731892 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3vTypC5RsZz9s7R for ; Fri, 24 Feb 2017 15:30:43 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.b="W/wbTMoT"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3vTypC4CZgzDqJ6 for ; Fri, 24 Feb 2017 15:30:43 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.b="W/wbTMoT"; dkim-atps=neutral X-Original-To: patchwork@lists.ozlabs.org Delivered-To: patchwork@lists.ozlabs.org Received: from mail-pg0-x243.google.com (mail-pg0-x243.google.com [IPv6:2607:f8b0:400e:c05::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3vTymD0wxZzDqHB for ; Fri, 24 Feb 2017 15:28:59 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.b="W/wbTMoT"; dkim-atps=neutral Received: by mail-pg0-x243.google.com with SMTP id 1so1558084pgz.2 for ; Thu, 23 Feb 2017 20:28:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=axtens.net; s=google; h=from:to:cc:subject:date:message-id; bh=oRJtw5GbGlV8QftXOcGuR7mdCQ2CdXIRI8TVMm+QP3c=; b=W/wbTMoTuHEajNF3caYlohLm8zKMpMdEnqYo0NeKdbLf7esgs6RqJAwnD30FhUFIVf 80T46XTvAPoHBFcDBEloHrzYv/srrOB5+qGu4HcA3uIGaCkC7tVubya7Dfaf4Xc+cYSW qZP/YSiGyqdayR2GhRa+9V05xCRp7BhJFILYI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=oRJtw5GbGlV8QftXOcGuR7mdCQ2CdXIRI8TVMm+QP3c=; b=iqtKa/CIC5qaQc7cwqB0p/JY7mkVjJVPBUchjAqpBPYNN+J1LuOdflYTaubZxAP1+6 c1/cmh7z1FDMyzA6IWPTz+pWp9fDTN+H1oe9mxtZJeg4r9AqP5ulna3wAl8anKr3gPPc YWu2x6OsGDJBt9JcTaS9UTTsZ9xUiBLF6wxl0Rg5oo9oC7mw4/1620rc4QE2z848CZsy +jbg6RqNhJKctjkY9881ETPQgvnoMnjlA2dCRA8XneUp9P2xapypp9syzND9Wvo+8qUI 3wPhrBgS3V14tCNaDZTVEUkK6YXjDsxg5J5R+Zdfdo5/8XfozIxfYelEjmcJKPSw5T4S Bg/g== X-Gm-Message-State: AMke39n7drRmxlVQkryhLePFZLTpPy6teLcselmByTOjer/5Z38e2Q4mLFt+a2alm9/T/Q== X-Received: by 10.99.123.65 with SMTP id k1mr985877pgn.17.1487910537945; Thu, 23 Feb 2017 20:28:57 -0800 (PST) Received: from connectitude.ozlabs.ibm.com ([122.99.82.10]) by smtp.gmail.com with ESMTPSA id g28sm12431908pgn.3.2017.02.23.20.28.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 23 Feb 2017 20:28:57 -0800 (PST) From: Daniel Axtens To: patchwork@lists.ozlabs.org Subject: [PATCH] Make parsemail-batch a managment command Date: Fri, 24 Feb 2017 15:28:35 +1100 Message-Id: <20170224042835.12728-1-dja@axtens.net> X-Mailer: git-send-email 2.9.3 X-BeenThere: patchwork@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Patchwork development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: patchwork-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Patchwork" This brings it into line with the other import commands. It's also an order of magnitude or so faster. Signed-off-by: Daniel Axtens --- This isn't that useful per se, but I'm using it to benchmark the performance impact of changes look for opportunities to speed up. --- patchwork/bin/parsemail-batch.sh | 25 +++----- patchwork/management/commands/parsemail-batch.py | 77 ++++++++++++++++++++++++ 2 files changed, 85 insertions(+), 17 deletions(-) create mode 100644 patchwork/management/commands/parsemail-batch.py diff --git a/patchwork/bin/parsemail-batch.sh b/patchwork/bin/parsemail-batch.sh index d42712ed0995..f35808e9d11b 100755 --- a/patchwork/bin/parsemail-batch.sh +++ b/patchwork/bin/parsemail-batch.sh @@ -20,25 +20,16 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA BIN_DIR=$(dirname "$0") +PATCHWORK_BASE=$(readlink -e "$BIN_DIR/../..") -if [ $# -lt 1 ]; then - echo "usage: $0 [options]" >&2 - exit 1 +if [ -z "$PW_PYTHON" ]; then + PW_PYTHON=python2 fi -mail_dir="$1" - -echo "dir: $mail_dir" - -if [ ! -d "$mail_dir" ]; then - echo "$mail_dir should be a directory"? >&2 - exit 1 +if [ -z "$DJANGO_SETTINGS_MODULE" ]; then + DJANGO_SETTINGS_MODULE=patchwork.settings.production fi -shift - -find "$mail_dir" -maxdepth 1 | -while read -r line; do - echo "$line" - "$BIN_DIR/parsemail.sh" "$@" < "$line" -done +PYTHONPATH="${PATCHWORK_BASE}:${PATCHWORK_BASE}/lib/python:$PYTHONPATH" \ + DJANGO_SETTINGS_MODULE="$DJANGO_SETTINGS_MODULE" \ + "$PW_PYTHON" "$PATCHWORK_BASE/manage.py" parsemail-batch "$@" diff --git a/patchwork/management/commands/parsemail-batch.py b/patchwork/management/commands/parsemail-batch.py new file mode 100644 index 000000000000..ad4dce35f595 --- /dev/null +++ b/patchwork/management/commands/parsemail-batch.py @@ -0,0 +1,77 @@ +# Patchwork - automated patch tracking system +# Copyright (C) 2017, Daniel Axtens, IBM Corporation. +# +# This file is part of the Patchwork package. +# +# Patchwork is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# Patchwork is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +import email +import logging +from optparse import make_option +import sys +import os + +import django +from django.core.management import base +from django.utils import six + +from patchwork.parser import parse_mail + +logger = logging.getLogger(__name__) + + +class Command(base.BaseCommand): + help = 'Parse a directory of mbox files and store any patches/comments found.' + + if django.VERSION < (1, 8): + args = '' + option_list = base.BaseCommand.option_list + ( + make_option( + '--list-id', + help='mailing list ID. If not supplied, this will be ' + 'extracted from the mail headers.'), + ) + else: + def add_arguments(self, parser): + parser.add_argument( + 'indir', + type=str, + default=None, + help='input mbox directory') + parser.add_argument( + '--list-id', + help='mailing list ID. If not supplied, this will be ' + 'extracted from the mail headers.') + + def handle(self, *args, **options): + indir = args[0] if args else options['indir'] + + if not os.path.isdir(indir): + logger.error('%s is not a directory' % indir) + sys.exit(1) + + for infile in sorted(os.listdir(indir)): + print(infile) + if six.PY3: + with open(os.path.join(indir, infile), 'rb') as file_: + mail = email.message_from_binary_file(file_) + else: + with open(os.path.join(indir, infile)) as file_: + mail = email.message_from_file(file_) + + try: + result = parse_mail(mail, options['list_id']) + if not result: + logger.warning('Failed to parse mail') + except Exception as e: + logger.warning('Error when parsing incoming email ' + \ + str(e), + extra={'mail': mail.as_string()})