From patchwork Wed Feb 21 14:17:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 876138 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zmgPz44wwz9rxp for ; Thu, 22 Feb 2018 01:50:15 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.b="jiNMI5Bn"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zmgPz2jLSzF0q1 for ; Thu, 22 Feb 2018 01:50:15 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.b="jiNMI5Bn"; dkim-atps=neutral X-Original-To: patchwork@lists.ozlabs.org Delivered-To: patchwork@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=axtens.net (client-ip=2607:f8b0:400e:c05::243; helo=mail-pg0-x243.google.com; envelope-from=dja@axtens.net; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.b="jiNMI5Bn"; dkim-atps=neutral Received: from mail-pg0-x243.google.com (mail-pg0-x243.google.com [IPv6:2607:f8b0:400e:c05::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zmfhS4l0JzDrTR for ; Thu, 22 Feb 2018 01:17:44 +1100 (AEDT) Received: by mail-pg0-x243.google.com with SMTP id l4so673506pgp.11 for ; Wed, 21 Feb 2018 06:17:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=axtens.net; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=EOJrm/jntT59FNNM17g783NVq9eejsbmOk6LvdOF5gg=; b=jiNMI5BnxMbbIcX4kOvTPGAYMRVHOLPM1iAb/wnTCR5QtccUKWvs46dzx3f/ER85Yf HnrCUdLdH87vKPBSL8kBWsdaMhnHpWSO5cACvuJQQtWRhCQW1a9E/wh1VL24nnDaLA3P 0vzu2B/LEsH4qnHCI6OYAE04Ytg/b7MdjgQkQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EOJrm/jntT59FNNM17g783NVq9eejsbmOk6LvdOF5gg=; b=MsBfCW/UrZfWaaBlD16e32DGi8S9Rj89E8/qrabpVdTR9oACLDbjY9QiwjNsIKihGW 4ItRiKwOenSJWHea6cLLWEguJzdG7E8RB6JVAhgO+e09tEFpbTTjdr06QM9xU/kiqHtO 9tBmtxPkHPgCWEbbLVlJoAP5XLPx/7Lp+/ljqFC1stNdThynzoZmc+PdFvU/Q4FsvNpZ 06ZLfGdoQVjib7fsrUsHXbzBgK7EXinePnY0MpbYNBT9iXimF6362epfULGaKDjoCpwb mAIfW2qo0vIGCnoHWM7eMAk3lwFcWILqcAf2zgwkVwz9XwwJvxhk7VndE4ZMnyN8pqwn N7Rw== X-Gm-Message-State: APf1xPBP3vm8umuSH0tg0Oatl9HqUGf0zmUBPqN/aqmA5Qa668I6zaLo AImE4sFJ3wKJHIIg2clrkD1/TKgsoYY= X-Google-Smtp-Source: AH8x226fmq9zXNdi9kp3gynIuwmCuSuUlZwB1TgnLVNgX71hyi6LFSAQuWDEhfJWNustCQxgmnRegg== X-Received: by 10.98.161.7 with SMTP id b7mr3495040pff.68.1519222662558; Wed, 21 Feb 2018 06:17:42 -0800 (PST) Received: from linkitivity.iinet.net.au (124-171-212-101.dyn.iinet.net.au. [124.171.212.101]) by smtp.gmail.com with ESMTPSA id w63sm14102667pgb.80.2018.02.21.06.17.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Feb 2018 06:17:41 -0800 (PST) From: Daniel Axtens To: patchwork@lists.ozlabs.org Subject: [PATCH 3/9] tools/scripts: split a mbox N ways Date: Thu, 22 Feb 2018 01:17:10 +1100 Message-Id: <20180221141716.10908-4-dja@axtens.net> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180221141716.10908-1-dja@axtens.net> References: <20180221141716.10908-1-dja@axtens.net> X-BeenThere: patchwork@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Patchwork development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Petazzoni , Andrew Donnellan MIME-Version: 1.0 Errors-To: patchwork-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Patchwork" To test parallel loading of mail, it's handy to be able to split an existing mbox file into N mbox files in an alternating pattern (e.g. 1 2 1 2 or 1 2 3 4 1 2 3 4 etc) Introduce tools/scripts as a place to put things like this. Signed-off-by: Daniel Axtens Reviewed-by: Andrew Donnellan --- tools/scripts/split_mail.py | 76 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100755 tools/scripts/split_mail.py diff --git a/tools/scripts/split_mail.py b/tools/scripts/split_mail.py new file mode 100755 index 000000000000..ce71fe16c362 --- /dev/null +++ b/tools/scripts/split_mail.py @@ -0,0 +1,76 @@ +#!/usr/bin/python3 +# Patchwork - automated patch tracking system +# Copyright (C) 2018 Daniel Axtens +# +# This file is part of the Patchwork package. +# +# Patchwork is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# Patchwork is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +import sys +import os +import mailbox + +usage = """Split a maildir or mbox into N mboxes +in an alternating pattern + +Usage: ./split_mail.py + + : input mbox file or Maildir + : output mbox + -1... must not exist + N-way split""" + + +in_name = sys.argv[1] +out_name = sys.argv[2] + +try: + n = int(sys.argv[3]) +except: + print("N must be an integer.") + print(" ") + print(usage) + exit(1) + +if n < 2: + print("N must be be at least 2") + print(" ") + print(usage) + exit(1) + +if not os.path.exists(in_name): + print("No input at ", in_name) + print(" ") + print(usage) + exit(1) + +print("Opening", in_name) +if os.path.isdir(in_name): + inmail = mailbox.Maildir(in_name) +else: + inmail = mailbox.mbox(in_name) + +out=[] +for i in range(n): + if os.path.exists(out_name+"-"+str(i+1)): + print("mbox already exists at ", out_name+"-"+str(i+1)) + print(" ") + print(usage) + exit(1) + + out += [mailbox.mbox(out_name+'-'+str(i+1))] + +print("Copying messages") + +for (i, msg) in enumerate(inmail): + out[i % n].add(msg) + +print("Done")