Message ID | 20180327151332.15881-1-vkabatov@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | Include all email headers in mboxes | expand |
vkabatov@redhat.com writes: > From: Veronika Kabatova <vkabatov@redhat.com> > > Solves issue #165 (Exported mboxes should include In-Reply-To, > References, etc headers). Instead of including only a few chosen ones, > all received headers are added to mboxes. Thanks for the patch. I'm a little worried that this will get really messy - I've included a snippet of headers from an email from an unrelated bug below. Maybe we don't care - I guess this isn't really for human consumption but is for consumption by e.g. git-am. Alternatively we can blacklist headers: I don't think there's anything worth having in Received, X-*, List-*, DKIM, ARC, SPF, etc. But I wonder if this is just whack-a-mole in reverse. Thoughts? Regards, Daniel Delivered-To: dja@axtens.net Received: by 10.74.139.216 with SMTP id p24csp390579ooj; Wed, 21 Feb 2018 01:07:26 -0800 (PST) X-Google-Smtp-Source: AH8x227WmScjPDkd1bgpNsd0DYg4ae/OzHPrTT68ezy+A7p6CMpHdsmU6pB6dZluayRkf9LDxR5l X-Received: by 10.99.55.1 with SMTP id e1mr2155932pga.237.1519204046116; Wed, 21 Feb 2018 01:07:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519204046; cv=none; d=google.com; s=arc-20160816; b=PXNfl2kupZqmeFoDWy2tLz6100KxBfLj/Rmu4/OTqiLwcPVNIsGBGE2FN/e/Gj2LCu NDdGSTuaDdYxI0wlCEqBZBrVatoda/PHBX2Fh2HZ2rzEahAs5R2w+YucxRTAi8UT7MTo xxxHqQ1DehoZmG9J3l0Ckh1cXmAplnmLB7+MufnD7FT7ajNufPhGiZ2ovDCcqb0/6KxG 8cfsjAnNAZAF3M8VF8r2KSubGuplPq4sB7aF9eklL0HmnXNVLYkyiwZbJuI/vVh6yBkc rKSdP02SlazOIL8FJuJQC5reac17qfbyjg/tbxyCC+/Lmc/DwexMfA4NPhEgqPZgwFWr 9ILQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature :arc-authentication-results; bh=CFYy3bhn7eoWPF6AJMH1sMf+K6u6usD+1M2NOg/UHAw=; b=VtdyXugEOzZm0I0t+YzlUoOWX+xI4UC50ZF9w5gd+zlNTsJMDGtykiwjZujU4FzMiP CdCwQqdlTuB5aY99HtOyBl1jjGz2zo1bGxRwr1aGFE+pbyUN5FFlTntYxZ0AD5wbDP3S 2hn6mDxzugFSJ9+LbDaa8lS4g6RENzaxHTPt7B6paAkbSFSZsBtY9oASno7RvI3ctlfz LCb1gAnhvL5p2H5vpNeIOr9eTQIOqhh7+kmmq7tMky2/aB95az+AaU/dKJQoUTsIY7qT gOTZAdHc4+fZHD2CeP9bX8MTUXy1Yc3fNy7ETbYqoe0+Gl2ke8m9nw84rHTmidoygKrH CwoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=YTP6W3WD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180 .67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: <linux-kernel-owner@vger.kernel.org> Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s9si2120330pgn.281.2018.02.21.01.07.26 for <dja@axtens.net>; Wed, 21 Feb 2018 01:07:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132. 180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=YTP6W3WD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180 .67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753028AbeBUJHV (ORCPT <rfc822;dja@axtens.net>); Wed, 21 Feb 2018 04:07:21 -0500 Received: from mail-wr0-f196.google.com ([209.85.128.196]:34728 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752409AbeBUJHR (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 21 Feb 2018 04:07:17 -0500 Received: by mail-wr0-f196.google.com with SMTP id m5so2297944wrg.1; Wed, 21 Feb 2018 01:07:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=CFYy3bhn7eoWPF6AJMH1sMf+K6u6usD+1M2NOg/UHAw=; b=YTP6W3WDHDARqaf+T4aDY1Z5IM4p2NlgLAJEnVsmJMIqE6PIL/1gIU2w7znKP2BZ2X 7xNMoPQrYm8bHmWr5ZNG5d2ql+zQOCnFE8XKIYCUCz4rYb0umccrvDcSgrGs27/UvIw3 8Tc0WFplVgoOsHMW+k7YF+a3an50Q/gDWwjT/9h/lvzRRZ5H2t5NAaKuuikgQoGSi6Su /FQBXhY8qT6E5TeC2r6XnuyQy73YV1DsUMZ8Ehjy/6tbJhUMsveshHC8/tNh9TbxhWoK fVlGEDVNmR2MHAkDnZFkACakllfB9sE3FlkwEQ0hPDUsUKEgS5RXoNlfc03Y3W7RNFpx XXIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=CFYy3bhn7eoWPF6AJMH1sMf+K6u6usD+1M2NOg/UHAw=; b=ORPK99ZlmTZ3ktjBaeVk3H3JZ/9sR4C8ohoUBdTnadoY6qa/wIhusqOXjEJo+RBS7D YZ6zbHR0iFM2ErnSnEgBUWQMgjsi3uSzTrHz+Fz0JDW02P9/noCHfXfl95yMRR5xLlS4 GeAIZKjHHYNmKK44oYrUYHfhpokK99afmJ0zIlbDjBbuyjsCgGih4s4+/n1IcGQQ91er pc9BUG/AwcAfmRkm6dryBzc3Qk4UA0EulG2iVjZmMlqO0GL1V4w0RjTx37Ym9+gAmMlq oKbLQq/ELtj252qNkRw1bxoE5PLBpfszmAv0kREKzvN7+9LNjYCac3h97xIz7AqQKdFW PiEg== X-Gm-Message-State: APf1xPDny4m9NzXZAkwJRYoao3cmAcW+tMOYiCFlZSovJ31ZZJ21t0qk lHrju+PIF9J/hpFO6iSeSeE= X-Received: by 10.28.167.208 with SMTP id q199mr1530756wme.29.1519204036496; Wed, 21 Feb 2018 01:07:16 -0800 (PST) Received: from baker.fritz.box ([2a02:908:1251:8fc0:d050:cb11:74d7:d8f6]) by smtp.gmail.com with ESMTPSA id x189sm1661875wmg.23.2018.02.21.01.07.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Feb 2018 01:07:16 -0800 (PST) From: "=?UTF-8?q?Christian=20K=C3=B6nig?=" <ckoenig.leichtzumerken@gmail.com> X-Google-Original-From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com> To: bhelgaas@google.com, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org Subject: [PATCH] PCI: stop crashing in pci_release_resource v2 Date: Wed, 21 Feb 2018 10:07:15 +0100 Message-Id: <20180221090715.2853-1-christian.koenig@amd.com> X-Mailer: git-send-email 2.14.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-TUID: AvE/m/C3rDgS > > Signed-off-by: Veronika Kabatova <vkabatov@redhat.com> > --- > patchwork/views/utils.py | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/patchwork/views/utils.py b/patchwork/views/utils.py > index 84682b8..5c28308 100644 > --- a/patchwork/views/utils.py > +++ b/patchwork/views/utils.py > @@ -89,21 +89,17 @@ def _submission_to_mbox(submission): > utc_timestamp = delta.seconds + delta.days * 24 * 3600 > > mail = PatchMbox(body) > - mail['Subject'] = submission.name > mail['X-Patchwork-Submitter'] = email.utils.formataddr(( > str(Header(submission.submitter.name, mail.patch_charset)), > submission.submitter.email)) > mail['X-Patchwork-Id'] = str(submission.id) > if is_patch and submission.delegate: > mail['X-Patchwork-Delegate'] = str(submission.delegate.email) > - mail['Message-Id'] = submission.msgid > mail.set_unixfrom('From patchwork ' + submission.date.ctime()) > > - copied_headers = ['To', 'Cc', 'Date', 'From', 'List-Id'] > orig_headers = HeaderParser().parsestr(str(submission.headers)) > - for header in copied_headers: > - if header in orig_headers: > - mail[header] = orig_headers[header] > + for header in orig_headers.keys(): > + mail[header] = orig_headers[header] > > if 'Date' not in mail: > mail['Date'] = email.utils.formatdate(utc_timestamp) > -- > 2.13.6 > > _______________________________________________ > Patchwork mailing list > Patchwork@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/patchwork
On Thu, 2018-04-05 at 19:58 +1000, Daniel Axtens wrote: > vkabatov@redhat.com writes: > > > From: Veronika Kabatova <vkabatov@redhat.com> > > > > Solves issue #165 (Exported mboxes should include In-Reply-To, > > References, etc headers). Instead of including only a few chosen ones, > > all received headers are added to mboxes. > > Thanks for the patch. > > I'm a little worried that this will get really messy - I've included a > snippet of headers from an email from an unrelated bug below. Maybe we > don't care - I guess this isn't really for human consumption but is for > consumption by e.g. git-am. > > Alternatively we can blacklist headers: I don't think there's anything > worth having in Received, X-*, List-*, DKIM, ARC, SPF, etc. But I wonder > if this is just whack-a-mole in reverse. > > Thoughts? I'm not really sure human consumption would be a worry for mbox files? Having the full headers - since it's sort of an email archive already - would be useful though. In particular, the change to keep the original Subject would be useful for us, as we have a script that automatically replies to the email saying "thank you, I've applied your patch" or similar. That said, using a dict for this is in general not quite right, since many header lines are valid multiple times, e.g. "Received:", but I'm not sure they're even all stored. johannes
----- Original Message ----- > From: "Daniel Axtens" <dja@axtens.net> > To: vkabatov@redhat.com, patchwork@lists.ozlabs.org > Sent: Thursday, April 5, 2018 11:58:15 AM > Subject: Re: [PATCH] Include all email headers in mboxes > > vkabatov@redhat.com writes: > > > From: Veronika Kabatova <vkabatov@redhat.com> > > > > Solves issue #165 (Exported mboxes should include In-Reply-To, > > References, etc headers). Instead of including only a few chosen ones, > > all received headers are added to mboxes. > > Thanks for the patch. > > I'm a little worried that this will get really messy - I've included a > snippet of headers from an email from an unrelated bug below. Maybe we > don't care - I guess this isn't really for human consumption but is for > consumption by e.g. git-am. > > Alternatively we can blacklist headers: I don't think there's anything > worth having in Received, X-*, List-*, DKIM, ARC, SPF, etc. But I wonder > if this is just whack-a-mole in reverse. > > Thoughts? > Hi, I've thought about ignoring eg headers starting with X-* but ultimately decided I don't want to exclude them. Maybe people use custom headers for patch apply hooks or similar craziness, or have CI / tools that download the mbox and require a specific header set (for example reply to the author and mailing list if the patch is broken unless X-ignore header is set) etc. Whatever subset we decide to include, there might be someone who needs one of the headers that wasn't added. Granted, you can extract the headers from the API itself but we can say the same about extracting the patch from there since git am eats input from stdin too. From my experience, blacklisting never works. Things evolve and new "messy" headers will be added, or people will just use a different nonstandard header name if they need custom headers and so on. As you said, I doubt anyone actually needs to read the mbox files. Everything included there is already visible in WebUI or API. Also, skipping the header section / searching the text should work if somebody really needs to consume raw mboxes without tools. Veronika > Regards, > Daniel > > > Delivered-To: dja@axtens.net > Received: by 10.74.139.216 with SMTP id p24csp390579ooj; > Wed, 21 Feb 2018 01:07:26 -0800 (PST) > X-Google-Smtp-Source: > AH8x227WmScjPDkd1bgpNsd0DYg4ae/OzHPrTT68ezy+A7p6CMpHdsmU6pB6dZluayRkf9LDxR5l > X-Received: by 10.99.55.1 with SMTP id e1mr2155932pga.237.1519204046116; > Wed, 21 Feb 2018 01:07:26 -0800 (PST) > ARC-Seal: i=1; a=rsa-sha256; t=1519204046; cv=none; > d=google.com; s=arc-20160816; > b=PXNfl2kupZqmeFoDWy2tLz6100KxBfLj/Rmu4/OTqiLwcPVNIsGBGE2FN/e/Gj2LCu > NDdGSTuaDdYxI0wlCEqBZBrVatoda/PHBX2Fh2HZ2rzEahAs5R2w+YucxRTAi8UT7MTo > xxxHqQ1DehoZmG9J3l0Ckh1cXmAplnmLB7+MufnD7FT7ajNufPhGiZ2ovDCcqb0/6KxG > 8cfsjAnNAZAF3M8VF8r2KSubGuplPq4sB7aF9eklL0HmnXNVLYkyiwZbJuI/vVh6yBkc > rKSdP02SlazOIL8FJuJQC5reac17qfbyjg/tbxyCC+/Lmc/DwexMfA4NPhEgqPZgwFWr > 9ILQ== > ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; > s=arc-20160816; > h=list-id:precedence:sender:content-transfer-encoding:mime-version > :message-id:date:subject:to:from:dkim-signature > :arc-authentication-results; > bh=CFYy3bhn7eoWPF6AJMH1sMf+K6u6usD+1M2NOg/UHAw=; > b=VtdyXugEOzZm0I0t+YzlUoOWX+xI4UC50ZF9w5gd+zlNTsJMDGtykiwjZujU4FzMiP > CdCwQqdlTuB5aY99HtOyBl1jjGz2zo1bGxRwr1aGFE+pbyUN5FFlTntYxZ0AD5wbDP3S > 2hn6mDxzugFSJ9+LbDaa8lS4g6RENzaxHTPt7B6paAkbSFSZsBtY9oASno7RvI3ctlfz > LCb1gAnhvL5p2H5vpNeIOr9eTQIOqhh7+kmmq7tMky2/aB95az+AaU/dKJQoUTsIY7qT > gOTZAdHc4+fZHD2CeP9bX8MTUXy1Yc3fNy7ETbYqoe0+Gl2ke8m9nw84rHTmidoygKrH > CwoQ== > ARC-Authentication-Results: i=1; mx.google.com; > dkim=pass header.i=@gmail.com header.s=20161025 header.b=YTP6W3WD; > spf=pass (google.com: best guess record for domain of > linux-kernel-owner@vger.kernel.org designates 209.132.180 > .67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; > dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com > Return-Path: <linux-kernel-owner@vger.kernel.org> > Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) > by mx.google.com with ESMTP id s9si2120330pgn.281.2018.02.21.01.07.26 > for <dja@axtens.net>; > Wed, 21 Feb 2018 01:07:26 -0800 (PST) > Received-SPF: pass (google.com: best guess record for domain of > linux-kernel-owner@vger.kernel.org designates 209.132. > 180.67 as permitted sender) client-ip=209.132.180.67; > Authentication-Results: mx.google.com; > dkim=pass header.i=@gmail.com header.s=20161025 header.b=YTP6W3WD; > spf=pass (google.com: best guess record for domain of > linux-kernel-owner@vger.kernel.org designates 209.132.180 > .67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; > dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com > Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand > id S1753028AbeBUJHV (ORCPT <rfc822;dja@axtens.net>); > Wed, 21 Feb 2018 04:07:21 -0500 > Received: from mail-wr0-f196.google.com ([209.85.128.196]:34728 "EHLO > mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org > with ESMTP id S1752409AbeBUJHR (ORCPT > <rfc822;linux-kernel@vger.kernel.org>); > Wed, 21 Feb 2018 04:07:17 -0500 > Received: by mail-wr0-f196.google.com with SMTP id m5so2297944wrg.1; > Wed, 21 Feb 2018 01:07:17 -0800 (PST) > DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; > d=gmail.com; s=20161025; > h=from:to:subject:date:message-id:mime-version > :content-transfer-encoding; > bh=CFYy3bhn7eoWPF6AJMH1sMf+K6u6usD+1M2NOg/UHAw=; > b=YTP6W3WDHDARqaf+T4aDY1Z5IM4p2NlgLAJEnVsmJMIqE6PIL/1gIU2w7znKP2BZ2X > 7xNMoPQrYm8bHmWr5ZNG5d2ql+zQOCnFE8XKIYCUCz4rYb0umccrvDcSgrGs27/UvIw3 > 8Tc0WFplVgoOsHMW+k7YF+a3an50Q/gDWwjT/9h/lvzRRZ5H2t5NAaKuuikgQoGSi6Su > /FQBXhY8qT6E5TeC2r6XnuyQy73YV1DsUMZ8Ehjy/6tbJhUMsveshHC8/tNh9TbxhWoK > fVlGEDVNmR2MHAkDnZFkACakllfB9sE3FlkwEQ0hPDUsUKEgS5RXoNlfc03Y3W7RNFpx > XXIw== > X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; > d=1e100.net; s=20161025; > h=x-gm-message-state:from:to:subject:date:message-id:mime-version > :content-transfer-encoding; > bh=CFYy3bhn7eoWPF6AJMH1sMf+K6u6usD+1M2NOg/UHAw=; > b=ORPK99ZlmTZ3ktjBaeVk3H3JZ/9sR4C8ohoUBdTnadoY6qa/wIhusqOXjEJo+RBS7D > YZ6zbHR0iFM2ErnSnEgBUWQMgjsi3uSzTrHz+Fz0JDW02P9/noCHfXfl95yMRR5xLlS4 > GeAIZKjHHYNmKK44oYrUYHfhpokK99afmJ0zIlbDjBbuyjsCgGih4s4+/n1IcGQQ91er > pc9BUG/AwcAfmRkm6dryBzc3Qk4UA0EulG2iVjZmMlqO0GL1V4w0RjTx37Ym9+gAmMlq > oKbLQq/ELtj252qNkRw1bxoE5PLBpfszmAv0kREKzvN7+9LNjYCac3h97xIz7AqQKdFW > PiEg== > X-Gm-Message-State: APf1xPDny4m9NzXZAkwJRYoao3cmAcW+tMOYiCFlZSovJ31ZZJ21t0qk > lHrju+PIF9J/hpFO6iSeSeE= > X-Received: by 10.28.167.208 with SMTP id q199mr1530756wme.29.1519204036496; > Wed, 21 Feb 2018 01:07:16 -0800 (PST) > Received: from baker.fritz.box ([2a02:908:1251:8fc0:d050:cb11:74d7:d8f6]) > by smtp.gmail.com with ESMTPSA id > x189sm1661875wmg.23.2018.02.21.01.07.15 > (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); > Wed, 21 Feb 2018 01:07:16 -0800 (PST) > From: "=?UTF-8?q?Christian=20K=C3=B6nig?=" > <ckoenig.leichtzumerken@gmail.com> > X-Google-Original-From: =?UTF-8?q?Christian=20K=C3=B6nig?= > <christian.koenig@amd.com> > To: bhelgaas@google.com, linux-kernel@vger.kernel.org, > linux-pci@vger.kernel.org > Subject: [PATCH] PCI: stop crashing in pci_release_resource v2 > Date: Wed, 21 Feb 2018 10:07:15 +0100 > Message-Id: <20180221090715.2853-1-christian.koenig@amd.com> > X-Mailer: git-send-email 2.14.1 > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > Sender: linux-kernel-owner@vger.kernel.org > Precedence: bulk > List-ID: <linux-kernel.vger.kernel.org> > X-Mailing-List: linux-kernel@vger.kernel.org > X-TUID: AvE/m/C3rDgS > > > > > Signed-off-by: Veronika Kabatova <vkabatov@redhat.com> > > --- > > patchwork/views/utils.py | 8 ++------ > > 1 file changed, 2 insertions(+), 6 deletions(-) > > > > diff --git a/patchwork/views/utils.py b/patchwork/views/utils.py > > index 84682b8..5c28308 100644 > > --- a/patchwork/views/utils.py > > +++ b/patchwork/views/utils.py > > @@ -89,21 +89,17 @@ def _submission_to_mbox(submission): > > utc_timestamp = delta.seconds + delta.days * 24 * 3600 > > > > mail = PatchMbox(body) > > - mail['Subject'] = submission.name > > mail['X-Patchwork-Submitter'] = email.utils.formataddr(( > > str(Header(submission.submitter.name, mail.patch_charset)), > > submission.submitter.email)) > > mail['X-Patchwork-Id'] = str(submission.id) > > if is_patch and submission.delegate: > > mail['X-Patchwork-Delegate'] = str(submission.delegate.email) > > - mail['Message-Id'] = submission.msgid > > mail.set_unixfrom('From patchwork ' + submission.date.ctime()) > > > > - copied_headers = ['To', 'Cc', 'Date', 'From', 'List-Id'] > > orig_headers = HeaderParser().parsestr(str(submission.headers)) > > - for header in copied_headers: > > - if header in orig_headers: > > - mail[header] = orig_headers[header] > > + for header in orig_headers.keys(): > > + mail[header] = orig_headers[header] > > > > if 'Date' not in mail: > > mail['Date'] = email.utils.formatdate(utc_timestamp) > > -- > > 2.13.6 > > > > _______________________________________________ > > Patchwork mailing list > > Patchwork@lists.ozlabs.org > > https://lists.ozlabs.org/listinfo/patchwork >
Johannes Berg <johannes@sipsolutions.net> writes: > On Thu, 2018-04-05 at 19:58 +1000, Daniel Axtens wrote: >> vkabatov@redhat.com writes: >> >> > From: Veronika Kabatova <vkabatov@redhat.com> >> > >> > Solves issue #165 (Exported mboxes should include In-Reply-To, >> > References, etc headers). Instead of including only a few chosen ones, >> > all received headers are added to mboxes. >> >> Thanks for the patch. >> >> I'm a little worried that this will get really messy - I've included a >> snippet of headers from an email from an unrelated bug below. Maybe we >> don't care - I guess this isn't really for human consumption but is for >> consumption by e.g. git-am. >> >> Alternatively we can blacklist headers: I don't think there's anything >> worth having in Received, X-*, List-*, DKIM, ARC, SPF, etc. But I wonder >> if this is just whack-a-mole in reverse. >> >> Thoughts? > > I'm not really sure human consumption would be a worry for mbox files? > Having the full headers - since it's sort of an email archive already - > would be useful though. > > In particular, the change to keep the original Subject would be useful > for us, as we have a script that automatically replies to the email > saying "thank you, I've applied your patch" or similar. Hmm, I think the hasher.py and patchwork-update-commits scripts were designed to facilitate this sort of thing in a slightly more robust way. But I've never really felt like I understood it, and I don't recall any docs. I hope that the change to the subject mangling doesn't break anything else, but I can't imagine it would. [0] > That said, using a dict for this is in general not quite right, since > many header lines are valid multiple times, e.g. "Received:", but I'm > not sure they're even all stored. Right, you and Veronika have convinced me. Clearly as a developer of patchwork my view is a bit skewed - I have to look at them fairly frequently when people report parsing bugs so I forget that other people don't do this. Veronika - can you check on repeated headers issue that Johannes raises? Thanks all. Regards, Daniel [0] Obligatory xkcd - https://xkcd.com/1172/ > > johannes
----- Original Message ----- > From: "Daniel Axtens" <dja@axtens.net> > To: "Johannes Berg" <johannes@sipsolutions.net>, vkabatov@redhat.com, patchwork@lists.ozlabs.org > Sent: Thursday, April 5, 2018 3:47:03 PM > Subject: Re: [PATCH] Include all email headers in mboxes > > Johannes Berg <johannes@sipsolutions.net> writes: > > > On Thu, 2018-04-05 at 19:58 +1000, Daniel Axtens wrote: > >> vkabatov@redhat.com writes: > >> > >> > From: Veronika Kabatova <vkabatov@redhat.com> > >> > > >> > Solves issue #165 (Exported mboxes should include In-Reply-To, > >> > References, etc headers). Instead of including only a few chosen ones, > >> > all received headers are added to mboxes. > >> > >> Thanks for the patch. > >> > >> I'm a little worried that this will get really messy - I've included a > >> snippet of headers from an email from an unrelated bug below. Maybe we > >> don't care - I guess this isn't really for human consumption but is for > >> consumption by e.g. git-am. > >> > >> Alternatively we can blacklist headers: I don't think there's anything > >> worth having in Received, X-*, List-*, DKIM, ARC, SPF, etc. But I wonder > >> if this is just whack-a-mole in reverse. > >> > >> Thoughts? > > > > I'm not really sure human consumption would be a worry for mbox files? > > Having the full headers - since it's sort of an email archive already - > > would be useful though. > > > > In particular, the change to keep the original Subject would be useful > > for us, as we have a script that automatically replies to the email > > saying "thank you, I've applied your patch" or similar. > > Hmm, I think the hasher.py and patchwork-update-commits scripts were > designed to facilitate this sort of thing in a slightly more robust > way. But I've never really felt like I understood it, and I don't recall > any docs. > > I hope that the change to the subject mangling doesn't break anything > else, but I can't imagine it would. [0] > > > That said, using a dict for this is in general not quite right, since > > many header lines are valid multiple times, e.g. "Received:", but I'm > > not sure they're even all stored. > > Right, you and Veronika have convinced me. Clearly as a developer of > patchwork my view is a bit skewed - I have to look at them fairly > frequently when people report parsing bugs so I forget that other people > don't do this. > > Veronika - can you check on repeated headers issue that Johannes raises? > Right, this won't be too friendly in case of duplicate header keys. We do store all headers with duplicate keys so I can easily change that and post updated patch shortly. However, since this was brought up, the same situation is present in the API -- only one of the duplicate headers is shown. Based on my testing, the code on our side does return correct values so this should be a problem with the REST framework or something similar. I might take a look at it later but can't promise I'll be able to fix it easily. Thanks, Veronika > Thanks all. > > Regards, > Daniel > > [0] Obligatory xkcd - https://xkcd.com/1172/ > > > > johannes >
>> > That said, using a dict for this is in general not quite right, since >> > many header lines are valid multiple times, e.g. "Received:", but I'm >> > not sure they're even all stored. >> >> Right, you and Veronika have convinced me. Clearly as a developer of >> patchwork my view is a bit skewed - I have to look at them fairly >> frequently when people report parsing bugs so I forget that other people >> don't do this. >> >> Veronika - can you check on repeated headers issue that Johannes raises? >> > > Right, this won't be too friendly in case of duplicate header keys. We do > store all headers with duplicate keys so I can easily change that and post > updated patch shortly. > > However, since this was brought up, the same situation is present in > the API -- only one of the duplicate headers is shown. Based on my testing, > the code on our side does return correct values so this should be a problem > with the REST framework or something similar. I might take a look at it > later but can't promise I'll be able to fix it easily. Thanks! Fixing one problem at a time is fine - feel free to leave the API for later. I think it'll need a special serializer so we may need Stephen's expertise. Regards, Daniel
diff --git a/patchwork/views/utils.py b/patchwork/views/utils.py index 84682b8..5c28308 100644 --- a/patchwork/views/utils.py +++ b/patchwork/views/utils.py @@ -89,21 +89,17 @@ def _submission_to_mbox(submission): utc_timestamp = delta.seconds + delta.days * 24 * 3600 mail = PatchMbox(body) - mail['Subject'] = submission.name mail['X-Patchwork-Submitter'] = email.utils.formataddr(( str(Header(submission.submitter.name, mail.patch_charset)), submission.submitter.email)) mail['X-Patchwork-Id'] = str(submission.id) if is_patch and submission.delegate: mail['X-Patchwork-Delegate'] = str(submission.delegate.email) - mail['Message-Id'] = submission.msgid mail.set_unixfrom('From patchwork ' + submission.date.ctime()) - copied_headers = ['To', 'Cc', 'Date', 'From', 'List-Id'] orig_headers = HeaderParser().parsestr(str(submission.headers)) - for header in copied_headers: - if header in orig_headers: - mail[header] = orig_headers[header] + for header in orig_headers.keys(): + mail[header] = orig_headers[header] if 'Date' not in mail: mail['Date'] = email.utils.formatdate(utc_timestamp)