From patchwork Tue Aug 24 15:28:11 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: handle octet-stream attachments that contain patches From: Patrick Georgi X-Patchwork-Id: 62650 Message-Id: <4C73E50B.4000707@georgi-clan.de> To: patchwork@lists.ozlabs.org Date: Tue, 24 Aug 2010 17:28:11 +0200 Hi, attached patch adds handling for octet-stream typed patches, like google mail sends them. It relies on parse_patch to determine if the octet-stream is actually a patch, and I didn't check how robust that is against abuse (eg. binary data), so that might not be the best approach. Signed-off-by: Patrick Georgi The other issue I'd like to fix (or see fixed preferably :-) ) is that replies that dissect a patch (by commenting inline) sometimes end up as new (and invalid) patches on the patchwork queue. My approach would be to change parse_patch to be somewhat more strict on the patch format. I didn't look at it yet, but that should fix it. Is it a design goal to keep reformatted patches (eg. line breaks) alive? That would break said approach. Regards, Patrick Georgi diff --git a/apps/patchwork/bin/parsemail.py b/apps/patchwork/bin/parsemail.py index 68bd94c..190b312 100755 --- a/apps/patchwork/bin/parsemail.py +++ b/apps/patchwork/bin/parsemail.py @@ -140,31 +140,44 @@ def find_content(project, mail): commentbuf = '' for part in mail.walk(): - if part.get_content_maintype() != 'text': - continue + if part.get_content_maintype() == 'text': + payload = part.get_payload(decode=True) + charset = part.get_content_charset() + subtype = part.get_content_subtype() - payload = part.get_payload(decode=True) - charset = part.get_content_charset() - subtype = part.get_content_subtype() + # if we don't have a charset, assume utf-8 + if charset is None: + charset = 'utf-8' - # if we don't have a charset, assume utf-8 - if charset is None: - charset = 'utf-8' + if not isinstance(payload, unicode): + payload = unicode(payload, charset) - if not isinstance(payload, unicode): - payload = unicode(payload, charset) + if subtype in ['x-patch', 'x-diff']: + patchbuf = payload - if subtype in ['x-patch', 'x-diff']: - patchbuf = payload + elif subtype == 'plain': + if not patchbuf: + (patchbuf, c) = parse_patch(payload) + else: + c = payload - elif subtype == 'plain': - if not patchbuf: - (patchbuf, c) = parse_patch(payload) - else: - c = payload + if c is not None: + commentbuf += c.strip() + '\n' - if c is not None: - commentbuf += c.strip() + '\n' + elif part.get_content_maintype() == 'application' and part.get_content_subtype() == 'octet-stream': + payload = part.get_payload(decode=True) + charset = part.get_content_charset() + # if we don't have a charset, assume utf-8 + if charset is None: + charset = 'utf-8' + + if not isinstance(payload, unicode): + payload = unicode(payload, charset) + + (patchbuf, c) = parse_patch(payload) + + if c is not None: + commentbuf += c.strip() + '\n' patch = None comment = None