Message ID | 1297719787.10411.46.camel@feioso |
---|---|
State | RFC |
Headers | show |
Hi Guilherme, > We're going to use Patchwork to track Linaro patches, but instead of > subscribing to all mailing lists to which patches may be sent, we're > asking all Linaro developers to CC a common email address whenever they > send a patch upstream, and we'll then feed all mail delivered to that > address into Patchwork. I don't see the reasoning for doing this - it's more work (once-off setup vs getting every contributor to do stuff repeatedly) and less reliable (contributors may forget to add the magic address, or not know about this requirement). If there's sound justification for doing this though, I'm not averse to the change, but it needs to be sensible :) Cheers, Jeremy
Hi Jeremy, On Tue, 2011-02-15 at 08:42 +0800, Jeremy Kerr wrote: > Hi Guilherme, > > > We're going to use Patchwork to track Linaro patches, but instead of > > subscribing to all mailing lists to which patches may be sent, we're > > asking all Linaro developers to CC a common email address whenever they > > send a patch upstream, and we'll then feed all mail delivered to that > > address into Patchwork. > > I don't see the reasoning for doing this - it's more work (once-off setup vs > getting every contributor to do stuff repeatedly) and less reliable > (contributors may forget to add the magic address, or not know about this > requirement). Maybe I wasn't clear, but this is not to track patches sent to Linaro; what we want is to track patches sent upstream by engineers working for Linaro. It already is the company's policy that all patches sent upstream should be CCed to this magic address, and feeding that address' mailbox to patchwork.linaro.org seemed like a better alternative than continuously having to subscribe patchwork.l.o to the mailing lists of all different projects that Linaro engineers may contribute to (we don't have a list of all these projects beforehand and we expect it to grow all the time). Cheers,
Hi Guilherme, > all different projects that Linaro engineers may contribute to (we don't > have a list of all these projects beforehand and we expect it to grow > all the time). You'll still have the same problem here; patchwork needs to know about the project before it will parse patches for that project. My only concern is the impresicion of using the list address; it'd be common to send a patch to mulitple lists. If patchwork is configured for both lists, we'll end up with a dropped patch on the non-matched list. We could work-around this with handling multiple 'parses' per incoming email, but we'd need to be careful about avoiding duplicates here. Also, if this was configurable (PATCHWORK_FALLBACK_TO_LISTEMAIL perhaps?), I'd be much happier :) Cheers, Jeremy
On Tue, 2011-03-08 at 13:58 +0800, Jeremy Kerr wrote: > Hi Guilherme, > > > all different projects that Linaro engineers may contribute to (we don't > > have a list of all these projects beforehand and we expect it to grow > > all the time). > > You'll still have the same problem here; patchwork needs to know about the > project before it will parse patches for that project. You're right, but we have a custom email-parsing script which places patches sent to unknown mailing lists under a catch-all project. That allows us to identify new projects that need to be created, and together with a new form for moving multiple patches from one project to another we can also easily move the patches from the catch-all project to the newly created ones. I haven't sent the patches for those changes because I thought they wouldn't be of much use in general, but I'm happy to send them if you think they could be useful to others. > > My only concern is the impresicion of using the list address; it'd be common > to send a patch to mulitple lists. If patchwork is configured for both lists, > we'll end up with a dropped patch on the non-matched list. > > We could work-around this with handling multiple 'parses' per incoming email, > but we'd need to be careful about avoiding duplicates here. We could also make find_project() return a list of projects and create one Patch for every project returned. > > Also, if this was configurable (PATCHWORK_FALLBACK_TO_LISTEMAIL perhaps?), I'd > be much happier :) Sure, I'm happy to make it configurable. Cheers,
diff --git a/apps/patchwork/bin/parsemail.py b/apps/patchwork/bin/parsemail.py index 700cb6f..305ab95 100755 --- a/apps/patchwork/bin/parsemail.py +++ b/apps/patchwork/bin/parsemail.py @@ -55,7 +55,7 @@ def clean_header(header): return normalise_space(u' '.join(fragments)) -def find_project(mail): +def find_project_by_listid(mail): project = None listid_res = [re.compile('.*<([^>]+)>.*', re.S), re.compile('^([\S]+)$', re.S)] @@ -81,6 +81,30 @@ def find_project(mail): return project +def extract_email_addresses(str): + email_re = re.compile( + r"([_\.0-9a-zA-Z-+=]+@(([0-9a-zA-Z-]{1,}\.)*)[a-zA-Z]{2,})") + # re.findall() will return a list of tuples because we have multiple + # groups on the regex above, but we're only interested on the outermost + # group (which should contain the whole email address), so we drop the + # second and third groups. + return [email for email, dummy, dummy2 in email_re.findall(str)] + +def find_project_by_list_address(mail): + recipients = mail.get('To', '') + mail.get('CC', '') + for email_address in extract_email_addresses(recipients): + try: + return Project.objects.get(listemail = email_address) + except Project.DoesNotExist: + pass + return None + +def find_project(mail): + project = find_project_by_listid(mail) + if project is None: + project = find_project_by_list_address(mail) + return project + def find_author(mail): from_header = clean_header(mail.get('From')) diff --git a/apps/patchwork/tests/patchparser.py b/apps/patchwork/tests/patchparser.py index ff0025a..d4d15d1 100644 --- a/apps/patchwork/tests/patchparser.py +++ b/apps/patchwork/tests/patchparser.py @@ -34,8 +34,9 @@ class PatchTest(unittest.TestCase): default_subject = defaults.subject project = defaults.project -from patchwork.bin.parsemail import find_content, find_author, find_project, \ - parse_mail +from patchwork.bin.parsemail import ( + extract_email_addresses, find_content, find_author, find_project, + parse_mail) class InlinePatchTest(PatchTest): patch_filename = '0001-add-line.patch' @@ -276,18 +277,44 @@ class MultipleProjectPatchCommentTest(MultipleProjectPatchTest): # and the one we parsed in setUp() self.assertEquals(Comment.objects.filter(patch = patch).count(), 2) -class ListIdHeaderTest(unittest.TestCase): - """ Test that we parse List-Id headers from mails correctly """ +class EmailProjectGuessing(unittest.TestCase): + """Projects are guessed based on List-Id headers or recipient addresses""" def setUp(self): self.project = Project(linkname = 'test-project-1', name = 'Project 1', listid = '1.example.com', listemail='1@example.com') self.project.save() + def testExtractingEmailAddressesFromRecipientsList(self): + emails = extract_email_addresses( + '"Foo Bar" <foo.bar@example.com>,' + '<baz+list@foo.example.com>,' + 'bar-foo@bar.foo.example.com,' + # Notice that this one is not a valid email address. + 'bar-foo@.com') + self.assertEqual( + ['foo.bar@example.com', + 'baz+list@foo.example.com', + 'bar-foo@bar.foo.example.com'], + emails) + def testNoListId(self): email = MIMEText('') project = find_project(email) self.assertEquals(project, None) + def testNoListIdWithListEmailAsRecipient(self): + email = MIMEText('') + email['To'] = '"First dev list" <1@example.com>' + project = find_project(email) + self.assertEquals(self.project, project) + + def testNoListIdWithListEmailAsCC(self): + email = MIMEText('') + email['CC'] = ('"First maintainer <maintainer@example.com>, ' + '"First dev list" <1@example.com>') + project = find_project(email) + self.assertEquals(self.project, project) + def testBlankListId(self): email = MIMEText('') email['List-Id'] = ''