mbox series

[0/9] Tools and fixes for parallel parsing

Message ID 20180221141716.10908-1-dja@axtens.net
Headers show
Series Tools and fixes for parallel parsing | expand

Message

Daniel Axtens Feb. 21, 2018, 2:17 p.m. UTC
Thomas Petazzoni reported that Patchwork would occasionally lose
Buildroot email. Andrew - having talked to jk and sfr - suggested that
this may be race-condition related.

I investigated and found some bugs. I first had to develop some tools.
Along the way I found other unrelated bugs too.

Patches 1-4 are tooling - ways to do parallel parsing of messages and
get and compare the output. (Patch 1 fixes an issue I found when
running the tool from patch 2)

Patch 5 is an unrelated fix that came up along the way and
demonstrates that humans remain the best fuzzers, and that Python's
email module is still adorably* quirky.

Patch 6 is a bug that came up very quickly in testing but is unlikely
to be the actual bug Buildroot is hitting, as it can only occur the
first time an email address is seen.

Patch 7 is a related tidy-up/optimisation.

Patch 8 fixes up a MySQL-only bug, but also adds some robustness.

I think patch 9 closes the most likely issue for Buildroot patches.

Pending review, patches 5, 6, 8 and 9 should go to stable.

Regards,
Daniel

Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Daniel Axtens (9):
  tools/docker: assume terminal supports utf-8
  debugging: add command to dump patches and series
  tools/scripts: split a mbox N ways
  tools/scripts: parallel_parsearchive - load archives in parallel
  parser: Handle even more exotically broken headers
  parser: close a TOCTTOU bug on Person creation
  parser: avoid an unnecessary UPDATE of Person
  parser: use Patch.objects.create instead of save()
  parser: don't fail on multiple SeriesReferences

 patchwork/management/commands/debug_dump.py | 46 +++++++++++++++
 patchwork/parser.py                         | 92 +++++++++++++++++++----------
 patchwork/tests/fuzztests/x-face.mbox       | 58 ++++++++++++++++++
 patchwork/tests/test_parser.py              | 44 +++++++-------
 tools/docker/Dockerfile                     |  1 +
 tools/scripts/parallel_parsearchive.sh      | 55 +++++++++++++++++
 tools/scripts/split_mail.py                 | 76 ++++++++++++++++++++++++
 7 files changed, 317 insertions(+), 55 deletions(-)
 create mode 100644 patchwork/management/commands/debug_dump.py
 create mode 100644 patchwork/tests/fuzztests/x-face.mbox
 create mode 100755 tools/scripts/parallel_parsearchive.sh
 create mode 100755 tools/scripts/split_mail.py

Comments

Thomas Petazzoni Feb. 21, 2018, 2:29 p.m. UTC | #1
Hello Daniel,

On Thu, 22 Feb 2018 01:17:07 +1100, Daniel Axtens wrote:
> Thomas Petazzoni reported that Patchwork would occasionally lose
> Buildroot email. Andrew - having talked to jk and sfr - suggested that
> this may be race-condition related.
> 
> I investigated and found some bugs. I first had to develop some tools.
> Along the way I found other unrelated bugs too.
> 
> Patches 1-4 are tooling - ways to do parallel parsing of messages and
> get and compare the output. (Patch 1 fixes an issue I found when
> running the tool from patch 2)
> 
> Patch 5 is an unrelated fix that came up along the way and
> demonstrates that humans remain the best fuzzers, and that Python's
> email module is still adorably* quirky.
> 
> Patch 6 is a bug that came up very quickly in testing but is unlikely
> to be the actual bug Buildroot is hitting, as it can only occur the
> first time an email address is seen.
> 
> Patch 7 is a related tidy-up/optimisation.
> 
> Patch 8 fixes up a MySQL-only bug, but also adds some robustness.
> 
> I think patch 9 closes the most likely issue for Buildroot patches.
> 
> Pending review, patches 5, 6, 8 and 9 should go to stable.

Thanks a lot for your work on this issue, much appreciated!

Unfortunately, I have no idea how we could easily test this, since
we're using the official ozlabs.org instance. Do you think it would be
possible to have a separate testing instance setup, which we subscribed
to the Buildroot mailing list as well, just to check if the problem is
fixed? Of course, we won't be using this testing instance to update the
status of patches, but we would at least be able to verify that no
patches are lost.

What do you think?

Best regards,

Thomas
Andrew Donnellan Feb. 22, 2018, 1:17 a.m. UTC | #2
On 22/02/18 01:29, Thomas Petazzoni wrote:
> Unfortunately, I have no idea how we could easily test this, since
> we're using the official ozlabs.org instance. Do you think it would be
> possible to have a separate testing instance setup, which we subscribed
> to the Buildroot mailing list as well, just to check if the problem is
> fixed? Of course, we won't be using this testing instance to update the
> status of patches, but we would at least be able to verify that no
> patches are lost.
> 
> What do you think?

dja used to run a testing instance but VMs ain't free :)

We'll probably just backport this to ozlabs.org after this gets a bit of 
review.
Daniel Axtens Feb. 22, 2018, 1:49 a.m. UTC | #3
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes:

> On 22/02/18 01:29, Thomas Petazzoni wrote:
>> Unfortunately, I have no idea how we could easily test this, since
>> we're using the official ozlabs.org instance. Do you think it would be
>> possible to have a separate testing instance setup, which we subscribed
>> to the Buildroot mailing list as well, just to check if the problem is
>> fixed? Of course, we won't be using this testing instance to update the
>> status of patches, but we would at least be able to verify that no
>> patches are lost.
>> 
>> What do you think?
>
> dja used to run a testing instance but VMs ain't free :)

I thought about spinning up a VM, but I'm not sure if it would be
sufficiently equivalent to the OzLabs set up to actually hit the bug...

I think hitting the issue it's due to a combination of OzLabs and
Buildroot features:
 - how quickly the buildroot mailing list sends mail
 - how quickly ozlabs receives it - both in terms of mail system lag and
   just in terms of network locality
 - how many cpus ozlabs has to parallelise processing, etc
I don't think I can get an OzLabs like virtual machine for very long
without spending a *lot* of money.
 
> We'll probably just backport this to ozlabs.org after this gets a bit of 
> review.

I think this is probably the way to go, but it's worth noting that as
always I'm not at OzLabs any more :)

Regards,
Daniel

>
>
> -- 
> Andrew Donnellan              OzLabs, ADL Canberra
> andrew.donnellan@au1.ibm.com  IBM Australia Limited
Andrew Donnellan Feb. 22, 2018, 1:52 a.m. UTC | #4
On 22/02/18 12:49, Daniel Axtens wrote:
>> dja used to run a testing instance but VMs ain't free :)
> 
> I thought about spinning up a VM, but I'm not sure if it would be
> sufficiently equivalent to the OzLabs set up to actually hit the bug...
> 
> I think hitting the issue it's due to a combination of OzLabs and
> Buildroot features:
>   - how quickly the buildroot mailing list sends mail
>   - how quickly ozlabs receives it - both in terms of mail system lag and
>     just in terms of network locality
>   - how many cpus ozlabs has to parallelise processing, etc
> I don't think I can get an OzLabs like virtual machine for very long
> without spending a *lot* of money.

ACK - given how it's a rare enough issue on the massive ozlabs.org 
instance it'd be very hard to replicate

> 
>> We'll probably just backport this to ozlabs.org after this gets a bit of
>> review.
> 
> I think this is probably the way to go, but it's worth noting that as
> always I'm not at OzLabs any more :)

incorrect, you're no longer at *IBM* any more ;)
Georg Faerber Feb. 22, 2018, 3:10 a.m. UTC | #5
Hi,

On 18-02-22 12:49:54, Daniel Axtens wrote:
> I don't think I can get an OzLabs like virtual machine for very long
> without spending a *lot* of money.

Could you or someone else elaborate on that? Which kind of VM is it, in
terms of memory, cpu, storage, etc?

Cheers,
Georg
Daniel Axtens Feb. 22, 2018, 3:35 a.m. UTC | #6
Georg Faerber <georg@riseup.net> writes:

> Hi,
>
> On 18-02-22 12:49:54, Daniel Axtens wrote:
>> I don't think I can get an OzLabs like virtual machine for very long
>> without spending a *lot* of money.
>
> Could you or someone else elaborate on that? Which kind of VM is it, in
> terms of memory, cpu, storage, etc?
The OzLabs server is a reasonably powerful dedicated physical machine, co-located
in Canberra, Australia. Beyond that I don't know.

Regards,
Daniel
>
> Cheers,
> Georg
> _______________________________________________
> Patchwork mailing list
> Patchwork@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/patchwork
Andrew Donnellan Feb. 22, 2018, 3:54 a.m. UTC | #7
On 22/02/18 14:35, Daniel Axtens wrote:
>> On 18-02-22 12:49:54, Daniel Axtens wrote:
>>> I don't think I can get an OzLabs like virtual machine for very long
>>> without spending a *lot* of money.
>>
>> Could you or someone else elaborate on that? Which kind of VM is it, in
>> terms of memory, cpu, storage, etc?
> The OzLabs server is a reasonably powerful dedicated physical machine, co-located
> in Canberra, Australia. Beyond that I don't know.

It's not an incredibly powerful box:

ajd@bilbo:~$ free -h
               total        used        free      shared  buff/cache 
available
Mem:            15G        6.2G        298M        228M        9.1G 
   8.8G
Swap:          9.3G        1.1G        8.2G

ajd@bilbo:~$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               60
Model name:          Intel(R) Xeon(R) CPU E3-1220 v3 @ 3.10GHz
Thomas Petazzoni Feb. 22, 2018, 8:19 a.m. UTC | #8
Hello,

On Thu, 22 Feb 2018 12:49:54 +1100, Daniel Axtens wrote:

> I thought about spinning up a VM, but I'm not sure if it would be
> sufficiently equivalent to the OzLabs set up to actually hit the bug...
> 
> I think hitting the issue it's due to a combination of OzLabs and
> Buildroot features:
>  - how quickly the buildroot mailing list sends mail
>  - how quickly ozlabs receives it - both in terms of mail system lag and
>    just in terms of network locality
>  - how many cpus ozlabs has to parallelise processing, etc

Indeed, it's quite surprising that we seem to be the only project
affected by this issue. Yesterday, my colleague Alexandre Belloni, who
maintains the kernel RTC subsystem, sent a 100 patches patch series to
the Linux RTC mailing list, and all of the patches were recorded by
patchwork. On the other hand, someone sent a 6 patches patch series on
the Buildroot mailing list yesterday, and only 2 out of the 6 patches
were recorded by patchwork.

However, I believe the RTC mailing list is using Google Groups, while
we are using a mailman hosted at OSUOSL, which perhaps is sending
e-mail faster than Google Groups.

Anyway, looking forward to seeing those fixes deployed on ozlabs.org!

Thanks again!

Thomas
Thomas Petazzoni Feb. 22, 2018, 8:44 a.m. UTC | #9
Hello,

On Thu, 22 Feb 2018 09:29:55 +0100, Alexandre Belloni wrote:

> > Indeed, it's quite surprising that we seem to be the only project
> > affected by this issue. Yesterday, my colleague Alexandre Belloni, who
> > maintains the kernel RTC subsystem, sent a 100 patches patch series to
> > the Linux RTC mailing list, and all of the patches were recorded by
> > patchwork. On the other hand, someone sent a 6 patches patch series on
> > the Buildroot mailing list yesterday, and only 2 out of the 6 patches
> > were recorded by patchwork.
> > 
> > However, I believe the RTC mailing list is using Google Groups, while
> > we are using a mailman hosted at OSUOSL, which perhaps is sending
> > e-mail faster than Google Groups.
> 
> No, it is now hosted on vger.kernel.org but maybe they also have some
> kind of throttling.

Ah sorry. Wasn't it hosted on Google Groups in the past ?

Thomas