diff mbox series

[v2,8/9] parser: use Patch.objects.create instead of save()

Message ID 20180224145020.15181-9-dja@axtens.net
State Accepted
Headers show
Series Tools and fixes for parallel parsing | expand

Checks

Context Check Description
dja/snowpatch-0_1_0 success master/apply_patch Successfully applied
dja/snowpatch-snowpatch_job_snowpatch-patchwork success Test snowpatch/job/snowpatch-patchwork on branch master

Commit Message

Daniel Axtens Feb. 24, 2018, 2:50 p.m. UTC
Attempts to do parallel parsing with MySQL threw the following errors:

_mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

Looking at the code, it was thrown when we created a patch like this:

patch = Patch(...)
patch.save()

The SQL statements that were being generated were weird:

UPDATE "patchwork_patch" SET ...
INSERT INTO "patchwork_patch" (...) VALUES (...)

As far as I can tell, the update could never work, because it was
trying to update a patch that didn't exist yet. My hypothesis is
that Django somehow didn't quite 'get' that because of the backend
complexity of the Patch model, so it tried to do an update, failed,
and then tried an insert.

Change the code to use Patch.objects.create, which makes the UPDATEs
and the weird MySQL errors go away.

Also move it up a bit earlier in the process so that if things go wrong
later at least we've committed the patch to the db.

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 patchwork/parser.py | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

Comments

Stephen Finucane Feb. 25, 2018, 12:25 p.m. UTC | #1
On Sun, 2018-02-25 at 01:50 +1100, Daniel Axtens wrote:
> Attempts to do parallel parsing with MySQL threw the following errors:
> 
> _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
> 
> Looking at the code, it was thrown when we created a patch like this:
> 
> patch = Patch(...)
> patch.save()
> 
> The SQL statements that were being generated were weird:
> 
> UPDATE "patchwork_patch" SET ...
> INSERT INTO "patchwork_patch" (...) VALUES (...)
> 
> As far as I can tell, the update could never work, because it was
> trying to update a patch that didn't exist yet. My hypothesis is
> that Django somehow didn't quite 'get' that because of the backend
> complexity of the Patch model, so it tried to do an update, failed,
> and then tried an insert.
> 
> Change the code to use Patch.objects.create, which makes the UPDATEs
> and the weird MySQL errors go away.
> 
> Also move it up a bit earlier in the process so that if things go wrong
> later at least we've committed the patch to the db.
> 
> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Signed-off-by: Daniel Axtens <dja@axtens.net>

Yup, this makes total sense. The reason this was done this way
previously was that I wanted to ensure as much pre-work as possible was
done before saving patches. If that failed, the patch was "broken" and
shouldn't be saved. However, as we've seen, it's usually Patchwork, not
the patch, that's broken, and this pattern ensure the patch is never
saved. Changing the ordering here is therefore a sensible thing to do.
As such:

Reviewed-by: Stephen Finucane <stephen@that.guru>
diff mbox series

Patch

diff --git a/patchwork/parser.py b/patchwork/parser.py
index 9502162be90c..56dc7006c811 100644
--- a/patchwork/parser.py
+++ b/patchwork/parser.py
@@ -984,6 +984,20 @@  def parse_mail(mail, list_id=None):
             filenames = find_filenames(diff)
             delegate = find_delegate_by_filename(project, filenames)
 
+        patch = Patch.objects.create(
+            msgid=msgid,
+            project=project,
+            name=name[:255],
+            date=date,
+            headers=headers,
+            submitter=author,
+            content=message,
+            diff=diff,
+            pull_url=pull_url,
+            delegate=delegate,
+            state=find_state(mail))
+        logger.debug('Patch saved')
+
         # if we don't have a series marker, we will never have an existing
         # series to match against.
         series = None
@@ -1024,21 +1038,6 @@  def parse_mail(mail, list_id=None):
                 except SeriesReference.DoesNotExist:
                     SeriesReference.objects.create(series=series, msgid=ref)
 
-        patch = Patch(
-            msgid=msgid,
-            project=project,
-            name=name[:255],
-            date=date,
-            headers=headers,
-            submitter=author,
-            content=message,
-            diff=diff,
-            pull_url=pull_url,
-            delegate=delegate,
-            state=find_state(mail))
-        patch.save()
-        logger.debug('Patch saved')
-
         # add to a series if we have found one, and we have a numbered
         # patch. Don't add unnumbered patches (for example diffs sent
         # in reply, or just messages with random refs/in-reply-tos)