diff mbox series

[8/9] parser: use Patch.objects.create instead of save()

Message ID 20180221141716.10908-9-dja@axtens.net
State Changes Requested
Headers show
Series Tools and fixes for parallel parsing | expand

Checks

Context Check Description
dja/snowpatch-0_1_0 success master/apply_patch Successfully applied
dja/snowpatch-snowpatch_job_snowpatch-patchwork success Test snowpatch/job/snowpatch-patchwork on branch master

Commit Message

Daniel Axtens Feb. 21, 2018, 2:17 p.m. UTC
Attempts to do parallel parsing with MySQL threw the following errors:

_mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

Looking at the code, it was thrown when we created a patch like this:

patch = Patch(...)
patch.save()

The SQL statements that were being generated were weird:

UPDATE "patchwork_patch" SET ...
INSERT INTO "patchwork_patch" (...) VALUES (...)

As far as I can tell, the update could never work, because it was
trying to update a patch that didn't exist yet. My hypothesis is
that Django somehow didn't quite 'get' that because of the backend
complexity of the Patch model, so it tried to do an update, failed,
and then tried an insert.

Change the code to use Patch.objects.create, which makes the UPDATEs
and the weird MySQL errors go away.

Also move it up a bit earlier in the process so that if things go wrong
later at least we've committed the patch to the db.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 patchwork/parser.py | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

Comments

Andrew Donnellan Feb. 22, 2018, 4:50 a.m. UTC | #1
On 22/02/18 01:17, Daniel Axtens wrote:
> Attempts to do parallel parsing with MySQL threw the following errors:
> 
> _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
> 
> Looking at the code, it was thrown when we created a patch like this:
> 
> patch = Patch(...)
> patch.save()
> 
> The SQL statements that were being generated were weird:
> 
> UPDATE "patchwork_patch" SET ...
> INSERT INTO "patchwork_patch" (...) VALUES (...)
> 
> As far as I can tell, the update could never work, because it was
> trying to update a patch that didn't exist yet. My hypothesis is
> that Django somehow didn't quite 'get' that because of the backend
> complexity of the Patch model, so it tried to do an update, failed,
> and then tried an insert.

Backend complexity... subclassing bug or something? Hmm.

> 
> Change the code to use Patch.objects.create, which makes the UPDATEs
> and the weird MySQL errors go away.
> 
> Also move it up a bit earlier in the process so that if things go wrong
> later at least we've committed the patch to the db.
> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>

In any case, objects.create() is stylistically nicer imho.

I definitely think we should commit the patch to the database before 
Series...

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>   patchwork/parser.py | 29 ++++++++++++++---------------
>   1 file changed, 14 insertions(+), 15 deletions(-)
> 
> diff --git a/patchwork/parser.py b/patchwork/parser.py
> index 3d40b74375e0..0e53e6b9a3af 100644
> --- a/patchwork/parser.py
> +++ b/patchwork/parser.py
> @@ -984,6 +984,20 @@ def parse_mail(mail, list_id=None):
>               filenames = find_filenames(diff)
>               delegate = find_delegate_by_filename(project, filenames)
> 
> +        patch = Patch.objects.create(
> +            msgid=msgid,
> +            project=project,
> +            name=name[:255],
> +            date=date,
> +            headers=headers,
> +            submitter=author,
> +            content=message,
> +            diff=diff,
> +            pull_url=pull_url,
> +            delegate=delegate,
> +            state=find_state(mail))
> +        logger.debug('Patch saved')
> +
>           # if we don't have a series marker, we will never have an existing
>           # series to match against.
>           series = None
> @@ -1024,21 +1038,6 @@ def parse_mail(mail, list_id=None):
>                   except SeriesReference.DoesNotExist:
>                       SeriesReference.objects.create(series=series, msgid=ref)
> 
> -        patch = Patch(
> -            msgid=msgid,
> -            project=project,
> -            name=name[:255],
> -            date=date,
> -            headers=headers,
> -            submitter=author,
> -            content=message,
> -            diff=diff,
> -            pull_url=pull_url,
> -            delegate=delegate,
> -            state=find_state(mail))
> -        patch.save()
> -        logger.debug('Patch saved')
> -
>           # add to a series if we have found one, and we have a numbered
>           # patch. Don't add unnumbered patches (for example diffs sent
>           # in reply, or just messages with random refs/in-reply-tos)
>
Daniel Axtens Feb. 24, 2018, 2:26 p.m. UTC | #2
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes:

> On 22/02/18 01:17, Daniel Axtens wrote:
>> Attempts to do parallel parsing with MySQL threw the following errors:
>> 
>> _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
>> 
>> Looking at the code, it was thrown when we created a patch like this:
>> 
>> patch = Patch(...)
>> patch.save()
>> 
>> The SQL statements that were being generated were weird:
>> 
>> UPDATE "patchwork_patch" SET ...
>> INSERT INTO "patchwork_patch" (...) VALUES (...)
>> 
>> As far as I can tell, the update could never work, because it was
>> trying to update a patch that didn't exist yet. My hypothesis is
>> that Django somehow didn't quite 'get' that because of the backend
>> complexity of the Patch model, so it tried to do an update, failed,
>> and then tried an insert.
>
> Backend complexity... subclassing bug or something? Hmm.

This is definitely my suspicion but I have 0 interest in finding out: I
really want to denormalise some data and squish some of the class
hierarchy for performance reasons - so hopefully the relevant code goes
away soon anyway.

>> 
>> Change the code to use Patch.objects.create, which makes the UPDATEs
>> and the weird MySQL errors go away.
>> 
>> Also move it up a bit earlier in the process so that if things go wrong
>> later at least we've committed the patch to the db.
>> 
>> Signed-off-by: Daniel Axtens <dja@axtens.net>
>
> In any case, objects.create() is stylistically nicer imho.
>
> I definitely think we should commit the patch to the database before 
> Series...
>
> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

Thanks!

Regards,
Daniel

>> ---
>>   patchwork/parser.py | 29 ++++++++++++++---------------
>>   1 file changed, 14 insertions(+), 15 deletions(-)
>> 
>> diff --git a/patchwork/parser.py b/patchwork/parser.py
>> index 3d40b74375e0..0e53e6b9a3af 100644
>> --- a/patchwork/parser.py
>> +++ b/patchwork/parser.py
>> @@ -984,6 +984,20 @@ def parse_mail(mail, list_id=None):
>>               filenames = find_filenames(diff)
>>               delegate = find_delegate_by_filename(project, filenames)
>> 
>> +        patch = Patch.objects.create(
>> +            msgid=msgid,
>> +            project=project,
>> +            name=name[:255],
>> +            date=date,
>> +            headers=headers,
>> +            submitter=author,
>> +            content=message,
>> +            diff=diff,
>> +            pull_url=pull_url,
>> +            delegate=delegate,
>> +            state=find_state(mail))
>> +        logger.debug('Patch saved')
>> +
>>           # if we don't have a series marker, we will never have an existing
>>           # series to match against.
>>           series = None
>> @@ -1024,21 +1038,6 @@ def parse_mail(mail, list_id=None):
>>                   except SeriesReference.DoesNotExist:
>>                       SeriesReference.objects.create(series=series, msgid=ref)
>> 
>> -        patch = Patch(
>> -            msgid=msgid,
>> -            project=project,
>> -            name=name[:255],
>> -            date=date,
>> -            headers=headers,
>> -            submitter=author,
>> -            content=message,
>> -            diff=diff,
>> -            pull_url=pull_url,
>> -            delegate=delegate,
>> -            state=find_state(mail))
>> -        patch.save()
>> -        logger.debug('Patch saved')
>> -
>>           # add to a series if we have found one, and we have a numbered
>>           # patch. Don't add unnumbered patches (for example diffs sent
>>           # in reply, or just messages with random refs/in-reply-tos)
>> 
>
> -- 
> Andrew Donnellan              OzLabs, ADL Canberra
> andrew.donnellan@au1.ibm.com  IBM Australia Limited
diff mbox series

Patch

diff --git a/patchwork/parser.py b/patchwork/parser.py
index 3d40b74375e0..0e53e6b9a3af 100644
--- a/patchwork/parser.py
+++ b/patchwork/parser.py
@@ -984,6 +984,20 @@  def parse_mail(mail, list_id=None):
             filenames = find_filenames(diff)
             delegate = find_delegate_by_filename(project, filenames)
 
+        patch = Patch.objects.create(
+            msgid=msgid,
+            project=project,
+            name=name[:255],
+            date=date,
+            headers=headers,
+            submitter=author,
+            content=message,
+            diff=diff,
+            pull_url=pull_url,
+            delegate=delegate,
+            state=find_state(mail))
+        logger.debug('Patch saved')
+
         # if we don't have a series marker, we will never have an existing
         # series to match against.
         series = None
@@ -1024,21 +1038,6 @@  def parse_mail(mail, list_id=None):
                 except SeriesReference.DoesNotExist:
                     SeriesReference.objects.create(series=series, msgid=ref)
 
-        patch = Patch(
-            msgid=msgid,
-            project=project,
-            name=name[:255],
-            date=date,
-            headers=headers,
-            submitter=author,
-            content=message,
-            diff=diff,
-            pull_url=pull_url,
-            delegate=delegate,
-            state=find_state(mail))
-        patch.save()
-        logger.debug('Patch saved')
-
         # add to a series if we have found one, and we have a numbered
         # patch. Don't add unnumbered patches (for example diffs sent
         # in reply, or just messages with random refs/in-reply-tos)