Message ID | 20170928141935.23204-1-dja@axtens.net |
---|---|
State | Superseded |
Headers | show |
Series | xmlrpc/patch_list: only fetch required fields | expand |
Daniel Axtens <dja@axtens.net> writes: > OzLabs noticed *massive* slowdowns in queries like this one: > > SELECT "patchwork_submission"."id", "patchwork_submission"."msgid", > "patchwork_submission"."date", "patchwork_submission"."headers", > "patchwork_submission"."submitter_id", > "patchwork_submission"."content", "patchwork_submission"."project_id", > "patchwork_submission"."name", "patchwork_patch"."submission_ptr_id", > "patchwork_patch"."diff", "patchwork_patch"."commit_ref", > "patchwork_patch"."pull_url", "patchwork_patch"."delegate_id", > "patchwork_patch"."state_id", "patchwork_patch"."archived", > "patchwork_patch"."hash" FROM "patchwork_patch" INNER JOIN > "patchwork_submission" ON ("patchwork_patch"."submission_ptr_id" = > "patchwork_submission"."id") WHERE > ("patchwork_submission"."project_id" = 2 AND > "patchwork_patch"."state_id" = 1) ORDER BY > "patchwork_submission"."date" ASC > > These appear to be a result of pwclient list operations. We *do not* > need content/headers/diff in this case - so do not fetch them as it > is incredibly expensive - queries in excess of 50s have been observed. > > This should go to stable/2.0. > > Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > Cc: Jeremy Kerr <jk@ozlabs.org> > Signed-off-by: Daniel Axtens <dja@axtens.net> > > --- > We're seeing this much more than in v1, I think because of the join > that is now required. More on this later. > --- > patchwork/views/xmlrpc.py | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/patchwork/views/xmlrpc.py b/patchwork/views/xmlrpc.py > index 8de9d10e1d50..186f54cac684 100644 > --- a/patchwork/views/xmlrpc.py > +++ b/patchwork/views/xmlrpc.py > @@ -585,6 +585,14 @@ def patch_list(filt=None): > > patches = Patch.objects.filter(**dfilter) > > + # Only extract the relevant fields. This saves a big db load as we > + # no longer fetch content/headers/etc for potentially every patch > + # in a project. > + patches = patches.only('id', 'date', 'msgid', 'name', > + 'project_id', 'state_id', 'archived', > + 'submitter_id', 'delegate_id', 'commit_ref', > + 'hash') > + We probably also want to chuck a select/prefetch_related in here but for some reason xmlrpc doesn't want to test locally so I haven't been able to verify it just yet. This should still be a big improvement though! Regards, Daniel > return _get_objects(patch_to_dict, patches, max_count) > > > -- > 2.11.0
On 29/09/17 00:19, Daniel Axtens wrote: > OzLabs noticed *massive* slowdowns in queries like this one: > > SELECT "patchwork_submission"."id", "patchwork_submission"."msgid", > "patchwork_submission"."date", "patchwork_submission"."headers", > "patchwork_submission"."submitter_id", > "patchwork_submission"."content", "patchwork_submission"."project_id", > "patchwork_submission"."name", "patchwork_patch"."submission_ptr_id", > "patchwork_patch"."diff", "patchwork_patch"."commit_ref", > "patchwork_patch"."pull_url", "patchwork_patch"."delegate_id", > "patchwork_patch"."state_id", "patchwork_patch"."archived", > "patchwork_patch"."hash" FROM "patchwork_patch" INNER JOIN > "patchwork_submission" ON ("patchwork_patch"."submission_ptr_id" = > "patchwork_submission"."id") WHERE > ("patchwork_submission"."project_id" = 2 AND > "patchwork_patch"."state_id" = 1) ORDER BY > "patchwork_submission"."date" ASC > > These appear to be a result of pwclient list operations. We *do not* > need content/headers/diff in this case - so do not fetch them as it > is incredibly expensive - queries in excess of 50s have been observed. > > This should go to stable/2.0. > > Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> > Reported-by: Michael Ellerman <mpe@ellerman.id.au> > Cc: Jeremy Kerr <jk@ozlabs.org> > Signed-off-by: Daniel Axtens <dja@axtens.net> Looks good to me Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> > > --- > We're seeing this much more than in v1, I think because of the join > that is now required. More on this later. > --- > patchwork/views/xmlrpc.py | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/patchwork/views/xmlrpc.py b/patchwork/views/xmlrpc.py > index 8de9d10e1d50..186f54cac684 100644 > --- a/patchwork/views/xmlrpc.py > +++ b/patchwork/views/xmlrpc.py > @@ -585,6 +585,14 @@ def patch_list(filt=None): > > patches = Patch.objects.filter(**dfilter) > > + # Only extract the relevant fields. This saves a big db load as we > + # no longer fetch content/headers/etc for potentially every patch > + # in a project. > + patches = patches.only('id', 'date', 'msgid', 'name', > + 'project_id', 'state_id', 'archived', > + 'submitter_id', 'delegate_id', 'commit_ref', > + 'hash') > + > return _get_objects(patch_to_dict, patches, max_count) > > >
diff --git a/patchwork/views/xmlrpc.py b/patchwork/views/xmlrpc.py index 8de9d10e1d50..186f54cac684 100644 --- a/patchwork/views/xmlrpc.py +++ b/patchwork/views/xmlrpc.py @@ -585,6 +585,14 @@ def patch_list(filt=None): patches = Patch.objects.filter(**dfilter) + # Only extract the relevant fields. This saves a big db load as we + # no longer fetch content/headers/etc for potentially every patch + # in a project. + patches = patches.only('id', 'date', 'msgid', 'name', + 'project_id', 'state_id', 'archived', + 'submitter_id', 'delegate_id', 'commit_ref', + 'hash') + return _get_objects(patch_to_dict, patches, max_count)
OzLabs noticed *massive* slowdowns in queries like this one: SELECT "patchwork_submission"."id", "patchwork_submission"."msgid", "patchwork_submission"."date", "patchwork_submission"."headers", "patchwork_submission"."submitter_id", "patchwork_submission"."content", "patchwork_submission"."project_id", "patchwork_submission"."name", "patchwork_patch"."submission_ptr_id", "patchwork_patch"."diff", "patchwork_patch"."commit_ref", "patchwork_patch"."pull_url", "patchwork_patch"."delegate_id", "patchwork_patch"."state_id", "patchwork_patch"."archived", "patchwork_patch"."hash" FROM "patchwork_patch" INNER JOIN "patchwork_submission" ON ("patchwork_patch"."submission_ptr_id" = "patchwork_submission"."id") WHERE ("patchwork_submission"."project_id" = 2 AND "patchwork_patch"."state_id" = 1) ORDER BY "patchwork_submission"."date" ASC These appear to be a result of pwclient list operations. We *do not* need content/headers/diff in this case - so do not fetch them as it is incredibly expensive - queries in excess of 50s have been observed. This should go to stable/2.0. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Daniel Axtens <dja@axtens.net> --- We're seeing this much more than in v1, I think because of the join that is now required. More on this later. --- patchwork/views/xmlrpc.py | 8 ++++++++ 1 file changed, 8 insertions(+)