Message ID | 20180810080106.10714-2-stewart@linux.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | Performance for ALL THE THINGS! | expand |
Stewart Smith <stewart@linux.ibm.com> writes: > There's two main bits that are really expensive when composing the list > of patches for a project: the query getting the list, and the query > finding the series for each patch. > > If we look at the query getting the list, it gets a lot of unnesseccary > fields such as 'headers' and 'content', even though we tell Django not > to. It turns out that Django seems to ignore the Submission relationship > and I have no idea how to force it to ignore that thing (defer doesn't > work) but if we go only, then it works okay. > > From my import of ~8000 messages for a few projects, my laptop query > time (MySQL, as setup by whatever the docker-compose things do) goes > from: > > http://localhost:8000/project/linux-kernel/list/ > FROM: > 342ms SQL queries cold cache, 268ms warm cache > TO: > 118ms SQL queries cold cache, 88ms warm cache > > Which is... non trivial to say the least. > > The big jump is the patches.only change, and the removal of ordering > on the patchseries takes a further 10ms off. For some strange reason, it > seems rather hard to tell Django that you don't care what order the > results come back in for that query (if we do, then the db server has to > do a sort rather than just return each row) Thanks Stewart! It's great to get some real DB experience - feel free to hang around! :) So, further to our conversation with Konstantin, I tested this against Django 2.0. It still saves us some time - it means we no longer load the following fields: `patchwork_submission`.`id`, `patchwork_submission`.`msgid`, `patchwork_patch`.`commit_ref`, `patchwork_patch`.`pull_url`, `patchwork_patch`.`archived`, `patchwork_patch`.`hash`, `patchwork_patch`.`patch_project_id`, This obviously saves the db some work and communication overhead. I'm a little nervous that this will slightly complicate some of the further denormalisation but I think that's probably a price worth paying unless Stephen objects. I do still want to test the 'ordering' change a bit more before committing it though. Regards, Daniel > > Signed-off-by: Stewart Smith <stewart@linux.ibm.com> > --- > patchwork/models.py | 1 - > patchwork/views/__init__.py | 2 ++ > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/patchwork/models.py b/patchwork/models.py > index 6268f5b72e3c..d2b88fc48c91 100644 > --- a/patchwork/models.py > +++ b/patchwork/models.py > @@ -747,7 +747,6 @@ class Series(FilenameMixin, models.Model): > return self.name if self.name else 'Untitled series #%d' % self.id > > class Meta: > - ordering = ('date',) > verbose_name_plural = 'Series' > > > diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py > index f8d23a388ac7..96fd0798af5a 100644 > --- a/patchwork/views/__init__.py > +++ b/patchwork/views/__init__.py > @@ -287,6 +287,8 @@ def generic_list(request, project, view, view_args=None, filter_settings=None, > # rendering the list template > patches = patches.select_related('state', 'submitter', 'delegate') > > + patches = patches.only('state','submitter','delegate','project','name','date') > + > # we also need checks and series > patches = patches.prefetch_related('check_set', 'series') > > -- > 2.17.1 > > _______________________________________________ > Patchwork mailing list > Patchwork@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/patchwork
Daniel Axtens <dja@axtens.net> writes: > Stewart Smith <stewart@linux.ibm.com> writes: > >> There's two main bits that are really expensive when composing the list >> of patches for a project: the query getting the list, and the query >> finding the series for each patch. >> >> If we look at the query getting the list, it gets a lot of unnesseccary >> fields such as 'headers' and 'content', even though we tell Django not >> to. It turns out that Django seems to ignore the Submission relationship >> and I have no idea how to force it to ignore that thing (defer doesn't >> work) but if we go only, then it works okay. >> >> From my import of ~8000 messages for a few projects, my laptop query >> time (MySQL, as setup by whatever the docker-compose things do) goes >> from: >> >> http://localhost:8000/project/linux-kernel/list/ >> FROM: >> 342ms SQL queries cold cache, 268ms warm cache >> TO: >> 118ms SQL queries cold cache, 88ms warm cache >> >> Which is... non trivial to say the least. >> >> The big jump is the patches.only change, and the removal of ordering >> on the patchseries takes a further 10ms off. For some strange reason, it >> seems rather hard to tell Django that you don't care what order the >> results come back in for that query (if we do, then the db server has to >> do a sort rather than just return each row) > > Thanks Stewart! It's great to get some real DB experience - feel free to > hang around! :) I'll try to :) Or at least pop up to make things faster/nicer to the database. > So, further to our conversation with Konstantin, I tested this against > Django 2.0. It still saves us some time - it means we no longer load the > following fields: > > `patchwork_submission`.`id`, `patchwork_submission`.`msgid`, > `patchwork_patch`.`commit_ref`, `patchwork_patch`.`pull_url`, > `patchwork_patch`.`archived`, `patchwork_patch`.`hash`, > `patchwork_patch`.`patch_project_id`, > > This obviously saves the db some work and communication overhead. I found that 'headers' and 'content' from the patch were also being fetched, and it saved fetching those (which are really quite expensive to do, as blobs are typically mostly stored on a separate page than the row itself). It's the blobs that were the big expense, as they'd likely generate 2 disk seeks and reads per row. > I'm a little nervous that this will slightly complicate some of the > further denormalisation but I think that's probably a price worth paying > unless Stephen objects. I don't think it should really be too much of a problem, as any denormalisation occurs, folk are going to have to look at the queries being produced and executed anyway otherwise there's always a chance for regressions. > I do still want to test the 'ordering' change a bit more before > committing it though. Yeah, that struck me as a weird one, but it certainly did have an effect on some things.
On Sat, 2018-08-11 at 04:28 +1000, Daniel Axtens wrote: > Stewart Smith <stewart@linux.ibm.com> writes: > > > There's two main bits that are really expensive when composing the list > > of patches for a project: the query getting the list, and the query > > finding the series for each patch. > > > > If we look at the query getting the list, it gets a lot of unnesseccary > > fields such as 'headers' and 'content', even though we tell Django not > > to. It turns out that Django seems to ignore the Submission relationship > > and I have no idea how to force it to ignore that thing (defer doesn't > > work) but if we go only, then it works okay. > > > > From my import of ~8000 messages for a few projects, my laptop query > > time (MySQL, as setup by whatever the docker-compose things do) goes > > from: > > > > http://localhost:8000/project/linux-kernel/list/ > > FROM: > > 342ms SQL queries cold cache, 268ms warm cache > > TO: > > 118ms SQL queries cold cache, 88ms warm cache > > > > Which is... non trivial to say the least. > > > > The big jump is the patches.only change, and the removal of ordering > > on the patchseries takes a further 10ms off. For some strange reason, it > > seems rather hard to tell Django that you don't care what order the > > results come back in for that query (if we do, then the db server has to > > do a sort rather than just return each row) > > Thanks Stewart! It's great to get some real DB experience - feel free to > hang around! :) +1000 > So, further to our conversation with Konstantin, I tested this against > Django 2.0. It still saves us some time - it means we no longer load the > following fields: > > `patchwork_submission`.`id`, `patchwork_submission`.`msgid`, > `patchwork_patch`.`commit_ref`, `patchwork_patch`.`pull_url`, > `patchwork_patch`.`archived`, `patchwork_patch`.`hash`, > `patchwork_patch`.`patch_project_id`, Out of curiosity, does this allow us to drop the patch_project_id field entirely or not (bear in mind, I haven't read through the rest of this). > This obviously saves the db some work and communication overhead. > > I'm a little nervous that this will slightly complicate some of the > further denormalisation but I think that's probably a price worth paying > unless Stephen objects. To be honest, it could be 6 months before we get any further on big efforts like this. A 3x performance boost right now beats an unknown boost N months down the line, IMO. My 2c anyway. > I do still want to test the 'ordering' change a bit more before > committing it though. I'm going to attempt to do my own testing of this (with a much larger dataset) some evening this week. I'll hold off applying anything pending this though. Stephen
Stephen Finucane <stephen@that.guru> writes: > On Sat, 2018-08-11 at 04:28 +1000, Daniel Axtens wrote: >> Stewart Smith <stewart@linux.ibm.com> writes: >> So, further to our conversation with Konstantin, I tested this against >> Django 2.0. It still saves us some time - it means we no longer load the >> following fields: >> >> `patchwork_submission`.`id`, `patchwork_submission`.`msgid`, >> `patchwork_patch`.`commit_ref`, `patchwork_patch`.`pull_url`, >> `patchwork_patch`.`archived`, `patchwork_patch`.`hash`, >> `patchwork_patch`.`patch_project_id`, > > Out of curiosity, does this allow us to drop the patch_project_id field > entirely or not (bear in mind, I haven't read through the rest of > this). I'm not too sure... Umm... maybe not as I think we can go from patch to project when viewing a patch? >> This obviously saves the db some work and communication overhead. >> >> I'm a little nervous that this will slightly complicate some of the >> further denormalisation but I think that's probably a price worth paying >> unless Stephen objects. > > To be honest, it could be 6 months before we get any further on big > efforts like this. A 3x performance boost right now beats an unknown > boost N months down the line, IMO. My 2c anyway. I'd tend to agree, we may as well get the maximum performance we can today, and then if it's *really* needed later down the track, we can change the schema a bunch to denormalise. From my benchmarks, the biggest boost currently would be to precompute and cache patch counts for most of the common views, as a *lot* of page time is for that. >> I do still want to test the 'ordering' change a bit more before >> committing it though. > > I'm going to attempt to do my own testing of this (with a much larger > dataset) some evening this week. I'll hold off applying anything > pending this though. Nice, let me know how it goes. I'll be off somewhere in a national park for a bunch of next week, which means I won't look at email, so it's the *perfect* time to merge and deploy all of my patches :)
Stephen Finucane <stephen@that.guru> writes: > On Sat, 2018-08-11 at 04:28 +1000, Daniel Axtens wrote: >> Stewart Smith <stewart@linux.ibm.com> writes: >> >> > There's two main bits that are really expensive when composing the list >> > of patches for a project: the query getting the list, and the query >> > finding the series for each patch. >> > >> > If we look at the query getting the list, it gets a lot of unnesseccary >> > fields such as 'headers' and 'content', even though we tell Django not >> > to. It turns out that Django seems to ignore the Submission relationship >> > and I have no idea how to force it to ignore that thing (defer doesn't >> > work) but if we go only, then it works okay. >> > >> > From my import of ~8000 messages for a few projects, my laptop query >> > time (MySQL, as setup by whatever the docker-compose things do) goes >> > from: >> > >> > http://localhost:8000/project/linux-kernel/list/ >> > FROM: >> > 342ms SQL queries cold cache, 268ms warm cache >> > TO: >> > 118ms SQL queries cold cache, 88ms warm cache >> > >> > Which is... non trivial to say the least. >> > >> > The big jump is the patches.only change, and the removal of ordering >> > on the patchseries takes a further 10ms off. For some strange reason, it >> > seems rather hard to tell Django that you don't care what order the >> > results come back in for that query (if we do, then the db server has to >> > do a sort rather than just return each row) >> >> Thanks Stewart! It's great to get some real DB experience - feel free to >> hang around! :) > > +1000 > >> So, further to our conversation with Konstantin, I tested this against >> Django 2.0. It still saves us some time - it means we no longer load the >> following fields: >> >> `patchwork_submission`.`id`, `patchwork_submission`.`msgid`, >> `patchwork_patch`.`commit_ref`, `patchwork_patch`.`pull_url`, >> `patchwork_patch`.`archived`, `patchwork_patch`.`hash`, >> `patchwork_patch`.`patch_project_id`, > > Out of curiosity, does this allow us to drop the patch_project_id field > entirely or not (bear in mind, I haven't read through the rest of > this). > >> This obviously saves the db some work and communication overhead. >> >> I'm a little nervous that this will slightly complicate some of the >> further denormalisation but I think that's probably a price worth paying >> unless Stephen objects. > > To be honest, it could be 6 months before we get any further on big > efforts like this. A 3x performance boost right now beats an unknown > boost N months down the line, IMO. My 2c anyway. OK then, reaffirming my in-principle ack to merging these once Stephen or I can test the things they touch. Regards, Daniel > >> I do still want to test the 'ordering' change a bit more before >> committing it though. > > I'm going to attempt to do my own testing of this (with a much larger > dataset) some evening this week. I'll hold off applying anything > pending this though. > > Stephen
On Fri, 2018-08-10 at 18:00 +1000, Stewart Smith wrote: > There's two main bits that are really expensive when composing the list > of patches for a project: the query getting the list, and the query > finding the series for each patch. > > If we look at the query getting the list, it gets a lot of unnesseccary > fields such as 'headers' and 'content', even though we tell Django not > to. It turns out that Django seems to ignore the Submission relationship > and I have no idea how to force it to ignore that thing (defer doesn't > work) but if we go only, then it works okay. > > From my import of ~8000 messages for a few projects, my laptop query > time (MySQL, as setup by whatever the docker-compose things do) goes > from: > > http://localhost:8000/project/linux-kernel/list/ > FROM: > 342ms SQL queries cold cache, 268ms warm cache > TO: > 118ms SQL queries cold cache, 88ms warm cache > > Which is... non trivial to say the least. > > The big jump is the patches.only change, and the removal of ordering > on the patchseries takes a further 10ms off. For some strange reason, it > seems rather hard to tell Django that you don't care what order the > results come back in for that query (if we do, then the db server has to > do a sort rather than just return each row) > > Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Tested this and it looks good to me. There's a migration missing (it's instead squashed into the first "add index" patch) but I can fix that at merge time. I was concerned that the combination of '.only' followed immediately by '.prefetch_related' (for checks and series) would be an issue and indeed it seems that, at this patch set at least, we do generate a query per check. However, this is no different to before and it resolved in a later check. As such: Reviewed-by: Stephen Finucane <stephen@that.guru> I'm not going to push this (or any of the series) yet in case Daniel wants to weigh in but will do so next week if there are no complaints. Stephen > --- > patchwork/models.py | 1 - > patchwork/views/__init__.py | 2 ++ > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/patchwork/models.py b/patchwork/models.py > index 6268f5b72e3c..d2b88fc48c91 100644 > --- a/patchwork/models.py > +++ b/patchwork/models.py > @@ -747,7 +747,6 @@ class Series(FilenameMixin, models.Model): > return self.name if self.name else 'Untitled series #%d' % self.id > > class Meta: > - ordering = ('date',) > verbose_name_plural = 'Series' > > > diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py > index f8d23a388ac7..96fd0798af5a 100644 > --- a/patchwork/views/__init__.py > +++ b/patchwork/views/__init__.py > @@ -287,6 +287,8 @@ def generic_list(request, project, view, view_args=None, filter_settings=None, > # rendering the list template > patches = patches.select_related('state', 'submitter', 'delegate') > > + patches = patches.only('state','submitter','delegate','project','name','date') > + > # we also need checks and series > patches = patches.prefetch_related('check_set', 'series') >
Stephen Finucane <stephen@that.guru> writes: > On Fri, 2018-08-10 at 18:00 +1000, Stewart Smith wrote: >> There's two main bits that are really expensive when composing the list >> of patches for a project: the query getting the list, and the query >> finding the series for each patch. >> >> If we look at the query getting the list, it gets a lot of unnesseccary >> fields such as 'headers' and 'content', even though we tell Django not >> to. It turns out that Django seems to ignore the Submission relationship >> and I have no idea how to force it to ignore that thing (defer doesn't >> work) but if we go only, then it works okay. >> >> From my import of ~8000 messages for a few projects, my laptop query >> time (MySQL, as setup by whatever the docker-compose things do) goes >> from: >> >> http://localhost:8000/project/linux-kernel/list/ >> FROM: >> 342ms SQL queries cold cache, 268ms warm cache >> TO: >> 118ms SQL queries cold cache, 88ms warm cache >> >> Which is... non trivial to say the least. >> >> The big jump is the patches.only change, and the removal of ordering >> on the patchseries takes a further 10ms off. For some strange reason, it >> seems rather hard to tell Django that you don't care what order the >> results come back in for that query (if we do, then the db server has to >> do a sort rather than just return each row) >> >> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> > > Tested this and it looks good to me. There's a migration missing (it's > instead squashed into the first "add index" patch) but I can fix that > at merge time. Ahh, sorry about that. > I was concerned that the combination of '.only' followed immediately by > '.prefetch_related' (for checks and series) would be an issue and > indeed it seems that, at this patch set at least, we do generate a > query per check. However, this is no different to before and it > resolved in a later check. As such: > > Reviewed-by: Stephen Finucane <stephen@that.guru> > > I'm not going to push this (or any of the series) yet in case Daniel > wants to weigh in but will do so next week if there are no complaints. Thanks for testing it all, increases my confidence with the patches as well :)
Stephen Finucane <stephen@that.guru> writes: > On Fri, 2018-08-10 at 18:00 +1000, Stewart Smith wrote: >> There's two main bits that are really expensive when composing the list >> of patches for a project: the query getting the list, and the query >> finding the series for each patch. >> >> If we look at the query getting the list, it gets a lot of unnesseccary >> fields such as 'headers' and 'content', even though we tell Django not >> to. It turns out that Django seems to ignore the Submission relationship >> and I have no idea how to force it to ignore that thing (defer doesn't >> work) but if we go only, then it works okay. >> >> From my import of ~8000 messages for a few projects, my laptop query >> time (MySQL, as setup by whatever the docker-compose things do) goes >> from: >> >> http://localhost:8000/project/linux-kernel/list/ >> FROM: >> 342ms SQL queries cold cache, 268ms warm cache >> TO: >> 118ms SQL queries cold cache, 88ms warm cache >> >> Which is... non trivial to say the least. >> >> The big jump is the patches.only change, and the removal of ordering >> on the patchseries takes a further 10ms off. For some strange reason, it >> seems rather hard to tell Django that you don't care what order the >> results come back in for that query (if we do, then the db server has to >> do a sort rather than just return each row) >> >> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> > > Tested this and it looks good to me. There's a migration missing (it's > instead squashed into the first "add index" patch) but I can fix that > at merge time. > > I was concerned that the combination of '.only' followed immediately by > '.prefetch_related' (for checks and series) would be an issue and > indeed it seems that, at this patch set at least, we do generate a > query per check. However, this is no different to before and it > resolved in a later check. As such: > > Reviewed-by: Stephen Finucane <stephen@that.guru> > > I'm not going to push this (or any of the series) yet in case Daniel > wants to weigh in but will do so next week if there are no complaints. I'm not likely to find the time to do so, so if you're happy I'm happy :) Merge at will! Regards, Daniel > > Stephen > >> --- >> patchwork/models.py | 1 - >> patchwork/views/__init__.py | 2 ++ >> 2 files changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/patchwork/models.py b/patchwork/models.py >> index 6268f5b72e3c..d2b88fc48c91 100644 >> --- a/patchwork/models.py >> +++ b/patchwork/models.py >> @@ -747,7 +747,6 @@ class Series(FilenameMixin, models.Model): >> return self.name if self.name else 'Untitled series #%d' % self.id >> >> class Meta: >> - ordering = ('date',) >> verbose_name_plural = 'Series' >> >> >> diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py >> index f8d23a388ac7..96fd0798af5a 100644 >> --- a/patchwork/views/__init__.py >> +++ b/patchwork/views/__init__.py >> @@ -287,6 +287,8 @@ def generic_list(request, project, view, view_args=None, filter_settings=None, >> # rendering the list template >> patches = patches.select_related('state', 'submitter', 'delegate') >> >> + patches = patches.only('state','submitter','delegate','project','name','date') >> + >> # we also need checks and series >> patches = patches.prefetch_related('check_set', 'series') >> > > > _______________________________________________ > Patchwork mailing list > Patchwork@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/patchwork
On Mon, 2018-09-10 at 23:26 +1000, Daniel Axtens wrote: > Stephen Finucane <stephen@that.guru> writes: > > > On Fri, 2018-08-10 at 18:00 +1000, Stewart Smith wrote: > > > There's two main bits that are really expensive when composing the list > > > of patches for a project: the query getting the list, and the query > > > finding the series for each patch. > > > > > > If we look at the query getting the list, it gets a lot of unnesseccary > > > fields such as 'headers' and 'content', even though we tell Django not > > > to. It turns out that Django seems to ignore the Submission relationship > > > and I have no idea how to force it to ignore that thing (defer doesn't > > > work) but if we go only, then it works okay. > > > > > > From my import of ~8000 messages for a few projects, my laptop query > > > time (MySQL, as setup by whatever the docker-compose things do) goes > > > from: > > > > > > http://localhost:8000/project/linux-kernel/list/ > > > FROM: > > > 342ms SQL queries cold cache, 268ms warm cache > > > TO: > > > 118ms SQL queries cold cache, 88ms warm cache > > > > > > Which is... non trivial to say the least. > > > > > > The big jump is the patches.only change, and the removal of ordering > > > on the patchseries takes a further 10ms off. For some strange reason, it > > > seems rather hard to tell Django that you don't care what order the > > > results come back in for that query (if we do, then the db server has to > > > do a sort rather than just return each row) > > > > > > Signed-off-by: Stewart Smith <stewart@linux.ibm.com> > > > > Tested this and it looks good to me. There's a migration missing (it's > > instead squashed into the first "add index" patch) but I can fix that > > at merge time. > > > > I was concerned that the combination of '.only' followed immediately by > > '.prefetch_related' (for checks and series) would be an issue and > > indeed it seems that, at this patch set at least, we do generate a > > query per check. However, this is no different to before and it > > resolved in a later check. As such: > > > > Reviewed-by: Stephen Finucane <stephen@that.guru> > > > > I'm not going to push this (or any of the series) yet in case Daniel > > wants to weigh in but will do so next week if there are no complaints. > > I'm not likely to find the time to do so, so if you're happy I'm happy :) > Merge at will! Sweet. Have pushed all of this now so. Stephen > Regards, > Daniel > > > > > Stephen > > > > > --- > > > patchwork/models.py | 1 - > > > patchwork/views/__init__.py | 2 ++ > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/patchwork/models.py b/patchwork/models.py > > > index 6268f5b72e3c..d2b88fc48c91 100644 > > > --- a/patchwork/models.py > > > +++ b/patchwork/models.py > > > @@ -747,7 +747,6 @@ class Series(FilenameMixin, models.Model): > > > return self.name if self.name else 'Untitled series #%d' % self.id > > > > > > class Meta: > > > - ordering = ('date',) > > > verbose_name_plural = 'Series' > > > > > > > > > diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py > > > index f8d23a388ac7..96fd0798af5a 100644 > > > --- a/patchwork/views/__init__.py > > > +++ b/patchwork/views/__init__.py > > > @@ -287,6 +287,8 @@ def generic_list(request, project, view, view_args=None, filter_settings=None, > > > # rendering the list template > > > patches = patches.select_related('state', 'submitter', 'delegate') > > > > > > + patches = patches.only('state','submitter','delegate','project','name','date') > > > + > > > # we also need checks and series > > > patches = patches.prefetch_related('check_set', 'series') > > > > > > > > > _______________________________________________ > > Patchwork mailing list > > Patchwork@lists.ozlabs.org > > https://lists.ozlabs.org/listinfo/patchwork
diff --git a/patchwork/models.py b/patchwork/models.py index 6268f5b72e3c..d2b88fc48c91 100644 --- a/patchwork/models.py +++ b/patchwork/models.py @@ -747,7 +747,6 @@ class Series(FilenameMixin, models.Model): return self.name if self.name else 'Untitled series #%d' % self.id class Meta: - ordering = ('date',) verbose_name_plural = 'Series' diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py index f8d23a388ac7..96fd0798af5a 100644 --- a/patchwork/views/__init__.py +++ b/patchwork/views/__init__.py @@ -287,6 +287,8 @@ def generic_list(request, project, view, view_args=None, filter_settings=None, # rendering the list template patches = patches.select_related('state', 'submitter', 'delegate') + patches = patches.only('state','submitter','delegate','project','name','date') + # we also need checks and series patches = patches.prefetch_related('check_set', 'series')
There's two main bits that are really expensive when composing the list of patches for a project: the query getting the list, and the query finding the series for each patch. If we look at the query getting the list, it gets a lot of unnesseccary fields such as 'headers' and 'content', even though we tell Django not to. It turns out that Django seems to ignore the Submission relationship and I have no idea how to force it to ignore that thing (defer doesn't work) but if we go only, then it works okay. From my import of ~8000 messages for a few projects, my laptop query time (MySQL, as setup by whatever the docker-compose things do) goes from: http://localhost:8000/project/linux-kernel/list/ FROM: 342ms SQL queries cold cache, 268ms warm cache TO: 118ms SQL queries cold cache, 88ms warm cache Which is... non trivial to say the least. The big jump is the patches.only change, and the removal of ordering on the patchseries takes a further 10ms off. For some strange reason, it seems rather hard to tell Django that you don't care what order the results come back in for that query (if we do, then the db server has to do a sort rather than just return each row) Signed-off-by: Stewart Smith <stewart@linux.ibm.com> --- patchwork/models.py | 1 - patchwork/views/__init__.py | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-)