From patchwork Tue Mar 17 13:59:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 1256492 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48hblB3YdDz9sSd for ; Wed, 18 Mar 2020 01:53:26 +1100 (AEDT) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=axtens.net Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.a=rsa-sha256 header.s=google header.b=O0cH48RH; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 48hbl95VVQzDqRj for ; Wed, 18 Mar 2020 01:53:25 +1100 (AEDT) X-Original-To: patchwork@lists.ozlabs.org Delivered-To: patchwork@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=axtens.net (client-ip=2607:f8b0:4864:20::1036; helo=mail-pj1-x1036.google.com; envelope-from=dja@axtens.net; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=axtens.net Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=axtens.net header.i=@axtens.net header.a=rsa-sha256 header.s=google header.b=O0cH48RH; dkim-atps=neutral Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 48hZXx2DPGzDqWj for ; Wed, 18 Mar 2020 00:59:29 +1100 (AEDT) Received: by mail-pj1-x1036.google.com with SMTP id nu11so7301443pjb.1 for ; Tue, 17 Mar 2020 06:59:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=axtens.net; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=L3+/3evD5dv1pHiLWwLiGzQihuoPYE7zBD95oriGaZs=; b=O0cH48RH2DywEC9MayD5CWe4kXj/vayc0j/K0iS8ywuI0e3Cay0zoBTu02VbFtnVPO 7Q7f0AG9Qx9m9w6m8j+NFaw0+76KQBpWVadQ6HLfeQlrhGbbuA4x79Xl5RS6FoALuN+F 6OQri7qHcV1XcmCAsXoEiHxjgNSyCS7DDKc1Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=L3+/3evD5dv1pHiLWwLiGzQihuoPYE7zBD95oriGaZs=; b=F96CBFd1zL7rjDOBLOKBPUgpo3ORsJxAH4v9Avl3A7Q9lGpjl5ykxaJxh7A7MB9ph/ 8oa7J9gUNIBTfspkkgNUc3OM0lVEDpc7Yv4JCtOsDpi47dxXZrV33X+EbVcKcVH50v7C xQVRQSq+2lU2RO+c9TfKDP4g+3+f3PDRKTyVuEqJbqwkAT51uREfyDsrvwa59QuftVzW AaXvuobx8CXEW03eWZ3i9kIKjlS2k0Z/cUlKeVdAaDebC/BiOvNT68WZOrk/Mzx1zzWO iQ1QYX2CyG5bVcl0Qv1LLNh5Dhb0gSD6QiKE5IlX7Fxt7EOqFFXxSbXFXcj9D+0sVmNh ipIQ== X-Gm-Message-State: ANhLgQ2iuZgV2sz6pxPaFcLCvOSzbGYbCPmPSTbg/Kry6dB9cushrtn9 h96O9c3PNn0wdzk0SEpIqD/PC7rPpDE= X-Google-Smtp-Source: ADFU+vvzE4k7EChhspXqjuOta/J9gpm+KdQF5PtTBVdXIz+PWs6dYGROu4BmRlALeBmvsnZwsWVQYQ== X-Received: by 2002:a17:902:ba93:: with SMTP id k19mr2289659pls.73.1584453566273; Tue, 17 Mar 2020 06:59:26 -0700 (PDT) Received: from localhost (2001-44b8-111e-5c00-11e1-e7cb-3c10-05d6.static.ipv6.internode.on.net. [2001:44b8:111e:5c00:11e1:e7cb:3c10:5d6]) by smtp.gmail.com with ESMTPSA id x71sm3447796pfd.129.2020.03.17.06.59.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Mar 2020 06:59:25 -0700 (PDT) From: Daniel Axtens To: patchwork@lists.ozlabs.org Subject: [PATCH 1/4] REST: massively improve the patch counting query under filters Date: Wed, 18 Mar 2020 00:59:13 +1100 Message-Id: <20200317135916.13691-2-dja@axtens.net> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200317135916.13691-1-dja@axtens.net> References: <20200317135916.13691-1-dja@axtens.net> MIME-Version: 1.0 X-BeenThere: patchwork@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Patchwork development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: patchwork-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Patchwork" The DRF web view counts the patches as part of pagination. The query it uses is a disaster zone: SELECT COUNT(*) FROM (SELECT DISTINCT `patchwork_submission`.`id` AS Col1, `patchwork_submission`.`msgid` AS Col2, `patchwork_submission`.`date` AS Col3, `patchwork_submission`.`submitter_id` AS Col4, `patchwork_submission`.`project_id` AS Col5, `patchwork_submission`.`name` AS Col6, `patchwork_patch`.`submission_ptr_id` AS Col7, `patchwork_patch`.`commit_ref` AS Col8, `patchwork_patch`.`pull_url` AS Col9, `patchwork_patch`.`delegate_id` AS Col10, `patchwork_patch`.`state_id` AS Col11, `patchwork_patch`.`archived` AS Col12, `patchwork_patch`.`hash` AS Col13, `patchwork_patch`.`patch_project_id` AS Col14, `patchwork_patch`.`series_id` AS Col15, `patchwork_patch`.`number` AS Col16, `patchwork_patch`.`related_id` AS Col17 FROM `patchwork_patch` INNER JOIN `patchwork_submission` ON (`patchwork_patch`.`submission_ptr_id` = `patchwork_submission`.`id`) WHERE `patchwork_submission`.`project_id` = 1) subquery This is because django-filters adds a DISTINCT qualifier on a ModelMultiChoiceFilter by default. I guess it makes sense and they do a decent job of justifying it, but it causes the count to be made with this awful subquery. (The justification is that they don't know if you're filtering on a to-many relationship, in which case there could be duplicate values that need to be removed.) While fixing that, we can also tell the filter to filter on patch_project rather than submission's project, which allows us in some cases to avoid the join entirely. The resultant SQL is beautiful when filtering by project only: SELECT COUNT(*) AS `__count` FROM `patchwork_patch` WHERE `patchwork_patch`.`patch_project_id` = 1 On my test setup (2x canonical kernel mailing list in the db, warm cache, my laptop) this query goes from >1s to ~10ms, a ~100x improvement. If we filter by project and date the query is still nice, but still also very slow: SELECT COUNT(*) AS `__count` FROM `patchwork_patch` INNER JOIN `patchwork_submission` ON (`patchwork_patch`.`submission_ptr_id` = `patchwork_submission`.`id`) WHERE (`patchwork_patch`.`patch_project_id` = 1 AND `patchwork_submission`.`date` >= '2010-11-01 00:00:00') This us from ~1.3s to a bit under 400ms - still not ideal, but I'll take the 3x improvement! Reported-by: Konstantin Ryabitsev Signed-off-by: Daniel Axtens --- patchwork/api/filters.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/patchwork/api/filters.py b/patchwork/api/filters.py index 60f1a3634fcf..a3b6f03c6ccd 100644 --- a/patchwork/api/filters.py +++ b/patchwork/api/filters.py @@ -188,7 +188,8 @@ class CoverLetterFilterSet(TimestampMixin, BaseFilterSet): class PatchFilterSet(TimestampMixin, BaseFilterSet): - project = ProjectFilter(queryset=Project.objects.all()) + project = ProjectFilter(queryset=Project.objects.all(), distinct=False, + field_name='patch_project') # NOTE(stephenfin): We disable the select-based HTML widgets for these # filters as the resulting query is _huge_ series = BaseFilter(queryset=Series.objects.all(),