From patchwork Sat Dec 18 21:19:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yury Norov X-Patchwork-Id: 1570627 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=gYbhtwUh; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JJsYWCM4; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:e::133; helo=bombadil.infradead.org; envelope-from=linux-snps-arc-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JGdzx4kRvz9sR4 for ; Sun, 19 Dec 2021 08:20:29 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:To :From:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=NwwMJ7EbRz4Ubb2V054QYBPnQcZE8nlXjlmroERznvA=; b=gYbhtwUhS2UX+q Qb5GbCKOHmEXFqswrheVBtdcZjsCB38o5tRYdx5AZ3qhXp2D45que9IkVdMaSpBiK+oeSS/MbSrIL mgMNbZa4fez6DXtIHxkOlBbmqKJVSv88bgIWjTKY9179sOfvIvb+B98jQQT6KeoaWh4bMK/lrz3Y6 mNjy/VELNy8L9wcCnqvL54yfgJ/i+gy3qg0QzC8NUHjvVlvGN1YJpXy+OsWkjPi7vEUffLXgYGeuG 7yo4jvefq6rL0tygJAPx3aqOvf4hhvaEgXvizFCyAArZEBOy6aDtfYVtaPo1fJ6HqQCFF1EnyVXzY YE/8Bu77jHO+QR2YKtRw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1myh7z-00EjYC-9j; Sat, 18 Dec 2021 21:20:27 +0000 Received: from mail-oi1-x22e.google.com ([2607:f8b0:4864:20::22e]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1myh7r-00EjT7-0y; Sat, 18 Dec 2021 21:20:20 +0000 Received: by mail-oi1-x22e.google.com with SMTP id bf8so9366293oib.6; Sat, 18 Dec 2021 13:20:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=JzsjdMdsnVfTtF2jA28Re9um5Pc1jpset8TjSm+sCCQ=; b=JJsYWCM4ZBrFr/QGYcVtz1O5XpwXMR8kjk5zvlx1xd9d3Bnfs+RyARM+UF2BFzbQzh Gtz0Jhiop3r+qa0tUyPUPAMdU6Jzy350IR0tNVuhpUl5a46wJZSSCfrlTroVV0iEzWQm jnatXUCyMxUjx4zPd6MbwV+jlb+bcIyfWFewnkHP1g6bfMLkxOoJnBeNiY1ltSGyKLUW NJO78tJlSMx+Op9iY217snQDmDuDZZGnfHCB/c9C+zObP29p/bUuiIWC4jVIoflp+GZQ +vsH2Ey++3hW+p7C5pzF3cKCahFFKGna42dCmd+BpNpZItDrp0SdGfIa8OEsJ5ce8y7A EIWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=JzsjdMdsnVfTtF2jA28Re9um5Pc1jpset8TjSm+sCCQ=; b=z1hfkP2TDeFS76czq3JanAH6TGO49BRBvWlzpuoo7jTrZBInh91i+ip+EA7lqwMYEM J5MajDbNy/EDV5x9q/jUrJ25kufUdSBysTp/bZxNQyAeA/ssgf908NzvQ1axbcLBz8UL a9yRDg0RQjd0x5QR7HJXF3TUfhH7zHPzEPvjtNhrEEa+dvZEWgsgUPXUVVZrZbQO02yG Owf8V3f/fniUhNv42MyxMkKKxIchtphPz4svngtNip0UL6OaDZg5sGhxkk+Pu0W0TEnz N4apZMFMQQMytITDLmmUTtRmmR0XGfurrBGVITs6AcXEWjAVr4DL2B7i8THqelCFqJzO b7iA== X-Gm-Message-State: AOAM533vLj+Z5QLxZFwuvO+94yuKkHuHCG48nxOZB3i328IhM3aec41I O3GY0rj6/vKL8Vbk9U7itAk= X-Google-Smtp-Source: ABdhPJx5aJcpDQGi8bZmEiNIZHDj12WYIMCd+iCUv3wO5n9NxzqY0XYaFcTOfyycv/QCCz3me1Quxg== X-Received: by 2002:a05:6808:a8f:: with SMTP id q15mr12338850oij.65.1639862417151; Sat, 18 Dec 2021 13:20:17 -0800 (PST) Received: from localhost (searspoint.nvidia.com. [216.228.112.21]) by smtp.gmail.com with ESMTPSA id g26sm2402061ots.25.2021.12.18.13.20.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Dec 2021 13:20:16 -0800 (PST) From: Yury Norov To: linux-kernel@vger.kernel.org, Yury Norov , "James E.J. Bottomley" , "Martin K. Petersen" , =?utf-8?b?TWljaGHFgiBN?= =?utf-8?b?aXJvc8WCYXc=?= , "Paul E. McKenney" , "Rafael J. Wysocki" , Alexander Shishkin , Alexey Klimov , Amitkumar Karwar , Andi Kleen , Andrew Lunn , Andrew Morton , Andy Gross , Andy Lutomirski , Andy Shevchenko , Anup Patel , Ard Biesheuvel , Arnaldo Carvalho de Melo , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christoph Hellwig , Christoph Lameter , Daniel Vetter , Dave Hansen , David Airlie , David Laight , Dennis Zhou , Emil Renner Berthing , Geert Uytterhoeven , Geetha sowjanya , Greg Kroah-Hartman , Guo Ren , Hans de Goede , Heiko Carstens , Ian Rogers , Ingo Molnar , Jakub Kicinski , Jason Wessel , Jens Axboe , Jiri Olsa , Joe Perches , Jonathan Cameron , Juri Lelli , Kees Cook , Krzysztof Kozlowski , Lee Jones , Marc Zyngier , Marcin Wojtas , Mark Gross , Mark Rutland , Matti Vaittinen , Mauro Carvalho Chehab , Mel Gorman , Michael Ellerman , Mike Marciniszyn , Nicholas Piggin , Palmer Dabbelt , Peter Zijlstra , Petr Mladek , Randy Dunlap , Rasmus Villemoes , Russell King , Saeed Mahameed , Sagi Grimberg , Sergey Senozhatsky , Solomon Peachy , Stephen Boyd , Stephen Rothwell , Steven Rostedt , Subbaraya Sundeep , Sudeep Holla , Sunil Goutham , Tariq Toukan , Tejun Heo , Thomas Bogendoerfer , Thomas Gleixner , Ulf Hansson , Vincent Guittot , Vineet Gupta , Viresh Kumar , Vivien Didelot , Vlastimil Babka , Will Deacon , bcm-kernel-feedback-list@broadcom.com, kvm@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-crypto@vger.kernel.org, linux-csky@vger.kernel.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-perf-users@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-snps-arc@lists.infradead.org, linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 00/17] lib/bitmap: optimize bitmap_weight() usage Date: Sat, 18 Dec 2021 13:19:56 -0800 Message-Id: <20211218212014.1315894-1-yury.norov@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211218_132019_112261_B00333BA X-CRM114-Status: GOOD ( 17.30 ) X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: In many cases people use bitmap_weight()-based functions to compare the result against a number of expression: if (cpumask_weight(...) > 1) do_something(); This may take considerable amount of time on many-cpus machines because cpumask_weight(...) will traverse every word of underlying cpumask unconditionally. Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:22e listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider [yury.norov[at]gmail.com] -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain X-BeenThere: linux-snps-arc@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Linux on Synopsys ARC Processors List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org In many cases people use bitmap_weight()-based functions to compare the result against a number of expression: if (cpumask_weight(...) > 1) do_something(); This may take considerable amount of time on many-cpus machines because cpumask_weight(...) will traverse every word of underlying cpumask unconditionally. We can significantly improve on it for many real cases if stop traversing the mask as soon as we count cpus to any number greater than 1: if (cpumask_weight_gt(..., 1)) do_something(); To implement this idea, the series adds bitmap_weight_cmp() function and bitmap_weight_{eq,gt,ge,lt,le} macros on top of it; corresponding wrappers in cpumask and nodemask. There are 3 cpumasks, for which weight is counted frequently: possible, present and active. They all are read-mostly, and to optimize counting number of set bits for them, this series adds atomic counters, similarly to online cpumask. v1: https://lkml.org/lkml/2021/11/27/339 v2: - add bitmap_weight_cmp(); - fix bitmap_weight_le semantics and provide full set of {eq,gt,ge,lt,le} as wrappers around bitmap_weight_cmp(); - don't touch small bitmaps (less than 32 bits) - optimization works only for large bitmaps; - move bitmap_weight() == 0 -> bitmap_empty() conversion to a separate patch, ditto cpumask_weight() and nodes_weight; - add counters for possible, present and active cpus; - drop bitmap_empty() where possible; - various fixes around bit counting that spotted my eyes. Yury Norov (17): all: don't use bitmap_weight() where possible drivers: rename num_*_cpus variables fix open-coded for_each_set_bit() all: replace bitmap_weight with bitmap_empty where appropriate all: replace cpumask_weight with cpumask_empty where appropriate all: replace nodes_weight with nodes_empty where appropriate lib/bitmap: add bitmap_weight_{cmp,eq,gt,ge,lt,le} functions all: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate lib/cpumask: add cpumask_weight_{eq,gt,ge,lt,le} lib/nodemask: add nodemask_weight_{eq,gt,ge,lt,le} lib/nodemask: add num_node_state_eq() kernel/cpu.c: fix init_cpu_online kernel/cpu: add num_possible_cpus counter kernel/cpu: add num_present_cpu counter kernel/cpu: add num_active_cpu counter tools/bitmap: sync bitmap_weight MAINTAINERS: add cpumask and nodemask files to BITMAP_API MAINTAINERS | 4 + arch/alpha/kernel/process.c | 2 +- arch/ia64/kernel/setup.c | 2 +- arch/ia64/mm/tlb.c | 2 +- arch/mips/cavium-octeon/octeon-irq.c | 4 +- arch/mips/kernel/crash.c | 2 +- arch/nds32/kernel/perf_event_cpu.c | 2 +- arch/powerpc/kernel/smp.c | 2 +- arch/powerpc/kernel/watchdog.c | 2 +- arch/powerpc/xmon/xmon.c | 4 +- arch/s390/kernel/perf_cpum_cf.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 +-- arch/x86/kernel/smpboot.c | 4 +- arch/x86/kvm/hyperv.c | 8 +- arch/x86/mm/amdtopology.c | 2 +- arch/x86/mm/mmio-mod.c | 2 +- arch/x86/mm/numa_emulation.c | 4 +- arch/x86/platform/uv/uv_nmi.c | 2 +- drivers/acpi/numa/srat.c | 2 +- drivers/cpufreq/qcom-cpufreq-hw.c | 2 +- drivers/cpufreq/scmi-cpufreq.c | 2 +- drivers/firmware/psci/psci_checker.c | 2 +- drivers/gpu/drm/i915/i915_pmu.c | 2 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +- drivers/hv/channel_mgmt.c | 4 +- drivers/iio/dummy/iio_simple_dummy_buffer.c | 4 +- drivers/iio/industrialio-trigger.c | 2 +- drivers/infiniband/hw/hfi1/affinity.c | 13 +- drivers/infiniband/hw/qib/qib_file_ops.c | 2 +- drivers/infiniband/hw/qib/qib_iba7322.c | 2 +- drivers/irqchip/irq-bcm6345-l1.c | 2 +- drivers/leds/trigger/ledtrig-cpu.c | 6 +- drivers/memstick/core/ms_block.c | 4 +- drivers/net/dsa/b53/b53_common.c | 6 +- drivers/net/ethernet/broadcom/bcmsysport.c | 6 +- .../net/ethernet/intel/ice/ice_virtchnl_pf.c | 4 +- .../net/ethernet/intel/ixgbe/ixgbe_sriov.c | 2 +- .../marvell/octeontx2/nic/otx2_ethtool.c | 2 +- .../marvell/octeontx2/nic/otx2_flows.c | 8 +- .../ethernet/marvell/octeontx2/nic/otx2_pf.c | 2 +- drivers/net/ethernet/mellanox/mlx4/cmd.c | 33 ++--- drivers/net/ethernet/mellanox/mlx4/eq.c | 4 +- drivers/net/ethernet/mellanox/mlx4/fw.c | 4 +- drivers/net/ethernet/mellanox/mlx4/main.c | 2 +- drivers/net/ethernet/qlogic/qed/qed_rdma.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_roce.c | 2 +- drivers/perf/arm-cci.c | 2 +- drivers/perf/arm_pmu.c | 4 +- drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +- drivers/perf/thunderx2_pmu.c | 4 +- drivers/perf/xgene_pmu.c | 2 +- drivers/scsi/lpfc/lpfc_init.c | 2 +- drivers/scsi/storvsc_drv.c | 6 +- drivers/soc/fsl/qbman/qman_test_stash.c | 2 +- drivers/staging/media/tegra-video/vi.c | 2 +- drivers/thermal/intel/intel_powerclamp.c | 9 +- include/linux/bitmap.h | 80 +++++++++++ include/linux/cpumask.h | 131 +++++++++++++----- include/linux/nodemask.h | 40 ++++++ kernel/cpu.c | 54 ++++++++ kernel/irq/affinity.c | 2 +- kernel/padata.c | 2 +- kernel/rcu/tree_nocb.h | 4 +- kernel/rcu/tree_plugin.h | 2 +- kernel/sched/core.c | 10 +- kernel/sched/topology.c | 4 +- kernel/time/clockevents.c | 2 +- kernel/time/clocksource.c | 2 +- lib/bitmap.c | 21 +++ mm/mempolicy.c | 2 +- mm/page_alloc.c | 2 +- mm/vmstat.c | 4 +- tools/include/linux/bitmap.h | 44 ++++++ tools/lib/bitmap.c | 20 +++ tools/perf/builtin-c2c.c | 4 +- tools/perf/util/pmu.c | 2 +- 76 files changed, 480 insertions(+), 183 deletions(-)