{"id":2234978,"url":"http://patchwork.ozlabs.org/api/1.2/patches/2234978/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-ext4/patch/20260508121539.4174601-2-libaokun@linux.alibaba.com/","project":{"id":8,"url":"http://patchwork.ozlabs.org/api/1.2/projects/8/?format=json","name":"Linux ext4 filesystem development","link_name":"linux-ext4","list_id":"linux-ext4.vger.kernel.org","list_email":"linux-ext4@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20260508121539.4174601-2-libaokun@linux.alibaba.com>","list_archive_url":null,"date":"2026-05-08T12:15:23","name":"[RFC,01/17] lib/crc: add crc32c_flip_range() for incremental CRC update","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"d5f5f6f966583ddc7246a60d6b31d407074a79c8","submitter":{"id":92757,"url":"http://patchwork.ozlabs.org/api/1.2/people/92757/?format=json","name":"Baokun Li","email":"libaokun@linux.alibaba.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linux-ext4/patch/20260508121539.4174601-2-libaokun@linux.alibaba.com/mbox/","series":[{"id":503377,"url":"http://patchwork.ozlabs.org/api/1.2/series/503377/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-ext4/list/?series=503377","date":"2026-05-08T12:15:22","name":"ext4/lib-crc: LBS performance part 1 - incremental CRC32c for bitmap checksums","version":1,"mbox":"http://patchwork.ozlabs.org/series/503377/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2234978/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2234978/checks/","tags":{},"related":[],"headers":{"Return-Path":"\n <SRS0=SThK=DF=vger.kernel.org=linux-ext4+bounces-16365-patchwork-incoming=ozlabs.org@ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-ext4@vger.kernel.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","patchwork-incoming@ozlabs.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=Sf2QXxMz;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org\n (client-ip=2404:9400:2221:ea00::3; helo=mail.ozlabs.org;\n envelope-from=srs0=sthk=df=vger.kernel.org=linux-ext4+bounces-16365-patchwork-incoming=ozlabs.org@ozlabs.org;\n receiver=patchwork.ozlabs.org)","gandalf.ozlabs.org;\n arc=pass smtp.remote-ip=\"2600:3c04:e001:36c::12fc:5321\"\n arc.chain=subspace.kernel.org","gandalf.ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=linux.alibaba.com","gandalf.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=Sf2QXxMz;\n\tdkim-atps=neutral","gandalf.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c04:e001:36c::12fc:5321; helo=tor.lore.kernel.org;\n envelope-from=linux-ext4+bounces-16365-patchwork-incoming=ozlabs.org@vger.kernel.org;\n receiver=ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (1024-bit key) header.d=linux.alibaba.com\n header.i=@linux.alibaba.com header.b=\"Sf2QXxMz\"","smtp.subspace.kernel.org;\n arc=none smtp.client-ip=115.124.30.111","smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=linux.alibaba.com","smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=linux.alibaba.com"],"Received":["from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1 raw public key)\n server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4gBp4k5QBxz1yKm\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 08 May 2026 22:16:21 +1000 (AEST)","from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3])\n\tby gandalf.ozlabs.org (Postfix) with ESMTP id 4gBp4j2MfSz4wJg\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 08 May 2026 22:16:21 +1000 (AEST)","by gandalf.ozlabs.org (Postfix)\n\tid 4gBp4j2HKHz4wKS; Fri, 08 May 2026 22:16:21 +1000 (AEST)","from tor.lore.kernel.org (tor.lore.kernel.org\n [IPv6:2600:3c04:e001:36c::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby gandalf.ozlabs.org (Postfix) with ESMTPS id 4gBp4d6MTyz4wJg\n\tfor <patchwork-incoming@ozlabs.org>; Fri, 08 May 2026 22:16:17 +1000 (AEST)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby tor.lore.kernel.org (Postfix) with ESMTP id 0F5F23003531\n\tfor <patchwork-incoming@ozlabs.org>; Fri,  8 May 2026 12:16:14 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 3F2443DA5DC;\n\tFri,  8 May 2026 12:16:13 +0000 (UTC)","from out30-111.freemail.mail.aliyun.com\n (out30-111.freemail.mail.aliyun.com [115.124.30.111])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id 3745A37FF76;\n\tFri,  8 May 2026 12:16:09 +0000 (UTC)","from\n x31h02109.sqa.na131.tbsite.net(mailfrom:libaokun@linux.alibaba.com\n fp:SMTPD_---0X2XQgnZ_1778242559 cluster:ay36)\n          by smtp.aliyun-inc.com;\n          Fri, 08 May 2026 20:16:00 +0800"],"ARC-Seal":["i=2; a=rsa-sha256; d=ozlabs.org; s=201707; t=1778242581; cv=pass;\n\tb=E8z785dPIJdcx+ujtuew49K05lVvY2grsuJkh0hOnsNSbtgyDCjhxTyS0gwL+r/ZkVqSoWDpH11WXMgQypLAf3O2t/RrcuobYaCunyJ0/s/IQ6++387wLcnoB45xtfZcduMuq1achqBYpUttfFQPO0cNi/2/4xF2apmvwPJ1VuvJ18wxYmV0u1mf5mGaUJZ/1C79DKrVRYOhY/CBM9q92VbbJpFusGakTZazrGCg+0XvrNnT8WyO4f01X1Lf5NpKLMawyim59c6CiP7J2K6dZ817x8GePb7RtIGb5X3NryfCV5jr3ZOdkV0a6hCFtb2Iu6QDgxB2rOarm+WJJUdSEA==","i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1778242572; cv=none;\n b=PbdMQJYvPaCP0TvQv8dHL/hJYKuNPltI1jXotDCOyZwaF4gbIKf2U6WbM1K3SH9NAxWE0HnvkUqpkOhEAinK5r7ZM3rKAZe5hEjEWZ9/bW9XKHyDv7HmVau8jg1OgSCAHNpR+J9+eEa0pfGWEDdF5hPF5svVnkjuw3Tyv3yZt+s="],"ARC-Message-Signature":["i=2; a=rsa-sha256; d=ozlabs.org; s=201707;\n\tt=1778242581; c=relaxed/relaxed;\n\tbh=zVCvkhkySoy4vLElucDPi/u0WE7sWUYB9QYlCMmSWqI=;\n\th=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version;\n b=t3TQDEMhUTKCTKol5NkZOkr6YEXHlhE268YL/HJLP/vaPoNDBnBSEvDlLOM8/nKJX77baryeRCPU7QtKdWBixoQXbmAOCF5NlztZ5irG37fsvmU3jB7uAVSz9L5TYyJVMCsH7VwMDaimQRRGvpdkx4DT2+wDCbvXI883JKNvM8ATJddbSOWnsNv3sjgmHWjyZF5yi0mVDoz/RlvdLolfr9sw3nl7dTzP3Xp4Z5E0XgqV8WPDLlhzVKKmDBAVoDBBaq2YrfJRhK8T20x3xYiO0EvKlGliWNwScFzD7JGoG4ZObZ9NeKSTTZwlnNQDoODJNt6Sm+ohY7oVeMeiVx04pg==","i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1778242572; c=relaxed/simple;\n\tbh=UcUtEoeFLQkTX+9bdpVthUhnxii8XmIfti4ngDriq2g=;\n\th=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version;\n b=qxNxr6lX/6bw6qwM+S8k6yTkxvGLu/hYA9YRyMmCNB1ax4VioYQHTVoevqnVnS12IwYrBc4DhiupdNjP/R4tid0DEuagowuL44mWcK4CiYwQZLmHpHo3q4N6t5iQJiFSW809DD+52b3fjpzPwDa4jSRW845FMdJyYhJDRWDU3BI="],"ARC-Authentication-Results":["i=2; gandalf.ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=linux.alibaba.com;\n dkim=pass (1024-bit key;\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=Sf2QXxMz; dkim-atps=neutral;\n spf=pass (client-ip=2600:3c04:e001:36c::12fc:5321; helo=tor.lore.kernel.org;\n envelope-from=linux-ext4+bounces-16365-patchwork-incoming=ozlabs.org@vger.kernel.org;\n receiver=ozlabs.org) smtp.mailfrom=vger.kernel.org","i=1; smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=linux.alibaba.com;\n spf=pass smtp.mailfrom=linux.alibaba.com;\n dkim=pass (1024-bit key) header.d=linux.alibaba.com\n header.i=@linux.alibaba.com header.b=Sf2QXxMz;\n arc=none smtp.client-ip=115.124.30.111"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=linux.alibaba.com; s=default;\n\tt=1778242560; h=From:To:Subject:Date:Message-ID:MIME-Version;\n\tbh=zVCvkhkySoy4vLElucDPi/u0WE7sWUYB9QYlCMmSWqI=;\n\tb=Sf2QXxMzK1F2by5SdGfNtdPpAr9Gg8Q7kUg0hiWwkMwE1fwvLQqPX/KPXwb1rrrCSs9FYeVBiz6499ed8mqocA6XvUdprqJWYF57nFUYiJlzcd9vQ/I+gGMYVqZzhidnSEtX7X0SwsU7uMJv9P+SCijBC0R6BnLfofg3d9MnWXg=","X-Alimail-AntiSpam":"\n AC=PASS;BC=-1|-1;BR=01201311R591e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=libaokun@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0X2XQgnZ_1778242559;","From":"Baokun Li <libaokun@linux.alibaba.com>","To":"linux-ext4@vger.kernel.org","Cc":"linux-crypto@vger.kernel.org,\n\tebiggers@kernel.org,\n\tardb@kernel.org,\n\ttytso@mit.edu,\n\tadilger.kernel@dilger.ca,\n\tjack@suse.cz,\n\tyi.zhang@huawei.com,\n\tojaswin@linux.ibm.com,\n\tritesh.list@gmail.com,\n\tBaokun Li <libaokun@linux.alibaba.com>","Subject":"[PATCH RFC 01/17] lib/crc: add crc32c_flip_range() for incremental\n CRC update","Date":"Fri,  8 May 2026 20:15:23 +0800","Message-ID":"<20260508121539.4174601-2-libaokun@linux.alibaba.com>","X-Mailer":"git-send-email 2.43.7","In-Reply-To":"<20260508121539.4174601-1-libaokun@linux.alibaba.com>","References":"<20260508121539.4174601-1-libaokun@linux.alibaba.com>","Precedence":"bulk","X-Mailing-List":"linux-ext4@vger.kernel.org","List-Id":"<linux-ext4.vger.kernel.org>","List-Subscribe":"<mailto:linux-ext4+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-ext4+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","X-Spam-Status":"No, score=-8.7 required=5.0 tests=ARC_SIGNED,ARC_VALID,\n\tDKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DMARC_PASS,\n\tHEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,\n\tSPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_DKIM_WL autolearn=disabled\n\tversion=4.0.1","X-Spam-Checker-Version":"SpamAssassin 4.0.1 (2024-03-25) on gandalf.ozlabs.org"},"content":"When a contiguous range of bits in a buffer is flipped, the CRC32c\nchecksum can be updated incrementally without re-scanning the entire\nbuffer, by exploiting the linearity of CRCs over GF(2):\n\n  New_CRC = Old_CRC ^ CRC(flip_mask << trailing_bits)\n\nIntroduce crc32c_flip_range() which computes this delta using\nprecomputed GF(2) shift matrices and nibble-indexed lookup tables.\nThe implementation decomposes nbits and trailing_bits into\npower-of-2 components and combines them via the CRC concatenation\nproperty:\n\n  CRC(A || B) = shift(CRC(A), len(B)) ^ CRC(B)\n\nThis gives O(log N) complexity with only ~9.8KB of static tables\n(fits in L1 cache).  The current maximum supported buffer size is\n64KB (INCR_MAX_ORDER = 19, i.e. 2^19 bits = 524288 bits = 64KB).\n\nThis is useful for filesystems like ext4, where bitmap updates\ninvolve flipping a contiguous range of bits, and recalculating\nthe full CRC after every update is wasteful.\n\nBenchmark results on Intel Xeon (Ice Lake) with CRC32c hardware\nacceleration:\n\n  bitmap:      1024  2048  4096  8192  16384  32768  65536\n  flip(ns):      48    53    57    63     68     73     78\n  full(ns):      45    88   182   357    709   1421   2853\n  speedup:     0.9x  1.6x  3.1x  5.6x  10.3x  19.3x  36.3x\n\nSigned-off-by: Baokun Li <libaokun@linux.alibaba.com>\n---\n include/linux/crc32.h           |  25 ++++++\n lib/crc/.gitignore              |   2 +\n lib/crc/Makefile                |  13 ++-\n lib/crc/crc32c-incr.c           | 140 +++++++++++++++++++++++++++++++\n lib/crc/gen_crc32c_incr_table.c | 141 ++++++++++++++++++++++++++++++++\n 5 files changed, 318 insertions(+), 3 deletions(-)\n create mode 100644 lib/crc/crc32c-incr.c\n create mode 100644 lib/crc/gen_crc32c_incr_table.c","diff":"diff --git a/include/linux/crc32.h b/include/linux/crc32.h\nindex da78b215ff2e..034f73f0f5dc 100644\n--- a/include/linux/crc32.h\n+++ b/include/linux/crc32.h\n@@ -81,6 +81,31 @@ u32 crc32_be(u32 crc, const void *p, size_t len);\n  */\n u32 crc32c(u32 crc, const void *p, size_t len);\n \n+/**\n+ * crc32c_flip_range - Update CRC32c after flipping a range of bits\n+ * @old_crc:    Existing CRC32c value of the buffer (pre-flip).\n+ * @total_bits: Total size of the buffer in bits (e.g., 524288 for 64KB).\n+ * @bit_off:    Starting bit offset of the modified range.\n+ * @nbits:      Length of the flipped bit sequence.\n+ *\n+ * This function calculates the new CRC32c value when a contiguous range of\n+ * bits is flipped (XORed with 1s) without re-scanning the entire buffer.\n+ * It leverages the linearity of CRCs in Galois Field GF(2):\n+ *\n+ * New_CRC = Old_CRC ^ CRC(Mask_of_Ones << Trailing_Bits)\n+ *\n+ * The complexity is O(log nbits + log trailing_bits), making it\n+ * significantly faster than recomputing the CRC for large buffers.\n+ *\n+ * Note: @total_bits must not exceed 524288 (2^19 bits = 64KB).  Callers\n+ * must ensure that @bit_off + @nbits <= @total_bits.  Behavior is\n+ * undefined if these constraints are violated.\n+ *\n+ * Return: The updated CRC32c value.\n+ */\n+u32 crc32c_flip_range(u32 old_crc, u32 total_bits,\n+\t\t      u32 bit_off, u32 nbits);\n+\n /*\n  * crc32_optimizations() returns flags that indicate which CRC32 library\n  * functions are using architecture-specific optimizations.  Unlike\ndiff --git a/lib/crc/.gitignore b/lib/crc/.gitignore\nindex a9e48103c9fb..4e2b9524426d 100644\n--- a/lib/crc/.gitignore\n+++ b/lib/crc/.gitignore\n@@ -1,5 +1,7 @@\n # SPDX-License-Identifier: GPL-2.0-only\n /crc32table.h\n+/crc32c-incr-table.h\n /crc64table.h\n /gen_crc32table\n+/gen_crc32c_incr_table\n /gen_crc64table\ndiff --git a/lib/crc/Makefile b/lib/crc/Makefile\nindex ff213590e4e3..2c255ac029d0 100644\n--- a/lib/crc/Makefile\n+++ b/lib/crc/Makefile\n@@ -21,7 +21,7 @@ crc-t10dif-$(CONFIG_X86) += x86/crc16-msb-pclmul.o\n endif\n \n obj-$(CONFIG_CRC32) += crc32.o\n-crc32-y := crc32-main.o\n+crc32-y := crc32-main.o crc32c-incr.o\n ifeq ($(CONFIG_CRC32_ARCH),y)\n CFLAGS_crc32-main.o += -I$(src)/$(SRCARCH)\n crc32-$(CONFIG_ARM) += arm/crc32-core.o\n@@ -49,20 +49,27 @@ endif # CONFIG_CRC64_ARCH\n \n obj-y += tests/\n \n-hostprogs := gen_crc32table gen_crc64table\n-clean-files := crc32table.h crc64table.h\n+hostprogs := gen_crc32table gen_crc32c_incr_table gen_crc64table\n+clean-files := crc32table.h crc32c-incr-table.h crc64table.h\n \n $(obj)/crc32-main.o: $(obj)/crc32table.h\n+$(obj)/crc32c-incr.o: $(obj)/crc32c-incr-table.h\n $(obj)/crc64-main.o: $(obj)/crc64table.h\n \n quiet_cmd_crc32 = GEN     $@\n       cmd_crc32 = $< > $@\n \n+quiet_cmd_crc32c_incr = GEN     $@\n+      cmd_crc32c_incr = $< > $@\n+\n quiet_cmd_crc64 = GEN     $@\n       cmd_crc64 = $< > $@\n \n $(obj)/crc32table.h: $(obj)/gen_crc32table\n \t$(call cmd,crc32)\n \n+$(obj)/crc32c-incr-table.h: $(obj)/gen_crc32c_incr_table\n+\t$(call cmd,crc32c_incr)\n+\n $(obj)/crc64table.h: $(obj)/gen_crc64table\n \t$(call cmd,crc64)\ndiff --git a/lib/crc/crc32c-incr.c b/lib/crc/crc32c-incr.c\nnew file mode 100644\nindex 000000000000..b6258231cc0d\n--- /dev/null\n+++ b/lib/crc/crc32c-incr.c\n@@ -0,0 +1,140 @@\n+// SPDX-License-Identifier: GPL-2.0-only\n+/*\n+ * GF(2) matrix-based CRC32c incremental update.\n+ *\n+ * When a contiguous range of bits is flipped, the new CRC can be\n+ * derived from the old one without re-scanning the buffer:\n+ *   New_CRC = Old_CRC ^ CRC(flip_mask << trailing_bits)\n+ *\n+ * The delta CRC is computed by decomposing num_bits and trailing_bits\n+ * into power-of-2 components and combining them via the CRC\n+ * concatenation property, giving O(log N) complexity.\n+ *\n+ * Memory usage: ~9.8KB\n+ * - crc32c_incr_nibble_table: 19 * 8 * 16 * 4 = 9728 bytes\n+ * - crc32c_incr_ones_lookup:  20 * 4          = 80   bytes\n+ *\n+ * Tables are generated at compile time by gen_crc32c_incr_table.\n+ * INCR_MAX_ORDER 19 supports up to 64KB buffers (2^19 bits).\n+ *\n+ * Copyright (C) 2026 Alibaba Inc.\n+ */\n+\n+#include <linux/bitops.h>\n+#include <linux/bug.h>\n+#include <linux/export.h>\n+#include <linux/crc32.h>\n+\n+#include \"crc32c-incr-table.h\"\n+\n+#define INCR_MAX_ORDER\t\t19\n+\n+/**\n+ * gf2_xform - Multiply a CRC state vector by a GF(2) shift matrix\n+ * @order: Selects the precomputed matrix M^(2^order).\n+ * @v: The 32-bit CRC state vector.\n+ *\n+ * Computes v * M^(2^order) using nibble (4-bit) indexed tables,\n+ * reducing the operation from 32 bit-level iterations to 8 lookups.\n+ */\n+static inline u32 gf2_xform(int order, u32 v)\n+{\n+\tconst u32 (*tab)[16] = crc32c_incr_nibble_table[order];\n+\n+\treturn tab[0][v & 0xf] ^\n+\t       tab[1][(v >> 4) & 0xf] ^\n+\t       tab[2][(v >> 8) & 0xf] ^\n+\t       tab[3][(v >> 12) & 0xf] ^\n+\t       tab[4][(v >> 16) & 0xf] ^\n+\t       tab[5][(v >> 20) & 0xf] ^\n+\t       tab[6][(v >> 24) & 0xf] ^\n+\t       tab[7][(v >> 28) & 0xf];\n+}\n+\n+/**\n+ * crc32c_incr_get_ones_delta - Compute CRC of an all-ones bit sequence\n+ * @num_bits: Length of the all-ones sequence.\n+ *\n+ * Returns CRC(0, [111...1] of length num_bits).  Decomposes num_bits\n+ * into powers of 2 (MSB-first) and combines using:\n+ *   CRC(A || B) = shift(CRC(A), len(B)) ^ CRC(B)\n+ *\n+ * This requires only (popcount - 1) gf2_xform calls, each doing\n+ * 8 table lookups.\n+ *\n+ * Caller must ensure num_bits <= (1UL << INCR_MAX_ORDER).\n+ */\n+static u32 crc32c_incr_get_ones_delta(size_t num_bits)\n+{\n+\tu32 delta;\n+\tint n;\n+\n+\tif (!num_bits)\n+\t\treturn 0;\n+\n+\t/* Initialize with the highest power-of-2 block */\n+\tn = __fls(num_bits);\n+\tdelta = crc32c_incr_ones_lookup[n];\n+\tnum_bits ^= (1UL << n);\n+\n+\t/* Concatenate remaining blocks from high to low */\n+\twhile (num_bits) {\n+\t\tn = __fls(num_bits);\n+\t\tdelta = gf2_xform(n, delta);\n+\t\tdelta ^= crc32c_incr_ones_lookup[n];\n+\t\tnum_bits ^= (1UL << n);\n+\t}\n+\treturn delta;\n+}\n+\n+/**\n+ * gf2_shift_crc - Shift a CRC state by @trailing_bits zero-bit positions\n+ * @crc: The CRC state vector.\n+ * @trailing_bits: Number of zero bits to shift through.\n+ *\n+ * Equivalent to appending @trailing_bits zero bits to the data stream\n+ * and continuing the CRC computation.  Decomposes trailing_bits into\n+ * powers of 2 and applies the corresponding precomputed matrices.\n+ */\n+static u32 gf2_shift_crc(u32 crc, size_t trailing_bits)\n+{\n+\tint n;\n+\n+\tfor (n = 0; trailing_bits > 0 && n < INCR_MAX_ORDER; n++) {\n+\t\tif (trailing_bits & 1)\n+\t\t\tcrc = gf2_xform(n, crc);\n+\t\ttrailing_bits >>= 1;\n+\t}\n+\treturn crc;\n+}\n+\n+/* See full kernel-doc in include/linux/crc32.h */\n+u32 crc32c_flip_range(u32 old_crc, u32 total_bits,\n+\t\t      u32 bit_off, u32 nbits)\n+{\n+\tu32 delta, trailing_bits;\n+\n+\tif (!nbits)\n+\t\treturn old_crc;\n+\n+\t/*\n+\t * total_bits must not exceed 2^INCR_MAX_ORDER bits (64KB).\n+\t * bit_off + nbits must not exceed total_bits.\n+\t */\n+\tif (WARN_ON_ONCE(total_bits > (1UL << INCR_MAX_ORDER)))\n+\t\treturn old_crc;\n+\tif (WARN_ON_ONCE(bit_off + nbits > total_bits))\n+\t\treturn old_crc;\n+\n+\ttrailing_bits = total_bits - (bit_off + nbits);\n+\n+\t/* 1. Calculate CRC of the flip-mask (all 1s of length nbits) */\n+\tdelta = crc32c_incr_get_ones_delta(nbits);\n+\n+\t/* 2. Shift the mask-CRC to the correct bit position */\n+\tdelta = gf2_shift_crc(delta, trailing_bits);\n+\n+\t/* 3. Apply the delta to the existing CRC */\n+\treturn old_crc ^ delta;\n+}\n+EXPORT_SYMBOL(crc32c_flip_range);\ndiff --git a/lib/crc/gen_crc32c_incr_table.c b/lib/crc/gen_crc32c_incr_table.c\nnew file mode 100644\nindex 000000000000..f906506282cc\n--- /dev/null\n+++ b/lib/crc/gen_crc32c_incr_table.c\n@@ -0,0 +1,141 @@\n+// SPDX-License-Identifier: GPL-2.0\n+/*\n+ * Generate GF(2) nibble-based lookup tables for incremental CRC32c updates.\n+ * MAX_ORDER 19 supports up to 64KB buffers (2^19 bits = 524288 bits).\n+ *\n+ * Instead of storing raw 32x32 bit matrices (32 rows per order),\n+ * we precompute nibble (4-bit) indexed tables.  This reduces gf2_xform\n+ * to 8 table lookups instead of 32 branchless mask-and-XOR iterations.\n+ *\n+ * Memory layout:\n+ * - crc32c_incr_nibble_table[19][8][16]: 19 * 8 * 16 * 4 = 9728 bytes\n+ * - crc32c_incr_ones_lookup[20]:         20 * 4          = 80   bytes\n+ * Total: ~9.8KB (fits comfortably in L1 cache)\n+ *\n+ * Copyright (C) 2026 Alibaba Inc.\n+ */\n+\n+#include <stdio.h>\n+#include <inttypes.h>\n+\n+#include \"../../include/linux/crc32poly.h\"\n+\n+#define CRC32C_INCR_MAX_ORDER\t19\n+#define NIBBLES_PER_U32\t\t8\n+\n+static uint32_t bit_matrix[CRC32C_INCR_MAX_ORDER][32];\n+static uint32_t nibble_table[CRC32C_INCR_MAX_ORDER][NIBBLES_PER_U32][16];\n+static uint32_t ones_lookup[CRC32C_INCR_MAX_ORDER + 1];\n+\n+static void crc32c_incr_init(void)\n+{\n+\tint n, i, k, v;\n+\n+\t/*\n+\t * Step 1: Build the order-0 matrix M, where M[i] is the CRC\n+\t * state after shifting basis vector e_i by one bit position.\n+\t */\n+\tfor (i = 0; i < 32; i++) {\n+\t\tuint32_t r = 1U << i;\n+\n+\t\tbit_matrix[0][i] = (r & 1) ?\n+\t\t\t(r >> 1) ^ CRC32C_POLY_LE : (r >> 1);\n+\t}\n+\n+\t/* Step 2: M^(2^n) = (M^(2^(n-1)))^2 via matrix squaring */\n+\tfor (n = 1; n < CRC32C_INCR_MAX_ORDER; n++) {\n+\t\tfor (i = 0; i < 32; i++) {\n+\t\t\tuint32_t r = bit_matrix[n - 1][i];\n+\t\t\tuint32_t res = 0;\n+\n+\t\t\tfor (k = 0; k < 32; k++) {\n+\t\t\t\tif (r & (1U << k))\n+\t\t\t\t\tres ^= bit_matrix[n - 1][k];\n+\t\t\t}\n+\t\t\tbit_matrix[n][i] = res;\n+\t\t}\n+\t}\n+\n+\t/* Step 3: Convert bit matrices to nibble-indexed lookup tables */\n+\tfor (n = 0; n < CRC32C_INCR_MAX_ORDER; n++) {\n+\t\tfor (i = 0; i < NIBBLES_PER_U32; i++) {\n+\t\t\tnibble_table[n][i][0] = 0;\n+\t\t\tfor (v = 1; v < 16; v++) {\n+\t\t\t\tuint32_t res = 0;\n+\n+\t\t\t\tfor (k = 0; k < 4; k++) {\n+\t\t\t\t\tif (v & (1 << k))\n+\t\t\t\t\t\tres ^= bit_matrix[n][i * 4 + k];\n+\t\t\t\t}\n+\t\t\t\tnibble_table[n][i][v] = res;\n+\t\t\t}\n+\t\t}\n+\t}\n+\n+\t/*\n+\t * Step 4: ones_lookup[n] = CRC(0, all-ones of 2^n bits).\n+\t * Uses CRC(A||B) = shift(CRC(A), len(B)) ^ CRC(B) to double\n+\t * the length at each step.  ones_lookup[0] = CRC of a single\n+\t * 1-bit, which equals the generator polynomial.\n+\t */\n+\tones_lookup[0] = CRC32C_POLY_LE;\n+\n+\tfor (n = 1; n <= CRC32C_INCR_MAX_ORDER; n++) {\n+\t\tuint32_t low = ones_lookup[n - 1];\n+\t\tuint32_t high = 0;\n+\n+\t\tfor (k = 0; k < 32; k++) {\n+\t\t\tif (low & (1U << k))\n+\t\t\t\thigh ^= bit_matrix[n - 1][k];\n+\t\t}\n+\t\tones_lookup[n] = low ^ high;\n+\t}\n+}\n+\n+int main(int argc, char **argv)\n+{\n+\tint n, i, v;\n+\n+\tcrc32c_incr_init();\n+\n+\tprintf(\"/* this file is generated - do not edit */\\n\\n\");\n+\n+\tprintf(\"static const u32 crc32c_incr_nibble_table[%d][%d][16] = {\\n\",\n+\t       CRC32C_INCR_MAX_ORDER, NIBBLES_PER_U32);\n+\tfor (n = 0; n < CRC32C_INCR_MAX_ORDER; n++) {\n+\t\tprintf(\"\\t{\\n\");\n+\t\tfor (i = 0; i < NIBBLES_PER_U32; i++) {\n+\t\t\tprintf(\"\\t\\t{\\n\");\n+\t\t\tfor (v = 0; v < 16; v += 4) {\n+\t\t\t\tprintf(\"\\t\\t\\t0x%08x, 0x%08x, 0x%08x, 0x%08x,\\n\",\n+\t\t\t\t       nibble_table[n][i][v],\n+\t\t\t\t       nibble_table[n][i][v + 1],\n+\t\t\t\t       nibble_table[n][i][v + 2],\n+\t\t\t\t       nibble_table[n][i][v + 3]);\n+\t\t\t}\n+\t\t\tprintf(\"\\t\\t},\\n\");\n+\t\t}\n+\t\tprintf(\"\\t},\\n\");\n+\t}\n+\tprintf(\"};\\n\\n\");\n+\n+\tprintf(\"static const u32 crc32c_incr_ones_lookup[%d] = {\\n\",\n+\t       CRC32C_INCR_MAX_ORDER + 1);\n+\tfor (n = 0; n <= CRC32C_INCR_MAX_ORDER; n += 4) {\n+\t\tint remaining = CRC32C_INCR_MAX_ORDER + 1 - n;\n+\n+\t\tif (remaining >= 4) {\n+\t\t\tprintf(\"\\t0x%08x, 0x%08x, 0x%08x, 0x%08x,\\n\",\n+\t\t\t       ones_lookup[n], ones_lookup[n + 1],\n+\t\t\t       ones_lookup[n + 2], ones_lookup[n + 3]);\n+\t\t} else {\n+\t\t\tprintf(\"\\t\");\n+\t\t\tfor (i = 0; i < remaining; i++)\n+\t\t\t\tprintf(\"0x%08x, \", ones_lookup[n + i]);\n+\t\t\tprintf(\"\\n\");\n+\t\t}\n+\t}\n+\tprintf(\"};\\n\");\n+\n+\treturn 0;\n+}\n","prefixes":["RFC","01/17"]}