{"id":2232137,"url":"http://patchwork.ozlabs.org/api/covers/2232137/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/cover/20260503053451.48504-1-herumi@nifty.com/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20260503053451.48504-1-herumi@nifty.com>","list_archive_url":null,"date":"2026-05-03T05:34:48","name":"[v2,0/3] Optimize 32-bit unsigned constant division for 64-bit targets","submitter":{"id":92964,"url":"http://patchwork.ozlabs.org/api/people/92964/?format=json","name":null,"email":"herumi@nifty.com"},"mbox":"http://patchwork.ozlabs.org/project/gcc/cover/20260503053451.48504-1-herumi@nifty.com/mbox/","series":[{"id":502562,"url":"http://patchwork.ozlabs.org/api/series/502562/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=502562","date":"2026-05-03T05:34:51","name":"Optimize 32-bit unsigned constant division for 64-bit targets","version":2,"mbox":"http://patchwork.ozlabs.org/series/502562/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/covers/2232137/comments/","headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=nifty.com header.i=@nifty.com header.a=rsa-sha256\n header.s=default-1th84yt82rvi header.b=t5rr0gKo;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=nifty.com header.i=@nifty.com header.a=rsa-sha256\n header.s=default-1th84yt82rvi header.b=t5rr0gKo","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=nifty.com","sourceware.org; spf=pass smtp.mailfrom=nifty.com","server2.sourceware.org;\n arc=none smtp.remote-ip=2001:268:fa30:831:6a:99:e3:27"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g7YSf5yLVz1yJV\n\tfor <incoming@patchwork.ozlabs.org>; Sun, 03 May 2026 15:37:22 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id D0CED4B920C7\n\tfor <incoming@patchwork.ozlabs.org>; Sun,  3 May 2026 05:37:20 +0000 (GMT)","from mta-snd-w07.mail.nifty.com (mta-snd-w07.mail.nifty.com\n [IPv6:2001:268:fa30:831:6a:99:e3:27])\n by sourceware.org (Postfix) with ESMTPS id D892C4BB24E7\n for <gcc-patches@gcc.gnu.org>; Sun,  3 May 2026 05:35:31 +0000 (GMT)","from sapp by mta-snd-w07.mail.nifty.com with ESMTP\n id <20260503053529048.UFD.19957.sapp@nifty.com>;\n Sun, 3 May 2026 14:35:29 +0900"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org D0CED4B920C7","OpenDKIM Filter v2.11.0 sourceware.org D892C4BB24E7"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org D892C4BB24E7","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org D892C4BB24E7","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1777786533; cv=none;\n b=YrWi+DdJJPfL/ykdtod8suIN0cpaNMdTo5PV3+GaR5EZ+UnKNyuSJitt7orV24DXSqkhTIGcb+l4FHjfuD/XcppTHBq4FTXZvVbwdzAn1TUchiJD0kMs0HLEBUpp3jUcPIB/3BPUUvc2iqtidCgV0xawtOKHxDe6vMFwAEsNT1M=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1777786533; c=relaxed/simple;\n bh=6xdcfpDiybYQLZ68dBW6YlhNG0VQ+5HI/O9yy79c5qc=;\n h=From:To:Subject:Date:Message-ID:MIME-Version:DKIM-Signature;\n b=iFQKOopKSXmzO4C00j7gL1go+jiujqQXZjlvjO6NYRJKDU2QbzgS8ktP/sG4xsY68G0m3PbeqcBQ2J35JH1cLhf7YrwF0vxpuuB6+lb8VKe90vIcQ1UvHMsZ/AR+lJFZsCQARU23HV/zn1X+YZLa96C3000UNmlXhwvXp+UIUFQ=","ARC-Authentication-Results":"i=1; server2.sourceware.org","From":"herumi@nifty.com","To":"gcc-patches@gcc.gnu.org","Cc":"rguenther@suse.de, jeffreyalaw@gmail.com, ubizjak@gmail.com,\n MITSUNARI Shigeo <herumi@nifty.com>","Subject":"[PATCH v2 0/3] Optimize 32-bit unsigned constant division for 64-bit\n targets","Date":"Sun,  3 May 2026 14:34:48 +0900","Message-ID":"<20260503053451.48504-1-herumi@nifty.com>","X-Mailer":"git-send-email 2.43.0","In-Reply-To":"<727728f8-76e5-457b-ab9f-d650550e0702@gmail.com>","References":"<727728f8-76e5-457b-ab9f-d650550e0702@gmail.com>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.com;\n s=default-1th84yt82rvi; t=1777786529;\n bh=fd7rht+IvbAwrYd3C92JvlLFmZSfpGT8yUmNGjkdFK0=;\n h=From:To:Cc:Subject:Date:In-Reply-To:References;\n b=t5rr0gKokbq+rRsjGkuAGA6Fiz11rKdHyyBDJA1MdvjkRMJcGI5MjhAmQnHxcwxfqUwXheZq\n vjQ0Zgu5SIeryG6EAshRPWu8ynXI3gUnvOuf6zh/d2osK+1ErHJr5oQlMOTgLxjm1F+KPnt1fp\n V36rd+DtTLqGQymUIMtM5VaL5i5nb4jzGpnQ+zmh4ls2YdztqKzhBM/VLntY9RL1zDRy6Pu0QW\n 3Z7UnQ35pn5upuKIZ954dVgBrkbmxVfNlOgb0mtvw8nb7u5uwAFpMkV3vE3PPejaux9u2uqOT+\n pxNxMrJvCZSvN3jRyEEGdJVqxm43kYkiTgPQG8NvpEH85X/A==","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"},"content":"From: MITSUNARI Shigeo <herumi@nifty.com>\n\nCompiler optimization for constant division of `uint32_t` variables (e.g.,\n`x / 7`) is based on the Granlund-Montgomery (GM) method [1].\nWhen the magic multiplier requires 33 bits (`mh != 0`), the GM method uses a\nsub/shift/add sequence designed for 32-bit registers.\nThis patch pre-shifts the 33-bit magic constant into a 64-bit value and uses a\nsingle 64x64->128-bit high-part multiply to obtain the quotient directly,\neliminating the post-multiply add/sub/shift sequence.\n\nOn x86_64, `x / 7` is reduced from 7 instructions to 4 (or 3 with BMI2).\nA benchmark on Xeon w9-3495X shows a 1.67x speedup.\n\nChanges in v2:\n- Rebased on current master (post GCC 16 release).\n\nAffected divisors: 32-bit unsigned constant divisors where `mh != 0`\n(7, 19, 21, 27, 31, 35, 37, 107, etc.)\nArchitectures: 64-bit or wider targets with 64x64->128-bit high-part multiply\n(x86_64, AArch64, RISC-V64, etc.)\npatch2, patch3: x86_64 BMI2 targets only.\n\nTesting: Ran `make check-gcc RUNTESTFLAGS=\"i386.exp\"` before and after applying\nthe patches. The results are identical.\n\nNote: The same optimization technique has been applied to LLVM and merged into\nllvm:main at https://github.com/llvm/llvm-project/pull/181288.\n\nReferences:\n[1] T. Granlund, P. L. Montgomery, \"Division by Invariant Integers using\n    Multiplication\", PLDI 1994\n[2] https://arxiv.org/abs/2604.07902\n\nMITSUNARI Shigeo (3):\n  expmed: Optimize 32-bit unsigned division by constants on 64-bit\n    targets\n  i386: Add BMI2 MULX pattern for highpart-only multiplication\n  i386: Add peephole2 to convert highpart mul to mulx\n\n gcc/config/i386/i386.md | 32 +++++++++++++++\n gcc/expmed.cc           | 86 +++++++++++++++++++++++++++++------------\n 2 files changed, 93 insertions(+), 25 deletions(-)"}