{"id":2232261,"url":"http://patchwork.ozlabs.org/api/patches/2232261/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/patch/20260504065508.399252-1-herumi@nifty.com/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20260504065508.399252-1-herumi@nifty.com>","list_archive_url":null,"date":"2026-05-04T06:55:08","name":"[v4,3/3] i386: Add peephole2 to convert highpart mul to mulx","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"eebc4ffc02b163d5e5507e5bfa4fc1c842b2ac9d","submitter":{"id":92964,"url":"http://patchwork.ozlabs.org/api/people/92964/?format=json","name":null,"email":"herumi@nifty.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/gcc/patch/20260504065508.399252-1-herumi@nifty.com/mbox/","series":[{"id":502615,"url":"http://patchwork.ozlabs.org/api/series/502615/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=502615","date":"2026-05-04T06:55:08","name":null,"version":4,"mbox":"http://patchwork.ozlabs.org/series/502615/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2232261/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2232261/checks/","tags":{},"related":[],"headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=nifty.com header.i=@nifty.com header.a=rsa-sha256\n header.s=default-1th84yt82rvi header.b=CrHoixxf;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=38.145.34.32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=nifty.com header.i=@nifty.com header.a=rsa-sha256\n header.s=default-1th84yt82rvi header.b=CrHoixxf","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=nifty.com","sourceware.org; spf=pass smtp.mailfrom=nifty.com","server2.sourceware.org;\n arc=none smtp.remote-ip=106.153.226.39"],"Received":["from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g8C8t0Fp6z1yJV\n\tfor <incoming@patchwork.ozlabs.org>; Mon, 04 May 2026 16:55:57 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 0022E4BAE7D5\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  4 May 2026 06:55:54 +0000 (GMT)","from mta-snd-e07.mail.nifty.com (mta-snd-e07.mail.nifty.com\n [106.153.226.39])\n by sourceware.org (Postfix) with ESMTPS id A1FB04BAE7D3\n for <gcc-patches@gcc.gnu.org>; Mon,  4 May 2026 06:55:22 +0000 (GMT)","from sapp by mta-snd-e07.mail.nifty.com with ESMTP\n id <20260504065520643.OHED.14880.sapp@nifty.com>;\n Mon, 4 May 2026 15:55:20 +0900"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 0022E4BAE7D5","OpenDKIM Filter v2.11.0 sourceware.org A1FB04BAE7D3"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org A1FB04BAE7D3","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org A1FB04BAE7D3","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1777877723; cv=none;\n b=uZ0PAwEOASXS72z0rB0O6+WqbEo88UaayM9W16fuQTh3sQnASqmbNWz/qRUIG+V/rWl9MO7LYHZ5yMHVPJMwTLLW3m6ZfB3jr3rB3bgOieLQMCdkTQhB/zbEIoveIONAmYtkILQM0N9Aq9XV/ls7YpFcgjvf8B9/aMaeRZrdjyU=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1777877723; c=relaxed/simple;\n bh=Rq5MXooiSnzrfTlk29geX+kvb1807AZ8MyhETpyl97U=;\n h=From:To:Subject:Date:Message-ID:MIME-Version:DKIM-Signature;\n b=TVuYlzua2hf9cgmK0jwJMvOswoldY3tta4c0DAt67aXcr+KxNTe8QuANOAYWFXVCPo7XvKsrUzjWCOld3ceaA9IcKq6vxOKk6WnE08NHnPCSoUGBNcq5gqxsbkI73NfwbdXQuauOfb33RjvwrszHh+7AsdwhCLo6RKJAQj12uCQ=","ARC-Authentication-Results":"i=1; server2.sourceware.org","From":"herumi@nifty.com","To":"gcc-patches@gcc.gnu.org","Cc":"MITSUNARI Shigeo <herumi@nifty.com>","Subject":"[PATCH v4 3/3] i386: Add peephole2 to convert highpart mul to mulx","Date":"Mon,  4 May 2026 15:55:08 +0900","Message-ID":"<20260504065508.399252-1-herumi@nifty.com>","X-Mailer":"git-send-email 2.43.0","In-Reply-To":"\n <CAMe9rOr2yt9MKP=woDxtn8DM_VgVBkDCkchmYvSwpTN86wCmZQ@mail.gmail.com>","References":"\n <CAMe9rOr2yt9MKP=woDxtn8DM_VgVBkDCkchmYvSwpTN86wCmZQ@mail.gmail.com>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.com;\n s=default-1th84yt82rvi; t=1777877720;\n bh=GThbQ65werV71DibGXC47jjoeE+ENXK6XNbNw4USuQ4=;\n h=From:To:Cc:Subject:Date:In-Reply-To:References;\n b=CrHoixxf1i9mpfv+6rCK9OqoACjHH0XqchaaJPHYCyH2c66Vis35QRSPVz3eY889qVALCEhi\n hqnuuVLLhwRfugtagMvelPLbwq2MSP2h6jUvIaxLNYVJU44FtNft2wJRh0FRJj3sOkUebD+8l8\n 7Poqjr5i9xBwTS84zKOb0EJdNgx9qu5qUCG1tfyT+Owf21cqgniuHdc8xpqAiuWMw7FCxUfa9b\n zAfgqYV8eZWg1MFIb9IokXBl2wM/w5Vp5GyclekrJitT3YwoEeN3cm85ahyl/lm9gfrbHpCk8A\n EVrhP0QNizUhJhy7xjFbeXhJLvcOkhJf7LfBvtBdEOrADmmQ==","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"},"content":"From: MITSUNARI Shigeo <herumi@nifty.com>\n\nWhen the register allocator selects the MUL-based highpart pattern\n(umuldi3_highpart) with the source value already in %rdx, it inserts\na redundant mov to %rax before the mul instruction.  Add a peephole2\nthat detects this mov + mul sequence and converts it to a single mulx,\neliminating the extra mov.\n\nThis improves inlined loops that perform multiple unsigned divisions\nby constants.  For example, a loop with three div-by-constant\noperations now generates 15 instructions (matching LLVM) instead\nof 18.\n\nBefore (loop body excerpt):\n\tmov\trax, rdx\n\tmul\tr9\n\nAfter:\n\tmulx\trdx, rax, r9\n\ngcc/ChangeLog:\n\n\t* config/i386/i386.md: Add peephole2 to convert\n\tmov + umul_highpart to mulx on BMI2 targets.\n\ngcc/testsuite/ChangeLog:\n\n        * gcc.target/i386/bmi2-mulx-highpart-2.c: New test.\n---\n gcc/config/i386/i386.md                       | 17 +++++++++++++++++\n .../gcc.target/i386/bmi2-mulx-highpart-2.c    | 19 +++++++++++++++++++\n 2 files changed, 36 insertions(+)\n create mode 100644 gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c","diff":"diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md\nindex 472f9d41332..1c394690b04 100644\n--- a/gcc/config/i386/i386.md\n+++ b/gcc/config/i386/i386.md\n@@ -11522,6 +11522,23 @@\n    (set_attr \"prefix\" \"vex\")\n    (set_attr \"mode\" \"<MODE>\")])\n \n+;; Convert mov + highpart mul to mulx when the mov source is %rdx.\n+;; mov %rdx, %rax; mulq %src -> mulx %src, %rax, %out\n+(define_peephole2\n+  [(set (match_operand:DWIH 0 \"register_operand\")\n+\t(match_operand:DWIH 1 \"register_operand\"))\n+   (parallel [(set (match_operand:DWIH 2 \"register_operand\")\n+\t\t   (umul_highpart:DWIH (match_dup 0)\n+\t\t\t(match_operand:DWIH 3 \"nonimmediate_operand\")))\n+\t      (clobber (match_dup 0))\n+\t      (clobber (reg:CC FLAGS_REG))])]\n+  \"TARGET_BMI2\n+   && REGNO (operands[1]) == DX_REG\n+   && REGNO (operands[0]) != REGNO (operands[2])\"\n+  [(parallel [(set (match_dup 2)\n+\t\t   (umul_highpart:DWIH (match_dup 1) (match_dup 3)))\n+\t      (clobber (match_dup 0))])])\n+\n ;; Highpart multiplication patterns\n (define_insn \"<s>mul<mode>3_highpart\"\n   [(set (match_operand:DWIH 0 \"register_operand\" \"=d\")\ndiff --git a/gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c b/gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c\nnew file mode 100644\nindex 00000000000..be56cf15d07\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c\n@@ -0,0 +1,19 @@\n+/* { dg-do compile { target { ! ia32 } } } */\n+/* { dg-options \"-O2 -mbmi2\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target *-*-linux* *-*-gnu* } } } */\n+\n+/*\n+**div7loop:\n+**...\n+**\tmulx\t%rsi, %rax, %rdx\n+**...\n+*/\n+\n+unsigned int\n+div7loop (unsigned int x)\n+{\n+  for (int i = 0; i < 10000; i++) {\n+    x ^= (i ^ x) / 7;\n+  }\n+  return x;\n+}\n","prefixes":["v4","3/3"]}