[{"id":3685502,"web_url":"http://patchwork.ozlabs.org/comment/3685502/","msgid":"<CAMe9rOr2yt9MKP=woDxtn8DM_VgVBkDCkchmYvSwpTN86wCmZQ@mail.gmail.com>","list_archive_url":null,"date":"2026-05-04T05:45:03","subject":"Re: [PATCH v3 3/3] i386: Add peephole2 to convert highpart mul to\n mulx","submitter":{"id":4387,"url":"http://patchwork.ozlabs.org/api/people/4387/","name":"H.J. Lu","email":"hjl.tools@gmail.com"},"content":"On Mon, May 4, 2026 at 7:59 AM <herumi@nifty.com> wrote:\n>\n> From: MITSUNARI Shigeo <herumi@nifty.com>\n>\n> When the register allocator selects the MUL-based highpart pattern\n> (umuldi3_highpart) with the source value already in %rdx, it inserts\n> a redundant mov to %rax before the mul instruction.  Add a peephole2\n> that detects this mov + mul sequence and converts it to a single mulx,\n> eliminating the extra mov.\n>\n> This improves inlined loops that perform multiple unsigned divisions\n> by constants.  For example, a loop with three div-by-constant\n> operations now generates 15 instructions (matching LLVM) instead\n> of 18.\n>\n> Before (loop body excerpt):\n>         mov     rax, rdx\n>         mul     r9\n>\n> After:\n>         mulx    rdx, rax, r9\n>\n> gcc/ChangeLog:\n>\n>         * config/i386/i386.md: Add peephole2 to convert\n>         mov + umul_highpart to mulx on BMI2 targets.\n\nNeed the ChangeLog entry for the test.\n\n> ---\n>  gcc/config/i386/i386.md                       | 17 +++++++++++++++++\n>  .../gcc.target/i386/bmi2-mulx-highpart-2.c    | 19 +++++++++++++++++++\n>  2 files changed, 36 insertions(+)\n>  create mode 100644 gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c\n>\n> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md\n> index 472f9d41332..1c394690b04 100644\n> --- a/gcc/config/i386/i386.md\n> +++ b/gcc/config/i386/i386.md\n> @@ -11522,6 +11522,23 @@\n>     (set_attr \"prefix\" \"vex\")\n>     (set_attr \"mode\" \"<MODE>\")])\n>\n> +;; Convert mov + highpart mul to mulx when the mov source is %rdx.\n> +;; mov %rdx, %rax; mulq %src -> mulx %src, %rax, %out\n> +(define_peephole2\n> +  [(set (match_operand:DWIH 0 \"register_operand\")\n> +       (match_operand:DWIH 1 \"register_operand\"))\n> +   (parallel [(set (match_operand:DWIH 2 \"register_operand\")\n> +                  (umul_highpart:DWIH (match_dup 0)\n> +                       (match_operand:DWIH 3 \"nonimmediate_operand\")))\n> +             (clobber (match_dup 0))\n> +             (clobber (reg:CC FLAGS_REG))])]\n> +  \"TARGET_BMI2\n> +   && REGNO (operands[1]) == DX_REG\n> +   && REGNO (operands[0]) != REGNO (operands[2])\"\n> +  [(parallel [(set (match_dup 2)\n> +                  (umul_highpart:DWIH (match_dup 1) (match_dup 3)))\n> +             (clobber (match_dup 0))])])\n> +\n>  ;; Highpart multiplication patterns\n>  (define_insn \"<s>mul<mode>3_highpart\"\n>    [(set (match_operand:DWIH 0 \"register_operand\" \"=d\")\n> diff --git a/gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c b/gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c\n> new file mode 100644\n> index 00000000000..be56cf15d07\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/i386/bmi2-mulx-highpart-2.c\n> @@ -0,0 +1,19 @@\n> +/* { dg-do compile { target { ! ia32 } } } */\n> +/* { dg-options \"-O2 -mbmi2\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target *-*-linux* *-*-gnu* } } } */\n> +\n> +/*\n> +**div7loop:\n> +**...\n> +**     mulx    %rsi, %rax, %rdx\n> +**...\n> +*/\n> +\n> +unsigned int\n> +div7loop (unsigned int x)\n> +{\n> +  for (int i = 0; i < 10000; i++) {\n> +    x ^= (i ^ x) / 7;\n> +  }\n> +  return x;\n> +}\n> --\n> 2.43.0\n>","headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20251104 header.b=DIK/V47d;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20251104 header.b=DIK/V47d","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","sourceware.org; spf=pass smtp.mailfrom=gmail.com","server2.sourceware.org;\n arc=pass smtp.remote-ip=2607:f8b0:4864:20::530"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g89cR1zL2z1yJ9\n\tfor <incoming@patchwork.ozlabs.org>; Mon, 04 May 2026 15:46:13 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 487304BAE7E7\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  4 May 2026 05:46:09 +0000 (GMT)","from mail-pg1-x530.google.com (mail-pg1-x530.google.com\n [IPv6:2607:f8b0:4864:20::530])\n by sourceware.org (Postfix) with ESMTPS id 67C9F4BAD16A\n for <gcc-patches@gcc.gnu.org>; Mon,  4 May 2026 05:45:42 +0000 (GMT)","by mail-pg1-x530.google.com with SMTP id\n 41be03b00d2f7-c736261ee8dso1084169a12.1\n for <gcc-patches@gcc.gnu.org>; Sun, 03 May 2026 22:45:42 -0700 (PDT)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 487304BAE7E7","OpenDKIM Filter v2.11.0 sourceware.org 67C9F4BAD16A"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 67C9F4BAD16A","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org 67C9F4BAD16A","ARC-Seal":["i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1777873542; cv=pass;\n b=pCVn4dBkYfzM4TuL4wDK8BqOWC1zyEycEMzmVlsUEwVnnCD5NV1vvzuNcICiEsX2nXeCkuGvNN9RESBu1LkwplYaZT0eGhuZv1oSaXmg6ax9BJ3f2c6JgiyTvd3y7U6tc++8p9VOl5GkeYSc+41nVK5L2wrNKUacyr/HCXhFwis=","i=1; a=rsa-sha256; t=1777873541; cv=none;\n d=google.com; s=arc-20240605;\n b=LEW8izAp6FW/AUP13To7Y/bX1Ts3xRokDHosaB8JtCfNg4ftyt1ApaLuJRPnP2AK12\n UFJlkzL1zxOCHPdOVrvQCUu4EWOkKra5zPHZWythJfZuopupEkorBABae/zQtr8v/sqJ\n cQtbSIWLEGA+dZfOQdRg1NruqUIsP6OGuF/iOdeXxlLHbfrLT5Xp2CYAdXxWdth35vxW\n qy9CV6Pnou9sZvlg8PuPTXqggCi6OVYR0H1ztCK0Hn/kvrmi7/sMilObNZ6YdkjXNm06\n fy5yOFt5HiQ4rM6gvgHaKuY71ymAfJrQSdYFCaaboVLSea4GEKbG/34G5lORvsy29s5o\n Friw=="],"ARC-Message-Signature":["i=2; a=rsa-sha256; d=sourceware.org; s=key;\n t=1777873542; c=relaxed/simple;\n bh=TpVcFJ/IP1F2iKShDsLi7xui4/vjkAw/pdQ9VwSqMu0=;\n h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To;\n b=kgsmteBW7Q6NDeEE2iBvIs68oENdR2WJZKKbSGgp2Cts6hc2FldpC529gBAASlU9pguCtFGaTpZLDGz95h+vrHpCQyMfqWVa14xezTLNpakh5VtwobYiXt4B1VseTxwotW0rnR2odOT8iF3YcfTnlocBA2T7gSgtSCBYV1IA8bQ=","i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;\n s=arc-20240605;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:dkim-signature;\n bh=3S9MkBdj+6rP1gPEHM199QuFFTQzEi21wHrU+tuyRV0=;\n fh=0iv4iZA3+g7dVzf5zgf2FQXCrAsNlpV6Q4MoJTRoQqQ=;\n b=BtOPo/LEE/PBcaNksK04JeOpc7hzmICA/4l7HfB/zA+68+94gZ0Qe4LeYIyCoLb22M\n Ck8ITk70JRzx1gw6rupyMHK/RmzMZ56xTctH2QjcfP+qLRbwwddWKKz11cSobt8KAPBb\n uAQIrD9JfXb0jgANRauUKvJHOBMqP/GQShhMO2QuQoaYOG20PlW3thAEdSONUa8Mq8TW\n 6IZCBcOQ5p1N0wfCY5jngr+CDO3jnPF5V5allEd28WuaNVUmY7QXFhXXW/FRZBI1V5Qp\n gyXE7g37RbxL3WznnKvroJ7PeAHFx/xcDZPFIVU7E/D0teV5w4DmpCQ17Br6hKIRDCG5\n ZB9A==; darn=gcc.gnu.org"],"ARC-Authentication-Results":["i=2; server2.sourceware.org","i=1; mx.google.com; arc=none"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=gmail.com; s=20251104; t=1777873541; x=1778478341; darn=gcc.gnu.org;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:from:to:cc:subject:date\n :message-id:reply-to;\n bh=3S9MkBdj+6rP1gPEHM199QuFFTQzEi21wHrU+tuyRV0=;\n b=DIK/V47dvVrBD4e/2J29AprE2UrByFR6Jy0p20nPe9e8z8knp4zPZ5shG9lnG13YKG\n gUht6aghTACuDGUSF9m6Yjq/N0vhk341FFwYEZszSOAAPaztpERWQFJjsmL+lRNxmNCt\n 6gEfmZVeAF35q+5fQeYvhmwKk76biz7KJfoGudNzr1IF3gD9A44SN+PmFnouxH8PHY9b\n gvYcgE8jEgEeSP4hyZGgwewVGq7C7yQJbDD0CM7PkZvztN/WlIWClULcAiBydCyktSof\n cMQIIZl5vt8N14hb+IKuvbT9laKea4gzLI9QklpLtY88z/cFQa3QLVNwfYII+9WTVaHD\n /pjg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20251104; t=1777873541; x=1778478341;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from\n :to:cc:subject:date:message-id:reply-to;\n bh=3S9MkBdj+6rP1gPEHM199QuFFTQzEi21wHrU+tuyRV0=;\n b=L2RwJCBAMbtvPObTu4L51c2bqYuaiDL46VJJv73tCup8Kp7/Hy4Iv5s/+LnSqREppm\n pCDrfZGeU+9l9DxroyqzYuQY77T4xaLyDf/3Ed4QVI7DCPG3Q56TkkEWQyMDoTRam/Qt\n 9vzJajTHHGhzZ0a3N/1YrF5JwT18h7sGRO2Vw7Grf9uxnshoZSd7WnLJCzatGfHN0dMq\n WaySCd35u6WXY8JumVUM8r20rbRptlPMBAjgPc7J0VGDlo/qheF7ugr8AoRkz4cPjcIt\n x7T7KeX7U6jOYvINxJeuAdWs9zMfGpLavNYyV7nV0lLnn2nRxcdnKajoenDvQZhCK5D2\n JuBQ==","X-Gm-Message-State":"AOJu0YyMnUniJ73iCmOovElx+6p33JxNUzWDx/XlGwYzjKZC+xoitnBW\n 0yQ1T7SyZ+YA841McWXAW+y75mOfDwCd5Ibum5aIag+2QwZNplWvtehuGBC1zdu3PCgvwOi4VGO\n DcTZ0mwoIgAwCSljWCQWGJFcoAst1wGg=","X-Gm-Gg":"AeBDievQQYmlP2DHkcFnCVqmXIlevBl5hHRxlnA2yVgDDydzb5XbQOwh8CybiPa7Mm5\n siGMppICbbcTdqavyf8n7uAAeVKKFCdBiYByI0ONQBayWbozmqMxVvjri98aEZjqS90UeowaLvG\n kGzy9URefg6YZjgmnC4g1mxhCa2iihOcP+ziOEXMglalJNs2CptfqxwvEql+KawGOpIKMRdu0T1\n ebMuku4aQLQ8Wj2xOD5i714QCU5mJJMJmV8S3REAx5UGPPYQ+WfslOHrDakii0xHGb2z1nrlxv2\n Ya+0au8KY5vpCHNMHjo=","X-Received":"by 2002:a05:6a20:a109:b0:3a2:edff:297c with SMTP id\n adf61e73a8af0-3a7f1776799mr8402758637.0.1777873541090; Sun, 03 May 2026\n 22:45:41 -0700 (PDT)","MIME-Version":"1.0","References":"<20260503053451.48504-4-herumi@nifty.com>\n <20260503235813.389169-1-herumi@nifty.com>","In-Reply-To":"<20260503235813.389169-1-herumi@nifty.com>","From":"\"H.J. Lu\" <hjl.tools@gmail.com>","Date":"Mon, 4 May 2026 13:45:03 +0800","X-Gm-Features":"AVHnY4Koa5Pe7YPGZSP1qOqnpiin_oOYDTaoKskz78sfHpMg-G4BDnbHFQ4Jtjc","Message-ID":"\n <CAMe9rOr2yt9MKP=woDxtn8DM_VgVBkDCkchmYvSwpTN86wCmZQ@mail.gmail.com>","Subject":"Re: [PATCH v3 3/3] i386: Add peephole2 to convert highpart mul to\n mulx","To":"herumi@nifty.com","Cc":"gcc-patches@gcc.gnu.org","Content-Type":"text/plain; charset=\"UTF-8\"","Content-Transfer-Encoding":"quoted-printable","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"}}]