Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/2225582/?format=api
{ "id": 2225582, "url": "http://patchwork.ozlabs.org/api/patches/2225582/?format=api", "web_url": "http://patchwork.ozlabs.org/project/gcc/patch/20260421082950.27099-1-jinma@linux.alibaba.com/", "project": { "id": 17, "url": "http://patchwork.ozlabs.org/api/projects/17/?format=api", "name": "GNU Compiler Collection", "link_name": "gcc", "list_id": "gcc-patches.gcc.gnu.org", "list_email": "gcc-patches@gcc.gnu.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20260421082950.27099-1-jinma@linux.alibaba.com>", "list_archive_url": null, "date": "2026-04-21T08:29:50", "name": "ira: Remove subset classes from restrict_cost_classes narrow list", "commit_ref": null, "pull_url": null, "state": "new", "archived": false, "hash": "8643743e809cc2d608537308ee4988c88959be96", "submitter": { "id": 85474, "url": "http://patchwork.ozlabs.org/api/people/85474/?format=api", "name": "Jin Ma", "email": "jinma@linux.alibaba.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/gcc/patch/20260421082950.27099-1-jinma@linux.alibaba.com/mbox/", "series": [ { "id": 500758, "url": "http://patchwork.ozlabs.org/api/series/500758/?format=api", "web_url": "http://patchwork.ozlabs.org/project/gcc/list/?series=500758", "date": "2026-04-21T08:29:50", "name": "ira: Remove subset classes from restrict_cost_classes narrow list", "version": 1, "mbox": "http://patchwork.ozlabs.org/series/500758/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/2225582/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/2225582/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>", "X-Original-To": [ "incoming@patchwork.ozlabs.org", "gcc-patches@gcc.gnu.org" ], "Delivered-To": [ "patchwork-incoming@legolas.ozlabs.org", "gcc-patches@gcc.gnu.org" ], "Authentication-Results": [ "legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=rtoVNpfN;\n\tdkim-atps=neutral", "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)", "sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=rtoVNpfN", "sourceware.org; dmarc=pass (p=none dis=none)\n header.from=linux.alibaba.com", "sourceware.org;\n spf=pass smtp.mailfrom=linux.alibaba.com", "server2.sourceware.org;\n arc=none smtp.remote-ip=115.124.30.130" ], "Received": [ "from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g0FtQ3Kwdz1yGt\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 21 Apr 2026 18:30:54 +1000 (AEST)", "from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 9F8814BA9010\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 21 Apr 2026 08:30:52 +0000 (GMT)", "from out30-130.freemail.mail.aliyun.com\n (out30-130.freemail.mail.aliyun.com [115.124.30.130])\n by sourceware.org (Postfix) with ESMTPS id 79CDD4BA2E3C\n for <gcc-patches@gcc.gnu.org>; Tue, 21 Apr 2026 08:30:23 +0000 (GMT)", "from localhost.localdomain(mailfrom:jinma@linux.alibaba.com\n fp:SMTPD_---0X1SPf8k_1776760218 cluster:ay36) by smtp.aliyun-inc.com;\n Tue, 21 Apr 2026 16:30:18 +0800" ], "DKIM-Filter": [ "OpenDKIM Filter v2.11.0 sourceware.org 9F8814BA9010", "OpenDKIM Filter v2.11.0 sourceware.org 79CDD4BA2E3C" ], "DMARC-Filter": "OpenDMARC Filter v1.4.2 sourceware.org 79CDD4BA2E3C", "ARC-Filter": "OpenARC Filter v1.0.0 sourceware.org 79CDD4BA2E3C", "ARC-Seal": "i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1776760224; cv=none;\n b=hLGHX9tjGO1titfmMat3oCOpxMzpd1pXJfR5vccRf8drnKDt8zKT/yv7T6wggt0xtKQfUz+kPXaxQXPkL4TCzEuIWpSe/9A1KxlNlqsBl83WKYuxyZfPn8VdlRkY1BFSCvTTZtZyfy8BOFRkO6wHOSvQjwk659zI9xj3TD8yR2k=", "ARC-Message-Signature": "i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1776760224; c=relaxed/simple;\n bh=iw+z8H3ZyehVQ3dcGP0G7oRzRet6b7yK8dK9SdEDYWE=;\n h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;\n b=xBfbYqfl3Yl2G6iwi6BRjBxIN0ziTBH80WxFDIeKYdevqh2g3gU66HG6VrCssjOYM6LIchSiE13/kBS3CTzNAwoczXUY+mZC0D2aZVrZWK0MC3q6XbI4WgJgqsPVq2VEXAc0zD1rqMd/SfMWQ/qcXt6k+zqvkoWzIgpcvFrlsUo=", "ARC-Authentication-Results": "i=1; server2.sourceware.org", "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=linux.alibaba.com; s=default;\n t=1776760220; h=From:To:Subject:Date:Message-ID:MIME-Version;\n bh=x2u9k/l72gPcgS69kMhYrIpJYO7b8QwMfvR6U1lZYBg=;\n b=rtoVNpfNIYAStMAwy16uetPVi95Hk/Cq5KS/5lokk9Ca8DtwpCzTm1KUanfFbIjI9KD1ruooY36QN+CTOoHOUTcwsqzoYerUyIluFMqwQtG+zO38vF2DcwuenNgR4dlxWlAJkRdeoczcgCnbOGbfWDWoRiWK86bsqMSt6Wa1DHY=", "X-Alimail-AntiSpam": "AC=PASS; BC=-1|-1; BR=01201311R631e4; CH=green;\n DM=||false|;\n DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=maildocker-contentspam011083073210;\n MF=jinma@linux.alibaba.com; NM=1; PH=DS; RN=5; SR=0;\n TI=SMTPD_---0X1SPf8k_1776760218;", "From": "Jin Ma <jinma@linux.alibaba.com>", "To": "gcc-patches@gcc.gnu.org", "Cc": "jeffreyalaw@gmail.com, palmer@dabbelt.com, richard.sandiford@arm.com,\n Jin Ma <jinma@linux.alibaba.com>", "Subject": "[PATCH] ira: Remove subset classes from restrict_cost_classes narrow\n list", "Date": "Tue, 21 Apr 2026 16:29:50 +0800", "Message-ID": "<20260421082950.27099-1-jinma@linux.alibaba.com>", "X-Mailer": "git-send-email 2.50.1", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "X-BeenThere": "gcc-patches@gcc.gnu.org", "X-Mailman-Version": "2.1.30", "Precedence": "list", "List-Id": "Gcc-patches mailing list <gcc-patches.gcc.gnu.org>", "List-Unsubscribe": "<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>", "List-Archive": "<https://gcc.gnu.org/pipermail/gcc-patches/>", "List-Post": "<mailto:gcc-patches@gcc.gnu.org>", "List-Help": "<mailto:gcc-patches-request@gcc.gnu.org?subject=help>", "List-Subscribe": "<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>", "Errors-To": "gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org" }, "content": "The restrict_cost_classes function builds a \"narrow\" list of register\nclasses for IRA cost computation. When iterating over classes, it skips\na class if it is a subset of an already-added class. However, it does\nnot handle the reverse case: when a newly encountered class is a\nsuperset of an already-added class.\n\nPer GCC internals Section 19.8, subclasses should have lower class\nnumbers than their parent classes (\"Order the classes so that if class x\nis contained in class y then x has a lower class number than y\"). This\nmeans subclasses are enumerated first and may be added to the narrow\nlist before their parent class is encountered. When the parent class\narrives, it is not a subset of the subclass, so it is also added,\nresulting in both classes coexisting in the narrow list.\n\nThis causes problems for targets with constrained register subsets.\nFor example, on RISC-V, RVC_FP_REGS (8 registers, f8-f15) appears\nbefore FP_REGS (32 registers) in enum reg_class. Both end up in the\nnarrow list, and since their register move costs are equal, the\nallocator may choose the smaller RVC_FP_REGS as the cost class. This\noverestimates register pressure (8 vs 32 available registers), leading\nto unnecessary spills and suboptimal instruction scheduling. This was\nobserved as a 6.3% performance regression in lmbench's double bogo\nbenchmark on the C907 rv32 platform.\n\nSimilarly, on AArch64, FP_LO8_REGS (8 registers) appears before\nFP_LO_REGS (16 registers) and FP_REGS (32 registers), and could\nexhibit the same issue.\n\nFix this by removing existing subset classes from the narrow list when\na superset class is added. This ensures the narrow list result is\nindependent of the enumeration order of reg_class, and the allocator\nalways uses the largest applicable class for pressure estimation.\n\ngcc/ChangeLog:\n\n\t* ira-costs.cc (restrict_cost_classes): When adding a new class\n\tto the narrow list, remove any existing classes that are subsets\n\tof the new class. Update the map array to reflect the removal.\n\ngcc/testsuite/ChangeLog:\n\n\t* gcc.target/riscv/ira-cost-class-subset.c: New test.\n\t* gcc.target/aarch64/ira-cost-class-subset.c: New test.\n---\n gcc/ira-costs.cc | 47 +++++++++++++++++\n .../aarch64/ira-cost-class-subset.c | 50 +++++++++++++++++++\n .../gcc.target/riscv/ira-cost-class-subset.c | 50 +++++++++++++++++++\n 3 files changed, 147 insertions(+)\n create mode 100644 gcc/testsuite/gcc.target/aarch64/ira-cost-class-subset.c\n create mode 100644 gcc/testsuite/gcc.target/riscv/ira-cost-class-subset.c", "diff": "diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc\nindex 8a731422761..a24ef87537f 100644\n--- a/gcc/ira-costs.cc\n+++ b/gcc/ira-costs.cc\n@@ -285,6 +285,8 @@ restrict_cost_classes (cost_classes_t full, machine_mode mode,\n \t union-based definition would lose the extra restrictions\n \t placed on FR_REGS. GR_AND_FR_REGS is therefore only useful\n \t for cases where GR_REGS and FP_REGS are both valid. */\n+ /* Check if the current class is a subset of an existing class\n+\t in the narrow list. If so, skip it. */\n int pos;\n for (pos = 0; pos < narrow.num; ++pos)\n \t{\n@@ -300,6 +302,51 @@ restrict_cost_classes (cost_classes_t full, machine_mode mode,\n \t enum reg_class cl2 = ira_allocno_class_translate[cl];\n \t if (ira_class_hard_regs_num[cl] == ira_class_hard_regs_num[cl2])\n \t cl = cl2;\n+\n+\t /* Remove any existing classes in the narrow list that are\n+\t subsets of the new class CL. Per GCC internals Section\n+\t 19.8 Register Classes, subclasses have lower class numbers\n+\t and may appear earlier in the enumeration. Without this\n+\t removal, both the subclass and superclass would remain in\n+\t the narrow list, causing the register pressure computation\n+\t to use the smaller subclass (fewer registers) and overestimate\n+\t pressure. */\n+\t int old_num = narrow.num;\n+\t int remap[N_REG_CLASSES];\n+\t int new_num = 0;\n+\t for (int j = 0; j < old_num; ++j)\n+\t {\n+\t if (hard_reg_set_subset_p (reg_class_contents[narrow.classes[j]],\n+\t\t\t\t\t reg_class_contents[cl]))\n+\t\t{\n+\t\t remap[j] = -1;\n+\t\t continue;\n+\t\t}\n+\t remap[j] = new_num;\n+\t narrow.classes[new_num++] = narrow.classes[j];\n+\t }\n+\n+\t if (new_num < old_num)\n+\t {\n+\t narrow.num = new_num;\n+\n+\t /* Fix up map entries for previously processed classes\n+\t\t whose narrow positions shifted due to the removal.\n+\t\t Entries that pointed to a removed subclass now point\n+\t\t to the parent class CL at position narrow.num. */\n+\t for (int k = 0; k < i; ++k)\n+\t\t{\n+\t\t if (map[k] >= 0 && map[k] < old_num)\n+\t\t {\n+\t\t if (remap[map[k]] == -1)\n+\t\t\tmap[k] = narrow.num;\n+\t\t else\n+\t\t\tmap[k] = remap[map[k]];\n+\t\t }\n+\t\t}\n+\t }\n+\n+\t map[i] = narrow.num;\n \t narrow.classes[narrow.num++] = cl;\n \t}\n }\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ira-cost-class-subset.c b/gcc/testsuite/gcc.target/aarch64/ira-cost-class-subset.c\nnew file mode 100644\nindex 00000000000..a0bb8dadd51\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ira-cost-class-subset.c\n@@ -0,0 +1,50 @@\n+/* Verify that the IRA cost class computation does not overestimate register\n+ pressure due to FP subset classes (FP_LO8_REGS / FP_LO_REGS) appearing\n+ before their parent class FP_REGS in enum reg_class. The\n+ restrict_cost_classes function should eliminate subset classes when a\n+ superset class is present, regardless of enumeration order.\n+\n+ This test is derived from lmbench's do_double_bogomflops, which exercises\n+ heavy FP register pressure. Without the fix, the compiler may treat the\n+ 8-register FP_LO8_REGS as the bottleneck, causing unnecessary spills. */\n+\n+/* { dg-do compile } */\n+/* { dg-skip-if \"\" { *-*-* } { \"-O0\" \"-O1\" \"-Og\" \"-Os\" \"-Oz\" } } */\n+/* { dg-options \"-O2 -fschedule-insns -fdump-rtl-sched1-details\" } */\n+\n+struct _state\n+{\n+ int N;\n+ int M;\n+ int K;\n+ double *data;\n+};\n+\n+void do_double_bogomflops(unsigned long int iterations, void *cookie)\n+{\n+ struct _state *pState = (struct _state *)cookie;\n+ register int i;\n+ register int M = pState->M / 10;\n+\n+ while (iterations-- > 0)\n+ {\n+ register double *x = (double *)pState->data;\n+ for (i = 0; i < M; ++i)\n+ {\n+ x[0] = (1.0f + x[0]) * (1.5f - x[0]) / x[0];\n+ x[1] = (1.0f + x[1]) * (1.5f - x[1]) / x[1];\n+ x[2] = (1.0f + x[2]) * (1.5f - x[2]) / x[2];\n+ x[3] = (1.0f + x[3]) * (1.5f - x[3]) / x[3];\n+ x[4] = (1.0f + x[4]) * (1.5f - x[4]) / x[4];\n+ x[5] = (1.0f + x[5]) * (1.5f - x[5]) / x[5];\n+ x[6] = (1.0f + x[6]) * (1.5f - x[6]) / x[6];\n+ x[7] = (1.0f + x[7]) * (1.5f - x[7]) / x[7];\n+ x[8] = (1.0f + x[8]) * (1.5f - x[8]) / x[8];\n+ x[9] = (1.0f + x[9]) * (1.5f - x[9]) / x[9];\n+ x += 10;\n+ }\n+ }\n+}\n+\n+/* { dg-final { scan-rtl-dump-not \"FP_LO8_REGS\" \"sched1\" } } */\n+/* { dg-final { scan-rtl-dump-not \"FP_LO_REGS\" \"sched1\" } } */\ndiff --git a/gcc/testsuite/gcc.target/riscv/ira-cost-class-subset.c b/gcc/testsuite/gcc.target/riscv/ira-cost-class-subset.c\nnew file mode 100644\nindex 00000000000..b28176726a9\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/riscv/ira-cost-class-subset.c\n@@ -0,0 +1,50 @@\n+/* Verify that the IRA cost class computation does not overestimate register\n+ pressure due to RVC subset classes (RVC_FP_REGS / RVC_GR_REGS) appearing\n+ before their parent classes in enum reg_class. The restrict_cost_classes\n+ function should eliminate subset classes when a superset class is present,\n+ regardless of enumeration order.\n+\n+ This test is derived from lmbench's do_double_bogomflops, which exercises\n+ heavy FP register pressure. Without the fix, the compiler treats the\n+ 8-register RVC_FP_REGS as the bottleneck, causing unnecessary spills. */\n+\n+/* { dg-do compile } */\n+/* { dg-skip-if \"\" { *-*-* } { \"-O0\" \"-O1\" \"-O3\" \"-Og\" \"-Os\" \"-Oz\" \"-flto\" } } */\n+/* { dg-options \"-O2 -fschedule-insns -fdump-rtl-sched1-details\" } */\n+\n+struct _state\n+{\n+ int N;\n+ int M;\n+ int K;\n+ double *data;\n+};\n+\n+void do_double_bogomflops(unsigned long int iterations, void *cookie)\n+{\n+ struct _state *pState = (struct _state *)cookie;\n+ register int i;\n+ register int M = pState->M / 10;\n+\n+ while (iterations-- > 0)\n+ {\n+ register double *x = (double *)pState->data;\n+ for (i = 0; i < M; ++i)\n+ {\n+ x[0] = (1.0f + x[0]) * (1.5f - x[0]) / x[0];\n+ x[1] = (1.0f + x[1]) * (1.5f - x[1]) / x[1];\n+ x[2] = (1.0f + x[2]) * (1.5f - x[2]) / x[2];\n+ x[3] = (1.0f + x[3]) * (1.5f - x[3]) / x[3];\n+ x[4] = (1.0f + x[4]) * (1.5f - x[4]) / x[4];\n+ x[5] = (1.0f + x[5]) * (1.5f - x[5]) / x[5];\n+ x[6] = (1.0f + x[6]) * (1.5f - x[6]) / x[6];\n+ x[7] = (1.0f + x[7]) * (1.5f - x[7]) / x[7];\n+ x[8] = (1.0f + x[8]) * (1.5f - x[8]) / x[8];\n+ x[9] = (1.0f + x[9]) * (1.5f - x[9]) / x[9];\n+ x += 10;\n+ }\n+ }\n+}\n+\n+/* { dg-final { scan-rtl-dump-not \"RVC_GR_REGS\" \"sched1\" } } */\n+/* { dg-final { scan-rtl-dump-not \"RVC_FP_REGS\" \"sched1\" } } */\n", "prefixes": [] }