Cover Letter Detail
Show a cover letter.
GET /api/covers/2196925/?format=api
{ "id": 2196925, "url": "http://patchwork.ozlabs.org/api/covers/2196925/?format=api", "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/cover/20260216145219.1959-1-alireza.sanaee@huawei.com/", "project": { "id": 14, "url": "http://patchwork.ozlabs.org/api/projects/14/?format=api", "name": "QEMU Development", "link_name": "qemu-devel", "list_id": "qemu-devel.nongnu.org", "list_email": "qemu-devel@nongnu.org", "web_url": "", "scm_url": "", "webscm_url": "", "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20260216145219.1959-1-alireza.sanaee@huawei.com>", "list_archive_url": null, "date": "2026-02-16T14:52:16", "name": "[v3,0/2] Performant CXL type 3 non-interleaved regions", "submitter": { "id": 90159, "url": "http://patchwork.ozlabs.org/api/people/90159/?format=api", "name": "Alireza Sanaee", "email": "alireza.sanaee@huawei.com" }, "mbox": "http://patchwork.ozlabs.org/project/qemu-devel/cover/20260216145219.1959-1-alireza.sanaee@huawei.com/mbox/", "series": [ { "id": 492317, "url": "http://patchwork.ozlabs.org/api/series/492317/?format=api", "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/list/?series=492317", "date": "2026-02-16T14:52:17", "name": "Performant CXL type 3 non-interleaved regions", "version": 3, "mbox": "http://patchwork.ozlabs.org/series/492317/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/covers/2196925/comments/", "headers": { "Return-Path": "<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>", "X-Original-To": "incoming@patchwork.ozlabs.org", "Delivered-To": "patchwork-incoming@legolas.ozlabs.org", "Authentication-Results": "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org\n (client-ip=209.51.188.17; helo=lists.gnu.org;\n envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n receiver=patchwork.ozlabs.org)", "Received": [ "from lists.gnu.org (lists.gnu.org [209.51.188.17])\n\t(using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fF5Q85zhMz1xpY\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 17 Feb 2026 01:54:08 +1100 (AEDT)", "from localhost ([::1] helo=lists1p.gnu.org)\n\tby lists.gnu.org with esmtp (Exim 4.90_1)\n\t(envelope-from <qemu-devel-bounces@nongnu.org>)\n\tid 1vrzyz-0006r6-Od; Mon, 16 Feb 2026 09:53:53 -0500", "from eggs.gnu.org ([2001:470:142:3::10])\n by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <alireza.sanaee@huawei.com>)\n id 1vrzyx-0006qY-7j\n for qemu-devel@nongnu.org; Mon, 16 Feb 2026 09:53:51 -0500", "from frasgout.his.huawei.com ([185.176.79.56])\n by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <alireza.sanaee@huawei.com>)\n id 1vrzyt-0004XG-Bc\n for qemu-devel@nongnu.org; Mon, 16 Feb 2026 09:53:50 -0500", "from mail.maildlp.com (unknown [172.18.224.150])\n by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4fF5My5NB0zJ46Bv;\n Mon, 16 Feb 2026 22:52:14 +0800 (CST)", "from dubpeml500005.china.huawei.com (unknown [7.214.145.207])\n by mail.maildlp.com (Postfix) with ESMTPS id 58BAB40539;\n Mon, 16 Feb 2026 22:52:23 +0800 (CST)", "from a2303103017.china.huawei.com (10.47.67.55) by\n dubpeml500005.china.huawei.com (7.214.145.207) with Microsoft SMTP Server\n (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.2.1544.11; Mon, 16 Feb 2026 14:52:22 +0000" ], "To": "<qemu-devel@nongnu.org>, <lizhijian@fujitsu.com>", "CC": "<anisa.su887@gmail.com>, <armbru@redhat.com>, <david@kernel.org>,\n <gourry@gourry.net>, <imammedo@redhat.com>, <jonathan.cameron@huawei.com>,\n <linuxarm@huawei.com>, <mst@redhat.com>, <nifan.cxl@gmail.com>,\n <peterx@redhat.com>, <philmd@linaro.org>, <ppbonzini@redhat.com>,\n <venkataravis@micron.com>, <xiaoguangrong.eric@gmail.com>", "Subject": "[PATCH v3 0/2] Performant CXL type 3 non-interleaved regions", "Date": "Mon, 16 Feb 2026 14:52:16 +0000", "Message-ID": "<20260216145219.1959-1-alireza.sanaee@huawei.com>", "X-Mailer": "git-send-email 2.51.0.windows.2", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "Content-Type": "text/plain", "X-Originating-IP": "[10.47.67.55]", "X-ClientProxiedBy": "lhrpeml100009.china.huawei.com (7.191.174.83) To\n dubpeml500005.china.huawei.com (7.214.145.207)", "Received-SPF": "pass client-ip=185.176.79.56;\n envelope-from=alireza.sanaee@huawei.com; helo=frasgout.his.huawei.com", "X-Spam_score_int": "-41", "X-Spam_score": "-4.2", "X-Spam_bar": "----", "X-Spam_report": "(-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3,\n RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001,\n RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001,\n SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no", "X-Spam_action": "no action", "X-BeenThere": "qemu-devel@nongnu.org", "X-Mailman-Version": "2.1.29", "Precedence": "list", "List-Id": "qemu development <qemu-devel.nongnu.org>", "List-Unsubscribe": "<https://lists.nongnu.org/mailman/options/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>", "List-Archive": "<https://lists.nongnu.org/archive/html/qemu-devel>", "List-Post": "<mailto:qemu-devel@nongnu.org>", "List-Help": "<mailto:qemu-devel-request@nongnu.org?subject=help>", "List-Subscribe": "<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=subscribe>", "Reply-to": "Alireza Sanaee <alireza.sanaee@huawei.com>", "From": "Alireza Sanaee via qemu development <qemu-devel@nongnu.org>", "Errors-To": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org", "Sender": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org" }, "content": "Hey everyone,\n\nThis is v3 of performant CXL type 3 regions set:\n\nv2 -> v3: Addressing Zhijian Li. Thanks for the feedback.\nv1 -> v2: Mainly rebase.\n\nBase: On top of Jonathan's tree -> https://github.com/sarsanaee/qemu-lab/commits/performant-v3-E/\n\n==========================================================\n\nThe CXL address to device decoding logic is complex because of the need\nto correctly decode fine grained interleave. The current implementation\nprevents use with KVM where executed instructions may reside in that\nmemory and gives very slow performance even in TCG.\n\nIn many real cases non interleaved memory configurations are useful and\nfor those we can use a more conventional memory region alias allowing\nsimilar performance to other memory in the system.\n\nWhether this fast path is applicable can be established once the full\nset of HDM decoders has been committed (in whatever order the guest\ndecides to commit them). As such a check is performed on each commit /\nuncommit of HDM decoder to establish if the alias should be added or\nremoved.\n\n\nPerformance numbers:\n\nFor a read/write test with 4K block size, 256M region size, and 1 thread\nwith 100 iteration on TCG (it should do similar on KVM):\n\n - Non-interleaved region (fast path): 25-30 seconds.\n - Interleaved region (no fast path): Never finishes within 10\n minutes.\n\nTested Topologies and Region Layouts\n====================================\n\nThis series was validated across multiple CXL topology configurations,\ncovering single-device, multi-device, multi-host-bridge, and switched\nfabrics. Region creation was exercised using the `cxl` userspace tool\nwith both non-interleaved and interleaved setups.\n\nDecoder and memdev identifiers were discovered using:\n\n cxl list\n cxl list -D\n\nDecoder IDs (e.g. decoder0.0) and memdev names (mem0, mem1) are\nenvironment-specific. Commands below use placeholders such as\n<decoder_span_both> which should be replaced with IDs from `cxl list -D`.\n\n---------------------------------------------------------------------\n\nRegion Layout Notation\n----------------------\n\nCFMW (CXL Fixed Memory Window) is shown as a linear address space\ncontaining regions:\n\n CFMW: [ R0 | R1 | R2 ]\n\nR0, R1, R2 are regions created by `cxl create-region`.\n\nNon-interleaved region:\n\n R0 (ways=1) -> entirely on one device (mem0 or mem1)\n Fast path: APPLICABLE\n\n2-way interleaved region (g=256):\n\n R1 (ways=2, g=256) striped across devices:\n\n |mem0|mem1|mem0|mem1|mem0|mem1| ...\n 256 256 256 256 256 256 bytes\n\n Fast path: NOT APPLICABLE\n\n---------------------------------------------------------------------\n\n1) One device, one host bridge, one fixed window\n------------------------------------------------\n\nQEMU:\n\n -M q35,cxl=on,cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G\n -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=12\n -device cxl-rp,id=rp0,bus=cxl.0,port=0,chassis=0,slot=2\n -object memory-backend-ram,id=mem0,size=512M,share=on\n -device cxl-type3,id=dev0,bus=rp0,memdev=mem0\n\nTopology:\n\n Host\n |\n +-- CXL Host Bridge (cxl.0)\n |\n +-- Root Port (rp0)\n |\n +-- Type-3 (dev0, mem0)\n\nRegions created:\n\n cxl create-region ... -w 1 ... mem0 (Fast path: YES)\n cxl create-region ... -w 1 ... mem0 (Fast path: YES)\n\nLayout:\n\n CFMW: [ R0 | R1 ]\n\n R0 -> mem0 (Fast path: YES)\n R1 -> mem0 (Fast path: YES)\n\n---------------------------------------------------------------------\n\n2) One host bridge, two Type-3 devices (via two root ports)\n------------------------------------------------------------\n\nQEMU:\n\n -M q35,cxl=on,cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G\n -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=12\n -device cxl-rp,id=rp0,bus=cxl.0,port=0,chassis=0,slot=2\n -device cxl-rp,id=rp1,bus=cxl.0,port=1,chassis=0,slot=3\n -object memory-backend-ram,id=mem0,size=512M,share=on\n -object memory-backend-ram,id=mem1,size=512M,share=on\n -device cxl-type3,id=dev0,bus=rp0,memdev=mem0\n -device cxl-type3,id=dev1,bus=rp1,memdev=mem1\n\nTopology:\n\n Host\n |\n +-- CXL Host Bridge (cxl.0)\n |\n +-- Root Port (rp0) -- Type-3 (dev0, mem0)\n |\n +-- Root Port (rp1) -- Type-3 (dev1, mem1)\n\nRegion patterns exercised:\n\n2.1 All non-interleaved:\n R0 -> mem0 (Fast path: YES)\n R1 -> mem0 (Fast path: YES)\n R2 -> mem1 (Fast path: YES)\n R3 -> mem1 (Fast path: YES)\n\n2.2 Interleaved + local:\n R0 -> mem0/mem1 interleaved (Fast path: NO)\n R1 -> mem0 (Fast path: YES)\n\n2.3 Local + interleaved + local:\n R0 -> mem0 (Fast path: YES)\n R1 -> mem0/mem1 interleaved (Fast path: NO)\n R2 -> mem1 (Fast path: YES)\n\n---------------------------------------------------------------------\n\n3) Two host bridges, one device per host bridge\n------------------------------------------------\n\nQEMU:\n\n -M q35,cxl=on,\n cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G,\n cxl-fmw.1.targets.0=cxl.1,cxl-fmw.1.size=4G,\n cxl-fmw.2.targets.0=cxl.0,cxl-fmw.2.size=4G\n -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=12\n -device cxl-rp,id=rp0,bus=cxl.0,port=0,chassis=0,slot=2\n -object memory-backend-ram,id=mem0,size=512M,share=on\n -device cxl-type3,id=dev0,bus=rp0,memdev=mem0\n -device pxb-cxl,id=cxl.1,bus=pcie.0,bus_nr=13\n -device cxl-rp,id=rp1,bus=cxl.1,port=0,chassis=1,slot=2\n -object memory-backend-ram,id=mem1,size=512M,share=on\n -device cxl-type3,id=dev1,bus=rp1,memdev=mem1\n\nRegion patterns identical to section 2, and fast-path applicability is\nidentical per region mapping (non-interleaved: YES, interleaved: NO).\n\n---------------------------------------------------------------------\n\n4) Switch topology\n------------------\n\nQEMU:\n\n -M q35,cxl=on,cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G\n -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=12\n -device cxl-rp,id=rp0,bus=cxl.0,port=0,chassis=0,slot=2\n -device cxl-rp,id=rp1,bus=cxl.0,port=0,chassis=0,slot=3\n -device cxl-upstream,id=us0,bus=rp0\n -device cxl-downstream,id=ds0,bus=us0,port=0,chassis=0,slot=4\n -object memory-backend-ram,id=mem0,size=512M,share=on\n -device cxl-type3,id=dev0,bus=ds0,memdev=mem0\n\nTopology (detailed):\n\n Host\n |\n +-- CXL Host Bridge (cxl.0)\n |\n +-- Root Port (rp0)\n | |\n | +-- CXL Switch (upstream us0)\n | |\n | +-- Downstream Port (ds0) -- Type-3 (mem0)\n | |\n | +-- Downstream Port (ds1) -- Type-3 (mem1) [optional]\n +-- Root Port (rp1)\n |\n +-- More devices/switches.\n\nFast-path interpretation in this topology:\n\n If only mem0 exists:\n All regions -> Fast path: YES\n\n If mem0 and mem1 exist:\n Non-interleaved regions -> Fast path: YES\n Interleaved regions -> Fast path: NO\n\n---------------------------------------------------------------------\n\nSummary\n-------\n\nAcross all topologies, region creation, enablement, and HDM decoder\ncommit/uncommit flows were exercised. The fast path is enabled only when\nall decoders describe a non-interleaved mapping and is removed when any\ninterleave configuration is introduced.\n\nAlireza Sanaee (2):\n hw/cxl: Use HPA in cxl_cfmws_find_device() rather than offset in\n window.\n hw/cxl: Add a performant (and correct) path for the non interleaved\n cases\n\n hw/cxl/cxl-component-utils.c | 6 ++\n hw/cxl/cxl-host.c | 203 +++++++++++++++++++++++++++++++++--\n hw/mem/cxl_type3.c | 4 +\n include/hw/cxl/cxl.h | 1 +\n include/hw/cxl/cxl_device.h | 1 +\n 5 files changed, 206 insertions(+), 9 deletions(-)" }