[{"id":3687312,"web_url":"http://patchwork.ozlabs.org/comment/3687312/","msgid":"<d42199e8-af04-4232-a9eb-eecd2355c314@intel.com>","list_archive_url":null,"date":"2026-05-06T18:34:15","subject":"Re: [PATCH v17 11/11] Documentation: cxl: Document CXL protocol error\n handling","submitter":{"id":13225,"url":"http://patchwork.ozlabs.org/api/people/13225/","name":"Dave Jiang","email":"dave.jiang@intel.com"},"content":"On 5/5/26 10:30 AM, Terry Bowman wrote:\n> Add Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> describing the end-to-end CXL protocol error path: AER ingress, the\n> AER-CXL kfifo handoff, the cxl_core consumer worker, RCD/RCH special\n> cases, severity policy, trace events, and a source code map.\n> \n> This documents the architecture introduced by the preceding patches in\n> this series.\n> \n> This was generated by claude-opus-4.7.\n> \n> Assisted-by: Claude:claude-opus-4.7\n> Signed-off-by: Terry Bowman <terry.bowman@amd.com>\n> ---\n>  Documentation/driver-api/cxl/index.rst        |   1 +\n>  .../cxl/linux/protocol-error-handling.rst     | 440 ++++++++++++++++++\n>  2 files changed, 441 insertions(+)\n>  create mode 100644 Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> \n> diff --git a/Documentation/driver-api/cxl/index.rst b/Documentation/driver-api/cxl/index.rst\n> index 3dfae1d310ca..6861b2e5726a 100644\n> --- a/Documentation/driver-api/cxl/index.rst\n> +++ b/Documentation/driver-api/cxl/index.rst\n> @@ -42,6 +42,7 @@ that have impacts on each other.  The docs here break up configurations steps.\n>     linux/dax-driver\n>     linux/memory-hotplug\n>     linux/access-coordinates\n> +   linux/protocol-error-handling\n>  \n>  .. toctree::\n>     :maxdepth: 2\n> diff --git a/Documentation/driver-api/cxl/linux/protocol-error-handling.rst b/Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> new file mode 100644\n> index 000000000000..4d6f33f0ed31\n> --- /dev/null\n> +++ b/Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> @@ -0,0 +1,440 @@\n> +.. SPDX-License-Identifier: GPL-2.0\n> +\n> +==============================\n> +CXL Protocol Error Handling\n> +==============================\n> +\n> +This document describes how the kernel detects, classifies, dispatches,\n> +logs, and recovers from CXL protocol errors signaled through the PCIe\n> +Advanced Error Reporting (AER) interface. It covers both Virtual\n> +Hierarchy (VH) topologies (Root Ports, Upstream/Downstream Switch\n> +Ports, and Endpoints) and Restricted CXL Host (RCH) topologies\n> +(Root Complex Event Collectors driving Restricted CXL Devices).\n> +\n> +It is intended for kernel developers maintaining or extending\n> +``drivers/pci/pcie/aer*.c``, ``drivers/cxl/core/ras.c``, and the\n> +related plumbing in ``include/linux/aer.h``.\n> +\n> +\n> +Background\n> +==========\n> +\n> +A CXL device reports protocol-layer failures (CXL.cachemem RAS) as\n> +PCIe AER **Internal Errors**: ``PCI_ERR_COR_INTERNAL`` for correctable\n> +events and ``PCI_ERR_UNC_INTN`` for uncorrectable events. From the AER\n> +core's point of view these look like ordinary PCIe AER messages, but\n> +their semantics are CXL-specific: the actual fault information lives\n> +in CXL RAS capability registers, not in the PCIe AER status registers.\n> +\n> +Historically, native CXL.cachemem RAS handling was implemented only\n> +for CXL Endpoints and for RCH Downstream Ports. CXL Root Ports,\n> +Upstream Switch Ports, and Downstream Switch Ports were not covered.\n> +This left the kernel unable to log or react to protocol errors\n> +signaled by switch components.\n> +\n> +The unified CXL protocol error path closes that gap by routing every\n> +CXL Internal Error through a single producer/consumer pipeline shared\n> +by all CXL device types.\n> +\n> +\n> +Architecture overview\n> +=====================\n> +\n> +CXL protocol error handling is implemented as a distinct error plane\n> +layered on top of the existing PCIe AER infrastructure. The two planes\n> +are kept separate:\n> +\n> +* The **PCIe AER plane** continues to handle native PCIe errors\n> +  (Receiver overflows, malformed TLPs, completion timeouts, and so\n> +  on). This is unchanged.\n> +\n> +* The **CXL protocol error plane** owns CXL Internal Errors. The AER\n> +  core forwards them to ``cxl_core`` via a dedicated kfifo; ``cxl_core``\n> +  then dispatches to CE/UE handlers and drives the recovery and\n> +  panic policy.\n> +\n> +The boundary between the two planes is ``is_cxl_error()`` in\n> +``drivers/pci/pcie/aer_cxl_vh.c``, which inspects ``info->is_cxl``\n> +(set from ``pcie_is_cxl()``) together with the PCIe device type and\n> +the AER status word. When ``is_cxl_error()`` returns true the event\n> +is enqueued into the AER-CXL kfifo; otherwise the event flows through\n> +``pci_aer_handle_error()`` as before.\n> +\n> +The pipeline has three layers:\n> +\n> +1. **Producer** (``aer_cxl_vh.c``, ``aer_cxl_rch.c``) - runs in AER\n> +   IRQ/threaded context, classifies, clears the AER CE status, and\n> +   enqueues ``struct cxl_proto_err_work_data``.\n> +2. **Queue** - the AER-CXL kfifo plus a backing ``struct work_struct``.\n> +3. **Consumer** (``cxl_core/ras.c``) - workqueue-context worker that\n> +   resolves the CXL Port topology and dispatches to CE/UE handlers.\n> +\n> +\n> +Topologies\n> +==========\n> +\n> +Two topologies are supported, and both feed the same kfifo.\n> +\n> +Virtual Hierarchy (VH)\n> +----------------------\n> +\n> +A standard CXL VH consists of a CXL Root Port (RP), an optional CXL\n> +Upstream Switch Port (USP), one or more CXL Downstream Switch Ports\n\nI think it's clearer if you say \"an optional CXL Upstream Switch Port (USP)\nwith one or more CXL Downstream Switch Ports (DSP)\" to indicate that this is\na wholly contained component. Otherwise it reads that only the USP is\noptional.\n\nDJ\n\n> +(DSPs), and CXL Endpoints (EPs) attached to the DSPs. Each component\n> +is a regular PCIe device with a CXL DVSEC and a CXL RAS capability,\n> +and it raises Internal Errors directly to the AER subsystem via the\n> +RP's MSI/MSI-X interrupt.\n> +\n> +The VH producer is ``cxl_forward_error()`` in\n> +``drivers/pci/pcie/aer_cxl_vh.c``.\n> +\n> +Restricted CXL Host (RCH)\n> +-------------------------\n> +\n> +In the RCH topology, a Root Complex Event Collector (RCEC) aggregates\n> +errors from one or more Restricted CXL Devices (RCDs) attached as\n> +Root Complex Integrated Endpoints. The RCEC delivers the AER\n> +interrupt; the AER driver iterates the RCDs beneath it.\n> +\n> +The RCH producer is ``cxl_rch_handle_error_iter()`` in\n> +``drivers/pci/pcie/aer_cxl_rch.c``. For each RCD it finds, it calls\n> +``cxl_forward_error()`` (the same producer helper used by the VH\n> +path), so RCH events end up in the same AER-CXL kfifo as VH events.\n> +\n> +\n> +End-to-end flow\n> +===============\n> +\n> +The diagram below shows the full path from an AER interrupt through\n> +producer classification, kfifo handoff, and consumer dispatch.\n> +\n> +.. code-block:: text\n> +\n> +   +-------------------------------------------------------------------------+\n> +   |                  CXL Internal Error Packet Flow                         |\n> +   |    From PCIe AER Interrupt to CXL Protocol Error Handling and Logging   |\n> +   +-------------------------------------------------------------------------+\n> +\n> +      CXL device (RP / USP / DSP / EP / RCD) raises AER Internal Error\n> +      (correctable PCI_ERR_COR_INTERNAL or uncorrectable PCI_ERR_UNC_INTN)\n> +                      |\n> +                      v\n> +      +-------------------------------------------------------------+\n> +      |    PCIe Root Port AER MSI/MSI-X interrupt fires             |\n> +      +-------------------------------------------------------------+\n> +                      |\n> +      ============= drivers/pci/pcie/aer.c (AER core) =============\n> +                      |\n> +                      v\n> +           +---------------------------------+\n> +           |  aer_irq()  /  aer_isr()        |  (top + threaded handler)\n> +           +---------------------------------+\n> +                      |\n> +                      v\n> +           +---------------------------------+\n> +           |  aer_isr_one_error()            |\n> +           |  aer_isr_one_error_type()       |\n> +           +---------------------------------+\n> +                      |\n> +                      v\n> +          +------------------------------------------+\n> +          |  aer_get_device_error_info()             |\n> +          |  - reads PCI_ERR_COR_STATUS              |\n> +          |  - reads PCI_ERR_UNCOR_STATUS  (*if RP/  |\n> +          |    RCEC/DSP, or non-fatal severity)      |\n> +          |  - sets info->is_cxl = pcie_is_cxl(dev)  |\n> +          +------------------------------------------+\n> +                      |\n> +                      v\n> +           +---------------------------------+\n> +           |  handle_error_source(dev, info) |\n> +           +---------------------------------+\n> +              |                          |\n> +              |  is_cxl_error()          +--->  pci_aer_handle_error()\n> +              |  (CXL device + Internal)        (native PCIe AER path,\n> +              v                                  not covered here)\n> +      +-------------------------------------------------------------+\n> +      | Topology dispatch within AER core:                          |\n> +      |                                                             |\n> +      |   - VH topology  (RP / USP / DSP / EP)                      |\n> +      |     -> drivers/pci/pcie/aer_cxl_vh.c                        |\n> +      |                                                             |\n> +      |   - RCH topology (RCEC iterates RCDs under it)              |\n> +      |     -> drivers/pci/pcie/aer_cxl_rch.c                       |\n> +      +-------------------------------------------------------------+\n> +           |                                            |\n> +           | VH path                            RCH path (RCEC AER)\n> +           v                                            v\n> +      ============= aer_cxl_vh.c (VH      ============= aer_cxl_rch.c (RCH\n> +                    producer) =============              producer) ==========\n> +           |                                            |\n> +           v                                            v\n> +      +-----------------------------+         +-------------------------------+\n> +      | cxl_forward_error(pdev,info)|         | cxl_rch_handle_error_iter()   |\n> +      |  - if AER_CORRECTABLE:      |         |  - iterate each RCD pdev      |\n> +      |     clear PCI_ERR_COR_STATUS|         |    beneath the RCEC           |\n> +      |  - pci_dev_get(pdev)        |         |  - call cxl_forward_error()   |\n> +      |  - build cxl_proto_err_     |         |    for each RCD               |\n> +      |    work_data                |         |    (same producer helper as   |\n> +      |    { pdev, severity }       |         |     the VH path uses)         |\n> +      |  - kfifo_in_spinlocked(...) |         +-------------------------------+\n> +      |  - schedule_work(...)       |                       |\n> +      +-----------------------------+                       |\n> +              |                                             |\n> +              +-----------------+---------------------------+\n> +                                |\n> +                                v\n> +                    +--------------------------+\n> +                    |     AER-CXL kfifo        |\n> +                    |     (work_struct)        |\n> +                    +--------------------------+\n> +                                |\n> +                                v\n> +      ============= drivers/cxl/core/ras.c (consumer worker) =======\n> +                                |\n> +                                v\n> +      +-------------------------------------------------------------+\n> +      | cxl_proto_err_work_fn() (workqueue handler)                 |\n> +      |   for_each_cxl_proto_err(&wd, __cxl_proto_err_work_fn)      |\n> +      +-------------------------------------------------------------+\n> +                      |\n> +                      v\n> +      +-------------------------------------------------------------+\n> +      | __cxl_proto_err_work_fn(wd)                                 |\n> +      |   port = find_cxl_port_by_dev(&pdev->dev, &dport)           |\n> +      |   cxl_handle_proto_error(pdev, port, dport, severity)       |\n> +      |   pci_dev_put(pdev)                                         |\n> +      +-------------------------------------------------------------+\n> +                      |\n> +                      v\n> +      +-------------------------------------------------------------+\n> +      | cxl_handle_proto_error()                                    |\n> +      +-------------------------------------------------------------+\n> +           |                                            |\n> +      pci_pcie_type ==                          pci_pcie_type !=\n> +      PCI_EXP_TYPE_RC_END                       PCI_EXP_TYPE_RC_END\n> +      (RCD Endpoint)                            (VH: RP/USP/DSP/EP)\n> +           |                                            |\n> +           v                                            |\n> +      +-------------------------------------+           |\n> +      | cxl_handle_rdport_errors(pdev)      |           |\n> +      |   - process RCH Downstream Port's   |           |\n> +      |     RAS register block first        |           |\n> +      |   - cxl_handle_cor_ras() for CE     |           |\n> +      |   - cxl_handle_ras() for UE         |           |\n> +      |     (log only; does NOT panic)      |           |\n> +      +-------------------------------------+           |\n> +           |                                            |\n> +           +--------------------+-----------------------+\n> +                                |\n> +                                v\n> +                   +-----------------------------+\n> +                   | severity == AER_CORRECTABLE |\n> +                   +-----------------------------+\n> +                         |                  |\n> +                         yes                no\n> +                         v                  v\n> +            +----------------------+   +-------------------------+\n> +            | cxl_handle_cor_ras() |   | cxl_do_recovery()       |\n> +            |  - emit cxl_aer_     |   | (described below)       |\n> +            |    correctable_      |   +-------------------------+\n> +            |    error trace       |\n> +            | pcie_clear_device_   |\n> +            |   status()           |\n> +            +----------------------+\n> +\n> +                    +-------------------------------+\n> +                    | cxl_do_recovery()             |\n> +                    |  if pci_dev_is_disconnected:  |\n> +                    |    panic(\"CXL cachemem err.\") |\n> +                    |                               |\n> +                    |  ue = cxl_handle_ras()        |\n> +                    |    -> emit                    |\n> +                    |       cxl_aer_uncorrectable_  |\n> +                    |       error trace event       |\n> +                    |                               |\n> +                    |  if (ue):                     |\n> +                    |    panic(\"CXL cachemem err.\") |\n> +                    |                               |\n> +                    |  pcie_clear_device_status()   |\n> +                    |  pci_aer_clear_nonfatal_status|\n> +                    |  pci_aer_clear_fatal_status   |\n> +                    +-------------------------------+\n> +\n> +\n> +Severity policy\n> +===============\n> +\n> +The kernel's response to a CXL protocol error depends on the AER\n> +severity reported by the device and on the result of inspecting the\n> +CXL RAS registers.\n> +\n> +Correctable Error (CE)\n> +----------------------\n> +\n> +* The AER driver clears ``PCI_ERR_COR_STATUS`` in the producer\n> +  (``cxl_forward_error()``) before enqueue, so the device is\n> +  acknowledged even if the consumer drops the event.\n> +* The consumer's ``cxl_handle_cor_ras()`` reads and clears the CXL\n> +  RAS correctable status and emits a ``cxl_aer_correctable_error``\n> +  trace event.\n> +* No recovery action is taken.\n> +\n> +Uncorrectable Error (UE), non-fatal\n> +-----------------------------------\n> +\n> +* The producer enqueues the event without clearing the AER UCE\n> +  status.\n> +* The consumer enters ``cxl_do_recovery()``.\n> +* ``cxl_handle_ras()`` reads the CXL RAS uncorrectable status and\n> +  emits a ``cxl_aer_uncorrectable_error`` trace event.\n> +* If ``cxl_handle_ras()`` returns true (a CXL RAS UE bit was set),\n> +  the kernel panics with ``\"CXL cachemem error.\"``. CXL.cachemem\n> +  traffic cannot be safely recovered in software once corruption is\n> +  observed; continuing risks silent data loss across all devices in\n> +  an interleaved HDM region.\n> +* If ``cxl_handle_ras()`` returns false (no CXL RAS bit set, i.e.\n> +  the AER UCE was a PCIe-side issue rather than a CXL.cachemem\n> +  issue), the AER UCE status is cleared and execution continues.\n> +\n> +Uncorrectable Error (UE), fatal\n> +-------------------------------\n> +\n> +Fatal severity follows the same recovery path as non-fatal in\n> +``cxl_do_recovery()``, with one important caveat: the AER core only\n> +reads ``PCI_ERR_UNCOR_STATUS`` for Root Ports, RCECs, Downstream\n> +Ports, or non-fatal severities (see ``aer_get_device_error_info()``\n> +in ``drivers/pci/pcie/aer.c``). For a fatal UE signaled by an\n> +upstream component, PCI config reads to the source device are\n> +expected to fail, so ``UNCOR_STATUS`` is never retrieved and\n> +``info->status`` stays zero.\n> +\n> +The practical consequence: a fatal UE on an Upstream Switch Port or\n> +Endpoint is **not** classified as a CXL error by ``is_cxl_error()``.\n> +It falls through to ``pci_aer_handle_error()`` and is processed by\n> +the standard AER recovery flow. Only the CXL trace events emitted by\n> +the AER core (``aer_event``) appear; the CXL-specific\n> +``cxl_aer_uncorrectable_error`` event is not emitted on this path.\n> +\n> +Disconnect during recovery\n> +--------------------------\n> +\n> +``cxl_do_recovery()`` checks ``pci_dev_is_disconnected(pdev)`` before\n> +touching the RAS registers. A device disconnecting during an\n> +uncorrectable error event is itself unrecoverable, particularly when\n> +the device backs an interleaved HDM region; in that case the kernel\n> +panics directly rather than returning ``~0u`` from the readl() and\n> +masking the cause.\n> +\n> +\n> +RCD/RCH special cases\n> +=====================\n> +\n> +RCD Endpoint flow\n> +-----------------\n> +\n> +When ``cxl_handle_proto_error()`` sees ``pci_pcie_type(pdev) ==\n> +PCI_EXP_TYPE_RC_END`` (i.e. an RCD Endpoint), it calls\n> +``cxl_handle_rdport_errors()`` first. This processes the RAS state\n> +of the RCH Downstream Port that hosts the RCD before falling through\n> +to the common CE/UE dispatch on the RCD Endpoint itself.\n> +\n> +The RCH Downstream Port's RAS UE is **logged only**: it emits the\n> +trace event but does not panic. The panic decision is taken on the\n> +RCD Endpoint's own RAS in ``cxl_do_recovery()``.\n> +\n> +This split mirrors the structure of an RCH topology: the RCH dport\n> +is functionally a CXL infrastructure component (similar to a switch\n> +port), while the RCD itself is the actual CXL.cachemem source whose\n> +corruption drives the recovery decision.\n> +\n> +RCH ingress aggregation\n> +-----------------------\n> +\n> +RCH errors do not arrive on a per-RCD interrupt. The RCEC is the AER\n> +source, and the AER driver drives ``cxl_rch_handle_error_iter()`` to\n> +walk each RCD beneath it and forward an event per RCD through the\n> +shared kfifo. From the consumer's point of view, RCH-originated\n> +events are indistinguishable from VH events.\n> +\n> +\n> +Trace events\n> +============\n> +\n> +Two unified trace events are emitted from ``cxl_handle_cor_ras()``\n> +and ``cxl_handle_ras()`` and are used by every CXL device type and\n> +both topologies:\n> +\n> +* ``cxl_aer_correctable_error`` - emitted when a CXL RAS CE bit is\n> +  set; carries the human-readable status string.\n> +* ``cxl_aer_uncorrectable_error`` - emitted when a CXL RAS UE bit is\n> +  set; carries both the current status and the first-error pointer.\n> +\n> +Common fields:\n> +\n> +* ``device=<PCI BDF>`` - the source device (always a PCI BDF, even\n> +  for RCH paths where the trace was historically a memdev name).\n> +* ``host=<bridge>`` - the parent host bridge or PCI host BDF.\n> +* ``serial=<u64>`` - the device serial from ``pci_get_dsn()``.\n> +\n> +The ``device`` field replaces the older ``memdev`` field that earlier\n> +revisions emitted on Endpoint events. Userspace consumers\n> +(rasdaemon's ``ras-cxl-handler.c``) need a corresponding update to\n> +read the new field name.\n> +\n> +\n> +Source code map\n> +===============\n> +\n> +============================================  ==============================\n> +File                                          Role\n> +============================================  ==============================\n> +``drivers/pci/pcie/aer.c``                    AER core; receives the IRQ,\n> +                                              builds ``aer_err_info``,\n> +                                              dispatches to either the CXL\n> +                                              path (``is_cxl_error()``) or\n> +                                              ``pci_aer_handle_error()``.\n> +``drivers/pci/pcie/aer_cxl_vh.c``             VH producer; provides\n> +                                              ``is_cxl_error()``,\n> +                                              ``cxl_forward_error()``, the\n> +                                              AER-CXL kfifo, and the\n> +                                              consumer registration\n> +                                              helpers.\n> +``drivers/pci/pcie/aer_cxl_rch.c``            RCH producer; iterates RCDs\n> +                                              under an RCEC and forwards\n> +                                              each via\n> +                                              ``cxl_forward_error()``.\n> +``drivers/cxl/core/ras.c``                    Consumer; defines\n> +                                              ``cxl_proto_err_work_fn()``,\n> +                                              ``cxl_handle_proto_error()``,\n> +                                              ``cxl_handle_rdport_errors()``,\n> +                                              ``cxl_do_recovery()``,\n> +                                              ``cxl_handle_cor_ras()`` and\n> +                                              ``cxl_handle_ras()``.\n> +``include/linux/aer.h``                       Public declarations:\n> +                                              ``struct cxl_proto_err_work_data``,\n> +                                              ``cxl_proto_err_fn_t``,\n> +                                              ``cxl_register_proto_err_work()``\n> +                                              and ``for_each_cxl_proto_err()``.\n> +============================================  ==============================\n> +\n> +\n> +Limitations and future work\n> +===========================\n> +\n> +* **USP/EP fatal UCE is not classified as CXL.** As described under\n> +  `Severity policy`_, the AER core never retrieves\n> +  ``PCI_ERR_UNCOR_STATUS`` in this scenario, so ``is_cxl_error()``\n> +  cannot tag the event as CXL. The event is handled by the AER path\n> +  only. Resolving this requires either an AER-core change to attempt\n> +  a config read with link-validity gating, or a separate CXL-side\n> +  notification mechanism for upstream-signaled fatal events.\n> +* **User-defined status masks** are not yet supported. All CE and UE\n> +  status bits are reported as they appear in the RAS register.\n> +* **Port traversing in cxl_do_recovery()** is not yet implemented; a\n> +  CXL UE today is reported and acted on at the source device only,\n> +  not propagated to ancestor ports.\n> +* The RCH producer (``aer_cxl_rch.c``) currently lives under\n> +  ``drivers/pci/pcie/`` for historical reasons. Moving it to\n> +  ``drivers/cxl/core/ras_rch.c`` is on the roadmap.\n> +","headers":{"Return-Path":"\n <linux-pci+bounces-53936-incoming=patchwork.ozlabs.org@vger.kernel.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-pci@vger.kernel.org"],"Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256\n header.s=Intel header.b=L6h1pyTj;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c09:e001:a7::12fc:5321; helo=sto.lore.kernel.org;\n envelope-from=linux-pci+bounces-53936-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com\n header.b=\"L6h1pyTj\"","smtp.subspace.kernel.org;\n arc=none smtp.client-ip=192.198.163.12","smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=intel.com","smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=intel.com"],"Received":["from sto.lore.kernel.org (sto.lore.kernel.org\n [IPv6:2600:3c09:e001:a7::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g9kYv4rr2z1y04\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 07 May 2026 04:34:27 +1000 (AEST)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sto.lore.kernel.org (Postfix) with ESMTP id 1C3CD30090B5\n\tfor <incoming@patchwork.ozlabs.org>; Wed,  6 May 2026 18:34:23 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 6350F3F0775;\n\tWed,  6 May 2026 18:34:21 +0000 (UTC)","from mgamail.intel.com (mgamail.intel.com [192.198.163.12])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id 358BF47DD6A;\n\tWed,  6 May 2026 18:34:19 +0000 (UTC)","from fmviesa001.fm.intel.com ([10.60.135.141])\n  by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;\n 06 May 2026 11:34:18 -0700","from cmdeoliv-mobl4.amr.corp.intel.com (HELO [10.125.110.169])\n ([10.125.110.169])\n  by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;\n 06 May 2026 11:34:16 -0700"],"ARC-Seal":"i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1778092461; cv=none;\n b=jEY0xGA5sQoZNulhr0uOkIlSWO+5C8dTe6feVqIJF0aXLAghoxSAI4w0hhMK/gfxPT4GfDlPpsVFsMRtyHRCDjM56CklEAjomnkle3KS8DeuMEPMfz0cj+NPIVylqIcN4CR79zUJNbHdeNSxXAy0QDCQ0wA5a/2Nsj6VXReDS7M=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1778092461; c=relaxed/simple;\n\tbh=QDrBENxjsSz9HpYZlPbembLQuDucx2Gt2nomRrO4fQI=;\n\th=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From:\n\t In-Reply-To:Content-Type;\n b=ljnnq3at85vOEe7x4C3S6gepAVyZIiOTK9aqGmFrDToro/qNIWh6SOUrf3WRnK/TCuMxyHr1wiQxfL4IhZqEyhBLQAX0vq/bbg8BE0XFo15kVYCPvz6qk1oFq6REPKZ/JUsoSKEVODCgsUvJ2THhGgja+wBUiNQW2JNcj9J4beQ=","ARC-Authentication-Results":"i=1; smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=intel.com;\n spf=pass smtp.mailfrom=intel.com;\n dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com\n header.b=L6h1pyTj; arc=none smtp.client-ip=192.198.163.12","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple;\n  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;\n  t=1778092459; x=1809628459;\n  h=message-id:date:mime-version:subject:to:cc:references:\n   from:in-reply-to:content-transfer-encoding;\n  bh=QDrBENxjsSz9HpYZlPbembLQuDucx2Gt2nomRrO4fQI=;\n  b=L6h1pyTj2kYkzFpfrCrYProu2BdJ7UI5bdbyqhS/SHuCeMiqcb3P2msy\n   AYRAANiMCbdG4LUqql2RwQVyW8EJQVWHJTUGvCqQ98eepfevJN2QkcKJN\n   b3g5IrnFlxJV37MJYHMSfODnvWPU9IuQX/sRPrY02Fwg5U+x55mWNjtdz\n   LerWJfVZb1D+fowsfeo/baW/cEtQw40KDjxWK3iol8abPl4AjS0SA2nis\n   f0VTfOf2J1/qGC4nVJn3HpEJK5/ak4lrcSGUKok4KPLCvt8y+IsULZJ4M\n   wnWdMuqot98fY6quB4JwqROuZb8rmnVVhnCdDJivcSX5NNauEfDIkxUqT\n   Q==;","X-CSE-ConnectionGUID":["LSwhBE/OT6KLVL9dq7TxGQ==","pktBvql3TUWGSUHC8UGOWA=="],"X-CSE-MsgGUID":["VsFMiQ1FRBW5naC8K4nsBQ==","Y4XOKrpiQu2NG0OJ84GsmA=="],"X-IronPort-AV":["E=McAfee;i=\"6800,10657,11778\"; a=\"82879344\"","E=Sophos;i=\"6.23,220,1770624000\";\n   d=\"scan'208\";a=\"82879344\"","E=Sophos;i=\"6.23,220,1770624000\";\n   d=\"scan'208\";a=\"259919397\""],"X-ExtLoop1":"1","Message-ID":"<d42199e8-af04-4232-a9eb-eecd2355c314@intel.com>","Date":"Wed, 6 May 2026 11:34:15 -0700","Precedence":"bulk","X-Mailing-List":"linux-pci@vger.kernel.org","List-Id":"<linux-pci.vger.kernel.org>","List-Subscribe":"<mailto:linux-pci+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-pci+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH v17 11/11] Documentation: cxl: Document CXL protocol error\n handling","To":"Terry Bowman <terry.bowman@amd.com>, dave@stgolabs.net, jic23@kernel.org,\n alison.schofield@intel.com, djbw@kernel.org, bhelgaas@google.com,\n shiju.jose@huawei.com, ming.li@zohomail.com,\n Smita.KoralahalliChannabasappa@amd.com, rrichter@amd.com,\n dan.carpenter@linaro.org, PradeepVineshReddy.Kodamati@amd.com,\n lukas@wunner.de, Benjamin.Cheatham@amd.com,\n sathyanarayanan.kuppuswamy@linux.intel.com, vishal.l.verma@intel.com,\n alucerop@amd.com, ira.weiny@intel.com, corbet@lwn.net, rafael@kernel.org,\n xueshuai@linux.alibaba.com, linux-cxl@vger.kernel.org","Cc":"linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,\n linux-acpi@vger.kernel.org, linux-doc@vger.kernel.org","References":"<20260505173029.2718246-1-terry.bowman@amd.com>\n <20260505173029.2718246-12-terry.bowman@amd.com>","Content-Language":"en-US","From":"Dave Jiang <dave.jiang@intel.com>","In-Reply-To":"<20260505173029.2718246-12-terry.bowman@amd.com>","Content-Type":"text/plain; charset=UTF-8","Content-Transfer-Encoding":"7bit"}},{"id":3687997,"web_url":"http://patchwork.ozlabs.org/comment/3687997/","msgid":"<20260507195156.3757a20b@jic23-huawei>","list_archive_url":null,"date":"2026-05-07T18:51:56","subject":"Re: [PATCH v17 11/11] Documentation: cxl: Document CXL protocol\n error handling","submitter":{"id":10151,"url":"http://patchwork.ozlabs.org/api/people/10151/","name":"Jonathan Cameron","email":"jic23@kernel.org"},"content":"On Tue, 5 May 2026 12:30:29 -0500\nTerry Bowman <terry.bowman@amd.com> wrote:\n\n> Add Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> describing the end-to-end CXL protocol error path: AER ingress, the\n> AER-CXL kfifo handoff, the cxl_core consumer worker, RCD/RCH special\n> cases, severity policy, trace events, and a source code map.\n> \n> This documents the architecture introduced by the preceding patches in\n> this series.\n> \n> This was generated by claude-opus-4.7.\n\nMaybe too much?  I got bored reading it and stopped which is probably\nnot the best sign.\n\nA few formatting related comments inline.\n\nThanks,\n\nJ\n> \n> Assisted-by: Claude:claude-opus-4.7\n> Signed-off-by: Terry Bowman <terry.bowman@amd.com>\n> ---\n>  Documentation/driver-api/cxl/index.rst        |   1 +\n>  .../cxl/linux/protocol-error-handling.rst     | 440 ++++++++++++++++++\n>  2 files changed, 441 insertions(+)\n>  create mode 100644 Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> \n> diff --git a/Documentation/driver-api/cxl/index.rst b/Documentation/driver-api/cxl/index.rst\n> index 3dfae1d310ca..6861b2e5726a 100644\n> --- a/Documentation/driver-api/cxl/index.rst\n> +++ b/Documentation/driver-api/cxl/index.rst\n> @@ -42,6 +42,7 @@ that have impacts on each other.  The docs here break up configurations steps.\n>     linux/dax-driver\n>     linux/memory-hotplug\n>     linux/access-coordinates\n> +   linux/protocol-error-handling\n>  \n>  .. toctree::\n>     :maxdepth: 2\n> diff --git a/Documentation/driver-api/cxl/linux/protocol-error-handling.rst b/Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> new file mode 100644\n> index 000000000000..4d6f33f0ed31\n> --- /dev/null\n> +++ b/Documentation/driver-api/cxl/linux/protocol-error-handling.rst\n> @@ -0,0 +1,440 @@\n> +.. SPDX-License-Identifier: GPL-2.0\n> +\n> +==============================\n> +CXL Protocol Error Handling\n> +==============================\n> +\n> +This document describes how the kernel detects, classifies, dispatches,\n> +logs, and recovers from CXL protocol errors signaled through the PCIe\n> +Advanced Error Reporting (AER) interface. It covers both Virtual\n> +Hierarchy (VH) topologies (Root Ports, Upstream/Downstream Switch\n> +Ports, and Endpoints) and Restricted CXL Host (RCH) topologies\n> +(Root Complex Event Collectors driving Restricted CXL Devices).\n\nOdd drifting wrapping. I thought only humans did that. I guess it's common\nenough in kernel docs maybe it learn it!  Anyhow, I think Docs are 80 char\nlimit in which case something like:\n\nThis document describes how the kernel detects, classifies, dispatches, logs,\nand recovers from CXL protocol errors signaled through the PCIe Advanced Error\nReporting (AER) interface. It covers both Virtual Hierarchy (VH) topologies\n(Root Ports, Upstream/Downstream Switch Ports, and Endpoints) and Restricted\nCXL Host (RCH) topologies (Root Complex Event Collectors driving Restricted\nCXL Devices).\n\nMaybe t was intentional to keep lines similar lengths and brackets on last one?\nI'm not sure..\n\n> +\n> +It is intended for kernel developers maintaining or extending\n> +``drivers/pci/pcie/aer*.c``, ``drivers/cxl/core/ras.c``, and the\n> +related plumbing in ``include/linux/aer.h``.\n> +\n> +\n> +Background\n> +==========\n> +\n> +A CXL device reports protocol-layer failures (CXL.cachemem RAS) as\n> +PCIe AER **Internal Errors**: ``PCI_ERR_COR_INTERNAL`` for correctable\n> +events and ``PCI_ERR_UNC_INTN`` for uncorrectable events. From the AER\n> +core's point of view these look like ordinary PCIe AER messages, but\n> +their semantics are CXL-specific: the actual fault information lives\n> +in CXL RAS capability registers, not in the PCIe AER status registers.\n> +\n> +Historically, native CXL.cachemem RAS handling was implemented only\n> +for CXL Endpoints and for RCH Downstream Ports. CXL Root Ports,\n> +Upstream Switch Ports, and Downstream Switch Ports were not covered.\n> +This left the kernel unable to log or react to protocol errors\n> +signaled by switch components.\n\nI'd drop the historical bit.  Not sure it adds value and these tend to\nbecome stale (like all the 'New Courts' in my local Uni. Some of those are\n500+ years old :)\n\n> +\n> +The unified CXL protocol error path closes that gap by routing every\n> +CXL Internal Error through a single producer/consumer pipeline shared\n> +by all CXL device types.\n\nThe unified CXL Protocol path routes every ...\n(so no historical gap - as we don't care now you fixed it ;)\n\nSimilar follows for some other parts - I might not have called them all out.\n\n> +\n> +\n> +Architecture overview\n> +=====================\n> +\n> +CXL protocol error handling is implemented as a distinct error plane\n> +layered on top of the existing PCIe AER infrastructure. The two planes\n\n(drop existing - same why do we need the history theme)\n\n> +are kept separate:\n> +\n> +* The **PCIe AER plane** continues to handle native PCIe errors\n** handles native  \n> +  (Receiver overflows, malformed TLPs, completion timeouts, and so\n> +  on). This is unchanged.\n> +\n> +* The **CXL protocol error plane** owns CXL Internal Errors. The AER\n> +  core forwards them to ``cxl_core`` via a dedicated kfifo; ``cxl_core``\n> +  then dispatches to CE/UE handlers and drives the recovery and\n> +  panic policy.\n> +\n> +The boundary between the two planes is ``is_cxl_error()`` in\n\nI think you can drop the `` and the automarkup.py magic in the kernel docs build\nwill make that :c:func::is_cxl_error or something along those lines to\nboth pretty print it and hopefully match autobuilt kernel-doc (assuming\nwe include it anywhere for cxl)\n\n\n> +===============\n> +\n> +The diagram below shows the full path from an AER interrupt through\n> +producer classification, kfifo handoff, and consumer dispatch.\n> +\n> +.. code-block:: text\n> +\n> +   +-------------------------------------------------------------------------+\n> +   |                  CXL Internal Error Packet Flow                         |\n> +   |    From PCIe AER Interrupt to CXL Protocol Error Handling and Logging   |\n> +   +-------------------------------------------------------------------------+\n> +\n> +      CXL device (RP / USP / DSP / EP / RCD) raises AER Internal Error\n> +      (correctable PCI_ERR_COR_INTERNAL or uncorrectable PCI_ERR_UNC_INTN)\n> +                      |\n> +                      v\n> +      +-------------------------------------------------------------+\n> +      |    PCIe Root Port AER MSI/MSI-X interrupt fires             |\n> +      +-------------------------------------------------------------+\n> +                      |\n> +      ============= drivers/pci/pcie/aer.c (AER core) =============\n> +                      |\n> +                      v\n> +           +---------------------------------+\n> +           |  aer_irq()  /  aer_isr()        |  (top + threaded handler)\n> +           +---------------------------------+\n> +                      |\n> +                      v\n> +           +---------------------------------+\n> +           |  aer_isr_one_error()            |\n> +           |  aer_isr_one_error_type()       |\n> +           +---------------------------------+\n> +                      |\n> +                      v\n> +          +------------------------------------------+\n> +          |  aer_get_device_error_info()             |\n> +          |  - reads PCI_ERR_COR_STATUS              |\n> +          |  - reads PCI_ERR_UNCOR_STATUS  (*if RP/  |\n> +          |    RCEC/DSP, or non-fatal severity)      |\n> +          |  - sets info->is_cxl = pcie_is_cxl(dev)  |\n> +          +------------------------------------------+\n> +                      |\n> +                      v\n> +           +---------------------------------+\n> +           |  handle_error_source(dev, info) |\n> +           +---------------------------------+\n> +              |                          |\n> +              |  is_cxl_error()          +--->  pci_aer_handle_error()\n> +              |  (CXL device + Internal)        (native PCIe AER path,\n> +              v                                  not covered here)\n> +      +-------------------------------------------------------------+\n> +      | Topology dispatch within AER core:                          |\n> +      |                                                             |\n> +      |   - VH topology  (RP / USP / DSP / EP)                      |\n> +      |     -> drivers/pci/pcie/aer_cxl_vh.c                        |\n> +      |                                                             |\n> +      |   - RCH topology (RCEC iterates RCDs under it)              |\n> +      |     -> drivers/pci/pcie/aer_cxl_rch.c                       |\n> +      +-------------------------------------------------------------+\n> +           |                                            |\n> +           | VH path                            RCH path (RCEC AER)\n> +           v                                            v\n> +      ============= aer_cxl_vh.c (VH      ============= aer_cxl_rch.c (RCH\n> +                    producer) =============              producer) ==========\n> +           |                                            |\n> +           v                                            v\n> +      +-----------------------------+         +-------------------------------+\n> +      | cxl_forward_error(pdev,info)|         | cxl_rch_handle_error_iter()   |\n> +      |  - if AER_CORRECTABLE:      |         |  - iterate each RCD pdev      |\n> +      |     clear PCI_ERR_COR_STATUS|         |    beneath the RCEC           |\n> +      |  - pci_dev_get(pdev)        |         |  - call cxl_forward_error()   |\n> +      |  - build cxl_proto_err_     |         |    for each RCD               |\n> +      |    work_data                |         |    (same producer helper as   |\n> +      |    { pdev, severity }       |         |     the VH path uses)         |\n> +      |  - kfifo_in_spinlocked(...) |         +-------------------------------+\n> +      |  - schedule_work(...)       |                       |\n> +      +-----------------------------+                       |\n> +              |                                             |\n> +              +-----------------+---------------------------+\n> +                                |\n> +                                v\n> +                    +--------------------------+\n> +                    |     AER-CXL kfifo        |\n> +                    |     (work_struct)        |\n> +                    +--------------------------+\n> +                                |\n> +                                v\n> +      ============= drivers/cxl/core/ras.c (consumer worker) =======\n> +                                |\n> +                                v\n> +      +-------------------------------------------------------------+\n> +      | cxl_proto_err_work_fn() (workqueue handler)                 |\n> +      |   for_each_cxl_proto_err(&wd, __cxl_proto_err_work_fn)      |\n> +      +-------------------------------------------------------------+\n> +                      |\n> +                      v\n> +      +-------------------------------------------------------------+\n> +      | __cxl_proto_err_work_fn(wd)                                 |\n> +      |   port = find_cxl_port_by_dev(&pdev->dev, &dport)           |\n> +      |   cxl_handle_proto_error(pdev, port, dport, severity)       |\n> +      |   pci_dev_put(pdev)                                         |\n> +      +-------------------------------------------------------------+\n> +                      |\n> +                      v\n> +      +-------------------------------------------------------------+\n> +      | cxl_handle_proto_error()                                    |\n> +      +-------------------------------------------------------------+\n> +           |                                            |\n> +      pci_pcie_type ==                          pci_pcie_type !=\n> +      PCI_EXP_TYPE_RC_END                       PCI_EXP_TYPE_RC_END\n> +      (RCD Endpoint)                            (VH: RP/USP/DSP/EP)\n> +           |                                            |\n> +           v                                            |\n> +      +-------------------------------------+           |\n> +      | cxl_handle_rdport_errors(pdev)      |           |\n> +      |   - process RCH Downstream Port's   |           |\n> +      |     RAS register block first        |           |\n> +      |   - cxl_handle_cor_ras() for CE     |           |\n> +      |   - cxl_handle_ras() for UE         |           |\n> +      |     (log only; does NOT panic)      |           |\n> +      +-------------------------------------+           |\n> +           |                                            |\n> +           +--------------------+-----------------------+\n> +                                |\n> +                                v\n> +                   +-----------------------------+\n> +                   | severity == AER_CORRECTABLE |\n> +                   +-----------------------------+\n> +                         |                  |\n> +                         yes                no\n> +                         v                  v\n> +            +----------------------+   +-------------------------+\n> +            | cxl_handle_cor_ras() |   | cxl_do_recovery()       |\n> +            |  - emit cxl_aer_     |   | (described below)       |\n> +            |    correctable_      |   +-------------------------+\n> +            |    error trace       |\n> +            | pcie_clear_device_   |\n> +            |   status()           |\n> +            +----------------------+\n> +\n> +                    +-------------------------------+\n> +                    | cxl_do_recovery()             |\n> +                    |  if pci_dev_is_disconnected:  |\n> +                    |    panic(\"CXL cachemem err.\") |\n> +                    |                               |\n> +                    |  ue = cxl_handle_ras()        |\n> +                    |    -> emit                    |\n> +                    |       cxl_aer_uncorrectable_  |\n> +                    |       error trace event       |\n> +                    |                               |\n> +                    |  if (ue):                     |\n> +                    |    panic(\"CXL cachemem err.\") |\n> +                    |                               |\n> +                    |  pcie_clear_device_status()   |\n> +                    |  pci_aer_clear_nonfatal_status|\n> +                    |  pci_aer_clear_fatal_status   |\n> +                    +-------------------------------+\n\nPretty diagram but maybe far too much given we have the code?\n\n> +\n> +\n> +Severity policy\n> +===============\n> +\n> +The kernel's response to a CXL protocol error depends on the AER\n> +severity reported by the device and on the result of inspecting the\n> +CXL RAS registers.\n> +","headers":{"Return-Path":"\n <linux-pci+bounces-54120-incoming=patchwork.ozlabs.org@vger.kernel.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-pci@vger.kernel.org"],"Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256\n header.s=k20201202 header.b=O2ut9QPm;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c0a:e001:db::12fc:5321; helo=sea.lore.kernel.org;\n envelope-from=linux-pci+bounces-54120-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org\n header.b=\"O2ut9QPm\"","smtp.subspace.kernel.org;\n arc=none smtp.client-ip=10.30.226.201"],"Received":["from sea.lore.kernel.org (sea.lore.kernel.org\n [IPv6:2600:3c0a:e001:db::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4gBLw03LN3z1y04\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 08 May 2026 04:52:16 +1000 (AEST)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sea.lore.kernel.org (Postfix) with ESMTP id E9BA1301D323\n\tfor <incoming@patchwork.ozlabs.org>; Thu,  7 May 2026 18:52:11 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 2426A322A1C;\n\tThu,  7 May 2026 18:52:11 +0000 (UTC)","from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org\n [10.30.226.201])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id F1DD9274B53;\n\tThu,  7 May 2026 18:52:10 +0000 (UTC)","by smtp.kernel.org (Postfix) with ESMTPSA id 497F2C2BCB2;\n\tThu,  7 May 2026 18:52:00 +0000 (UTC)"],"ARC-Seal":"i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1778179931; cv=none;\n b=NElLSfudHHqFk6/nCRYSgL19r/SDiwlmx/U7YUf9mnKGVVYbGSujGMcsOj5h9WApm/NzkxTTW1aXSEpfQFgQdcbwf2qd+s9F6NuwiUMVa/wXkxB8z9K3rEf4cJxJn7pc4rngZAnQMuB/Npy+Ma1j3LahT/GiXsaadScs+JT51WU=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1778179931; c=relaxed/simple;\n\tbh=dPvk0Ao+9NzrVwPu/7e+NRYDILr8HDv2Ef1wxRjfB8A=;\n\th=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References:\n\t MIME-Version:Content-Type;\n b=Tuc7NQN3YjuxDssUd1ugQvG5vp8tCUmMFyK7ZT4nHztghS0c3sjEVYcdGtvdB1ZUULhL2a4OUoQwhT1ObGfJDidi3iqShDEoUs0c6ZLEpZ7gF+DElSnoifGfyfwpoZ+3sCFWxJ/ae2HFaZ5EOnCK63/UgmW94y91RYf2Y5UVQD8=","ARC-Authentication-Results":"i=1; smtp.subspace.kernel.org;\n dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org\n header.b=O2ut9QPm; arc=none smtp.client-ip=10.30.226.201","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;\n\ts=k20201202; t=1778179930;\n\tbh=dPvk0Ao+9NzrVwPu/7e+NRYDILr8HDv2Ef1wxRjfB8A=;\n\th=Date:From:To:Cc:Subject:In-Reply-To:References:From;\n\tb=O2ut9QPmkIas6p5rRy1BY/vSHFQJmBRPCVUzdc+ircsAvSbD4KdxXf60vz+T3DGou\n\t 3JRiLQc5OR2j+riAjiZ+7ZxoUbmAViw1+T8wvpNLmVgls/D5wW4ReuKvlVDhuoN5Al\n\t tHJxzYE3b/ZOxK0ZlrHOYPul9/8d6vENSyeDkjAiscJ8go/OJSHUo293WFEBZ+rKIt\n\t 08ommsQyaPb1OWcIPoJ/V7pcI4ZRLW/qY0Jiw4sbo+ZpKv3gHwglkMITSsTx1eGnJc\n\t x4SA+cUFGPfjgbmucaN7aBAaqbPsE5+4YbTCHHJmPSpJfQmEGaSavBGx8FGv1zNoiw\n\t wAqdUcfFL3BTA==","Date":"Thu, 7 May 2026 19:51:56 +0100","From":"Jonathan Cameron <jic23@kernel.org>","To":"Terry Bowman <terry.bowman@amd.com>","Cc":"<dave@stgolabs.net>, <dave.jiang@intel.com>,\n <alison.schofield@intel.com>, <djbw@kernel.org>, <bhelgaas@google.com>,\n <shiju.jose@huawei.com>, <ming.li@zohomail.com>,\n <Smita.KoralahalliChannabasappa@amd.com>, <rrichter@amd.com>,\n <dan.carpenter@linaro.org>, <PradeepVineshReddy.Kodamati@amd.com>,\n <lukas@wunner.de>, <Benjamin.Cheatham@amd.com>,\n <sathyanarayanan.kuppuswamy@linux.intel.com>, <vishal.l.verma@intel.com>,\n <alucerop@amd.com>, <ira.weiny@intel.com>, <corbet@lwn.net>,\n <rafael@kernel.org>, <xueshuai@linux.alibaba.com>,\n <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,\n <linux-pci@vger.kernel.org>, <linux-acpi@vger.kernel.org>,\n <linux-doc@vger.kernel.org>","Subject":"Re: [PATCH v17 11/11] Documentation: cxl: Document CXL protocol\n error handling","Message-ID":"<20260507195156.3757a20b@jic23-huawei>","In-Reply-To":"<20260505173029.2718246-12-terry.bowman@amd.com>","References":"<20260505173029.2718246-1-terry.bowman@amd.com>\n\t<20260505173029.2718246-12-terry.bowman@amd.com>","X-Mailer":"Claws Mail 4.4.0 (GTK 3.24.52; x86_64-pc-linux-gnu)","Precedence":"bulk","X-Mailing-List":"linux-pci@vger.kernel.org","List-Id":"<linux-pci.vger.kernel.org>","List-Subscribe":"<mailto:linux-pci+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-pci+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","Content-Transfer-Encoding":"7bit"}}]