Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/2196504/?format=api
{ "id": 2196504, "url": "http://patchwork.ozlabs.org/api/patches/2196504/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linux-pci/patch/20260214081130.1878424-1-liusizhe5@huawei.com/", "project": { "id": 28, "url": "http://patchwork.ozlabs.org/api/projects/28/?format=api", "name": "Linux PCI development", "link_name": "linux-pci", "list_id": "linux-pci.vger.kernel.org", "list_email": "linux-pci@vger.kernel.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20260214081130.1878424-1-liusizhe5@huawei.com>", "list_archive_url": null, "date": "2026-02-14T08:11:30", "name": "[v3] PCI/AER: Fix missing AER logs in DPC and EDR paths", "commit_ref": null, "pull_url": null, "state": "new", "archived": false, "hash": "f5091f9b0525fca50891f8fda50a0bd510ae7c22", "submitter": { "id": 92512, "url": "http://patchwork.ozlabs.org/api/people/92512/?format=api", "name": "Sizhe Liu", "email": "liusizhe5@huawei.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/linux-pci/patch/20260214081130.1878424-1-liusizhe5@huawei.com/mbox/", "series": [ { "id": 492164, "url": "http://patchwork.ozlabs.org/api/series/492164/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linux-pci/list/?series=492164", "date": "2026-02-14T08:11:30", "name": "[v3] PCI/AER: Fix missing AER logs in DPC and EDR paths", "version": 3, "mbox": "http://patchwork.ozlabs.org/series/492164/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/2196504/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/2196504/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "\n <linux-pci+bounces-47287-incoming=patchwork.ozlabs.org@vger.kernel.org>", "X-Original-To": [ "incoming@patchwork.ozlabs.org", "linux-pci@vger.kernel.org" ], "Delivered-To": "patchwork-incoming@legolas.ozlabs.org", "Authentication-Results": [ "legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=huawei.com header.i=@huawei.com header.a=rsa-sha256\n header.s=dkim header.b=311697jU;\n\tdkim-atps=neutral", "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=172.234.253.10; helo=sea.lore.kernel.org;\n envelope-from=linux-pci+bounces-47287-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)", "smtp.subspace.kernel.org;\n\tdkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com\n header.b=\"311697jU\"", "smtp.subspace.kernel.org;\n arc=none smtp.client-ip=113.46.200.227", "smtp.subspace.kernel.org;\n dmarc=pass (p=quarantine dis=none) header.from=huawei.com", "smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=huawei.com" ], "Received": [ "from sea.lore.kernel.org (sea.lore.kernel.org [172.234.253.10])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fChZh6jGlz1xpY\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 14 Feb 2026 19:11:40 +1100 (AEDT)", "from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sea.lore.kernel.org (Postfix) with ESMTP id D56413021722\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 14 Feb 2026 08:11:38 +0000 (UTC)", "from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 5C5D9285061;\n\tSat, 14 Feb 2026 08:11:38 +0000 (UTC)", "from canpmsgout12.his.huawei.com (canpmsgout12.his.huawei.com\n [113.46.200.227])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id BA67F217F27\n\tfor <linux-pci@vger.kernel.org>; Sat, 14 Feb 2026 08:11:34 +0000 (UTC)", "from mail.maildlp.com (unknown [172.19.163.214])\n\tby canpmsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4fChT06SYwznTVY\n\tfor <linux-pci@vger.kernel.org>; Sat, 14 Feb 2026 16:06:44 +0800 (CST)", "from dggemv712-chm.china.huawei.com (unknown [10.1.198.32])\n\tby mail.maildlp.com (Postfix) with ESMTPS id 05C544056C\n\tfor <linux-pci@vger.kernel.org>; Sat, 14 Feb 2026 16:11:32 +0800 (CST)", "from kwepemn200012.china.huawei.com (7.202.194.135) by\n dggemv712-chm.china.huawei.com (10.1.198.32) with Microsoft SMTP Server\n (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.2.1544.11; Sat, 14 Feb 2026 16:11:31 +0800", "from huawei.com (10.50.163.32) by kwepemn200012.china.huawei.com\n (7.202.194.135) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sat, 14 Feb\n 2026 16:11:31 +0800" ], "ARC-Seal": "i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1771056698; cv=none;\n b=CU0TI68tg/WpLm8MrX2cJX1Bw8cmR+KsIRKFLXmSV2WzVDrfvsvgAlXIbRzGkq7FdPj7ok6PsYC6mtlacVhm1g+JPfRPCI5TfdAAtI9YIFVc9X/3Z38x68E9td81CQq1Cjn2vdc7FTNLwIkNfVVbsiNTnCxYnFb383iRh2zWmIE=", "ARC-Message-Signature": "i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1771056698; c=relaxed/simple;\n\tbh=tvo6rIfnqOnefm4zIHCJSMsxGA4X1br/LOQN/nXmCeI=;\n\th=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type;\n b=lkVySYpub3VBMpJvLqqDNOUH5JbeRX6Q+EcVF3H5NqlEiHTtTV1i+yjKT1u7GB+I0/EXsljkoYdoGrwuonkfANeKZn9yFZQE4/P2uDaXlw4Qc3VjaV/9i8wlNUA+haR6nMOSQQ40mCTESF4OqWQ8+O4cA1LRdD7JsEmQ/v8J618=", "ARC-Authentication-Results": "i=1; smtp.subspace.kernel.org;\n dmarc=pass (p=quarantine dis=none) header.from=huawei.com;\n spf=pass smtp.mailfrom=huawei.com;\n dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com\n header.b=311697jU; arc=none smtp.client-ip=113.46.200.227", "dkim-signature": "v=1; a=rsa-sha256; d=huawei.com; s=dkim;\n\tc=relaxed/relaxed; q=dns/txt;\n\th=From;\n\tbh=jhXQ2c+stQbuJbbMGNcF7xhNlI3LtCrlHs3NkwrtwLc=;\n\tb=311697jU2T1D4gxhuwxmYjhbnxTQTsZ8ihkAQGhkTTWCpfvICS38VYV1wV/MpTfiRcyQLjCx1\n\tEE2gJtWAoxKSYvFOgtaxD5QAJySIQfTl0yOoAOpm3koPfUwsE8Zgow9owVABVX3v7wVg11LaO08\n\t0mRHOY18xRSx8/pH45Ew9Ak=", "From": "Sizhe Liu <liusizhe5@huawei.com>", "To": "<bhelgaas@google.com>, <jonathan.cameron@huawei.com>,\n\t<shiju.jose@huawei.com>, <pandoh@google.com>", "CC": "<linux-pci@vger.kernel.org>, <linuxarm@huawei.com>,\n\t<prime.zeng@hisilicon.com>, <fanghao11@huawei.com>, <shenyang39@huawei.com>,\n\t<liusizhe5@huawei.com>", "Subject": "[PATCH v3] PCI/AER: Fix missing AER logs in DPC and EDR paths", "Date": "Sat, 14 Feb 2026 16:11:30 +0800", "Message-ID": "<20260214081130.1878424-1-liusizhe5@huawei.com>", "X-Mailer": "git-send-email 2.33.0", "Precedence": "bulk", "X-Mailing-List": "linux-pci@vger.kernel.org", "List-Id": "<linux-pci.vger.kernel.org>", "List-Subscribe": "<mailto:linux-pci+subscribe@vger.kernel.org>", "List-Unsubscribe": "<mailto:linux-pci+unsubscribe@vger.kernel.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "Content-Type": "text/plain", "X-ClientProxiedBy": "kwepems100001.china.huawei.com (7.221.188.238) To\n kwepemn200012.china.huawei.com (7.202.194.135)" }, "content": "Current DPC and EDR processing paths fail to print complete AER logs\ninformation: aer_print_error() returns early without printing logs\ndue to uninitialized ratelimit_print variables.\n\nPhenomenon:\n-- Error log abnormal\npcieport 0000:20:00.0: DPC: containment event, status: 0x1f11: unmasked uncorrectable error detected\n(------ AER logs should be printed here, but are missing ------)\nnvme nvme0: frozen state error detected, reset controller\n{4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0\n\nRoot cause analysis:\nIn aer_print_error(), PCIe AER errors are rate-limited via info->ratelimit_print[i],\nbut this variable is not initialized in DPC/EDR paths, leading to\nearly return before log printing.\n\naer_print_error() entry points and key logic:\n1) Native AER path (working correctly):\naer_isr_one_error_type(aer_err_info)\n find_source_device(aer_err_info)\n find_device_iter(aer_err_info)\n add_error_device(aer_err_info)\n pci_dev_get(aer_err_info) <-- Get refcount\n if (aer_ratelimit(aer_err_info->severity))\n info->ratelimit_print[i] = 1 <-- Init ratelimit\n aer_process_err_devices(aer_err_info)\n aer_print_error(aer_err_info)\n if (!info->ratelimit_print[i])\n return;\n __aer_print_error() <-- Print AER info\n handle_error_source(aer_err_info)\n pci_dev_put() <-- Release refcount\n\n2) DPC path (missing ratelimit init; refcount not held):\ndpc_handler()\n dpc_process_error()\n dpc_get_aer_uncorrect_severity(aer_err_info)\n aer_get_device_error_info(aer_err_info)\n aer_print_error(aer_err_info)\n if (!info->ratelimit_print[i])\n return; <-- Early return (no log)\n __aer_print_error()\n\n3) EDR path (missing ratelimit init only):\nedr_handle_event()\n acpi_dpc_port_get()\n pci_dev_get() / pci_get_domain_bus_and_slot() <-- Get refcount\n dpc_process_error()\n dpc_get_aer_uncorrect_severity(aer_err_info)\n aer_get_device_error_info(aer_err_info)\n aer_print_error(aer_err_info)\n if (!info->ratelimit_print[i])\n return; <-- Early return (no log)\n __aer_print_error()\n pci_dev_put() <-- Release refcount\n\nFix approach (minimal intrusive):\n1. Extract the initialization of info->ratelimit_print[i]\nand e_info->root_ratelimit_print into aer_ratelimit_print_init().\n2. Call aer_ratelimit_print_init() in dpc_process_error() to initialize\nratelimit variables for DPC/EDR paths.\n3. Add pci_dev_get()/pci_dev_put() pairs in dpc_handler() to align\nDPC path with EDR and native AER paths.\n\nNote on not using add_error_device() directly:\nhttps://lore.kernel.org/linux-pci/962a43ca-9593-430b-9a89-1591a5ae9bf9@huawei.com/\n\nTest case (AER error injection):\n- Enable DPC toggle in BIOS.\n- Inject MalfTLP (AER FATAL ERROR) to target device.\n\nTest result (normal error log):\npcieport 0000:20:00.0: DPC: containment event, status:0x1f11: unmasked uncorrectable error detected\npcieport 0000:20:00.0: PCIe Bus Error: severity=Uncorrectable (Fatal), type=Transaction Layer, (Receiver ID)\npcieport 0000:20:00.0: device [19e5:a120] error status/mask=00040000/04580000\npcieport 0000:20:00.0: [18] MalfTLP (First)\npcieport 0000:20:00.0: AER: TLP Header: 0x00000000 0x00000000 0x00000000 0x00000000\nnvme nvme0: frozen state error detected, reset controller\n{2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0\n\n[1] https://lore.kernel.org/linux-pci/20260127035405.712271-1-liusizhe5@huawei.com/\n[2] https://lore.kernel.org/linux-pci/20260129140103.712011-1-liusizhe5@huawei.com/\n\nFixes: a57f2bfb4a58 (\"PCI/AER: Ratelimit correctable and non-fatal error logging\")\nSigned-off-by: Sizhe Liu <liusizhe5@huawei.com>\n---\nv1 -> v2\n- Corrected the format and spelling errors in the commit log.\n\nv2 -> v3\n- Add PCI device reference count handling in the DPC path.\n\n drivers/pci/pci.h | 2 ++\n drivers/pci/pcie/aer.c | 36 ++++++++++++++++++++++++------------\n drivers/pci/pcie/dpc.c | 3 +++\n 3 files changed, 29 insertions(+), 12 deletions(-)", "diff": "diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h\nindex c8a0522e2e1f..070ef360189e 100644\n--- a/drivers/pci/pci.h\n+++ b/drivers/pci/pci.h\n@@ -840,6 +840,8 @@ struct aer_err_info {\n \n int aer_get_device_error_info(struct aer_err_info *info, int i);\n void aer_print_error(struct aer_err_info *info, int i);\n+void aer_ratelimit_print_init(struct pci_dev *dev, struct aer_err_info *e_info,\n+\t\t\tint idx);\n \n int pcie_read_tlp_log(struct pci_dev *dev, int where, int where2,\n \t\t unsigned int tlp_len, bool flit,\ndiff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c\nindex 2b0ac6bfdd76..f8d248e397ce 100644\n--- a/drivers/pci/pcie/aer.c\n+++ b/drivers/pci/pcie/aer.c\n@@ -972,6 +972,29 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity,\n }\n EXPORT_SYMBOL_GPL(pci_print_aer);\n \n+/**\n+ * aer_ratelimit_print_init - set flag whether error message should be printed\n+ * @dev: pointer to pci_dev to be rate-limited\n+ * @e_info: pointer to aer error info\n+ * @idx: index for dev array in aer error info\n+ */\n+void aer_ratelimit_print_init(struct pci_dev *dev, struct aer_err_info *e_info,\n+\t\t\tint idx)\n+{\n+\t/*\n+\t * Ratelimit AER log messages. \"dev\" is either the source\n+\t * identified by the root's Error Source ID or it has an unmasked\n+\t * error logged in its own AER Capability. Messages are emitted\n+\t * when \"ratelimit_print[i]\" is non-zero. If we will print detail\n+\t * for a downstream device, make sure we print the Error Source ID\n+\t * from the root as well.\n+\t */\n+\tif (aer_ratelimit(dev, e_info->severity)) {\n+\t\te_info->ratelimit_print[idx] = 1;\n+\t\te_info->root_ratelimit_print = 1;\n+\t}\n+}\n+\n /**\n * add_error_device - list device to be handled\n * @e_info: pointer to error info\n@@ -987,18 +1010,7 @@ static int add_error_device(struct aer_err_info *e_info, struct pci_dev *dev)\n \te_info->dev[i] = pci_dev_get(dev);\n \te_info->error_dev_num++;\n \n-\t/*\n-\t * Ratelimit AER log messages. \"dev\" is either the source\n-\t * identified by the root's Error Source ID or it has an unmasked\n-\t * error logged in its own AER Capability. Messages are emitted\n-\t * when \"ratelimit_print[i]\" is non-zero. If we will print detail\n-\t * for a downstream device, make sure we print the Error Source ID\n-\t * from the root as well.\n-\t */\n-\tif (aer_ratelimit(dev, e_info->severity)) {\n-\t\te_info->ratelimit_print[i] = 1;\n-\t\te_info->root_ratelimit_print = 1;\n-\t}\n+\taer_ratelimit_print_init(dev, e_info, i);\n \treturn 0;\n }\n \ndiff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c\nindex fc18349614d7..6ee1f3daa766 100644\n--- a/drivers/pci/pcie/dpc.c\n+++ b/drivers/pci/pcie/dpc.c\n@@ -275,6 +275,7 @@ void dpc_process_error(struct pci_dev *pdev)\n \t\t\t status);\n \t\tif (dpc_get_aer_uncorrect_severity(pdev, &info) &&\n \t\t aer_get_device_error_info(&info, 0)) {\n+\t\t\taer_ratelimit_print_init(pdev, &info, 0);\n \t\t\taer_print_error(&info, 0);\n \t\t\tpci_aer_clear_nonfatal_status(pdev);\n \t\t\tpci_aer_clear_fatal_status(pdev);\n@@ -372,11 +373,13 @@ static irqreturn_t dpc_handler(int irq, void *context)\n \t\treturn IRQ_HANDLED;\n \t}\n \n+\tpci_dev_get(pdev);\n \tdpc_process_error(pdev);\n \n \t/* We configure DPC so it only triggers on ERR_FATAL */\n \tpcie_do_recovery(pdev, pci_channel_io_frozen, dpc_reset_link);\n \n+\tpci_dev_put(pdev);\n \treturn IRQ_HANDLED;\n }\n \n", "prefixes": [ "v3" ] }