Cover Letter Detail
Show a cover letter.
GET /api/covers/2217834/?format=api
{ "id": 2217834, "url": "http://patchwork.ozlabs.org/api/covers/2217834/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linux-pci/cover/20260330174011.1161-1-alifm@linux.ibm.com/", "project": { "id": 28, "url": "http://patchwork.ozlabs.org/api/projects/28/?format=api", "name": "Linux PCI development", "link_name": "linux-pci", "list_id": "linux-pci.vger.kernel.org", "list_email": "linux-pci@vger.kernel.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20260330174011.1161-1-alifm@linux.ibm.com>", "list_archive_url": null, "date": "2026-03-30T17:40:04", "name": "[v12,0/7] Error recovery for vfio-pci devices on s390x", "submitter": { "id": 73785, "url": "http://patchwork.ozlabs.org/api/people/73785/?format=api", "name": "Farhan Ali", "email": "alifm@linux.ibm.com" }, "mbox": "http://patchwork.ozlabs.org/project/linux-pci/cover/20260330174011.1161-1-alifm@linux.ibm.com/mbox/", "series": [ { "id": 498071, "url": "http://patchwork.ozlabs.org/api/series/498071/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linux-pci/list/?series=498071", "date": "2026-03-30T17:40:08", "name": "Error recovery for vfio-pci devices on s390x", "version": 12, "mbox": "http://patchwork.ozlabs.org/series/498071/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/covers/2217834/comments/", "headers": { "Return-Path": "\n <linux-pci+bounces-51480-incoming=patchwork.ozlabs.org@vger.kernel.org>", "X-Original-To": [ "incoming@patchwork.ozlabs.org", "linux-pci@vger.kernel.org" ], "Delivered-To": "patchwork-incoming@legolas.ozlabs.org", "Authentication-Results": [ "legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256\n header.s=pp1 header.b=FNuZ2JIt;\n\tdkim-atps=neutral", "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c0a:e001:db::12fc:5321; helo=sea.lore.kernel.org;\n envelope-from=linux-pci+bounces-51480-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)", "smtp.subspace.kernel.org;\n\tdkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com\n header.b=\"FNuZ2JIt\"", "smtp.subspace.kernel.org;\n arc=none smtp.client-ip=148.163.156.1", "smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=linux.ibm.com", "smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=linux.ibm.com" ], "Received": [ "from sea.lore.kernel.org (sea.lore.kernel.org\n [IPv6:2600:3c0a:e001:db::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fkz8q0R15z1xrn\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 31 Mar 2026 04:42:19 +1100 (AEDT)", "from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sea.lore.kernel.org (Postfix) with ESMTP id BA86F302D08B\n\tfor <incoming@patchwork.ozlabs.org>; Mon, 30 Mar 2026 17:40:26 +0000 (UTC)", "from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 901883DC4A9;\n\tMon, 30 Mar 2026 17:40:25 +0000 (UTC)", "from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n [148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id 090F53126A0;\n\tMon, 30 Mar 2026 17:40:23 +0000 (UTC)", "from pps.filterd (m0356517.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id\n 62UESOZK115373;\n\tMon, 30 Mar 2026 17:40:17 GMT", "from ppma23.wdc07v.mail.ibm.com\n (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93])\n\tby mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4d66q2yyrp-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);\n\tMon, 30 Mar 2026 17:40:16 +0000 (GMT)", "from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1])\n\tby ppma23.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id\n 62UF0txp013897;\n\tMon, 30 Mar 2026 17:40:15 GMT", "from smtprelay04.wdc07v.mail.ibm.com ([172.16.1.71])\n\tby ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4d6ttkdsfk-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);\n\tMon, 30 Mar 2026 17:40:15 +0000", "from smtpav04.dal12v.mail.ibm.com (smtpav04.dal12v.mail.ibm.com\n [10.241.53.103])\n\tby smtprelay04.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n 62UHeEmw36438778\n\t(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);\n\tMon, 30 Mar 2026 17:40:14 GMT", "from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id EAC795805A;\n\tMon, 30 Mar 2026 17:40:13 +0000 (GMT)", "from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id 217D158052;\n\tMon, 30 Mar 2026 17:40:13 +0000 (GMT)", "from IBM-D32RQW3.ibm.com (unknown [9.61.243.214])\n\tby smtpav04.dal12v.mail.ibm.com (Postfix) with ESMTP;\n\tMon, 30 Mar 2026 17:40:13 +0000 (GMT)" ], "ARC-Seal": "i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1774892425; cv=none;\n b=IhgljtgDSpBI53hc5Ym4P6K0xCb7L/DeVpk727hJ2InRkJDa408JYkvbJBo3ebOVlj0ixt0rgoww+tIiCCQoGZwE77DmKJGQBCaDTpkIcTmFDGS/x+8oEweZevBCU/hgvj6cUQ1EJBAEPZwWqnBLe+f3rgBE2Q6h73jb12lhUfE=", "ARC-Message-Signature": "i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1774892425; c=relaxed/simple;\n\tbh=T337hOKff8oCp8nowJzz7orVI5G6Qq8GEwjZkz5jFoc=;\n\th=From:To:Cc:Subject:Date:Message-ID:MIME-Version;\n b=LEZhP76k7QV5h0YkVrb/izbRNZbPIN+hpgPcNlPdjj+TkGgvlpjoRXrF4S+yix3pnZEU6KnXZwQmmDyvvX6RTttaLfcYsYVBhTzyyRC7vXBSWX+y2GGN5E7jmEZU1mcJYHmAQf0A29XoDvkirtJIMnPTLpE6mjVNPe8rRDRxaIE=", "ARC-Authentication-Results": "i=1; smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=linux.ibm.com;\n spf=pass smtp.mailfrom=linux.ibm.com;\n dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com\n header.b=FNuZ2JIt; arc=none smtp.client-ip=148.163.156.1", "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc\n\t:content-transfer-encoding:date:from:message-id:mime-version\n\t:subject:to; s=pp1; bh=65M6hJzQ8Nl9hWQP6B/+Ao7LCZRcc6emmL/qtJYSZ\n\tXs=; b=FNuZ2JIt7OSHy6j3Fx0xgoUbAhXrTVz5ObFDODLbqdKEVZjV6mmgJE81d\n\tSwD6tEuTOFiP9ACusnrEJ8Pq+nkBfuHvGud8PGAWCFN2YB5dxX/Mh5EA2GDrKk5Q\n\tqGDKjNAXoQOzQqhFA6YUU/87IQWKOJte6gA5qUGcdsjbqZBffZtTsxFfJEN5onAU\n\tblZvHmHNiWaxyQCa/3TFOqolYJchj479W7NS6KtYTgtFqLTv8b77oclX0L4P+vX8\n\tgYmW27cadAcZlVjJl6MIkCb9nBfRYaNqQrubH6eK44Hjab9DJT7QG2Yq96ljTtOJ\n\tPYL3S3SdLC0vgfLDqqaQEstC/iSVQ==", "From": "Farhan Ali <alifm@linux.ibm.com>", "To": "linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,\n linux-pci@vger.kernel.org", "Cc": "helgaas@kernel.org, lukas@wunner.de, alex@shazbot.org, clg@redhat.com,\n kbusch@kernel.org, alifm@linux.ibm.com, schnelle@linux.ibm.com,\n mjrosato@linux.ibm.com", "Subject": "[PATCH v12 0/7] Error recovery for vfio-pci devices on s390x", "Date": "Mon, 30 Mar 2026 10:40:04 -0700", "Message-ID": "<20260330174011.1161-1-alifm@linux.ibm.com>", "X-Mailer": "git-send-email 2.43.0", "Precedence": "bulk", "X-Mailing-List": "linux-pci@vger.kernel.org", "List-Id": "<linux-pci.vger.kernel.org>", "List-Subscribe": "<mailto:linux-pci+subscribe@vger.kernel.org>", "List-Unsubscribe": "<mailto:linux-pci+unsubscribe@vger.kernel.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "X-TM-AS-GCONF": "00", "X-Proofpoint-GUID": "a_4fv9VptW3RBX40WnwXdKylNqoP7QmF", "X-Authority-Analysis": "v=2.4 cv=frzRpV4f c=1 sm=1 tr=0 ts=69cab580 cx=c_pps\n a=3Bg1Hr4SwmMryq2xdFQyZA==:117 a=3Bg1Hr4SwmMryq2xdFQyZA==:17\n a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=RnoormkPH1_aCDwRdu11:22\n a=U7nrCbtTmkRpXpFmAIza:22 a=VwQbUJbxAAAA:8 a=VnNF1IyMAAAA:8\n a=H5O81k4uhyXwW133l-cA:9", "X-Proofpoint-ORIG-GUID": "a_4fv9VptW3RBX40WnwXdKylNqoP7QmF", "X-Proofpoint-Spam-Details-Enc": "AW1haW4tMjYwMzMwMDE0MSBTYWx0ZWRfX6Xtk7m38uJU2\n uv4VN4nrrLffQ9zNzi8IzRRs7EssBKp9l7r0QUm/5MGKLS/0B1FUHKPySqVGy9ATxRD773flPK9\n 3C36DXPr6z23W32iiHjz2ENJmwG9cIFuN72YtYFvVGHTZ2AMaYV9Bd6Eyt0n8bNauJkIqbKIxjp\n uaJkopRV2HNSGwz3EODM3nhC0E3LCYgs3xyfc2a7H7uMrZnOYhnMMqZo48wzADC653YWEQOATbn\n qB8BQ8o63NRnRd4whbrePUaTdxUTLJV6plxmbyLQOPUQwEtk+aNqImjch69SeUPMmOpZgrl1W7+\n LzIikOXbe1sSIL2DELmNPklmXvZNMZzcoZW9rY/7MtJl/p7MZBgBPozZltUE3Tjdox0KIZa0p2h\n qZkC/lk2ovGnn1s6+1zHxDYmTgP734v6yMIr2jzkEhEOXrgDy7xj+L99IsS+EL1WH2ncnqQC9Y9\n iePFgutPjTZ+oQuTLbA==", "X-Proofpoint-Virus-Version": "vendor=baseguard\n engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49\n definitions=2026-03-29_05,2026-03-28_01,2025-10-01_01", "X-Proofpoint-Spam-Details": "rule=outbound_notspam policy=outbound score=0\n impostorscore=0 spamscore=0 priorityscore=1501 malwarescore=0 clxscore=1015\n lowpriorityscore=0 bulkscore=0 adultscore=0 suspectscore=0 phishscore=0\n classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0\n reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603300141" }, "content": "Hi,\n\nThis Linux kernel patch series introduces support for error recovery for\npassthrough PCI devices on System Z (s390x). \n\nBackground\n----------\nFor PCI devices on s390x an operating system receives platform specific\nerror events from firmware rather than through AER.Today for\npassthrough/userspace devices, we don't attempt any error recovery and\nignore any error events for the devices. The passthrough/userspace devices\nare managed by the vfio-pci driver. The driver does register error handling\ncallbacks (error_detected), and on an error trigger an eventfd to\nuserspace. But we need a mechanism to notify userspace\n(QEMU/guest/userspace drivers) about the error event. \n\nProposal\n--------\nWe can expose this error information (currently only the PCI Error Code)\nvia a device feature. Userspace can then obtain the error information \nvia VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving \na device reset.\n\nThis is how a typical flow for passthrough devices to a VM would work:\nFor passthrough devices to a VM, the driver bound to the device on the host \nis vfio-pci. vfio-pci driver does support the error_detected() callback \n(vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery \ncode on the host will call the vfio-pci error_detected() callback. The \nvfio-pci error_detected() callback will notify userspace/QEMU via an \neventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x \nerror recovery on the host will skip any further action(see patch 4) and \nlet userspace drive the error recovery.\n\nOnce userspace/QEMU is notified, it then injects this error into the VM \nso device drivers in the VM can take recovery actions. For example for a \npassthrough NVMe device, the VM's OS NVMe driver will access the device. \nAt this point the VM's NVMe driver's error_detected() will drive the \nrecovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error \nrecovery in the VM's OS will try to do a reset. Resets are privileged \noperations and so the VM will need intervention from QEMU to perform the \nreset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the \nhost that the VM is requesting a reset of the device. The vfio-pci driver \non the host will then perform the reset on the device to recover it.\n\n\nThanks\nFarhan\n\nChangeLog\n---------\nv11 series https://lore.kernel.org/all/20260316191544.2279-1-alifm@linux.ibm.com/\n - Address Bjorn's comments from v11 (patches 1-3).\n\n - Create a common function to check config space accessibility \n (patch 2).\n\n - Address Alex's comments from v11 (patches 4, 5, 7).\n\n - Protect the mediated_recovery flag with the pending_errs_lock.\n Doing that it made sense to squash patches 5 and 6 from v11 \n (current patch 4). Even though the code didn't change significantly \n I have dropped R-b tags for it. Would appreciate another look at the\n patch (current patch 4).\n\n - Dropped arch specific pcibios_resource_to_bus and\n pcibios_bus_to_resource as its not needed for this series. Will address\n the issue as a standalone patch separate from this series.\n\n - Rebased on pci/next, with head at f8a1c947ccc6 (\"Merge branch 'pci/misc'\") \n\n\nv10 series https://lore.kernel.org/all/20260302203325.3826-1-alifm@linux.ibm.com/\nv10 -> v11\n - Rebase on pci/next to handle merge conflicts with patch 1.\n \n - Typo fixup in commit message (patch 4) and use guard() for mutex\n (patch 6).\n\nv9 series https://lore.kernel.org/all/20260217182257.1582-1-alifm@linux.ibm.com/\nv9 -> v10\n - Change pci_slot number to u16 (patch 1).\n\n - Avoid saving invalid config space state if config space is\n inaccessible in the device reset path. It uses the same patch as in v8\n with R-b from Niklas.\n\n - Rebase on 7.0.0-rc2\n\n\nv8 series https://lore.kernel.org/all/20260122194437.1903-1-alifm@linux.ibm.com/\nv8 -> v9\n - Avoid saving PCI config space state in reset path (patch 3) (suggested by Bjorn)\n \n - Add explicit version to struct vfio_device_feature_zpci_err (patch 7).\n\n - Rebase on 6.19\n\n\nv7 series https://lore.kernel.org/all/20260107183217.1365-1-alifm@linux.ibm.com/\nv7 -> v8\n - Rebase on 6.19-rc4\n\n - Address feedback from Niklas and Julien.\n\n\nv6 series https://lore.kernel.org/all/2c609e61-1861-4bf3-b019-a11c137d26a5@linux.ibm.com/\nv6 -> v7\n - Rebase on 6.19-rc4\n\n - Update commit message based on Niklas's suggestion (patch 3).\n\nv5 series https://lore.kernel.org/all/20251113183502.2388-1-alifm@linux.ibm.com/\nv5 -> v6\n - Rebase on 6.18 + Lukas's PCI: Universal error recoverability of\n devices series (https://lore.kernel.org/all/cover.1763483367.git.lukas@wunner.de/)\n\n - Re-work config space accessibility check to pci_dev_save_and_disable() (patch 3).\n This avoids saving the config space, in the reset path, if the device's config space is\n corrupted or inaccessible.\n\nv4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/\nv4 -> v5\n - Rebase on 6.18-rc5\n\n - Move bug fixes to the beginning of the series (patch 1 and 2). These patches\n were posted as a separate fixes series \nhttps://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.com/\n\n - Add matching pci_put_dev() for pci_get_slot() (patch 6).\n\nv3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/\nv3 -> v4\n - Remove warn messages for each PCI capability not restored (patch 1)\n\n - Check PCI_COMMAND and PCI_STATUS register for error value instead of device id \n (patch 1)\n\n - Fix kernel crash in patch 3\n\n - Added reviewed by tags\n\n - Address comments from Niklas's (patches 4, 5, 7)\n\n - Fix compilation error non s390x system (patch 8)\n\n - Explicitly align struct vfio_device_feature_zpci_err (patch 8)\n\n\nv2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/\nv2 -> v3\n - Patch 1 avoids saving any config space state if the device is in error\n (suggested by Alex)\n\n - Patch 2 adds additional check only for FLR reset to try other function \n reset method (suggested by Alex).\n\n - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple\n functions. Creates a new flag pci_slot to allow per function slot.\n\n - Patch 4 fixes a bug in s390 for resource to bus address translation.\n\n - Rebase on 6.17-rc5\n\n\nv1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/\nv1 - > v2\n - Patches 1 and 2 adds some additional checks for FLR/PM reset to \n try other function reset method (suggested by Alex).\n\n - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple\n functions.\n\n - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE \n ioctl. The ioctl is used by userspace to retriece any PCI error\n information for the device (suggested by Alex).\n\n - Patch 8 adds a reset_done() callback for the vfio-pci driver, to\n restore the state of the device after a reset.\n\n - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.\n\n\nFarhan Ali (7):\n PCI: Allow per function PCI slots to fix slot reset on s390\n PCI: Avoid saving config space state if inaccessible\n PCI: Fail FLR when config space is inaccessible\n s390/pci: Store PCI error information for passthrough devices\n vfio-pci/zdev: Add a device feature for error information\n vfio/pci: Add a reset_done callback for vfio-pci driver\n vfio/pci: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX\n\n arch/s390/include/asm/pci.h | 30 ++++++++\n arch/s390/pci/pci.c | 1 +\n arch/s390/pci/pci_event.c | 113 +++++++++++++++++-------------\n drivers/pci/hotplug/rpaphp_slot.c | 2 +-\n drivers/pci/pci.c | 32 ++++++++-\n drivers/pci/slot.c | 33 ++++++---\n drivers/vfio/pci/vfio_pci_core.c | 22 ++++--\n drivers/vfio/pci/vfio_pci_intrs.c | 3 +-\n drivers/vfio/pci/vfio_pci_priv.h | 9 +++\n drivers/vfio/pci/vfio_pci_zdev.c | 45 +++++++++++-\n include/linux/pci.h | 8 ++-\n include/uapi/linux/vfio.h | 18 +++++\n 12 files changed, 247 insertions(+), 69 deletions(-)" }