Cover Letter Detail
Show a cover letter.
GET /api/1.1/covers/2222854/?format=api
{ "id": 2222854, "url": "http://patchwork.ozlabs.org/api/1.1/covers/2222854/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linux-pci/cover/20260413210608.2912-1-alifm@linux.ibm.com/", "project": { "id": 28, "url": "http://patchwork.ozlabs.org/api/1.1/projects/28/?format=api", "name": "Linux PCI development", "link_name": "linux-pci", "list_id": "linux-pci.vger.kernel.org", "list_email": "linux-pci@vger.kernel.org", "web_url": null, "scm_url": null, "webscm_url": null }, "msgid": "<20260413210608.2912-1-alifm@linux.ibm.com>", "date": "2026-04-13T21:06:01", "name": "[v13,0/7] Error recovery for vfio-pci devices on s390x", "submitter": { "id": 73785, "url": "http://patchwork.ozlabs.org/api/1.1/people/73785/?format=api", "name": "Farhan Ali", "email": "alifm@linux.ibm.com" }, "mbox": "http://patchwork.ozlabs.org/project/linux-pci/cover/20260413210608.2912-1-alifm@linux.ibm.com/mbox/", "series": [ { "id": 499754, "url": "http://patchwork.ozlabs.org/api/1.1/series/499754/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linux-pci/list/?series=499754", "date": "2026-04-13T21:06:01", "name": "Error recovery for vfio-pci devices on s390x", "version": 13, "mbox": "http://patchwork.ozlabs.org/series/499754/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/covers/2222854/comments/", "headers": { "Return-Path": "\n <linux-pci+bounces-52448-incoming=patchwork.ozlabs.org@vger.kernel.org>", "X-Original-To": [ "incoming@patchwork.ozlabs.org", "linux-pci@vger.kernel.org" ], "Delivered-To": "patchwork-incoming@legolas.ozlabs.org", "Authentication-Results": [ "legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256\n header.s=pp1 header.b=G4u3k65y;\n\tdkim-atps=neutral", "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c15:e001:75::12fc:5321; helo=sin.lore.kernel.org;\n envelope-from=linux-pci+bounces-52448-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)", "smtp.subspace.kernel.org;\n\tdkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com\n header.b=\"G4u3k65y\"", "smtp.subspace.kernel.org;\n arc=none smtp.client-ip=148.163.158.5", "smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=linux.ibm.com", "smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=linux.ibm.com" ], "Received": [ "from sin.lore.kernel.org (sin.lore.kernel.org\n [IPv6:2600:3c15:e001:75::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fvg1t1FtHz1y2d\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 14 Apr 2026 07:06:26 +1000 (AEST)", "from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sin.lore.kernel.org (Postfix) with ESMTP id 846A43016D07\n\tfor <incoming@patchwork.ozlabs.org>; Mon, 13 Apr 2026 21:06:21 +0000 (UTC)", "from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 1254E3932ED;\n\tMon, 13 Apr 2026 21:06:21 +0000 (UTC)", "from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n [148.163.158.5])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id 6216C3939C2;\n\tMon, 13 Apr 2026 21:06:19 +0000 (UTC)", "from pps.filterd (m0356516.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id\n 63DF937Q3213180;\n\tMon, 13 Apr 2026 21:06:12 GMT", "from ppma12.dal12v.mail.ibm.com\n (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220])\n\tby mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4dfbqkhgwj-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);\n\tMon, 13 Apr 2026 21:06:12 +0000 (GMT)", "from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1])\n\tby ppma12.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id\n 63DGvZFa015158;\n\tMon, 13 Apr 2026 21:06:11 GMT", "from smtprelay01.wdc07v.mail.ibm.com ([172.16.1.68])\n\tby ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 4dg0msf0c6-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);\n\tMon, 13 Apr 2026 21:06:11 +0000", "from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com\n [10.39.53.230])\n\tby smtprelay01.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n 63DL6AF455837046\n\t(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);\n\tMon, 13 Apr 2026 21:06:10 GMT", "from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id 261065805C;\n\tMon, 13 Apr 2026 21:06:10 +0000 (GMT)", "from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id E467C58054;\n\tMon, 13 Apr 2026 21:06:08 +0000 (GMT)", "from IBM-D32RQW3.ibm.com (unknown [9.61.254.131])\n\tby smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTP;\n\tMon, 13 Apr 2026 21:06:08 +0000 (GMT)" ], "ARC-Seal": "i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1776114381; cv=none;\n b=DT1oQyNCHA9EMgCpeLK1CXf6BUW5sRZ0bwM/OPMl7LSQu7JJajZ80SZRf/CZOScp41blN37YBOjMT5YvBpNs9puhZoQGayeaE5+LAFxRyUlJvUkhE7wyA2Z3sH4YPNOEPF1+x637VwfXNvTM+zuqRPovc4JaLGFwWbU40rEG7Us=", "ARC-Message-Signature": "i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1776114381; c=relaxed/simple;\n\tbh=905a7Gm+lEwi6tzNpIvJl4J55YjdmUP5tfjWaXKVpiE=;\n\th=From:To:Cc:Subject:Date:Message-ID:MIME-Version;\n b=LP+VNw5mSgVIft4z/KNi0We4xNiIXOBJtk7WKTp61xCC4YLxaSS1C8odsPxZTcaqfjUsAGN7Jii9/G7UOZjuiCLnTn4DAelNCMpEdY57e17cXTcwSswBU4Mj9o5znd6q0BaBnwae/EuuxEwbfOT2cDEDr7Q/odrurm9ai2fDoZ4=", "ARC-Authentication-Results": "i=1; smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=linux.ibm.com;\n spf=pass smtp.mailfrom=linux.ibm.com;\n dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com\n header.b=G4u3k65y; arc=none smtp.client-ip=148.163.158.5", "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc\n\t:content-transfer-encoding:date:from:message-id:mime-version\n\t:subject:to; s=pp1; bh=Kn8WQSpbxnIxLVzwP1X5Gg2yLWzbVdeqqr5fbvMYU\n\tUU=; b=G4u3k65yVjpKUehzn2QflV9RHvtP/0sCX17EMbEaUzcuboN9U34c0EazV\n\tvUP2GRIM/AElPa+XsFHBxAqRXaAXpw+MgUmNadYPTE50PTwSrJrZbb6fuNV0gEcW\n\tHHJi4ik0ZvPK/JkRjrGQEZWRvcQcg3CWxz0UhMPjlckbh7UPNzd/MZZiY5YEDVfw\n\ttwvGXTyPcxBibN2QI7zwYE7bcCS+QKsXWwwyE3gGDM5dVmpsyzjE15ADflGlMpqK\n\trGfVT+Yeq1Mat1qf1aqxsyzhAWGZ+29S4phCNsdAZcOFdhrnzY5wYqevCo1Z42Gz\n\t1q1kOLovnWSXZGlu609JJuSZT8gKQ==", "From": "Farhan Ali <alifm@linux.ibm.com>", "To": "linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,\n linux-pci@vger.kernel.org", "Cc": "helgaas@kernel.org, lukas@wunner.de, alex@shazbot.org, clg@redhat.com,\n kbusch@kernel.org, alifm@linux.ibm.com, schnelle@linux.ibm.com,\n mjrosato@linux.ibm.com", "Subject": "[PATCH v13 0/7] Error recovery for vfio-pci devices on s390x", "Date": "Mon, 13 Apr 2026 14:06:01 -0700", "Message-ID": "<20260413210608.2912-1-alifm@linux.ibm.com>", "X-Mailer": "git-send-email 2.43.0", "Precedence": "bulk", "X-Mailing-List": "linux-pci@vger.kernel.org", "List-Id": "<linux-pci.vger.kernel.org>", "List-Subscribe": "<mailto:linux-pci+subscribe@vger.kernel.org>", "List-Unsubscribe": "<mailto:linux-pci+unsubscribe@vger.kernel.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "X-TM-AS-GCONF": "00", "X-Authority-Analysis": "v=2.4 cv=I+9Vgtgg c=1 sm=1 tr=0 ts=69dd5ac4 cx=c_pps\n a=bLidbwmWQ0KltjZqbj+ezA==:117 a=bLidbwmWQ0KltjZqbj+ezA==:17\n a=A5OVakUREuEA:10 a=VkNPw1HP01LnGYTKEx00:22 a=RnoormkPH1_aCDwRdu11:22\n a=Y2IxJ9c9Rs8Kov3niI8_:22 a=VwQbUJbxAAAA:8 a=VnNF1IyMAAAA:8\n a=H5O81k4uhyXwW133l-cA:9 a=O8hF6Hzn-FEA:10", "X-Proofpoint-GUID": "zRZ7xHMDTyvLUaDTKfIe9kg-VI_Zhb9f", "X-Proofpoint-Spam-Details-Enc": "AW1haW4tMjYwNDEzMDIwNSBTYWx0ZWRfXyQ/rvJEgXmRN\n rWk1KO585je+f1avygDFnwNzsMgzB+7IqNztNmeJhG2xSLYCdE4AXCCaBEQcZ8VTu0aNrXb8l9O\n ezs1Nwizmbl8CJzSQn/S+ZDVgIIRZ/ctb0UKEGvMoW2EDobaluw2Zt2VU/Ng/+htr6EU/Gh6RcG\n ohAqMJV2WPj35r9nNtIap/CCsY8IIIow9g+pv/77UILHua1wGMUQXHB2qH9JLlBXV5hL2wUyYIZ\n oVtUb8t1LqXAFU8ZcnA7WTp8Z5C6tNvfdnWfofxzvwjLf0diwwv4Yog/Dq5MRfnBMq0Y04YdtHn\n ZLhMm6JBjHSD4ozd8MWPdHg2x8NPmvJV9DocFyV1MIiIk19eJrJYRlCJE70QoH3c/iJq7Wmc1Ty\n SRqLPxfAFxboqcynSiFXcQwljuNn26vzsrhsFQ7Pijvxv7AZNoDvP8+WGmpUO5rz6j0w4ntgqWg\n 6T/XFVNEydd3kdDNQTA==", "X-Proofpoint-ORIG-GUID": "zRZ7xHMDTyvLUaDTKfIe9kg-VI_Zhb9f", "X-Proofpoint-Virus-Version": "vendor=baseguard\n engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49\n definitions=2026-04-13_03,2026-04-13_04,2025-10-01_01", "X-Proofpoint-Spam-Details": "rule=outbound_notspam policy=outbound score=0\n priorityscore=1501 impostorscore=0 bulkscore=0 suspectscore=0 adultscore=0\n lowpriorityscore=0 spamscore=0 malwarescore=0 clxscore=1015 phishscore=0\n classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0\n reason=mlx scancount=1 engine=8.22.0-2604010000 definitions=main-2604130205" }, "content": "Hi,\n\nThis Linux kernel patch series introduces support for error recovery for\npassthrough PCI devices on System Z (s390x).\n\nBackground\n----------\nFor PCI devices on s390x an operating system receives platform specific\nerror events from firmware rather than through AER.Today for\npassthrough/userspace devices, we don't attempt any error recovery and\nignore any error events for the devices. The passthrough/userspace devices\nare managed by the vfio-pci driver. The driver does register error handling\ncallbacks (error_detected), and on an error trigger an eventfd to\nuserspace. But we need a mechanism to notify userspace\n(QEMU/guest/userspace drivers) about the error event.\n\nProposal\n--------\nWe can expose this error information (currently only the PCI Error Code)\nvia a device feature. Userspace can then obtain the error information\nvia VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving\na device reset.\n\nThis is how a typical flow for passthrough devices to a VM would work:\nFor passthrough devices to a VM, the driver bound to the device on the host\nis vfio-pci. vfio-pci driver does support the error_detected() callback\n(vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery\ncode on the host will call the vfio-pci error_detected() callback. The\nvfio-pci error_detected() callback will notify userspace/QEMU via an\neventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x\nerror recovery on the host will skip any further action(see patch 4) and\nlet userspace drive the error recovery.\n\nOnce userspace/QEMU is notified, it then injects this error into the VM\nso device drivers in the VM can take recovery actions. For example for a\npassthrough NVMe device, the VM's OS NVMe driver will access the device.\nAt this point the VM's NVMe driver's error_detected() will drive the\nrecovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error\nrecovery in the VM's OS will try to do a reset. Resets are privileged\noperations and so the VM will need intervention from QEMU to perform the\nreset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the\nhost that the VM is requesting a reset of the device. The vfio-pci driver\non the host will then perform the reset on the device to recover it.\n\n\nThanks\nFarhan\n\nChangeLog\n---------\nv12 series https://lore.kernel.org/all/20260330174011.1161-1-alifm@linux.ibm.com/\nv12 -> v13\n - Add the mediated_recovery flag as part of struct zpci_ccdf_pending\n and protect the struct with pending_errs_lock (patch 4).\n\n - Move dequeing pending error logic to a helper function (patch 5).\n\n - Update device feature number for VFIO_DEVICE_FEATURE_ZPCI_ERROR (patch 5).\n\n - Rebase on linux-next with tag next-20260410\n\n\nv11 series https://lore.kernel.org/all/20260316191544.2279-1-alifm@linux.ibm.com/\n - Address Bjorn's comments from v11 (patches 1-3).\n\n - Create a common function to check config space accessibility\n (patch 2).\n\n - Address Alex's comments from v11 (patches 4, 5, 7).\n\n - Protect the mediated_recovery flag with the pending_errs_lock.\n Doing that it made sense to squash patches 5 and 6 from v11\n (current patch 4). Even though the code didn't change significantly\n I have dropped R-b tags for it. Would appreciate another look at the\n patch (current patch 4).\n\n - Dropped arch specific pcibios_resource_to_bus and\n pcibios_bus_to_resource as its not needed for this series. Will address\n the issue as a standalone patch separate from this series.\n\n - Rebased on pci/next, with head at f8a1c947ccc6 (\"Merge branch 'pci/misc'\")\n\n\nv10 series https://lore.kernel.org/all/20260302203325.3826-1-alifm@linux.ibm.com/\nv10 -> v11\n - Rebase on pci/next to handle merge conflicts with patch 1.\n\n - Typo fixup in commit message (patch 4) and use guard() for mutex\n (patch 6).\n\nv9 series https://lore.kernel.org/all/20260217182257.1582-1-alifm@linux.ibm.com/\nv9 -> v10\n - Change pci_slot number to u16 (patch 1).\n\n - Avoid saving invalid config space state if config space is\n inaccessible in the device reset path. It uses the same patch as in v8\n with R-b from Niklas.\n\n - Rebase on 7.0.0-rc2\n\n\nv8 series https://lore.kernel.org/all/20260122194437.1903-1-alifm@linux.ibm.com/\nv8 -> v9\n - Avoid saving PCI config space state in reset path (patch 3) (suggested by Bjorn)\n\n - Add explicit version to struct vfio_device_feature_zpci_err (patch 7).\n\n - Rebase on 6.19\n\n\nv7 series https://lore.kernel.org/all/20260107183217.1365-1-alifm@linux.ibm.com/\nv7 -> v8\n - Rebase on 6.19-rc4\n\n - Address feedback from Niklas and Julien.\n\n\nv6 series https://lore.kernel.org/all/2c609e61-1861-4bf3-b019-a11c137d26a5@linux.ibm.com/\nv6 -> v7\n - Rebase on 6.19-rc4\n\n - Update commit message based on Niklas's suggestion (patch 3).\n\nv5 series https://lore.kernel.org/all/20251113183502.2388-1-alifm@linux.ibm.com/\nv5 -> v6\n - Rebase on 6.18 + Lukas's PCI: Universal error recoverability of\n devices series (https://lore.kernel.org/all/cover.1763483367.git.lukas@wunner.de/)\n\n - Re-work config space accessibility check to pci_dev_save_and_disable() (patch 3).\n This avoids saving the config space, in the reset path, if the device's config space is\n corrupted or inaccessible.\n\nv4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/\nv4 -> v5\n - Rebase on 6.18-rc5\n\n - Move bug fixes to the beginning of the series (patch 1 and 2). These patches\n were posted as a separate fixes series\nhttps://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.com/\n\n - Add matching pci_put_dev() for pci_get_slot() (patch 6).\n\nv3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/\nv3 -> v4\n - Remove warn messages for each PCI capability not restored (patch 1)\n\n - Check PCI_COMMAND and PCI_STATUS register for error value instead of device id\n (patch 1)\n\n - Fix kernel crash in patch 3\n\n - Added reviewed by tags\n\n - Address comments from Niklas's (patches 4, 5, 7)\n\n - Fix compilation error non s390x system (patch 8)\n\n - Explicitly align struct vfio_device_feature_zpci_err (patch 8)\n\n\nv2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/\nv2 -> v3\n - Patch 1 avoids saving any config space state if the device is in error\n (suggested by Alex)\n\n - Patch 2 adds additional check only for FLR reset to try other function\n reset method (suggested by Alex).\n\n - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple\n functions. Creates a new flag pci_slot to allow per function slot.\n\n - Patch 4 fixes a bug in s390 for resource to bus address translation.\n\n - Rebase on 6.17-rc5\n\n\nv1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/\nv1 - > v2\n - Patches 1 and 2 adds some additional checks for FLR/PM reset to\n try other function reset method (suggested by Alex).\n\n - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple\n functions.\n\n - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE\n ioctl. The ioctl is used by userspace to retriece any PCI error\n information for the device (suggested by Alex).\n\n - Patch 8 adds a reset_done() callback for the vfio-pci driver, to\n restore the state of the device after a reset.\n\n - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.\n\n\nFarhan Ali (7):\n PCI: Allow per function PCI slots to fix slot reset on s390\n PCI: Avoid saving config space state if inaccessible\n PCI: Fail FLR when config space is inaccessible\n s390/pci: Store PCI error information for passthrough devices\n vfio-pci/zdev: Add a device feature for error information\n vfio/pci: Add a reset_done callback for vfio-pci driver\n vfio/pci: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX\n\n arch/s390/include/asm/pci.h | 33 ++++++++\n arch/s390/pci/pci.c | 1 +\n arch/s390/pci/pci_event.c | 136 +++++++++++++++++++-----------\n drivers/pci/hotplug/rpaphp_slot.c | 2 +-\n drivers/pci/pci.c | 32 ++++++-\n drivers/pci/slot.c | 33 ++++++--\n drivers/vfio/pci/vfio_pci_core.c | 22 +++--\n drivers/vfio/pci/vfio_pci_intrs.c | 3 +-\n drivers/vfio/pci/vfio_pci_priv.h | 9 ++\n drivers/vfio/pci/vfio_pci_zdev.c | 40 ++++++++-\n include/linux/pci.h | 8 +-\n include/uapi/linux/vfio.h | 20 +++++\n 12 files changed, 266 insertions(+), 73 deletions(-)" }