From patchwork Tue Jul 26 16:10:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 1660862 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=oracle.com header.i=@oracle.com header.a=rsa-sha256 header.s=corp-2022-7-12 header.b=EXPyEujx; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lsj213Slmz9s2R for ; Wed, 27 Jul 2022 02:25:29 +1000 (AEST) Received: from localhost ([::1]:41028 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oGNN8-0003d2-Tg for incoming@patchwork.ozlabs.org; Tue, 26 Jul 2022 12:25:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35256) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oGNA2-0003Ml-U0 for qemu-devel@nongnu.org; Tue, 26 Jul 2022 12:11:54 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:19746) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oGNA0-0006DQ-Ud for qemu-devel@nongnu.org; Tue, 26 Jul 2022 12:11:54 -0400 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26QFn0Wb031705; Tue, 26 Jul 2022 16:11:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2022-7-12; bh=ZpMX+dkGp0nolXRTPsLQuh/GSasLYD0jX7mg8D5/X9E=; b=EXPyEujxmrWbfQpY/0Z2qu4wE4wxZtPIr9Za6uwqgsB6IlAmOlqrAvA6ih6cDJN94Nuj CdvoZCwh2bZvL2xC+UwySXPic6g9VpcjkVMhHGK8KfmDlJ6Y4b7g6GxLZyfH8doQPvCG +tzReWDMhK9WhOafiGA32N8on7qa/7lrJhrTAnUJcqH/BsadVUustCGaVYfGw0YzWFWx NyFTTrpnGZVkL+dbdCUHnQy9dM434+qvdFCIiqjhd0+ZIwVXhB2MSKt0MDsqOgL0B5i/ XjebZ5Bq4XXlokzthqJXW8Hs/GacxKUD4Vhlw3kRT0yRX8djsh3T/NmnmRa7MzENweyD fw== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3hg9a4q1c7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Jul 2022 16:11:33 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 26QEhMox006143; Tue, 26 Jul 2022 16:11:32 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3hh65bq2dy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Jul 2022 16:11:32 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 26QG5uTD023334; Tue, 26 Jul 2022 16:11:31 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3hh65bq0y5-33; Tue, 26 Jul 2022 16:11:31 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , =?utf-8?q?Alex_Benn=C3=A9e?= , "Dr. David Alan Gilbert" , "Michael S. Tsirkin" , Marcel Apfelbaum , Alex Williamson , "Daniel P. Berrange" , Juan Quintela , Markus Armbruster , Eric Blake , Jason Zeng , Zheng Chuan , Steve Sistare , Mark Kanda , Guoyi Tu , Peter Maydell , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Igor Mammedov , David Hildenbrand , John Snow , Peng Liang Subject: [PATCH V9 32/46] vfio-pci: cpr part 2 (msi) Date: Tue, 26 Jul 2022 09:10:29 -0700 Message-Id: <1658851843-236870-33-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1658851843-236870-1-git-send-email-steven.sistare@oracle.com> References: <1658851843-236870-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-26_04,2022-07-26_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 malwarescore=0 phishscore=0 spamscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2207260063 X-Proofpoint-ORIG-GUID: FeeHbKC5XTxjTxP2rkCO4LYKLyUvqpXt X-Proofpoint-GUID: FeeHbKC5XTxjTxP2rkCO4LYKLyUvqpXt Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Finish cpr for vfio-pci MSI/MSI-X devices by preserving eventfd's and vector state. Signed-off-by: Steve Sistare --- hw/vfio/pci.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 118 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index b5fd2ec..1d0e8db 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -54,17 +54,47 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev); static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled); static void vfio_msi_disable_common(VFIOPCIDevice *vdev); +#define EVENT_FD_NAME(vdev, name) \ + g_strdup_printf("%s_%s", (vdev)->vbasedev.name, (name)) + +static void save_event_fd(VFIOPCIDevice *vdev, const char *name, int nr, + EventNotifier *ev) +{ + int fd = event_notifier_get_fd(ev); + + if (fd >= 0) { + g_autofree char *fdname = EVENT_FD_NAME(vdev, name); + + cpr_resave_fd(fdname, nr, fd); + } +} + +static int load_event_fd(VFIOPCIDevice *vdev, const char *name, int nr) +{ + g_autofree char *fdname = EVENT_FD_NAME(vdev, name); + return cpr_find_fd(fdname, nr); +} + +static void delete_event_fd(VFIOPCIDevice *vdev, const char *name, int nr) +{ + g_autofree char *fdname = EVENT_FD_NAME(vdev, name); + cpr_delete_fd(fdname, nr); +} + /* Create new or reuse existing eventfd */ static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e, const char *name, int nr) { - int fd = -1; /* placeholder until a subsequent patch */ int ret = 0; + int fd = load_event_fd(vdev, name, nr); if (fd >= 0) { event_notifier_init_fd(e, fd); } else { ret = event_notifier_init(e, 0); + if (!ret) { + save_event_fd(vdev, name, nr, e); + } } return ret; } @@ -72,6 +102,7 @@ static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e, static void vfio_notifier_cleanup(VFIOPCIDevice *vdev, EventNotifier *e, const char *name, int nr) { + delete_event_fd(vdev, name, nr); event_notifier_cleanup(e); } @@ -512,6 +543,15 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, VFIOMSIVector *vector; int ret; + /* + * Ignore the callback from msix_set_vector_notifiers during resume. + * The necessary subset of these actions is called from vfio_claim_vectors + * during post load. + */ + if (vdev->vbasedev.reused) { + return 0; + } + trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr); vector = &vdev->msi_vectors[nr]; @@ -2784,6 +2824,11 @@ static void vfio_register_err_notifier(VFIOPCIDevice *vdev) fd = event_notifier_get_fd(&vdev->err_notifier); qemu_set_fd_handler(fd, vfio_err_notifier_handler, NULL, vdev); + /* Do not alter irq_signaling during vfio_realize for cpr */ + if (vdev->vbasedev.reused) { + return; + } + if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_ERR_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); @@ -2849,6 +2894,12 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) fd = event_notifier_get_fd(&vdev->req_notifier); qemu_set_fd_handler(fd, vfio_req_notifier_handler, NULL, vdev); + /* Do not alter irq_signaling during vfio_realize for cpr */ + if (vdev->vbasedev.reused) { + vdev->req_enabled = true; + return; + } + if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_REQ_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); @@ -3357,6 +3408,46 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static void vfio_claim_vectors(VFIOPCIDevice *vdev, int nr_vectors, bool msix) +{ + int i, fd; + bool pending = false; + PCIDevice *pdev = &vdev->pdev; + + vdev->nr_vectors = nr_vectors; + vdev->msi_vectors = g_new0(VFIOMSIVector, nr_vectors); + vdev->interrupt = msix ? VFIO_INT_MSIX : VFIO_INT_MSI; + + vfio_prepare_kvm_msi_virq_batch(vdev); + + for (i = 0; i < nr_vectors; i++) { + VFIOMSIVector *vector = &vdev->msi_vectors[i]; + + fd = load_event_fd(vdev, "interrupt", i); + if (fd >= 0) { + vfio_vector_init(vdev, i); + qemu_set_fd_handler(fd, vfio_msi_interrupt, NULL, vector); + } + + if (load_event_fd(vdev, "kvm_interrupt", i) >= 0) { + vfio_add_kvm_msi_virq(vdev, vector, i, msix); + } else { + vdev->msi_vectors[i].virq = -1; + } + + if (msix && msix_is_pending(pdev, i) && msix_is_masked(pdev, i)) { + set_bit(i, vdev->msix->pending); + pending = true; + } + } + + vfio_commit_kvm_msi_virq_batch(vdev); + + if (msix) { + memory_region_set_enabled(&pdev->msix_pba_mmio, pending); + } +} + /* * The kernel may change non-emulated config bits. Exclude them from the * changed-bits check in get_pci_config_device. @@ -3375,6 +3466,29 @@ static int vfio_pci_pre_load(void *opaque) return 0; } +static int vfio_pci_post_load(void *opaque, int version_id) +{ + VFIOPCIDevice *vdev = opaque; + PCIDevice *pdev = &vdev->pdev; + int nr_vectors; + + if (msix_enabled(pdev)) { + msix_set_vector_notifiers(pdev, vfio_msix_vector_use, + vfio_msix_vector_release, NULL); + nr_vectors = vdev->msix->entries; + vfio_claim_vectors(vdev, nr_vectors, true); + + } else if (msi_enabled(pdev)) { + nr_vectors = msi_nr_vectors_allocated(pdev); + vfio_claim_vectors(vdev, nr_vectors, false); + + } else if (vfio_pci_read_config(pdev, PCI_INTERRUPT_PIN, 1)) { + assert(0); /* completed in a subsequent patch */ + } + + return 0; +} + static bool vfio_pci_needed(void *opaque) { return migrate_mode() == MIG_MODE_CPR_EXEC; @@ -3386,8 +3500,11 @@ static const VMStateDescription vfio_pci_vmstate = { .minimum_version_id = 0, .priority = MIG_PRI_VFIO_PCI, /* must load before container */ .pre_load = vfio_pci_pre_load, + .post_load = vfio_pci_post_load, .needed = vfio_pci_needed, .fields = (VMStateField[]) { + VMSTATE_PCI_DEVICE(pdev, VFIOPCIDevice), + VMSTATE_MSIX_TEST(pdev, VFIOPCIDevice, vfio_msix_present), VMSTATE_END_OF_LIST() } };