From patchwork Thu Mar 19 17:16:41 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Auger X-Patchwork-Id: 452076 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 6BC3014011D for ; Fri, 20 Mar 2015 04:19:45 +1100 (AEDT) Received: from localhost ([::1]:40313 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYe6h-0005Sc-Ix for incoming@patchwork.ozlabs.org; Thu, 19 Mar 2015 13:19:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47299) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYe49-0000hx-EV for qemu-devel@nongnu.org; Thu, 19 Mar 2015 13:17:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YYe41-0005A5-Rt for qemu-devel@nongnu.org; Thu, 19 Mar 2015 13:17:05 -0400 Received: from mail-wg0-f50.google.com ([74.125.82.50]:36276) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYe41-0005A1-JO for qemu-devel@nongnu.org; Thu, 19 Mar 2015 13:16:57 -0400 Received: by wgra20 with SMTP id a20so68314205wgr.3 for ; Thu, 19 Mar 2015 10:16:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=bMC5Kvgy0jSrEt6akOQu1xdjLa7Q21M9bsEALE+W7sA=; b=jzZyGghN5mDfy6cI6TbVV0lBHNcANbn5o6pQCJzo+jArBTho9uqKRM6/H7n4phE7C/ GPvMwYYJs7eDBD0vGq4jLCIeD7/7Paa6fnL+7MBlS7QcRmAz5iz5LHZ5gT+S5jzkG7/e PLL6vQl+wuAh3xn5sWuFcK+8isl9STUhhJwQYeiOzJL86e0xYaVvWSd1rn4w6QN2TyRW fs+2rNzrzBi0IQaRl4fah/yA0ZlpJkcC9wY7FjWR0D5OroJ4vAy31eS1UTXGjXXTCghX AeGBR6wktokB81EHpMhO3nDrDsVjJy+Sq6GH/aRUTCAx0YpNpnMDZl0AJq2EG8a+dIuS 5ddg== X-Gm-Message-State: ALoCoQnFlAE3qGoPqRXUFPMtf9zfOSxQjxVhRkj7Mr1Vy6B/haJSnhHM0d9lt7GLqr9zDirOTLN1 X-Received: by 10.194.61.12 with SMTP id l12mr153383800wjr.139.1426785417113; Thu, 19 Mar 2015 10:16:57 -0700 (PDT) Received: from midway01-04-00.lavalab ([81.128.185.50]) by mx.google.com with ESMTPSA id pa4sm2767071wjb.11.2015.03.19.10.16.56 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 19 Mar 2015 10:16:56 -0700 (PDT) From: Eric Auger To: eric.auger@st.com, eric.auger@linaro.org, qemu-devel@nongnu.org, alex.williamson@redhat.com, peter.maydell@linaro.org, agraf@suse.de Date: Thu, 19 Mar 2015 17:16:41 +0000 Message-Id: <1426785402-2091-10-git-send-email-eric.auger@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1426785402-2091-1-git-send-email-eric.auger@linaro.org> References: <1426785402-2091-1-git-send-email-eric.auger@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 74.125.82.50 Cc: kim.phillips@freescale.com, b.reynal@virtualopensystems.com, patches@linaro.org, a.rigo@virtualopensystems.com, pbonzini@redhat.com, alex.bennee@linaro.org, kvmarm@lists.cs.columbia.edu, christoffer.dall@linaro.org Subject: [Qemu-devel] [PATCH v12 9/9] hw/vfio/platform: add irqfd support X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This patch aims at optimizing IRQ handling using irqfd framework. Instead of handling the eventfds on user-side they are handled on kernel side using - the KVM irqfd framework, - the VFIO driver virqfd framework. the virtual IRQ completion is trapped at interrupt controller This removes the need for fast/slow path swap. Overall this brings significant performance improvements. it depends on host kernel KVM irqfd. Signed-off-by: Alvise Rigo Signed-off-by: Eric Auger Reviewed-by: Alex Bennée --- v10 -> v11: - Add Alex' Reviewed-by - introduce kvm_accel in this patch and initialize it v5 -> v6 - rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled() - guard KVM code with #ifdef CONFIG_KVM v3 -> v4: [Alvise Rigo] Use of VFIO Platform driver v6 unmask/virqfd feature and removal of resamplefd handler. Physical IRQ unmasking is now done in VFIO driver. v3: [Eric Auger] initial support with resamplefd handled on QEMU side since the unmask was not supported on VFIO platform driver v5. Conflicts: hw/vfio/platform.c --- hw/vfio/platform.c | 96 +++++++++++++++++++++++++++++++++++++++++ include/hw/vfio/vfio-platform.h | 2 + trace-events | 2 + 3 files changed, 100 insertions(+) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 361e01b..c5efa6e 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -26,6 +26,7 @@ #include "hw/sysbus.h" #include "trace.h" #include "hw/platform-bus.h" +#include "sysemu/kvm.h" /* * Functions used whatever the injection method @@ -51,6 +52,7 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, intp->pin = info.index; intp->flags = info.flags; intp->state = VFIO_IRQ_INACTIVE; + intp->kvm_accel = false; sysbus_init_irq(sbdev, &intp->qemuirq); @@ -61,6 +63,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, error_report("vfio: Error: trigger event_notifier_init failed "); return NULL; } + /* Get an eventfd for resample/unmask */ + ret = event_notifier_init(&intp->unmask, 0); + if (ret) { + g_free(intp); + error_report("vfio: Error: resample event_notifier_init failed eoi"); + return NULL; + } QLIST_INSERT_HEAD(&vdev->intp_list, intp, next); return intp; @@ -315,6 +324,82 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp) return ret; } +/* + * Functions used for irqfd + */ + +#ifdef CONFIG_KVM + +/** + * vfio_set_resample_eventfd - sets the resamplefd for an IRQ + * @intp: the IRQ struct handle + * programs the VFIO driver to unmask this IRQ when the + * intp->unmask eventfd is triggered + */ +static int vfio_set_resample_eventfd(VFIOINTp *intp) +{ + VFIODevice *vbasedev = &intp->vdev->vbasedev; + struct vfio_irq_set *irq_set; + int argsz, ret; + int32_t *pfd; + + argsz = sizeof(*irq_set) + sizeof(*pfd); + irq_set = g_malloc0(argsz); + irq_set->argsz = argsz; + irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK; + irq_set->index = intp->pin; + irq_set->start = 0; + irq_set->count = 1; + pfd = (int32_t *)&irq_set->data; + *pfd = event_notifier_get_fd(&intp->unmask); + qemu_set_fd_handler(*pfd, NULL, NULL, NULL); + ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set); + g_free(irq_set); + if (ret < 0) { + error_report("vfio: Failed to set resample eventfd: %m"); + } + return ret; +} + +/** + * vfio_start_irqfd_injection - starts irqfd injection for an IRQ + * programs VFIO driver with both the trigger and resamplefd + * programs KVM with the gsi, trigger & resample eventfds + */ +static int vfio_start_irqfd_injection(VFIOINTp *intp) +{ + struct kvm_irqfd irqfd = { + .fd = event_notifier_get_fd(&intp->interrupt), + .resamplefd = event_notifier_get_fd(&intp->unmask), + .gsi = intp->virtualID, + .flags = KVM_IRQFD_FLAG_RESAMPLE, + }; + + if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) { + error_report("vfio: Error: Failed to assign the irqfd: %m"); + goto fail_irqfd; + } + if (vfio_set_trigger_eventfd(intp, NULL) < 0) { + goto fail_vfio; + } + if (vfio_set_resample_eventfd(intp) < 0) { + goto fail_vfio; + } + + intp->kvm_accel = true; + trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID, + irqfd.fd, irqfd.resamplefd); + return 0; + +fail_vfio: + irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN; + kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd); +fail_irqfd: + return -1; +} + +#endif /* CONFIG_KVM */ + /* VFIO skeleton */ /* not implemented yet */ @@ -555,7 +640,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp) vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM; vbasedev->ops = &vfio_platform_ops; + +#ifdef CONFIG_KVM + if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() && + vdev->irqfd_allowed) { + vdev->start_irq_fn = vfio_start_irqfd_injection; + } else { + vdev->start_irq_fn = vfio_start_eventfd_injection; + } +#else vdev->start_irq_fn = vfio_start_eventfd_injection; +#endif trace_vfio_platform_realize(vbasedev->name, vdev->compat); @@ -646,6 +741,7 @@ static Property vfio_platform_dev_properties[] = { DEFINE_PROP_BOOL("x-mmap", VFIOPlatformDevice, vbasedev.allow_mmap, true), DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice, mmap_timeout, 1100), + DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h index c9ee898..25d2c46 100644 --- a/include/hw/vfio/vfio-platform.h +++ b/include/hw/vfio/vfio-platform.h @@ -42,6 +42,7 @@ typedef struct VFIOINTp { uint8_t pin; /* index */ uint32_t virtualID; /* virtual IRQ */ uint32_t flags; /* IRQ info flags */ + bool kvm_accel; /* set when QEMU bypass through KVM enabled */ } VFIOINTp; /* function type for routine starting IRQ propagation to the guest */ @@ -62,6 +63,7 @@ typedef struct VFIOPlatformDevice { /* function used to start IRQ propagation to the guest */ start_irq_fn_t start_irq_fn; QemuMutex intp_mutex; /* protect the intp_list IRQ state */ + bool irqfd_allowed; /* debug option to force irqfd on/off */ } VFIOPlatformDevice; typedef struct VFIOPlatformDeviceClass { diff --git a/trace-events b/trace-events index c3dbfc3..31371b4 100644 --- a/trace-events +++ b/trace-events @@ -1571,6 +1571,8 @@ vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned lo vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d" vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s" vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING" +vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d" +vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d" #hw/acpi/memory_hotplug.c mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32