From patchwork Wed Dec 21 21:42:19 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 132736 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 6BF95B7136 for ; Thu, 22 Dec 2011 08:42:57 +1100 (EST) Received: from localhost ([::1]:41502 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RdTw5-0007Rz-Ns for incoming@patchwork.ozlabs.org; Wed, 21 Dec 2011 16:42:53 -0500 Received: from eggs.gnu.org ([140.186.70.92]:44136) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RdTvk-0007FJ-0A for qemu-devel@nongnu.org; Wed, 21 Dec 2011 16:42:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RdTvh-0003By-0W for qemu-devel@nongnu.org; Wed, 21 Dec 2011 16:42:31 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35491) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RdTvg-0003Bo-JN for qemu-devel@nongnu.org; Wed, 21 Dec 2011 16:42:28 -0500 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pBLLgKSc008054 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 21 Dec 2011 16:42:21 -0500 Received: from bling.home (ovpn-113-45.phx2.redhat.com [10.3.113.45]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id pBLLgJTm001058; Wed, 21 Dec 2011 16:42:19 -0500 From: Alex Williamson To: chrisw@sous-sol.org, aik@ozlabs.ru, david@gibson.dropbear.id.au, joerg.roedel@amd.com, agraf@suse.de, benve@cisco.com, aafabbri@cisco.com, B08248@freescale.com, B07421@freescale.com, avi@redhat.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, iommu@lists.linux-foundation.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Date: Wed, 21 Dec 2011 14:42:19 -0700 Message-ID: <20111221214219.27028.32223.stgit@bling.home> In-Reply-To: <20111221213019.27028.26890.stgit@bling.home> References: <20111221213019.27028.26890.stgit@bling.home> User-Agent: StGIT/0.14.3 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 2/5] vfio: VFIO core header X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This defines both the user and bus driver APIs. Signed-off-by: Alex Williamson --- Documentation/ioctl/ioctl-number.txt | 1 include/linux/vfio.h | 353 ++++++++++++++++++++++++++++++++++ 2 files changed, 354 insertions(+), 0 deletions(-) create mode 100644 include/linux/vfio.h diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index af76fde..69825b0 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -88,6 +88,7 @@ Code Seq#(hex) Include File Comments and kernel/power/user.c '8' all SNP8023 advanced NIC card +';' 64-83 linux/vfio.h '@' 00-0F linux/radeonfb.h conflict! '@' 00-0F drivers/video/aty/aty128fb.c conflict! 'A' 00-1F linux/apm_bios.h conflict! diff --git a/include/linux/vfio.h b/include/linux/vfio.h new file mode 100644 index 0000000..2769dfb --- /dev/null +++ b/include/linux/vfio.h @@ -0,0 +1,353 @@ +/* + * VFIO API definition + * + * Copyright (C) 2011 Red Hat, Inc. All rights reserved. + * Author: Alex Williamson + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#ifndef VFIO_H +#define VFIO_H + +#include + +#ifdef __KERNEL__ /* Internal VFIO-core/bus driver API */ + +/** + * struct vfio_device_ops - VFIO bus driver device callbacks + * + * @match: Return true if buf describes the device + * @claim: Force driver to attach to device + * @open: Called when userspace receives file descriptor for device + * @release: Called when userspace releases file descriptor for device + * @read: Perform read(2) on device file descriptor + * @write: Perform write(2) on device file descriptor + * @ioctl: Perform ioctl(2) on device file descriptor, supporting VFIO_DEVICE_* + * operations documented below + * @mmap: Perform mmap(2) on a region of the device file descriptor + */ +struct vfio_device_ops { + bool (*match)(struct device *dev, const char *buf); + int (*claim)(struct device *dev); + int (*open)(void *device_data); + void (*release)(void *device_data); + ssize_t (*read)(void *device_data, char __user *buf, + size_t count, loff_t *ppos); + ssize_t (*write)(void *device_data, const char __user *buf, + size_t count, loff_t *size); + long (*ioctl)(void *device_data, unsigned int cmd, + unsigned long arg); + int (*mmap)(void *device_data, struct vm_area_struct *vma); +}; + +/** + * vfio_group_add_dev() - Add a device to the vfio-core + * + * @dev: Device to add + * @ops: VFIO bus driver callbacks for device + * + * This registration makes the VFIO core aware of the device, creates + * groups objects as required and exposes chardevs under /dev/vfio. + * + * Return 0 on success, errno on failure. + */ +extern int vfio_group_add_dev(struct device *dev, + const struct vfio_device_ops *ops); + +/** + * vfio_group_del_dev() - Remove a device from the vfio-core + * + * @dev: Device to remove + * + * Remove a device previously added to the VFIO core, removing groups + * and chardevs as necessary. + */ +extern void vfio_group_del_dev(struct device *dev); + +/** + * vfio_bind_dev() - Indicate device is bound to the VFIO bus driver and + * register private data structure for ops callbacks. + * + * @dev: Device being bound + * @device_data: VFIO bus driver private data + * + * This registration indicate that a device previously registered with + * vfio_group_add_dev() is now available for use by the VFIO core. When + * all devices within a group are available, the group is viable and my + * be used by userspace drivers. Typically called from VFIO bus driver + * probe function. + * + * Return 0 on success, errno on failure + */ +extern int vfio_bind_dev(struct device *dev, void *device_data); + +/** + * vfio_unbind_dev() - Indicate device is unbinding from VFIO bus driver + * + * @dev: Device being unbound + * + * De-registration of the device previously registered with vfio_bind_dev() + * from VFIO. Upon completion, the device is no longer available for use by + * the VFIO core. Typically called from the VFIO bus driver remove function. + * The VFIO core will attempt to release the device from users and may take + * measures to free the device and/or block as necessary. + * + * Returns pointer to private device_data structure registered with + * vfio_bind_dev(). + */ +extern void *vfio_unbind_dev(struct device *dev); + + +/** + * offsetofend(TYPE, MEMBER) + * + * @TYPE: The type of the structure + * @MEMBER: The member within the structure to get the end offset of + * + * Simple helper macro for dealing with variable sized structures passed + * from user space. This allows us to easily determine if the provided + * structure is sized to include various fields. + */ +#define offsetofend(TYPE, MEMBER) ({ \ + TYPE tmp; \ + offsetof(TYPE, MEMBER) + sizeof(tmp.MEMBER); }) \ + +#endif /* __KERNEL__ */ + +/* Kernel & User level defines for VFIO IOCTLs. */ + +/* + * The IOCTL interface is designed for extensibility by embedding the + * structure length (argsz) and flags into structures passed between + * kernel and userspace. We therefore use the _IO() macro for these + * defines to avoid implicitly embedding a size into the ioctl request. + * As structure fields are added, argsz will increase to match and flag + * bits will be defined to indicate additional fields with valid data. + * It's *always* the caller's responsibility to indicate the size of + * the structure passed by setting argsz appropriately. + */ + +#define VFIO_TYPE (';') +#define VFIO_BASE 100 + +/* --------------- IOCTLs for GROUP file descriptors --------------- */ + +/** + * VFIO_GROUP_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 0, struct vfio_group_info) + * + * Retrieve information about the group. Fills in provided + * struct vfio_group_info. Caller sets argsz. + */ +struct vfio_group_info { + __u32 argsz; + __u32 flags; +#define VFIO_GROUP_FLAGS_VIABLE (1 << 0) +#define VFIO_GROUP_FLAGS_MM_LOCKED (1 << 1) +}; + +#define VFIO_GROUP_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 0) + +/** + * VFIO_GROUP_MERGE - _IOW(VFIO_TYPE, VFIO_BASE + 1, __s32) + * + * Merge group indicated by passed file descriptor into current group. + * Current group may be in use, group indicated by file descriptor + * cannot be in use (no open iommu or devices). + */ +#define VFIO_GROUP_MERGE _IOW(VFIO_TYPE, VFIO_BASE + 1, __s32) + +/** + * VFIO_GROUP_UNMERGE - _IO(VFIO_TYPE, VFIO_BASE + 2) + * + * Remove the current group from a merged set. The current group cannot + * have any open devices. + */ +#define VFIO_GROUP_UNMERGE _IO(VFIO_TYPE, VFIO_BASE + 2) + +/** + * VFIO_GROUP_GET_IOMMU_FD - _IO(VFIO_TYPE, VFIO_BASE + 3) + * + * Return a new file descriptor for the IOMMU object. The IOMMU object + * is shared among members of a merged group. + */ +#define VFIO_GROUP_GET_IOMMU_FD _IO(VFIO_TYPE, VFIO_BASE + 3) + +/** + * VFIO_GROUP_GET_DEVICE_FD - _IOW(VFIO_TYPE, VFIO_BASE + 4, char) + * + * Return a new file descriptor for the device object described by + * the provided char array. + */ +#define VFIO_GROUP_GET_DEVICE_FD _IOW(VFIO_TYPE, VFIO_BASE + 4, char) + + +/* --------------- IOCTLs for IOMMU file descriptors --------------- */ + +/** + * VFIO_IOMMU_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 5, struct vfio_iommu_info) + * + * Retrieve information about the IOMMU object. Fills in provided + * struct vfio_iommu_info. Caller sets argsz. + */ +struct vfio_iommu_info { + __u32 argsz; + __u32 flags; + __u64 iova_max; /* Maximum IOVA address */ + __u64 iova_min; /* Minimum IOVA address */ + __u64 pgsize_bitmap; /* Bitmap of supported page sizes */ +}; + +#define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 5) + +/** + * VFIO_IOMMU_MAP_DMA - _IOW(VFIO_TYPE, VFIO_BASE + 6, struct vfio_dma_map) + * + * Map process virtual addresses to IO virtual addresses using the + * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required. + */ +struct vfio_dma_map { + __u32 argsz; + __u32 flags; +#define VFIO_DMA_MAP_FLAG_READ (1 << 0) /* readable from device */ +#define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */ + __u64 vaddr; /* Process virtual address */ + __u64 iova; /* IO virtual address */ + __u64 size; /* Size of mapping (bytes) */ +}; + +#define VFIO_IOMMU_MAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 6) + +/** + * VFIO_IOMMU_UNMAP_DMA - _IOW(VFIO_TYPE, VFIO_BASE + 7, struct vfio_dma_unmap) + * + * Unmap IO virtual addresses using the provided struct vfio_dma_unmap. + * Caller sets argsz. + */ +struct vfio_dma_unmap { + __u32 argsz; + __u32 flags; + __u64 iova; /* IO virtual address */ + __u64 size; /* Size of mapping (bytes) */ +}; + +#define VFIO_IOMMU_UNMAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 7) + + +/* --------------- IOCTLs for DEVICE file descriptors --------------- */ + +/** + * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 8, + * struct vfio_device_info) + * + * Retrieve information about the device. Fills in provided + * struct vfio_device_info. Caller sets argsz. + */ +struct vfio_device_info { + __u32 argsz; + __u32 flags; +#define VFIO_DEVICE_FLAGS_RESET (1 << 0) /* Device supports reset */ + __u32 num_regions; /* Max region index + 1 */ + __u32 num_irqs; /* Max IRQ index + 1 */ +}; + +#define VFIO_DEVICE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 8) + +/** + * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9, + * struct vfio_region_info) + * + * Retrieve information about a device region. Caller provides + * struct vfio_region_info with index value set. Caller sets argsz. + * Implementation of region mapping is bus driver specific. This is + * intended to describe MMIO, I/O port, as well as bus specific + * regions (ex. PCI config space). Zero sized regions may be used + * to describe unimplemented regions (ex. unimplemented PCI BARs). + */ +struct vfio_region_info { + __u32 argsz; + __u32 flags; +#define VFIO_REGION_INFO_FLAG_MMAP (1 << 0) /* Region supports mmap */ +#define VFIO_REGION_INFO_FLAG_RO (1 << 1) /* Region is read-only */ + __u32 index; /* Region index */ + __u32 resv; /* Reserved for alignment */ + __u64 size; /* Region size (bytes) */ + __u64 offset; /* Region offset from start of device fd */ +}; + +#define VFIO_DEVICE_GET_REGION_INFO _IO(VFIO_TYPE, VFIO_BASE + 9) + +/** + * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 10, + * struct vfio_irq_info) + * + * Retrieve information about a device IRQ. Caller provides + * struct vfio_irq_info with index value set. Caller sets argsz. + * Implementation of IRQ mapping is bus driver specific. Indexes + * supported multiple IRQs are primarily intended to support + * MSI-like interrupt blocks. Zero count irq blocks may be used + * to describe unimplemented interrupt types (ex. PCI MSI-X). + */ +struct vfio_irq_info { + __u32 argsz; + __u32 flags; +#define VFIO_IRQ_INFO_FLAG_LEVEL (1 << 0) /* Level (1) vs Edge (0) */ + __u32 index; /* IRQ index */ + __u32 count; /* Number of IRQs within this index */ +}; + +#define VFIO_DEVICE_GET_IRQ_INFO _IO(VFIO_TYPE, VFIO_BASE + 10) + +/** + * VFIO_DEVICE_SET_IRQ_EVENTFDS - _IOW(VFIO_TYPE, VFIO_BASE + 11, + * struct vfio_irq_eventfds) + * + * Set eventfds for IRQs using the struct vfio_irq_eventfds provided. + * Setting the eventfds also enables the interrupt. Caller sets all fields. + */ +struct vfio_irq_eventfds { + __u32 argsz; + __u32 flags; + __u32 index; /* IRQ index */ + __u32 count; /* Number of eventfds */ + __s32 eventfds[]; /* eventfd for sub-index, -1 to unset */ +}; + +#define VFIO_DEVICE_SET_IRQ_EVENTFDS _IO(VFIO_TYPE, VFIO_BASE + 11) + +/** + * VFIO_DEVICE_UNMASK_IRQ - _IOW(VFIO_TYPE, VFIO_BASE + 12, + * struct vfio_unmask_irq) + * + * Unmask the IRQ described by the provided struct vfio_unmask_irq. + * Level triggered IRQs are masked when posted to userspace and must + * be unmasked to re-trigger. Caller sets all fields. + */ +struct vfio_unmask_irq { + __u32 argsz; + __u32 flags; + __u32 index; /* IRQ index */ + __u32 subindex; /* Sub-index to unmask */ +}; + +#define VFIO_DEVICE_UNMASK_IRQ _IO(VFIO_TYPE, VFIO_BASE + 12) + +/** + * VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFDS - _IOW(VFIO_TYPE, VFIO_BASE + 13, + * struct vfio_irq_eventfds) + * + * Set eventfds to be used for unmasking IRQs using the provided + * struct vfio_irq_eventfds. Same semantics as VFIO_DEVICE_SET_IRQ_EVENTFDS. + * Caller sets all fields. + */ +#define VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFDS _IO(VFIO_TYPE, VFIO_BASE + 13) + +/** + * VFIO_DEVICE_RESET - _IO(VFIO_TYPE, VFIO_BASE + 14) + * + * Reset a device. + */ +#define VFIO_DEVICE_RESET _IO(VFIO_TYPE, VFIO_BASE + 14) + +#endif /* VFIO_H */