From patchwork Fri Apr 24 21:08:03 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 464418 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FBA114007F for ; Sat, 25 Apr 2015 07:08:29 +1000 (AEST) Received: from localhost ([::1]:46514 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ylkpm-0008JN-R8 for incoming@patchwork.ozlabs.org; Fri, 24 Apr 2015 17:08:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59027) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YlkpU-00082g-6X for qemu-devel@nongnu.org; Fri, 24 Apr 2015 17:08:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YlkpS-0001qU-Eo for qemu-devel@nongnu.org; Fri, 24 Apr 2015 17:08:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38124) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YlkpS-0001ps-71 for qemu-devel@nongnu.org; Fri, 24 Apr 2015 17:08:06 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t3OL841N023761 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 24 Apr 2015 17:08:04 -0400 Received: from gimli.home (ovpn-113-195.phx2.redhat.com [10.3.113.195]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t3OL83PM027645; Fri, 24 Apr 2015 17:08:03 -0400 From: Alex Williamson To: alex.williamson@redhat.com Date: Fri, 24 Apr 2015 15:08:03 -0600 Message-ID: <20150424210703.23374.96299.stgit@gimli.home> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org Subject: [Qemu-devel] [PATCH] vfio-pci: Reset workaround for AMD Bonaire and Hawaii GPUs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Somehow these GPUs manage not to respond to a PCI bus reset, removing our primary mechanism for resetting graphics cards. The result is that these devices typically work well for a single VM boot. If the VM is rebooted or restarted, the guest driver is not able to init the card from the dirty state, resulting in a blue screen for Windows guests. The workaround is to use a device specific reset. This is not 100% reliable though since it depends on the incoming state of the device, but it substantially improves the usability of these devices in a VM. Credit to Alex Deucher for his guidance. Signed-off-by: Alex Williamson --- hw/vfio/pci.c | 162 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 162 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ebc1e0a..baae98c 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -154,6 +154,7 @@ typedef struct VFIOPCIDevice { PCIHostDeviceAddress host; EventNotifier err_notifier; EventNotifier req_notifier; + int (*resetfn)(struct VFIOPCIDevice *); uint32_t features; #define VFIO_FEATURE_ENABLE_VGA_BIT 0 #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT) @@ -3319,6 +3320,162 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) vdev->req_enabled = false; } +/* + * AMD Radeon PCI config reset, based on Linux: + * drivers/gpu/drm/radeon/ci_smc.c:ci_is_smc_running() + * drivers/gpu/drm/radeon/radeon_device.c:radeon_pci_config_reset + * drivers/gpu/drm/radeon/ci_smc.c:ci_reset_smc() + * drivers/gpu/drm/radeon/ci_smc.c:ci_stop_smc_clock() + * IDs: include/drm/drm_pciids.h + * Registers: http://cgit.freedesktop.org/~agd5f/linux/commit/?id=4e2aa447f6f0 + * + * Bonaire and Hawaii GPUs do not respond to a bus reset. This is a bug in the + * hardware that should be fixed on future ASICs. The symptom of this is that + * once the accerlated driver loads, Windows guests will bsod on subsequent + * attmpts to load the driver, such as after VM reset or shutdown/restart. To + * work around this, we do an AMD specific PCI config reset, followed by an SMC + * reset. The PCI config reset only works if SMC firmware is running, so we + * have a dependency on the state of the device as to whether this reset will + * be effective. There are still cases where we won't be able to kick the + * device into working, but this greatly improves the usability overall. The + * config reset magic is relatively common on AMD GPUs, but the setup and SMC + * poking is largely ASIC specific. + */ +static bool vfio_radeon_smc_is_running(VFIOPCIDevice *vdev) +{ + uint32_t clk, pc_c; + + /* + * Registers 200h and 204h are index and data registers for acessing + * indirect configuration registers within the device. + */ + vfio_region_write(&vdev->bars[5].region, 0x200, 0x80000004, 4); + clk = vfio_region_read(&vdev->bars[5].region, 0x204, 4); + vfio_region_write(&vdev->bars[5].region, 0x200, 0x80000370, 4); + pc_c = vfio_region_read(&vdev->bars[5].region, 0x204, 4); + + return (!(clk & 1) && (0x20100 <= pc_c)); +} + +/* + * The scope of a config reset is controlled by a mode bit in the misc register + * and a fuse, exposed as a bit in another register. The fuse is the default + * (0 = GFX, 1 = whole GPU), the misc bit is a toggle, with the forumula + * scope = !(misc ^ fuse), where the resulting scope is defined the same as + * the fuse. A truth table therefore tells us that if misc == fuse, we need + * to flip the value of the bit in the misc register. + */ +static void vfio_radeon_set_gfx_only_reset(VFIOPCIDevice *vdev) +{ + uint32_t misc, fuse; + bool a, b; + + vfio_region_write(&vdev->bars[5].region, 0x200, 0xc00c0000, 4); + fuse = vfio_region_read(&vdev->bars[5].region, 0x204, 4); + b = fuse & 64; + + vfio_region_write(&vdev->bars[5].region, 0x200, 0xc0000010, 4); + misc = vfio_region_read(&vdev->bars[5].region, 0x204, 4); + a = misc & 2; + + if (a == b) { + vfio_region_write(&vdev->bars[5].region, 0x204, misc ^ 2, 4); + vfio_region_read(&vdev->bars[5].region, 0x204, 4); /* flush */ + } +} + +static int vfio_radeon_reset(VFIOPCIDevice *vdev) +{ + PCIDevice *pdev = &vdev->pdev; + int i, ret = 0; + uint32_t data; + + /* Defer to a kernel implemented reset */ + if (vdev->vbasedev.reset_works) { + return -ENODEV; + } + + /* Enable only memory BAR access */ + vfio_pci_write_config(pdev, PCI_COMMAND, PCI_COMMAND_MEMORY, 2); + + /* Reset only works if SMC firmware is loaded and running */ + if (!vfio_radeon_smc_is_running(vdev)) { + ret = -EINVAL; + goto out; + } + + /* Make sure only the GFX function is reset */ + vfio_radeon_set_gfx_only_reset(vdev); + + /* AMD PCI config reset */ + vfio_pci_write_config(pdev, 0x7c, 0x39d5e86b, 4); + usleep(100); + + /* Read back the memory size to make sure we're out of reset */ + for (i = 0; i < 100000; i++) { + if (vfio_region_read(&vdev->bars[5].region, 0x5428, 4) != 0xffffffff) { + break; + } + usleep(1); + } + + /* Reset SMC */ + vfio_region_write(&vdev->bars[5].region, 0x200, 0x80000000, 4); + data = vfio_region_read(&vdev->bars[5].region, 0x204, 4); + data |= 1; + vfio_region_write(&vdev->bars[5].region, 0x204, data, 4); + + /* Disable SMC clock */ + vfio_region_write(&vdev->bars[5].region, 0x200, 0x80000004, 4); + data = vfio_region_read(&vdev->bars[5].region, 0x204, 4); + data |= 1; + vfio_region_write(&vdev->bars[5].region, 0x204, data, 4); + +out: + /* Restore PCI command register */ + vfio_pci_write_config(pdev, PCI_COMMAND, 0, 2); + + return ret; +} + +static void vfio_setup_resetfn(VFIOPCIDevice *vdev) +{ + PCIDevice *pdev = &vdev->pdev; + uint16_t vendor, device; + + vendor = pci_get_word(pdev->config + PCI_VENDOR_ID); + device = pci_get_word(pdev->config + PCI_DEVICE_ID); + + switch (vendor) { + case 0x1002: + switch (device) { + /* Bonaire */ + case 0x6649: /* Bonaire [FirePro W5100] */ + case 0x6650: + case 0x6651: + case 0x6658: /* Bonaire XTX [Radeon R7 260X] */ + case 0x665c: /* Bonaire XT [Radeon HD 7790/8770 / R9 260 OEM] */ + case 0x665d: /* Bonaire [Radeon R7 200 Series] */ + /* Hawaii */ + case 0x67A0: /* Hawaii XT GL [FirePro W9100] */ + case 0x67A1: /* Hawaii PRO GL [FirePro W8100] */ + case 0x67A2: + case 0x67A8: + case 0x67A9: + case 0x67AA: + case 0x67B0: /* Hawaii XT [Radeon R9 290X] */ + case 0x67B1: /* Hawaii PRO [Radeon R9 290] */ + case 0x67B8: + case 0x67B9: + case 0x67BA: + case 0x67BE: + vdev->resetfn = vfio_radeon_reset; + break; + } + break; + } +} + static int vfio_initfn(PCIDevice *pdev) { VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev); @@ -3467,6 +3624,7 @@ static int vfio_initfn(PCIDevice *pdev) vfio_register_err_notifier(vdev); vfio_register_req_notifier(vdev); + vfio_setup_resetfn(vdev); return 0; @@ -3514,6 +3672,10 @@ static void vfio_pci_reset(DeviceState *dev) vfio_pci_pre_reset(vdev); + if (vdev->resetfn && !vdev->resetfn(vdev)) { + goto post_reset; + } + if (vdev->vbasedev.reset_works && (vdev->has_flr || !vdev->has_pm_reset) && !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {