diff mbox

Multi GPU passthrough via VFIO

Message ID 20140207002258.GJ994@parallels.com
State New
Headers show

Commit Message

Maik Broemme Feb. 7, 2014, 12:22 a.m. UTC
Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Thu, 2014-02-06 at 01:25 +0100, Maik Broemme wrote:
> > Hi Alex,
> > 
> > Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > > > > in QEMU. The 7870 is doing the reset properly.
> > > > > > 
> > > > > > 
> > > > > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > > > > chance?  Thanks,
> > > > > > 
> > > > > 
> > > > > Here are both. It is funny it is opposite as you described. :)
> > > > 
> > > > 
> > > > Oops, yes.  Does this help?
> > > > 
> > > > --- a/hw/misc/vfio.c
> > > > +++ b/hw/misc/vfio.c
> > > > @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
> > > >  
> > > >      QLIST_FOREACH(group, &group_list, next) {
> > > >          QLIST_FOREACH(vdev, &group->device_list, next) {
> > > > -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> > > > +            if (!vdev->reset_works || !vdev->has_flr) {
> > > >                  vdev->needs_reset = true;
> > > >              }
> > > >          }
> > > > 
> > > > I can't figure out why I coded it the way that I did.  Probably overly
> > > > targeting a specific device.  Thanks,
> > > > 
> > > 
> > > This patch works absolutely fine. After applying it to my 'qemu-git', the
> > > device resets works flawlessly. So it would be great to push it upstream
> > > as it looks good.
> > > 
> > 
> > Okay sorry. I was too fast here. It was just working first time but now
> > even after clean reboot it no longer works as expected but behavior
> > is very strange.
> > 
> > Windows:
> > 
> >   1st boot works fine - boot VGA and Windows ATI driver loaded, issue
> >       reboot and qemu stopped due to '-no-reboot'.
> > 
> >   2nd boot works partially - boot VGA and Windows ATI driver loaded but
> >       black screen and my system becames terrible slow and mostly
> >       unresponsive. My dmesg shows immediately after ATI driver will
> >       enable the device the following:
> > 
> > [  159.984324] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
> > [  159.984340] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
> > [  160.129036] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x19@0x270
> > [  160.129049] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x1b@0x2d0
> > [  172.977677] kvm: zapping shadow pages for mmio generation wraparound
> > [  173.160174] br0: port 2(tap0) entered forwarding state
> > [  175.902967] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X
> > [  188.340430] Clocksource tsc unstable (delta = -119654611 ns)
> > [  188.340511] Switched to clocksource hpet
> > [  191.088693] hpet1: lost 12 rtc interrupts
> > [  191.926555] hpet1: lost 25 rtc interrupts
> > 
> >   So your patch fixed indeed reset issue of boot VGA but something else
> >   is wrong now. :)
> 
> Can you try the cards separately?  If you run lspci on the device in the
> host, does it report as normal?  Often when the host gets slow and we
> get these sorts of clock issues it means the bus is fatal and we get
> timeouts trying to read from it.
> 

Okay with only one card I don't have the clock issues anymore, so we
should look into this a bit later as working reset seems more important
for now.

> > Linux (fglrx):
> > 
> >   1st boot works fine - boot VGA, fglrx loads fine and X could be
> >       started, issue reboot via SSH and qemu stopped due to
> >       '-no-reboot'.
> > 
> >   2nd boot works partially - boot VGA, fglrx loads fine but X couldn't
> >       be started and fails with:
> > 
> > [   34.265111] fglrx_pci 0000:02:00.0: irq 50 for MSI/MSI-X
> > [   34.344313] <6>[fglrx] Firegl kernel thread PID: 318
> > [   34.344400] <6>[fglrx] Firegl kernel thread PID: 319
> > [   34.344478] <6>[fglrx] Firegl kernel thread PID: 320
> > [   34.344589] <6>[fglrx] IRQ 50 Enabled
> > [   34.356105] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > [   34.356107] <6>[fglrx] Reserved FB block: Unshared offset:fac3000, size:3000 
> > [   34.356109] <6>[fglrx] Reserved FB block: Unshared offset:fac6000, size:23a000 
> > [   34.356110] <6>[fglrx] Reserved FB block: Unshared offset:7fff4000, size:c000 
> > [   34.386436] fglrx_pci 0000:01:00.0: irq 51 for MSI/MSI-X
> > [   34.490902] <6>[fglrx] Firegl kernel thread PID: 321
> > [   34.490994] <6>[fglrx] Firegl kernel thread PID: 322
> > [   34.491069] <6>[fglrx] Firegl kernel thread PID: 323
> > [   34.491166] <6>[fglrx] IRQ 51 Enabled
> > [   34.505271] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > [   34.505273] <6>[fglrx] Reserved FB block: Unshared offset:f9c3000, size:3000 
> > [   34.505274] <6>[fglrx] Reserved FB block: Unshared offset:f9c6000, size:23a000 
> > [   34.505276] <6>[fglrx] Reserved FB block: Unshared offset:fc00000, size:100000 
> > [   34.505277] <6>[fglrx] Reserved FB block: Unshared offset:fff8000, size:8000 
> > [   34.505278] <6>[fglrx] Reserved FB block: Unshared offset:ffff4000, size:c000 
> > [   34.526198] BUG: unable to handle kernel paging request at ffff880c724e8008
> > [   34.526203] IP: [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > [   34.526277] PGD 1b3e067 PUD 0 
> > [   34.526279] Oops: 0002 [#1] PREEMPT SMP 
> > [   34.526282] Modules linked in: mousedev crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec serio_raw psmouse parport_pc snd_hwdep snd_pcm parport snd_page_alloc processor snd_timer snd soundcore i2c_i801 intel_agp lpc_ich pcspkr intel_gtt i2c_core shpchp evdev fglrx(PO) amd_iommu_v2 button ext4 crc16 mbcache jbd2 atkbd libps2 virtio_blk virtio_net ahci libahci libata scsi_mod i8042 floppy serio virtio_pci virtio_ring virtio
> > [   34.526307] CPU: 1 PID: 316 Comm: X Tainted: P           O 3.13.1-2-ARCH #1
> > [   34.526309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> > [   34.526311] task: ffff8800776e2d00 ti: ffff880037a28000 task.ti: ffff880037a28000
> > [   34.526312] RIP: 0010:[<ffffffffa0399af6>]  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > [   34.526353] RSP: 0018:ffff880037a29810  EFLAGS: 00010296
> > [   34.526354] RAX: 0000000000000001 RBX: ffff8800724e800c RCX: 0000000000000006
> > [   34.526356] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8800724e8264
> > [   34.526357] RBP: ffff88007b19a00c R08: 00000000000186a0 R09: 000000000001e848
> > [   34.526358] R10: 00000002fffffffd R11: 00000000ffffffff R12: 0000000000000001
> > [   34.526359] R13: ffff88007b19a00c R14: 0000000000000000 R15: ffff880037a298b0
> > [   34.526363] FS:  00007f0ba649b880(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
> > [   34.526365] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [   34.526366] CR2: ffff880c724e8008 CR3: 0000000037998000 CR4: 00000000000406e0
> > [   34.526372] Stack:
> > [   34.526373]  ffff88007b19a2f4 ffff88007bffcd1c 0000000000000001 ffffffffa0322cf0
> > [   34.526375]  0000000000000000 0000000000000000 0000000000000000 ffff880077ed2c08
> > [   34.526378]  0000000000000000 ffff880077ed2c08 ffff880037a298a0 ffffffffa0327f14
> > [   34.526380] Call Trace:
> > [   34.526435]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > [   34.526490]  [<ffffffffa0327f14>] ? PECI_NotifyDALPreAdapterClockChange+0x144/0x160 [fglrx]
> > [   34.526546]  [<ffffffffa031e321>] ? PHM_SetPowerState+0x31/0xc0 [fglrx]
> > [   34.526597]  [<ffffffffa0340a5b>] ? PSM_ApplyHardwareAttributes_Dynamic+0x9b/0xf0 [fglrx]
> > [   34.526651]  [<ffffffffa033fde9>] ? PSM_AdjustPowerState_Dynamic+0x169/0x540 [fglrx]
> > [   34.526668]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > [   34.526668]  [<ffffffffa0342ee4>] ? PEM_ExcuteEventChain+0x64/0xe0 [fglrx]
> > [   34.526668]  [<ffffffffa0341302>] ? PEM_HandleEvent+0x92/0xd0 [fglrx]
> > [   34.526668]  [<ffffffffa03357c0>] ? PEM_CWDDEPM_NotifyEvent+0xe0/0x4d0 [fglrx]
> > [   34.526668]  [<ffffffffa0333869>] ? PP_Cwdde+0x109/0x180 [fglrx]
> > [   34.526668]  [<ffffffffa02091dc>] ? firegl_pplib_cwddepm+0xbc/0x130 [fglrx]
> > [   34.526668]  [<ffffffffa02092d9>] ? firegl_pplib_notify_event+0x89/0xd0 [fglrx]
> > [   34.526668]  [<ffffffffa020292f>] ? hal_init_gpu+0x2bf/0x480 [fglrx]
> > [   34.526668]  [<ffffffffa01dcc7b>] ? firegl_open+0x2db/0x310 [fglrx]
> > [   34.526668]  [<ffffffffa01cb287>] ? ip_firegl_open+0x17/0x20 [fglrx]
> > [   34.526668]  [<ffffffffa01ccac8>] ? firegl_stub_open+0x98/0x100 [fglrx]
> > [   34.526668]  [<ffffffff811a82bf>] ? chrdev_open+0x9f/0x1d0
> > [   34.526668]  [<ffffffff811a1967>] ? do_dentry_open+0x1b7/0x2c0
> > [   34.526668]  [<ffffffff811aed41>] ? __inode_permission+0x41/0xb0
> > [   34.526668]  [<ffffffff811a8220>] ? cdev_put+0x30/0x30
> > [   34.526668]  [<ffffffff811a1d91>] ? finish_open+0x31/0x40
> > [   34.526668]  [<ffffffff811b1b72>] ? do_last+0x572/0xe90
> > [   34.526668]  [<ffffffff811af036>] ? link_path_walk+0x236/0x8d0
> > [   34.526668]  [<ffffffff811b254b>] ? path_openat+0xbb/0x6b0
> > [   34.526668]  [<ffffffff811b3c6a>] ? do_filp_open+0x3a/0x90
> > [   34.526668]  [<ffffffff811c0567>] ? __alloc_fd+0xa7/0x130
> > [   34.526668]  [<ffffffff811a2f49>] ? do_sys_open+0x129/0x220
> > [   34.526668]  [<ffffffff811a305e>] ? SyS_open+0x1e/0x20
> > [   34.526668]  [<ffffffff8152136d>] ? system_call_fastpath+0x1a/0x1f
> > [   34.526668] Code: 8b 4a 1c 8b 93 e0 18 00 00 48 8d bb 58 02 00 00 85 d2 0f 84 63 02 00 00 f6 c2 01 0f 84 20 01 00 00 44 8b 1b 41 ff cb 4f 8d 14 5b <46> 89 44 93 08 8b 95 3c 02 00 00 48 89 d0 48 c1 e8 07 a8 01 75 
> > [   34.526668] RIP  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > [   34.526668]  RSP <ffff880037a29810>
> > [   34.526668] CR2: ffff880c724e8008
> > [   34.526668] ---[ end trace 5431e6dcf1c31dea ]---
> > [   69.317528] type=1006 audit(1391649552.046:4): pid=324 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=3 res=1
> > 
> > I know it is the binary driver but I would also retry with radeon one but
> > I believe there will be a similar crash. In my first try I just rebooted
> > the Linux VM several times without starting X.
> > 
> > I got it one time working without getting 'Clocksource tsc unstable' but
> > now I'm unable to repeat it. So I believe something more is needed.
> 
> Bus resets are a mixed blessing, it returns the card to a relatively
> known state, but it's a fairly unusual event from a platform perspective
> and we have no idea what kind of quirks the host system bios might have
> in place to workaround hardware.  If the bus is not fatal you might try
> running lspci -vvv in the host at various points to see what changed.
> For instance, boot a Linux guest to text mode and see if the card is in
> the same state between first boot and second boot before starting X.
> Thanks,
> 

I tried the R9 290X separately now. You're right there are some changes
between lspci -vvv output between 1st and 2nd boot and they are reset
if I do "suspend-to-ram" and resume before 3rd boot of VM. Below is the
lspci from 1st boot and the diffs of the lspci outputs:


After that if I do suspend-to-ram / resume trick I have again lspci
output from before 1st boot.

> Alex
> 

--Maik

Comments

Maik Broemme Feb. 7, 2014, 6:07 p.m. UTC | #1
Hi Alex,

Maik Broemme <mbroemme@parallels.com> wrote:
> Hi Alex,
> 
> Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Thu, 2014-02-06 at 01:25 +0100, Maik Broemme wrote:
> > > Hi Alex,
> > > 
> > > Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > > > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > > > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > > > > > in QEMU. The 7870 is doing the reset properly.
> > > > > > > 
> > > > > > > 
> > > > > > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > > > > > chance?  Thanks,
> > > > > > > 
> > > > > > 
> > > > > > Here are both. It is funny it is opposite as you described. :)
> > > > > 
> > > > > 
> > > > > Oops, yes.  Does this help?
> > > > > 
> > > > > --- a/hw/misc/vfio.c
> > > > > +++ b/hw/misc/vfio.c
> > > > > @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
> > > > >  
> > > > >      QLIST_FOREACH(group, &group_list, next) {
> > > > >          QLIST_FOREACH(vdev, &group->device_list, next) {
> > > > > -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> > > > > +            if (!vdev->reset_works || !vdev->has_flr) {
> > > > >                  vdev->needs_reset = true;
> > > > >              }
> > > > >          }
> > > > > 
> > > > > I can't figure out why I coded it the way that I did.  Probably overly
> > > > > targeting a specific device.  Thanks,
> > > > > 
> > > > 
> > > > This patch works absolutely fine. After applying it to my 'qemu-git', the
> > > > device resets works flawlessly. So it would be great to push it upstream
> > > > as it looks good.
> > > > 
> > > 
> > > Okay sorry. I was too fast here. It was just working first time but now
> > > even after clean reboot it no longer works as expected but behavior
> > > is very strange.
> > > 
> > > Windows:
> > > 
> > >   1st boot works fine - boot VGA and Windows ATI driver loaded, issue
> > >       reboot and qemu stopped due to '-no-reboot'.
> > > 
> > >   2nd boot works partially - boot VGA and Windows ATI driver loaded but
> > >       black screen and my system becames terrible slow and mostly
> > >       unresponsive. My dmesg shows immediately after ATI driver will
> > >       enable the device the following:
> > > 
> > > [  159.984324] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
> > > [  159.984340] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
> > > [  160.129036] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x19@0x270
> > > [  160.129049] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x1b@0x2d0
> > > [  172.977677] kvm: zapping shadow pages for mmio generation wraparound
> > > [  173.160174] br0: port 2(tap0) entered forwarding state
> > > [  175.902967] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X
> > > [  188.340430] Clocksource tsc unstable (delta = -119654611 ns)
> > > [  188.340511] Switched to clocksource hpet
> > > [  191.088693] hpet1: lost 12 rtc interrupts
> > > [  191.926555] hpet1: lost 25 rtc interrupts
> > > 
> > >   So your patch fixed indeed reset issue of boot VGA but something else
> > >   is wrong now. :)
> > 
> > Can you try the cards separately?  If you run lspci on the device in the
> > host, does it report as normal?  Often when the host gets slow and we
> > get these sorts of clock issues it means the bus is fatal and we get
> > timeouts trying to read from it.
> > 
> 
> Okay with only one card I don't have the clock issues anymore, so we
> should look into this a bit later as working reset seems more important
> for now.
> 
> > > Linux (fglrx):
> > > 
> > >   1st boot works fine - boot VGA, fglrx loads fine and X could be
> > >       started, issue reboot via SSH and qemu stopped due to
> > >       '-no-reboot'.
> > > 
> > >   2nd boot works partially - boot VGA, fglrx loads fine but X couldn't
> > >       be started and fails with:
> > > 
> > > [   34.265111] fglrx_pci 0000:02:00.0: irq 50 for MSI/MSI-X
> > > [   34.344313] <6>[fglrx] Firegl kernel thread PID: 318
> > > [   34.344400] <6>[fglrx] Firegl kernel thread PID: 319
> > > [   34.344478] <6>[fglrx] Firegl kernel thread PID: 320
> > > [   34.344589] <6>[fglrx] IRQ 50 Enabled
> > > [   34.356105] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > > [   34.356107] <6>[fglrx] Reserved FB block: Unshared offset:fac3000, size:3000 
> > > [   34.356109] <6>[fglrx] Reserved FB block: Unshared offset:fac6000, size:23a000 
> > > [   34.356110] <6>[fglrx] Reserved FB block: Unshared offset:7fff4000, size:c000 
> > > [   34.386436] fglrx_pci 0000:01:00.0: irq 51 for MSI/MSI-X
> > > [   34.490902] <6>[fglrx] Firegl kernel thread PID: 321
> > > [   34.490994] <6>[fglrx] Firegl kernel thread PID: 322
> > > [   34.491069] <6>[fglrx] Firegl kernel thread PID: 323
> > > [   34.491166] <6>[fglrx] IRQ 51 Enabled
> > > [   34.505271] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > > [   34.505273] <6>[fglrx] Reserved FB block: Unshared offset:f9c3000, size:3000 
> > > [   34.505274] <6>[fglrx] Reserved FB block: Unshared offset:f9c6000, size:23a000 
> > > [   34.505276] <6>[fglrx] Reserved FB block: Unshared offset:fc00000, size:100000 
> > > [   34.505277] <6>[fglrx] Reserved FB block: Unshared offset:fff8000, size:8000 
> > > [   34.505278] <6>[fglrx] Reserved FB block: Unshared offset:ffff4000, size:c000 
> > > [   34.526198] BUG: unable to handle kernel paging request at ffff880c724e8008
> > > [   34.526203] IP: [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > > [   34.526277] PGD 1b3e067 PUD 0 
> > > [   34.526279] Oops: 0002 [#1] PREEMPT SMP 
> > > [   34.526282] Modules linked in: mousedev crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec serio_raw psmouse parport_pc snd_hwdep snd_pcm parport snd_page_alloc processor snd_timer snd soundcore i2c_i801 intel_agp lpc_ich pcspkr intel_gtt i2c_core shpchp evdev fglrx(PO) amd_iommu_v2 button ext4 crc16 mbcache jbd2 atkbd libps2 virtio_blk virtio_net ahci libahci libata scsi_mod i8042 floppy serio virtio_pci virtio_ring virtio
> > > [   34.526307] CPU: 1 PID: 316 Comm: X Tainted: P           O 3.13.1-2-ARCH #1
> > > [   34.526309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> > > [   34.526311] task: ffff8800776e2d00 ti: ffff880037a28000 task.ti: ffff880037a28000
> > > [   34.526312] RIP: 0010:[<ffffffffa0399af6>]  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > > [   34.526353] RSP: 0018:ffff880037a29810  EFLAGS: 00010296
> > > [   34.526354] RAX: 0000000000000001 RBX: ffff8800724e800c RCX: 0000000000000006
> > > [   34.526356] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8800724e8264
> > > [   34.526357] RBP: ffff88007b19a00c R08: 00000000000186a0 R09: 000000000001e848
> > > [   34.526358] R10: 00000002fffffffd R11: 00000000ffffffff R12: 0000000000000001
> > > [   34.526359] R13: ffff88007b19a00c R14: 0000000000000000 R15: ffff880037a298b0
> > > [   34.526363] FS:  00007f0ba649b880(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
> > > [   34.526365] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [   34.526366] CR2: ffff880c724e8008 CR3: 0000000037998000 CR4: 00000000000406e0
> > > [   34.526372] Stack:
> > > [   34.526373]  ffff88007b19a2f4 ffff88007bffcd1c 0000000000000001 ffffffffa0322cf0
> > > [   34.526375]  0000000000000000 0000000000000000 0000000000000000 ffff880077ed2c08
> > > [   34.526378]  0000000000000000 ffff880077ed2c08 ffff880037a298a0 ffffffffa0327f14
> > > [   34.526380] Call Trace:
> > > [   34.526435]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > > [   34.526490]  [<ffffffffa0327f14>] ? PECI_NotifyDALPreAdapterClockChange+0x144/0x160 [fglrx]
> > > [   34.526546]  [<ffffffffa031e321>] ? PHM_SetPowerState+0x31/0xc0 [fglrx]
> > > [   34.526597]  [<ffffffffa0340a5b>] ? PSM_ApplyHardwareAttributes_Dynamic+0x9b/0xf0 [fglrx]
> > > [   34.526651]  [<ffffffffa033fde9>] ? PSM_AdjustPowerState_Dynamic+0x169/0x540 [fglrx]
> > > [   34.526668]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > > [   34.526668]  [<ffffffffa0342ee4>] ? PEM_ExcuteEventChain+0x64/0xe0 [fglrx]
> > > [   34.526668]  [<ffffffffa0341302>] ? PEM_HandleEvent+0x92/0xd0 [fglrx]
> > > [   34.526668]  [<ffffffffa03357c0>] ? PEM_CWDDEPM_NotifyEvent+0xe0/0x4d0 [fglrx]
> > > [   34.526668]  [<ffffffffa0333869>] ? PP_Cwdde+0x109/0x180 [fglrx]
> > > [   34.526668]  [<ffffffffa02091dc>] ? firegl_pplib_cwddepm+0xbc/0x130 [fglrx]
> > > [   34.526668]  [<ffffffffa02092d9>] ? firegl_pplib_notify_event+0x89/0xd0 [fglrx]
> > > [   34.526668]  [<ffffffffa020292f>] ? hal_init_gpu+0x2bf/0x480 [fglrx]
> > > [   34.526668]  [<ffffffffa01dcc7b>] ? firegl_open+0x2db/0x310 [fglrx]
> > > [   34.526668]  [<ffffffffa01cb287>] ? ip_firegl_open+0x17/0x20 [fglrx]
> > > [   34.526668]  [<ffffffffa01ccac8>] ? firegl_stub_open+0x98/0x100 [fglrx]
> > > [   34.526668]  [<ffffffff811a82bf>] ? chrdev_open+0x9f/0x1d0
> > > [   34.526668]  [<ffffffff811a1967>] ? do_dentry_open+0x1b7/0x2c0
> > > [   34.526668]  [<ffffffff811aed41>] ? __inode_permission+0x41/0xb0
> > > [   34.526668]  [<ffffffff811a8220>] ? cdev_put+0x30/0x30
> > > [   34.526668]  [<ffffffff811a1d91>] ? finish_open+0x31/0x40
> > > [   34.526668]  [<ffffffff811b1b72>] ? do_last+0x572/0xe90
> > > [   34.526668]  [<ffffffff811af036>] ? link_path_walk+0x236/0x8d0
> > > [   34.526668]  [<ffffffff811b254b>] ? path_openat+0xbb/0x6b0
> > > [   34.526668]  [<ffffffff811b3c6a>] ? do_filp_open+0x3a/0x90
> > > [   34.526668]  [<ffffffff811c0567>] ? __alloc_fd+0xa7/0x130
> > > [   34.526668]  [<ffffffff811a2f49>] ? do_sys_open+0x129/0x220
> > > [   34.526668]  [<ffffffff811a305e>] ? SyS_open+0x1e/0x20
> > > [   34.526668]  [<ffffffff8152136d>] ? system_call_fastpath+0x1a/0x1f
> > > [   34.526668] Code: 8b 4a 1c 8b 93 e0 18 00 00 48 8d bb 58 02 00 00 85 d2 0f 84 63 02 00 00 f6 c2 01 0f 84 20 01 00 00 44 8b 1b 41 ff cb 4f 8d 14 5b <46> 89 44 93 08 8b 95 3c 02 00 00 48 89 d0 48 c1 e8 07 a8 01 75 
> > > [   34.526668] RIP  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > > [   34.526668]  RSP <ffff880037a29810>
> > > [   34.526668] CR2: ffff880c724e8008
> > > [   34.526668] ---[ end trace 5431e6dcf1c31dea ]---
> > > [   69.317528] type=1006 audit(1391649552.046:4): pid=324 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=3 res=1
> > > 
> > > I know it is the binary driver but I would also retry with radeon one but
> > > I believe there will be a similar crash. In my first try I just rebooted
> > > the Linux VM several times without starting X.
> > > 
> > > I got it one time working without getting 'Clocksource tsc unstable' but
> > > now I'm unable to repeat it. So I believe something more is needed.
> > 
> > Bus resets are a mixed blessing, it returns the card to a relatively
> > known state, but it's a fairly unusual event from a platform perspective
> > and we have no idea what kind of quirks the host system bios might have
> > in place to workaround hardware.  If the bus is not fatal you might try
> > running lspci -vvv in the host at various points to see what changed.
> > For instance, boot a Linux guest to text mode and see if the card is in
> > the same state between first boot and second boot before starting X.
> > Thanks,
> > 
> 
> I tried the R9 290X separately now. You're right there are some changes
> between lspci -vvv output between 1st and 2nd boot and they are reset
> if I do "suspend-to-ram" and resume before 3rd boot of VM. Below is the
> lspci from 1st boot and the diffs of the lspci outputs:
> 
> --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> +++ 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
> @@ -1,6 +1,6 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
>  	Interrupt: pin A routed to IRQ 18
> @@ -19,7 +19,7 @@
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>  			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> +		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> @@ -39,13 +39,13 @@
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable-, Smallest Translation Unit: 00
> +		ATSCtl:	Enable+, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
> +++ 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 18
> +	Interrupt: pin A routed to IRQ 47
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,14 +17,14 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> +		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -32,8 +32,8 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 
> Now I stopped X and powered down the VM and started 2nd cycle:
> 
> --- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 47
> +	Interrupt: pin A routed to IRQ 18
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,7 +17,7 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
>  		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> @@ -32,7 +32,7 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
>  		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
> @@ -45,7 +45,7 @@
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable+, Smallest Translation Unit: 00
> +		ATSCtl:	Enable-, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 47
> +	Interrupt: pin A routed to IRQ 18
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,7 +17,7 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
>  		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> @@ -32,7 +32,7 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
>  		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
> @@ -45,7 +45,7 @@
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable+, Smallest Translation Unit: 00
> +		ATSCtl:	Enable-, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> +++ 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
> @@ -1,6 +1,6 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
>  	Interrupt: pin A routed to IRQ 18
> @@ -19,12 +19,12 @@
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>  			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> +		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -33,7 +33,7 @@
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 00000000fee00000  Data: 0000
> +		Address: 0000000000000000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> @@ -45,7 +45,7 @@
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable-, Smallest Translation Unit: 00
> +		ATSCtl:	Enable+, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
> +++ 006-lspci.290x.during.2nd.after.X.crash.log	2014-02-07 01:18:16.996855362 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 18
> +	Interrupt: pin A routed to IRQ 47
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,9 +17,9 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> +		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> @@ -32,8 +32,8 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 
> Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> prior to the booting. The only difference between 1st start and 2nd
> start are:
> 
> --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -24,7 +24,7 @@
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -33,13 +33,13 @@
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>  	Capabilities: [270 v1] #19
> 
> After that if I do suspend-to-ram / resume trick I have again lspci
> output from before 1st boot.
> 

Another workaround where your patch works fine is to do the following:

  #1 Start VM
  #2 Start X
  #3 Stop X
  #4 rmmod fglrx
  #5 poweroff

After this I'm able to restart the VM as many times as I want with boot
VGA, fglrx and X but obviously if the VM crashes I need to issue
"suspend-to-ram" / resume workaround. It looks like fglrx properly
disables the device if unloaded.

[   36.081197] <6>[fglrx] IRQ 48 Disabled
[   36.096488] <6>[fglrx] module unloaded - fglrx 13.35.5 [Jan 29 2014]

Should I retry it with radeon driver or with VFIO debug enabled?

> > Alex
> > 
> 
> --Maik
> 

--Maik
diff mbox

Patch

--- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
+++ 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
@@ -1,6 +1,6 @@ 
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
 	Interrupt: pin A routed to IRQ 18
@@ -19,7 +19,7 @@ 
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
+		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
@@ -39,13 +39,13 @@ 
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
+		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable-, Smallest Translation Unit: 00
+		ATSCtl:	Enable+, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
+++ 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
@@ -1,9 +1,9 @@ 
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 18
+	Interrupt: pin A routed to IRQ 47
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,14 +17,14 @@ 
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
+		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
-		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
@@ -32,8 +32,8 @@ 
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 0000000000000000  Data: 0000
+	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

Now I stopped X and powered down the VM and started 2nd cycle:

--- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
+++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
@@ -1,9 +1,9 @@ 
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 47
+	Interrupt: pin A routed to IRQ 18
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,7 +17,7 @@ 
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
@@ -32,7 +32,7 @@ 
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
 		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
@@ -45,7 +45,7 @@ 
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable+, Smallest Translation Unit: 00
+		ATSCtl:	Enable-, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
+++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
@@ -1,9 +1,9 @@ 
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 47
+	Interrupt: pin A routed to IRQ 18
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,7 +17,7 @@ 
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
@@ -32,7 +32,7 @@ 
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
 		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
@@ -45,7 +45,7 @@ 
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable+, Smallest Translation Unit: 00
+		ATSCtl:	Enable-, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
+++ 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
@@ -1,6 +1,6 @@ 
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
 	Interrupt: pin A routed to IRQ 18
@@ -19,12 +19,12 @@ 
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
+		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
-		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
@@ -33,7 +33,7 @@ 
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
 	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 00000000fee00000  Data: 0000
+		Address: 0000000000000000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
@@ -45,7 +45,7 @@ 
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable-, Smallest Translation Unit: 00
+		ATSCtl:	Enable+, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
+++ 006-lspci.290x.during.2nd.after.X.crash.log	2014-02-07 01:18:16.996855362 +0100
@@ -1,9 +1,9 @@ 
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 18
+	Interrupt: pin A routed to IRQ 47
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,9 +17,9 @@ 
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
+		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
@@ -32,8 +32,8 @@ 
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 0000000000000000  Data: 0000
+	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

Interesting is the diff between 1st and 2nd boot, so if I do the lspci
prior to the booting. The only difference between 1st start and 2nd
start are:

--- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
+++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
@@ -24,7 +24,7 @@ 
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
-		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
@@ -33,13 +33,13 @@ 
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
 	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 0000000000000000  Data: 0000
+		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
+		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
 	Capabilities: [270 v1] #19