From patchwork Mon Mar 31 18:14:15 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Cole Robinson X-Patchwork-Id: 335537 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id E02CB1400BF for ; Tue, 1 Apr 2014 05:14:46 +1100 (EST) Received: from localhost ([::1]:50604 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WUgjM-0007qm-Sn for incoming@patchwork.ozlabs.org; Mon, 31 Mar 2014 14:14:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38044) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WUgj1-0007q9-Vu for qemu-devel@nongnu.org; Mon, 31 Mar 2014 14:14:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WUgix-0006Qs-6L for qemu-devel@nongnu.org; Mon, 31 Mar 2014 14:14:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30266) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WUgiw-0006Ql-TJ for qemu-devel@nongnu.org; Mon, 31 Mar 2014 14:14:19 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s2VIEGdl014337 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 31 Mar 2014 14:14:17 -0400 Received: from colepc.home (ovpn-113-123.phx2.redhat.com [10.3.113.123]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s2VIEFEF011462; Mon, 31 Mar 2014 14:14:15 -0400 Message-ID: <5339B077.4040707@redhat.com> Date: Mon, 31 Mar 2014 14:14:15 -0400 From: Cole Robinson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: =?UTF-8?B?QW5kcmVhcyBGw6RyYmVy?= References: <53387E35.3010909@redhat.com> <533899F1.1030808@suse.de> In-Reply-To: <533899F1.1030808@suse.de> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id s2VIEGdl014337 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Paolo Bonzini , "Michael S. Tsirkin" , qemu-devel , Gerd Hoffmann Subject: Re: [Qemu-devel] 2.0 regression: loadvm assertion with ehci + tablet X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On 03/30/2014 06:25 PM, Andreas Färber wrote: > Hi, > > Am 30.03.2014 22:27, schrieb Cole Robinson: >> With git master, loadvm hits an assert failure if using ehci and usb tablet. >> Steps to reproduce: >> >> $ qemu-img create -f qcow2 foo.qcow2 10G >> $ ./x86_64-softmmu/qemu-system-x86_64 \ >> -enable-kvm -m 4096 \ >> -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 \ >> -device >> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 \ >> -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 \ >> -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 \ >> -device usb-tablet,id=input0 \ >> -hda foo.qcow2 \ >> -cdrom Fedora-20-x86_64-Live-Desktop.iso \ >> -boot d -monitor stdio >> >> >> (qemu) savevm foo >> (qemu) loadvm foo >> qemu-system-x86_64: hw/pci/pci.c:250: pcibus_reset: Assertion >> `bus->irq_count[i] == 0' failed. >> >> The relevant backtrace bits for the assertion: >> >> #4 0x00007f8f7241971e in pcibus_reset (qbus=0x7f8f74082fd0) >> at hw/pci/pci.c:250 >> #5 0x00007f8f723bd36d in qbus_reset_one (bus=0x7f8f74082fd0, >> opaque=) at hw/core/qdev.c:249 >> #6 0x00007f8f723bec88 in qdev_walk_children (dev=0x7f8f73efb320, >> pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x7f8f723bf4f0 , >> post_busfn=0x7f8f723bd320 , opaque=0x0) >> at hw/core/qdev.c:403 >> #7 0x00007f8f723bedb8 in qbus_walk_children (bus=0x7f8f740706e0, >> pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x7f8f723bf4f0 , >> post_busfn=0x7f8f723bd320 , opaque=0x0) >> at hw/core/qdev.c:369 >> #8 0x00007f8f724f5c5d in qemu_devices_reset () at vl.c:1867 >> #9 qemu_system_reset (report=report@entry=false) at vl.c:1880 >> #10 0x00007f8f7256dba2 in load_vmstate (name=name@entry=0x7f8f7417a160 "foo") >> at /home/crobinso/src/qemu/savevm.c:1098 >> >> The 'cause' is this: >> >> #0 ehci_detach (port=0x555556436968) at hw/usb/hcd-ehci.c:810 >> #1 0x0000555555727b5e in usb_detach (port=port@entry=0x555556436968) >> at hw/usb/core.c:49 >> #2 0x0000555555736bf3 in ehci_reset (opaque=0x5555564364d8) >> at hw/usb/hcd-ehci.c:941 >> #3 0x00005555557e1fcd in qemu_devices_reset () at vl.c:1867 >> #4 qemu_system_reset (report=report@entry=false) at vl.c:1880 >> #5 0x0000555555859f12 in load_vmstate (name=name@entry=0x555556458210 "foo") >> at /home/crobinso/src/qemu/savevm.c:1098 >> >> ehci_reset calls usb_detach which sets pcibus->irq_count[3] = 1. pcibus_reset >> runs and hits the assertion. But I don't understand this stuff enough to >> determine what's actually wrong here :) >> >> I bisected the issue to: >> >> commit 31b030d4abc5bea89c2b33b39d3b302836f6b6ee >> Author: Andreas Färber >> Date: Wed Sep 4 01:29:02 2013 +0200 >> >> cputlb: Change tlb_flush_page() argument to CPUState >> >> Signed-off-by: Andreas Färber >> >> ...and then I double checked it since that sounds unrelated. Same result. > > You are running into an unrelated migration bug: > http://git.qemu.org/?p=qemu.git;a=commit;h=c01a71c1a56fa27f43449ff59e5d03b2483658a2 > > Sorry about that. You'll need to patch -p1 the above commit on top of > each git-bisect commit to find the actual breakage if the above commit > is already bad (can't test right now). > Indeed, that seemed to be messing up my search, thanks. So the real culprit is: commit 9bdbbfc3a04c28dc43af5afffb32066623cb0022 Author: Paolo Bonzini Date: Fri Dec 6 17:54:25 2013 +0100 pci: clean up resetting of IRQs pci_device_reset will deassert the INTX pins, and this will make the irq_count array all-zeroes. Check that this is the case, and remove the existing loop which might even unsync irq_count and irq_state. Which is what adds the assert. Looking at pci_device_reset, there is an issue: dev->irq_state = 0; pci_update_irq_status(dev); pci_device_deassert_intx(dev); irq_state is cleared before pci_device_deassert_intx. But tries to clear all irqs via pci_irq_handler, but that function will exit without taking any action if the requested irq level matches what we already track in irq_state. Since irq_state is 0, pci_device_deassert_intx is basically a no-op. Any interrupts with level=1 will not be cleared, which is the case with the usb tablet after usb_detach. This fixes things for me, but I have no idea if it's the proper fix: - Cole diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 8f722dd..1912dfb 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -189,9 +189,9 @@ static void pci_do_device_reset(PCIDevice *dev) { int r; + pci_device_deassert_intx(dev); dev->irq_state = 0; pci_update_irq_status(dev); - pci_device_deassert_intx(dev); /* Clear all writable bits */ pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, pci_get_word(dev->wmask + PCI_COMMAND) |