Message ID | 4F37F910.5030400@codemonkey.ws |
---|---|
State | New |
Headers | show |
On Sun, Feb 12, 2012 at 11:38:24AM -0600, Anthony Liguori wrote: > On 02/12/2012 11:31 AM, Michael S. Tsirkin wrote: > >On Sun, Feb 12, 2012 at 07:07:43PM +0200, Michael S. Tsirkin wrote: > >>I got this assert when working on qemu: pci hotplug > >>callback failed so qdev_free was called. > >> > >>(gdb) where > >>#0 0x00007ffff5fa1905 in raise () from /lib64/libc.so.6 > >>#1 0x00007ffff5fa30e5 in abort () from /lib64/libc.so.6 > >>#2 0x00007ffff7413a7f in g_assertion_message () from > >>/lib64/libglib-2.0.so.0 > >>#3 0x00007ffff7414020 in g_assertion_message_expr () from > >>/lib64/libglib-2.0.so.0 > >>#4 0x00007ffff7e452a9 in object_delete (obj=0x7ffff9124e60) at > >>qom/object.c:375 > >>#5 0x00007ffff7e2f5d4 in qdev_free (dev=0x7ffff9124e60) > >> at /home/mst/scm/qemu/hw/qdev.c:250 > >>#6 qdev_init (dev=0x7ffff9124e60) at /home/mst/scm/qemu/hw/qdev.c:149 > >>#7 0x00007ffff7e2a7fe in qdev_device_add (opts=0x7ffff8b0d3a0) > >> at /home/mst/scm/qemu/hw/qdev-monitor.c:473 > >>#8 0x00007ffff7e06da9 in device_init_func (opts=<value optimized out>, > >> opaque=<value optimized out>) at /home/mst/scm/qemu/vl.c:1754 > >>#9 0x00007ffff7e3737a in qemu_opts_foreach (list=<value optimized out>, > >>func= > >> 0x7ffff7e06d90<device_init_func>, opaque=0x0, > >> abort_on_failure=<value optimized out>) at qemu-option.c:1048 > >>#10 0x00007ffff7e09cdb in main (argc=<value optimized out>, argv=<value > >>optimized out>, > >> envp=<value optimized out>) at /home/mst/scm/qemu/vl.c:3407 > >>(gdb) frame 6 > >>#6 qdev_init (dev=0x7ffff9124e60) at /home/mst/scm/qemu/hw/qdev.c:149 > >>149 qdev_free(dev); > >> > >>The problems seems to be that > >>pci_qdev_init calls do_pci_unregister_device on > >>hotplug error which will free the device twice? > > > >Here's a reproducer to a similar error in property parsing: > > > >qemu-system-x86_64 -enable-kvm -m 1G -drive file=/home/mst/rhel6.qcow2 > >-netdev user,id=bar -net > >nic,netdev=bar,model=e1000,macaddr=52:54:00:12:34:57 -redir > >tcp:8022::22 -device virtio-net-pci,netdev=foo,mac=5854:00:12:34:56 > >-netdev > >tap,id=foo,ifname=msttap0,script=/home/mst/ifup,downscript=no,vhost=on > >-vnc :1 -monitor stdio > > Here's the fix. I need to do some regression testing and then I'll > post as a proper top-level patch. > > Thanks for the report. > > Regards, > > Anthony Liguori > > > > > > > > >>-- > >>MST > > >From b7fc6f1eb7c5e041eac7d610061a1be950707e5b Mon Sep 17 00:00:00 2001 > From: Anthony Liguori <aliguori@us.ibm.com> > Date: Sun, 12 Feb 2012 11:36:24 -0600 > Subject: [PATCH] device_add: don't add a /peripheral link until init is complete > > Otherwise we end up with a dangling reference which causes qdev_free() to fail. > > Reported-by: Michael Tsirkin <mst@redhat.com> > Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> This handles the option parsing but what about hotplug failures (when bus->hotplug returns an error)? > --- > hw/qdev-monitor.c | 18 ++++++++++-------- > 1 files changed, 10 insertions(+), 8 deletions(-) > > diff --git a/hw/qdev-monitor.c b/hw/qdev-monitor.c > index 49f13ca..a310cc7 100644 > --- a/hw/qdev-monitor.c > +++ b/hw/qdev-monitor.c > @@ -457,6 +457,16 @@ DeviceState *qdev_device_add(QemuOpts *opts) > id = qemu_opts_id(opts); > if (id) { > qdev->id = id; > + } > + if (qemu_opt_foreach(opts, set_property, qdev, 1) != 0) { > + qdev_free(qdev); > + return NULL; > + } > + if (qdev_init(qdev) < 0) { > + qerror_report(QERR_DEVICE_INIT_FAILED, driver); > + return NULL; > + } > + if (qdev->id) { > object_property_add_child(qdev_get_peripheral(), qdev->id, > OBJECT(qdev), NULL); > } else { > @@ -466,14 +476,6 @@ DeviceState *qdev_device_add(QemuOpts *opts) > OBJECT(qdev), NULL); > g_free(name); > } > - if (qemu_opt_foreach(opts, set_property, qdev, 1) != 0) { > - qdev_free(qdev); > - return NULL; > - } > - if (qdev_init(qdev) < 0) { > - qerror_report(QERR_DEVICE_INIT_FAILED, driver); > - return NULL; > - } > qdev->opts = opts; > return qdev; > } > -- > 1.7.4.1 >
On 02/12/2012 11:57 AM, Michael S. Tsirkin wrote: > On Sun, Feb 12, 2012 at 11:38:24AM -0600, Anthony Liguori wrote: >> From: Anthony Liguori<aliguori@us.ibm.com> >> Date: Sun, 12 Feb 2012 11:36:24 -0600 >> Subject: [PATCH] device_add: don't add a /peripheral link until init is complete >> >> Otherwise we end up with a dangling reference which causes qdev_free() to fail. >> >> Reported-by: Michael Tsirkin<mst@redhat.com> >> Signed-off-by: Anthony Liguori<aliguori@us.ibm.com> > > This handles the option parsing but what about hotplug > failures (when bus->hotplug returns an error)? Sorry, I don't follow. The assert you reported was that object_free() noted a reference count of !0 which indicates something else was holding the reference to the object. In this case, it was the child link in /peripheral. By delaying creating the link in /peripheral, we eliminate the problem completely. BTW, the explicit calls to do_pci_unregister are redundant. finalize() will be called during cleanup which means exit() will be invoked (which already calls do_pci_unregister). I'm not sure why this isn't failing more aggressively but it looks clearly wrong to me. Regards, Anthony Liguori
On Sun, Feb 12, 2012 at 02:04:29PM -0600, Anthony Liguori wrote: > On 02/12/2012 11:57 AM, Michael S. Tsirkin wrote: > >On Sun, Feb 12, 2012 at 11:38:24AM -0600, Anthony Liguori wrote: > >>From: Anthony Liguori<aliguori@us.ibm.com> > >>Date: Sun, 12 Feb 2012 11:36:24 -0600 > >>Subject: [PATCH] device_add: don't add a /peripheral link until init is complete > >> > >>Otherwise we end up with a dangling reference which causes qdev_free() to fail. > >> > >>Reported-by: Michael Tsirkin<mst@redhat.com> > >>Signed-off-by: Anthony Liguori<aliguori@us.ibm.com> > > > >This handles the option parsing but what about hotplug > >failures (when bus->hotplug returns an error)? > > Sorry, I don't follow. > > The assert you reported was that object_free() noted a reference > count of !0 which indicates something else was holding the reference > to the object. In this case, it was the child link in /peripheral. > > By delaying creating the link in /peripheral, we eliminate the problem completely. Th other problem was internal in pci which calls ->hostplug during initialization. This doesn't seem affected? But I didn't try, maybe I misundertand. > BTW, the explicit calls to do_pci_unregister are redundant. > finalize() will be called during cleanup which means exit() will be > invoked (which already calls do_pci_unregister). I'm not sure why > this isn't failing more aggressively but it looks clearly wrong to > me. > > Regards, > > Anthony Liguori Me too. Want to try to drop them?
On 02/12/2012 02:15 PM, Michael S. Tsirkin wrote: > On Sun, Feb 12, 2012 at 02:04:29PM -0600, Anthony Liguori wrote: >> On 02/12/2012 11:57 AM, Michael S. Tsirkin wrote: >>> On Sun, Feb 12, 2012 at 11:38:24AM -0600, Anthony Liguori wrote: >>>> From: Anthony Liguori<aliguori@us.ibm.com> >>>> Date: Sun, 12 Feb 2012 11:36:24 -0600 >>>> Subject: [PATCH] device_add: don't add a /peripheral link until init is complete >>>> >>>> Otherwise we end up with a dangling reference which causes qdev_free() to fail. >>>> >>>> Reported-by: Michael Tsirkin<mst@redhat.com> >>>> Signed-off-by: Anthony Liguori<aliguori@us.ibm.com> >>> >>> This handles the option parsing but what about hotplug >>> failures (when bus->hotplug returns an error)? >> >> Sorry, I don't follow. >> >> The assert you reported was that object_free() noted a reference >> count of !0 which indicates something else was holding the reference >> to the object. In this case, it was the child link in /peripheral. >> >> By delaying creating the link in /peripheral, we eliminate the problem completely. > > Th other problem was internal in pci which calls ->hostplug > during initialization. This doesn't seem affected? > But I didn't try, maybe I misundertand. Yeah, from qdev's perspective it's all just init failing. hotplug is entirely a PCI concept. > >> BTW, the explicit calls to do_pci_unregister are redundant. >> finalize() will be called during cleanup which means exit() will be >> invoked (which already calls do_pci_unregister). I'm not sure why >> this isn't failing more aggressively but it looks clearly wrong to >> me. >> >> Regards, >> >> Anthony Liguori > > Me too. Want to try to drop them? Yeah, I'll make this a two patch series. Regards, Anthony Liguori >
On Sun, Feb 12, 2012 at 02:19:19PM -0600, Anthony Liguori wrote: > On 02/12/2012 02:15 PM, Michael S. Tsirkin wrote: > >On Sun, Feb 12, 2012 at 02:04:29PM -0600, Anthony Liguori wrote: > >>On 02/12/2012 11:57 AM, Michael S. Tsirkin wrote: > >>>On Sun, Feb 12, 2012 at 11:38:24AM -0600, Anthony Liguori wrote: > >>>>From: Anthony Liguori<aliguori@us.ibm.com> > >>>>Date: Sun, 12 Feb 2012 11:36:24 -0600 > >>>>Subject: [PATCH] device_add: don't add a /peripheral link until init is complete > >>>> > >>>>Otherwise we end up with a dangling reference which causes qdev_free() to fail. > >>>> > >>>>Reported-by: Michael Tsirkin<mst@redhat.com> > >>>>Signed-off-by: Anthony Liguori<aliguori@us.ibm.com> > >>> > >>>This handles the option parsing but what about hotplug > >>>failures (when bus->hotplug returns an error)? > >> > >>Sorry, I don't follow. > >> > >>The assert you reported was that object_free() noted a reference > >>count of !0 which indicates something else was holding the reference > >>to the object. In this case, it was the child link in /peripheral. > >> > >>By delaying creating the link in /peripheral, we eliminate the problem completely. > > > >Th other problem was internal in pci which calls ->hostplug > >during initialization. This doesn't seem affected? > >But I didn't try, maybe I misundertand. > > Yeah, from qdev's perspective it's all just init failing. hotplug > is entirely a PCI concept. > > > > >>BTW, the explicit calls to do_pci_unregister are redundant. > >>finalize() will be called during cleanup which means exit() will be > >>invoked (which already calls do_pci_unregister). I'm not sure why > >>this isn't failing more aggressively but it looks clearly wrong to > >>me. > >> > >>Regards, > >> > >>Anthony Liguori > > > >Me too. Want to try to drop them? > > Yeah, I'll make this a two patch series. > > Regards, > > Anthony Liguori I also see this: device_add virtio-net-pci,netdev=foo,mac=52:54:00:12:34:56,id=bla device_del bla *** glibc detected *** /home/mst/qemu-test/bin/qemu-system-x86_64: corrupted double-linked list: 0x00007fae434565a0 *** Am I alone?
On Mon, Feb 13, 2012 at 02:17:35AM +0200, Michael S. Tsirkin wrote: > On Sun, Feb 12, 2012 at 02:19:19PM -0600, Anthony Liguori wrote: > > On 02/12/2012 02:15 PM, Michael S. Tsirkin wrote: > > >On Sun, Feb 12, 2012 at 02:04:29PM -0600, Anthony Liguori wrote: > > >>On 02/12/2012 11:57 AM, Michael S. Tsirkin wrote: > > >>>On Sun, Feb 12, 2012 at 11:38:24AM -0600, Anthony Liguori wrote: > > >>>>From: Anthony Liguori<aliguori@us.ibm.com> > > >>>>Date: Sun, 12 Feb 2012 11:36:24 -0600 > > >>>>Subject: [PATCH] device_add: don't add a /peripheral link until init is complete > > >>>> > > >>>>Otherwise we end up with a dangling reference which causes qdev_free() to fail. > > >>>> > > >>>>Reported-by: Michael Tsirkin<mst@redhat.com> > > >>>>Signed-off-by: Anthony Liguori<aliguori@us.ibm.com> > > >>> > > >>>This handles the option parsing but what about hotplug > > >>>failures (when bus->hotplug returns an error)? > > >> > > >>Sorry, I don't follow. > > >> > > >>The assert you reported was that object_free() noted a reference > > >>count of !0 which indicates something else was holding the reference > > >>to the object. In this case, it was the child link in /peripheral. > > >> > > >>By delaying creating the link in /peripheral, we eliminate the problem completely. > > > > > >Th other problem was internal in pci which calls ->hostplug > > >during initialization. This doesn't seem affected? > > >But I didn't try, maybe I misundertand. > > > > Yeah, from qdev's perspective it's all just init failing. hotplug > > is entirely a PCI concept. > > > > > > > >>BTW, the explicit calls to do_pci_unregister are redundant. > > >>finalize() will be called during cleanup which means exit() will be > > >>invoked (which already calls do_pci_unregister). I'm not sure why > > >>this isn't failing more aggressively but it looks clearly wrong to > > >>me. > > >> > > >>Regards, > > >> > > >>Anthony Liguori > > > > > >Me too. Want to try to drop them? > > > > Yeah, I'll make this a two patch series. > > > > Regards, > > > > Anthony Liguori > > > I also see this: > > device_add virtio-net-pci,netdev=foo,mac=52:54:00:12:34:56,id=bla > device_del bla > *** glibc detected *** /home/mst/qemu-test/bin/qemu-system-x86_64: > corrupted double-linked list: 0x00007fae434565a0 *** > > Am I alone? > > Tried some tracing: I set breakpoint at g_free and gave the device_del command. (gdb) b g_free Breakpoint 3 at 0x7ffff73f55b0 (gdb) c Continuing. Program received signal SIGTSTP, Stopped (user). [Switching to Thread 0x7fffaf1a3700 (LWP 22749)] 0x00007ffff6ae074b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) c Continuing. Program received signal SIGTSTP, Stopped (user). [Switching to Thread 0x7ffff4b28700 (LWP 22727)] 0x00007ffff6ae03cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) continue Continuing. Program received signal SIGTSTP, Stopped (user). [Switching to Thread 0x7ffff7cb3700 (LWP 22712)] 0x00007ffff604c383 in select () from /lib64/libc.so.6 (gdb) continue Continuing. [Thread 0x7fffaf1a3700 (LWP 22749) exited] Breakpoint 3, 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 (gdb) continue Continuing. (qemu) device_del bla Breakpoint 3, 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 (gdb) where #0 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 #1 0x00007ffff7ec1d60 in monitor_parse_command (mon=0x7ffff8b207a0, cmdline=<value optimized out>, qdict=0x7ffff9161660) at /home/mst/scm/qemu/monitor.c:3724 #2 0x00007ffff7ec2684 in handle_user_command (mon=0x7ffff8b207a0, cmdline= 0x7ffff8c86a30 "device_del bla") at /home/mst/scm/qemu/monitor.c:3803 #3 0x00007ffff7ec2b6e in monitor_command_cb (mon=0x7ffff8b207a0, cmdline=<value optimized out>, opaque=<value optimized out>) at /home/mst/scm/qemu/monitor.c:4436 #4 0x00007ffff7e463ed in readline_handle_byte (rs=0x7ffff8c86a30, ch=<value optimized out>) at readline.c:370 #5 0x00007ffff7ec28b8 in monitor_read (opaque=<value optimized out>, buf= 0x7fffffffce20 "\r\023\f\366\377\177", size=1) at /home/mst/scm/qemu/monitor.c:4422 #6 0x00007ffff7e313bb in qemu_chr_be_write (opaque=0x7ffff8b0ca00) at qemu-char.c:163 #7 fd_chr_read (opaque=0x7ffff8b0ca00) at qemu-char.c:587 #8 0x00007ffff7d7c967 in qemu_iohandler_poll (readfds=0x7fffffffdfb0, writefds= 0x7fffffffdf30, xfds=<value optimized out>, ret=<value optimized out>) at iohandler.c:121 #9 0x00007ffff7e1085f in main_loop_wait (nonblocking=<value optimized out>) at main-loop.c:464 #10 0x00007ffff7e09284 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /home/mst/scm/qemu/vl.c:1482 #11 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /home/mst/scm/qemu/vl.c:3525 (gdb) frame 1 #1 0x00007ffff7ec1d60 in monitor_parse_command (mon=0x7ffff8b207a0, cmdline=<value optimized out>, qdict=0x7ffff9161660) at /home/mst/scm/qemu/monitor.c:3724 3724 g_free(key); (gdb) p key $1 = 0x7ffff9166820 "id" (gdb) continue Continuing. Breakpoint 3, 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 (gdb) where #0 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 #1 0x00007ffff7e42fd4 in object_property_del (obj=0x7ffff8ca79d0, name=<value optimized out>, errp=<value optimized out>) at qom/object.c:629 #2 0x00007ffff7e4308d in object_property_del_child (obj=0x7ffff9123e50) at qom/object.c:310 #3 object_unparent (obj=0x7ffff9123e50) at qom/object.c:318 #4 0x00007ffff7de0a0d in pci_unplug_device (qdev=<value optimized out>) at /home/mst/scm/qemu/hw/pci.c:1521 #5 0x00007ffff7ec26b5 in handle_user_command (mon=0x7ffff8b207a0, cmdline=<value optimized out>) at /home/mst/scm/qemu/monitor.c:3813 #6 0x00007ffff7ec2b6e in monitor_command_cb (mon=0x7ffff8b207a0, cmdline=<value optimized out>, opaque=<value optimized out>) at /home/mst/scm/qemu/monitor.c:4436 #7 0x00007ffff7e463ed in readline_handle_byte (rs=0x7ffff8c86a30, ch=<value optimized out>) at readline.c:370 #8 0x00007ffff7ec28b8 in monitor_read (opaque=<value optimized out>, buf= 0x7fffffffce20 "\r\023\f\366\377\177", size=1) at /home/mst/scm/qemu/monitor.c:4422 #9 0x00007ffff7e313bb in qemu_chr_be_write (opaque=0x7ffff8b0ca00) at qemu-char.c:163 #10 fd_chr_read (opaque=0x7ffff8b0ca00) at qemu-char.c:587 #11 0x00007ffff7d7c967 in qemu_iohandler_poll (readfds=0x7fffffffdfb0, writefds= 0x7fffffffdf30, xfds=<value optimized out>, ret=<value optimized out>) at iohandler.c:121 #12 0x00007ffff7e1085f in main_loop_wait (nonblocking=<value optimized out>) at main-loop.c:464 #13 0x00007ffff7e09284 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /home/mst/scm/qemu/vl.c:1482 #14 main (argc=<value optimized out>, argv=<value optimized out>, ---Type <return> to continue, or q <return> to quit--- envp=<value optimized out>) at /home/mst/scm/qemu/vl.c:3525 (gdb) frame 1 #1 0x00007ffff7e42fd4 in object_property_del (obj=0x7ffff8ca79d0, name=<value optimized out>, errp=<value optimized out>) at qom/object.c:629 629 g_free(prop->name); (gdb) p prop->name $2 = (gchar *) 0x7ffff9164510 "bla" (gdb) continue Continuing. Breakpoint 3, 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 (gdb) frame 1 #1 0x00007ffff7e42fdd in object_property_del (obj=0x7ffff8ca79d0, name=<value optimized out>, errp=<value optimized out>) at qom/object.c:630 630 g_free(prop->type); (gdb) p prop->type $3 = (gchar *) 0x7ffff9164580 "child<virtio-net-pci>" (gdb) continue Continuing. Breakpoint 3, 0x00007ffff73f55b0 in g_free () from /lib64/libglib-2.0.so.0 (gdb) frame 1 #1 0x00007ffff7e4308d in object_property_del_child (obj=0x7ffff9123e50) at qom/object.c:310 310 object_property_del(obj, prop->name, errp); (gdb) p prop->name $4 = (gchar *) 0x7ffff9164510 "\020h\026\371\377\177" >>>>>>>>>>>>>>>>>>>>>>>>>>>> It seems clear that there is at least a use after free here. >>>>>>>>>>>>>>>>>>>>>>>>>>>> This is not an immediate source of the crash, however.
From b7fc6f1eb7c5e041eac7d610061a1be950707e5b Mon Sep 17 00:00:00 2001 From: Anthony Liguori <aliguori@us.ibm.com> Date: Sun, 12 Feb 2012 11:36:24 -0600 Subject: [PATCH] device_add: don't add a /peripheral link until init is complete Otherwise we end up with a dangling reference which causes qdev_free() to fail. Reported-by: Michael Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> --- hw/qdev-monitor.c | 18 ++++++++++-------- 1 files changed, 10 insertions(+), 8 deletions(-) diff --git a/hw/qdev-monitor.c b/hw/qdev-monitor.c index 49f13ca..a310cc7 100644 --- a/hw/qdev-monitor.c +++ b/hw/qdev-monitor.c @@ -457,6 +457,16 @@ DeviceState *qdev_device_add(QemuOpts *opts) id = qemu_opts_id(opts); if (id) { qdev->id = id; + } + if (qemu_opt_foreach(opts, set_property, qdev, 1) != 0) { + qdev_free(qdev); + return NULL; + } + if (qdev_init(qdev) < 0) { + qerror_report(QERR_DEVICE_INIT_FAILED, driver); + return NULL; + } + if (qdev->id) { object_property_add_child(qdev_get_peripheral(), qdev->id, OBJECT(qdev), NULL); } else { @@ -466,14 +476,6 @@ DeviceState *qdev_device_add(QemuOpts *opts) OBJECT(qdev), NULL); g_free(name); } - if (qemu_opt_foreach(opts, set_property, qdev, 1) != 0) { - qdev_free(qdev); - return NULL; - } - if (qdev_init(qdev) < 0) { - qerror_report(QERR_DEVICE_INIT_FAILED, driver); - return NULL; - } qdev->opts = opts; return qdev; } -- 1.7.4.1