Message ID | 1423572769-4238-1-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
On Tue, 2015-02-10 at 13:52 +0100, Paolo Bonzini wrote: > address_space_destroy_dispatch is called from an RCU callback and hence > outside the iothread mutex (BQL). However, after address_space_destroy > no new accesses can hit the destroyed AddressSpace so it is not necessary > to observe changes to the memory map. Move the memory_listener_unregister > call earlier, to make it thread-safe again. > > Reported-by: Alex Williamson <alex.williamson@redhat.com> > Fixes: 374f2981d1f10bc4307f250f24b2a7ddb9b14be0 > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > exec.c | 6 +++++- > include/exec/memory-internal.h | 1 + > memory.c | 1 + > 3 files changed, 7 insertions(+), 1 deletion(-) Seems to fix it, Thanks! Tested-by: Alex Williamson <alex.williamson@redhat.com> > diff --git a/exec.c b/exec.c > index 6b79ad1..6dff7bc 100644 > --- a/exec.c > +++ b/exec.c > @@ -2059,11 +2059,15 @@ void address_space_init_dispatch(AddressSpace *as) > memory_listener_register(&as->dispatch_listener, as); > } > > +void address_space_unregister(AddressSpace *as) > +{ > + memory_listener_unregister(&as->dispatch_listener); > +} > + > void address_space_destroy_dispatch(AddressSpace *as) > { > AddressSpaceDispatch *d = as->dispatch; > > - memory_listener_unregister(&as->dispatch_listener); > g_free(d); > as->dispatch = NULL; > } > diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h > index 25c43c0..fb467ac 100644 > --- a/include/exec/memory-internal.h > +++ b/include/exec/memory-internal.h > @@ -23,6 +23,7 @@ > typedef struct AddressSpaceDispatch AddressSpaceDispatch; > > void address_space_init_dispatch(AddressSpace *as); > +void address_space_unregister(AddressSpace *as); > void address_space_destroy_dispatch(AddressSpace *as); > > extern const MemoryRegionOps unassigned_mem_ops; > diff --git a/memory.c b/memory.c > index 9b91243..130152c 100644 > --- a/memory.c > +++ b/memory.c > @@ -1978,6 +1978,7 @@ void address_space_destroy(AddressSpace *as) > as->root = NULL; > memory_region_transaction_commit(); > QTAILQ_REMOVE(&address_spaces, as, address_spaces_link); > + address_space_unregister(as); > > /* At this point, as->dispatch and as->current_map are dummy > * entries that the guest should never use. Wait for the old
On 10/02/2015 18:11, Alex Williamson wrote: > > Reported-by: Alex Williamson <alex.williamson@redhat.com> > > Fixes: 374f2981d1f10bc4307f250f24b2a7ddb9b14be0 > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > > --- > > exec.c | 6 +++++- > > include/exec/memory-internal.h | 1 + > > memory.c | 1 + > > 3 files changed, 7 insertions(+), 1 deletion(-) > > Seems to fix it, Thanks! > > Tested-by: Alex Williamson <alex.williamson@redhat.com> Would you just include it in your next pull request? Mine will wait for a week or two, and it's nice to keep history more bisectable. Paolo
On Tue, 2015-02-10 at 18:13 +0100, Paolo Bonzini wrote: > > On 10/02/2015 18:11, Alex Williamson wrote: > > > Reported-by: Alex Williamson <alex.williamson@redhat.com> > > > Fixes: 374f2981d1f10bc4307f250f24b2a7ddb9b14be0 > > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > > > --- > > > exec.c | 6 +++++- > > > include/exec/memory-internal.h | 1 + > > > memory.c | 1 + > > > 3 files changed, 7 insertions(+), 1 deletion(-) > > > > Seems to fix it, Thanks! > > > > Tested-by: Alex Williamson <alex.williamson@redhat.com> > > Would you just include it in your next pull request? Mine will wait for > a week or two, and it's nice to keep history more bisectable. Sure, good idea. Thanks, Alex
Quoting Paolo Bonzini (2015-02-10 06:52:49) > address_space_destroy_dispatch is called from an RCU callback and hence > outside the iothread mutex (BQL). However, after address_space_destroy > no new accesses can hit the destroyed AddressSpace so it is not necessary > to observe changes to the memory map. Move the memory_listener_unregister > call earlier, to make it thread-safe again. > > Reported-by: Alex Williamson <alex.williamson@redhat.com> > Fixes: 374f2981d1f10bc4307f250f24b2a7ddb9b14be0 > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Prior to this patch I was seeing segfaults in various parts of memory listener register/unregister path running a workload that rapidly hot plugs/unplugs a sizeable number of devices, which seems to be addressed with this patch applied. But now I'm seeing a less frequent segfault in the RCU thread when running the same workload: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x3fffb689ec20 (LWP 26230)] call_rcu_thread (opaque=<optimized out>) at /home/mdroth/w/qemu.git/util/rcu.c:250 250 node->func(node); (gdb) bt #0 call_rcu_thread (opaque=<optimized out>) at /home/mdroth/w/qemu.git/util/rcu.c:250 #1 0x00003fffb787c29c in .start_thread () from /lib64/libpthread.so.0 #2 0x00003fffb779cd30 in .__clone () from /lib64/libc.so.6 (gdb) ptype node type = struct rcu_head { struct rcu_head *next; RCUCBFunc *func; } * (gdb) print node $1 = (struct rcu_head *) 0x11189a68 (gdb) print node->func $2 = (RCUCBFunc *) 0x0 (gdb) print node->next $3 = (struct rcu_head *) 0x3fff9800d4f0 I've seen it on both x86 and pseries (with spapr hotplug patches applied), and have only seen it occur at this spot. AFAICT node->func is only set via 1 of: call_rcu(old_view, flatview_unref, rcu); call_rcu(as, do_address_space_destroy, rcu); so it shouldn't ever be NULL... and there's a wmb after node->func is set, prior to the node being made available to the RCU thread via enqueue(), so that doesn't seem to be the issue. I think the node in this case is a FlatView*, if that helps narrow it down: (gdb) print ((AddressSpace *)(0x3fff9800d4f0))->name $5 = 0x100000000 <Address 0x100000000 out of bounds> (gdb) print ((FlatView *)(0x3fff9800d4f0))->ref $6 = 1 (gdb) print ((FlatView *)(0x3fff9800d4f0))->nr $7 = 34 (gdb) print ((FlatView *)(0x3fff9800d4f0))->nr_allocated $8 = 40 (gdb) The workload is basically this, run in a tight loop: device_add virtio-net-pci,id=0 sleep .5 ... device_add virtio-net-pci,id=14 sleep .5 sleep 3 device_del 0 ... device_del 14 Let me know if there's anything else I can do to narrow it down further. > --- > exec.c | 6 +++++- > include/exec/memory-internal.h | 1 + > memory.c | 1 + > 3 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/exec.c b/exec.c > index 6b79ad1..6dff7bc 100644 > --- a/exec.c > +++ b/exec.c > @@ -2059,11 +2059,15 @@ void address_space_init_dispatch(AddressSpace *as) > memory_listener_register(&as->dispatch_listener, as); > } > > +void address_space_unregister(AddressSpace *as) > +{ > + memory_listener_unregister(&as->dispatch_listener); > +} > + > void address_space_destroy_dispatch(AddressSpace *as) > { > AddressSpaceDispatch *d = as->dispatch; > > - memory_listener_unregister(&as->dispatch_listener); > g_free(d); > as->dispatch = NULL; > } > diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h > index 25c43c0..fb467ac 100644 > --- a/include/exec/memory-internal.h > +++ b/include/exec/memory-internal.h > @@ -23,6 +23,7 @@ > typedef struct AddressSpaceDispatch AddressSpaceDispatch; > > void address_space_init_dispatch(AddressSpace *as); > +void address_space_unregister(AddressSpace *as); > void address_space_destroy_dispatch(AddressSpace *as); > > extern const MemoryRegionOps unassigned_mem_ops; > diff --git a/memory.c b/memory.c > index 9b91243..130152c 100644 > --- a/memory.c > +++ b/memory.c > @@ -1978,6 +1978,7 @@ void address_space_destroy(AddressSpace *as) > as->root = NULL; > memory_region_transaction_commit(); > QTAILQ_REMOVE(&address_spaces, as, address_spaces_link); > + address_space_unregister(as); > > /* At this point, as->dispatch and as->current_map are dummy > * entries that the guest should never use. Wait for the old > -- > 1.8.3.1
On 11/02/2015 06:13, Michael Roth wrote: > (gdb) print node > $1 = (struct rcu_head *) 0x11189a68 > (gdb) print node->func > $2 = (RCUCBFunc *) 0x0 > (gdb) print node->next > $3 = (struct rcu_head *) 0x3fff9800d4f0 > > I've seen it on both x86 and pseries (with spapr hotplug patches applied), and > have only seen it occur at this spot. > > AFAICT node->func is only set via 1 of: > > call_rcu(old_view, flatview_unref, rcu); > call_rcu(as, do_address_space_destroy, rcu); > > so it shouldn't ever be NULL... and there's a wmb after node->func is set, > prior to the node being made available to the RCU thread via enqueue(), so > that doesn't seem to be the issue. > > I think the node in this case is a FlatView*, if that helps narrow it down: > > (gdb) print ((AddressSpace *)(0x3fff9800d4f0))->name > $5 = 0x100000000 <Address 0x100000000 out of bounds> This is node->next, not node. The weird address looks almost like node == &dummy. I'll try to reproduce. Paolo
Quoting Paolo Bonzini (2015-02-11 01:30:00) > On 11/02/2015 06:13, Michael Roth wrote: > > (gdb) print node > > $1 = (struct rcu_head *) 0x11189a68 > > (gdb) print node->func > > $2 = (RCUCBFunc *) 0x0 > > (gdb) print node->next > > $3 = (struct rcu_head *) 0x3fff9800d4f0 > > > > I've seen it on both x86 and pseries (with spapr hotplug patches applied), and > > have only seen it occur at this spot. > > > > AFAICT node->func is only set via 1 of: > > > > call_rcu(old_view, flatview_unref, rcu); > > call_rcu(as, do_address_space_destroy, rcu); > > > > so it shouldn't ever be NULL... and there's a wmb after node->func is set, > > prior to the node being made available to the RCU thread via enqueue(), so > > that doesn't seem to be the issue. > > > > I think the node in this case is a FlatView*, if that helps narrow it down: > > > > (gdb) print ((AddressSpace *)(0x3fff9800d4f0))->name > > $5 = 0x100000000 <Address 0x100000000 out of bounds> > > This is node->next, not node. The weird address looks almost like node > == &dummy. I'll try to reproduce. Doh, sorry, not sure why I started looking at that address. node looks to be an AddressSpace* by way of do_pci_register_device(): (gdb) print node $21 = (struct rcu_head *) 0x11189a68 (gdb) print ((PCIDevice *)(0x11189860))->name $22 = "virtio-net-pci", '\000' <repeats 49 times> (gdb) print ((AddressSpace *)(node))->root->name $13 = 0x1117a410 "bus master" (gdb) print ((PCIDevice *)(0x11189860))->devfn $23 = 64 Is this state unexpected? (gdb) print ((DeviceState *)(0x11189860))->realized $24 = true (gdb) print ((Object *)(0x11189860))->ref $25 = 4 (gdb) Since the AddressSpace is a fields of PCIDevice, do we maybe need to make sure it's refcount doesn't drop to 0 prior to the RCU callback taking place? > > Paolo
On 11/02/2015 14:16, Michael Roth wrote: > Since the AddressSpace is a fields of PCIDevice, do we maybe need to make > sure it's refcount doesn't drop to 0 prior to the RCU callback taking > place? Yes, but then the problem is that objects are finalized outside the big QEMU lock. This is a pretty large can of worms, because finalizing an object can in turn finalize other objects, which means a lot of mutexes have to be introduced. I'll think a bit more about it. Paolo
diff --git a/exec.c b/exec.c index 6b79ad1..6dff7bc 100644 --- a/exec.c +++ b/exec.c @@ -2059,11 +2059,15 @@ void address_space_init_dispatch(AddressSpace *as) memory_listener_register(&as->dispatch_listener, as); } +void address_space_unregister(AddressSpace *as) +{ + memory_listener_unregister(&as->dispatch_listener); +} + void address_space_destroy_dispatch(AddressSpace *as) { AddressSpaceDispatch *d = as->dispatch; - memory_listener_unregister(&as->dispatch_listener); g_free(d); as->dispatch = NULL; } diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h index 25c43c0..fb467ac 100644 --- a/include/exec/memory-internal.h +++ b/include/exec/memory-internal.h @@ -23,6 +23,7 @@ typedef struct AddressSpaceDispatch AddressSpaceDispatch; void address_space_init_dispatch(AddressSpace *as); +void address_space_unregister(AddressSpace *as); void address_space_destroy_dispatch(AddressSpace *as); extern const MemoryRegionOps unassigned_mem_ops; diff --git a/memory.c b/memory.c index 9b91243..130152c 100644 --- a/memory.c +++ b/memory.c @@ -1978,6 +1978,7 @@ void address_space_destroy(AddressSpace *as) as->root = NULL; memory_region_transaction_commit(); QTAILQ_REMOVE(&address_spaces, as, address_spaces_link); + address_space_unregister(as); /* At this point, as->dispatch and as->current_map are dummy * entries that the guest should never use. Wait for the old
address_space_destroy_dispatch is called from an RCU callback and hence outside the iothread mutex (BQL). However, after address_space_destroy no new accesses can hit the destroyed AddressSpace so it is not necessary to observe changes to the memory map. Move the memory_listener_unregister call earlier, to make it thread-safe again. Reported-by: Alex Williamson <alex.williamson@redhat.com> Fixes: 374f2981d1f10bc4307f250f24b2a7ddb9b14be0 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- exec.c | 6 +++++- include/exec/memory-internal.h | 1 + memory.c | 1 + 3 files changed, 7 insertions(+), 1 deletion(-)