Message ID | 20180705181148.26871-1-clg@kaod.org |
---|---|
State | New |
Headers | show |
Series | pci: remove pci_del_option_rom() | expand |
On Thu, Jul 05, 2018 at 08:11:48PM +0200, Cédric Le Goater wrote: > PCI devices needing a ROM allocate an optional MemoryRegion with > pci_add_option_rom(). pci_del_option_rom() does the cleanup when the > device is destroyed. The only action taken by this routine is to call > vmstate_unregister_ram() which clears the id string of the optional > ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This > was recently added by commit b895de502717 ("migration: discard > non-migratable RAMBlocks"), . > > VFIO devices do their own allocation of the PCI ROM region. It is > initialized in vfio_pci_size_rom() in which the PCI attribute > 'has_rom' is set to true but the RAMBlock of the ROM region is not > allocated. When the associated PCI device is deleted, > pci_del_option_rom() calls vmstate_unregister_ram() which tries to > flag a NULL RAMBlock because 'has_rom' is set, leading to a SEGV . > > The use of vmstate_unregister_ram() in the PCI device was added in > commit b0e56e0b63f350691b52d3e75e89bb64143fbeff ("unset RAMBlock idstr > when unregister MemoryRegion") I don't see it in that commit. I think it was part of the original split by Avi. > and from the archive in > http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html, it > seems that it was trying to fix a reference count issue. > > vmstate_unregister_ram() being a work around, let's remove it to fix > the current SEGV issue > and let's try to find a fix for the initial ref > count issue if we can reproduce. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> What kind of testing did you do on this patch? Could you include that info in the commit log pls? I think you need to at least add/remove some devices, then migrate. > --- > hw/pci/pci.c | 11 ----------- > 1 file changed, 11 deletions(-) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index 80bc45930dee..78bf74e19f22 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -191,7 +191,6 @@ static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); > static void pci_update_mappings(PCIDevice *d); > static void pci_irq_handler(void *opaque, int irq_num, int level); > static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, Error **); > -static void pci_del_option_rom(PCIDevice *pdev); > > static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET; > static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU; > @@ -1096,7 +1095,6 @@ static void pci_qdev_unrealize(DeviceState *dev, Error **errp) > PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev); > > pci_unregister_io_regions(pci_dev); > - pci_del_option_rom(pci_dev); > > if (pc->exit) { > pc->exit(pci_dev); > @@ -2262,15 +2260,6 @@ static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, > pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); > } > > -static void pci_del_option_rom(PCIDevice *pdev) > -{ > - if (!pdev->has_rom) > - return; > - > - vmstate_unregister_ram(&pdev->rom, &pdev->qdev); > - pdev->has_rom = false; > -} > - > /* > * On success, pci_add_capability() returns a positive value > * that the offset of the pci capability. > -- > 2.13.6
On Thu, 5 Jul 2018 20:11:48 +0200 Cédric Le Goater <clg@kaod.org> wrote: > PCI devices needing a ROM allocate an optional MemoryRegion with > pci_add_option_rom(). pci_del_option_rom() does the cleanup when the > device is destroyed. The only action taken by this routine is to call > vmstate_unregister_ram() which clears the id string of the optional > ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This > was recently added by commit b895de502717 ("migration: discard > non-migratable RAMBlocks"), . > > VFIO devices do their own allocation of the PCI ROM region. It is > initialized in vfio_pci_size_rom() in which the PCI attribute > 'has_rom' is set to true but the RAMBlock of the ROM region is not > allocated. When the associated PCI device is deleted, > pci_del_option_rom() calls vmstate_unregister_ram() which tries to > flag a NULL RAMBlock because 'has_rom' is set, leading to a SEGV . > > The use of vmstate_unregister_ram() in the PCI device was added in > commit b0e56e0b63f350691b52d3e75e89bb64143fbeff ("unset RAMBlock idstr > when unregister MemoryRegion") and from the archive in > http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html, it > seems that it was trying to fix a reference count issue. > > vmstate_unregister_ram() being a work around, let's remove it to fix > the current SEGV issue and let's try to find a fix for the initial ref > count issue if we can reproduce. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > --- > hw/pci/pci.c | 11 ----------- > 1 file changed, 11 deletions(-) Looking back through git history, I think vfio sets PCIDevice.has_rom because we needed that to have memory_region_destroy() called, but that changed with Paolo's: 469b046ead06 ("memory: remove memory_region_destroy") Now the MemoryRegion gets freed automagically, so maybe the better option is that vfio-pci should not set has_rom to keep it out of this path. I don't see that has_rom serves any other purpose. Thanks, Alex
On 07/05/2018 09:11 PM, Michael S. Tsirkin wrote: > On Thu, Jul 05, 2018 at 08:11:48PM +0200, Cédric Le Goater wrote: >> PCI devices needing a ROM allocate an optional MemoryRegion with >> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the >> device is destroyed. The only action taken by this routine is to call >> vmstate_unregister_ram() which clears the id string of the optional >> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This >> was recently added by commit b895de502717 ("migration: discard >> non-migratable RAMBlocks"), . >> >> VFIO devices do their own allocation of the PCI ROM region. It is >> initialized in vfio_pci_size_rom() in which the PCI attribute >> 'has_rom' is set to true but the RAMBlock of the ROM region is not >> allocated. When the associated PCI device is deleted, >> pci_del_option_rom() calls vmstate_unregister_ram() which tries to >> flag a NULL RAMBlock because 'has_rom' is set, leading to a SEGV . >> >> The use of vmstate_unregister_ram() in the PCI device was added in >> commit b0e56e0b63f350691b52d3e75e89bb64143fbeff ("unset RAMBlock idstr >> when unregister MemoryRegion") > > I don't see it in that commit. I think it was part of the original > split by Avi. ok. That was pointed out to me by Paolo. It is hard to track all the changes. >> and from the archive in >> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html, it >> seems that it was trying to fix a reference count issue. >> >> vmstate_unregister_ram() being a work around, let's remove it to fix >> the current SEGV issue >> and let's try to find a fix for the initial ref >> count issue if we can reproduce. >> >> Signed-off-by: Cédric Le Goater <clg@kaod.org> > > What kind of testing did you do on this patch? Could you include > that info in the commit log pls? > > I think you need to at least add/remove some devices, then migrate. ok, for next round. I plugged/unplugged a : 0034:01:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] and then migrated. Here is the original segv backtrace: I caught this bug while deleting a passthrough device from a pseries machine. Here is the stack: #0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994 #1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>) #2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330) #3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>) #4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0) #5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, #6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, #7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, #8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, #9 0x00000001005fcdd8 in device_unparent (obj=0x101796330) #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330) #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0) #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0) #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0) #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>) C. > >> --- >> hw/pci/pci.c | 11 ----------- >> 1 file changed, 11 deletions(-) >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index 80bc45930dee..78bf74e19f22 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -191,7 +191,6 @@ static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); >> static void pci_update_mappings(PCIDevice *d); >> static void pci_irq_handler(void *opaque, int irq_num, int level); >> static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, Error **); >> -static void pci_del_option_rom(PCIDevice *pdev); >> >> static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET; >> static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU; >> @@ -1096,7 +1095,6 @@ static void pci_qdev_unrealize(DeviceState *dev, Error **errp) >> PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev); >> >> pci_unregister_io_regions(pci_dev); >> - pci_del_option_rom(pci_dev); >> >> if (pc->exit) { >> pc->exit(pci_dev); >> @@ -2262,15 +2260,6 @@ static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, >> pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); >> } >> >> -static void pci_del_option_rom(PCIDevice *pdev) >> -{ >> - if (!pdev->has_rom) >> - return; >> - >> - vmstate_unregister_ram(&pdev->rom, &pdev->qdev); >> - pdev->has_rom = false; >> -} >> - >> /* >> * On success, pci_add_capability() returns a positive value >> * that the offset of the pci capability. >> -- >> 2.13.6
On 05/07/2018 21:30, Alex Williamson wrote: > On Thu, 5 Jul 2018 20:11:48 +0200 > Cédric Le Goater <clg@kaod.org> wrote: > >> PCI devices needing a ROM allocate an optional MemoryRegion with >> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the >> device is destroyed. The only action taken by this routine is to call >> vmstate_unregister_ram() which clears the id string of the optional >> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This >> was recently added by commit b895de502717 ("migration: discard >> non-migratable RAMBlocks"), . >> >> VFIO devices do their own allocation of the PCI ROM region. It is >> initialized in vfio_pci_size_rom() in which the PCI attribute >> 'has_rom' is set to true but the RAMBlock of the ROM region is not >> allocated. When the associated PCI device is deleted, >> pci_del_option_rom() calls vmstate_unregister_ram() which tries to >> flag a NULL RAMBlock because 'has_rom' is set, leading to a SEGV . >> >> The use of vmstate_unregister_ram() in the PCI device was added in >> commit b0e56e0b63f350691b52d3e75e89bb64143fbeff ("unset RAMBlock idstr >> when unregister MemoryRegion") and from the archive in >> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html, it >> seems that it was trying to fix a reference count issue. >> >> vmstate_unregister_ram() being a work around, let's remove it to fix >> the current SEGV issue and let's try to find a fix for the initial ref >> count issue if we can reproduce. >> >> Signed-off-by: Cédric Le Goater <clg@kaod.org> >> --- >> hw/pci/pci.c | 11 ----------- >> 1 file changed, 11 deletions(-) > > Looking back through git history, I think vfio sets PCIDevice.has_rom > because we needed that to have memory_region_destroy() called, but that > changed with Paolo's: > > 469b046ead06 ("memory: remove memory_region_destroy") > > Now the MemoryRegion gets freed automagically, so maybe the better > option is that vfio-pci should not set has_rom to keep it out of this > path. I don't see that has_rom serves any other purpose. Thanks, That would work for me too! Paolo
On Thu, Jul 05, 2018 at 11:44:44PM +0200, Paolo Bonzini wrote: > On 05/07/2018 21:30, Alex Williamson wrote: > > On Thu, 5 Jul 2018 20:11:48 +0200 > > Cédric Le Goater <clg@kaod.org> wrote: > > > >> PCI devices needing a ROM allocate an optional MemoryRegion with > >> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the > >> device is destroyed. The only action taken by this routine is to call > >> vmstate_unregister_ram() which clears the id string of the optional > >> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This > >> was recently added by commit b895de502717 ("migration: discard > >> non-migratable RAMBlocks"), . > >> > >> VFIO devices do their own allocation of the PCI ROM region. It is > >> initialized in vfio_pci_size_rom() in which the PCI attribute > >> 'has_rom' is set to true but the RAMBlock of the ROM region is not > >> allocated. When the associated PCI device is deleted, > >> pci_del_option_rom() calls vmstate_unregister_ram() which tries to > >> flag a NULL RAMBlock because 'has_rom' is set, leading to a SEGV . > >> > >> The use of vmstate_unregister_ram() in the PCI device was added in > >> commit b0e56e0b63f350691b52d3e75e89bb64143fbeff ("unset RAMBlock idstr > >> when unregister MemoryRegion") and from the archive in > >> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html, it > >> seems that it was trying to fix a reference count issue. > >> > >> vmstate_unregister_ram() being a work around, let's remove it to fix > >> the current SEGV issue and let's try to find a fix for the initial ref > >> count issue if we can reproduce. > >> > >> Signed-off-by: Cédric Le Goater <clg@kaod.org> > >> --- > >> hw/pci/pci.c | 11 ----------- > >> 1 file changed, 11 deletions(-) > > > > Looking back through git history, I think vfio sets PCIDevice.has_rom > > because we needed that to have memory_region_destroy() called, but that > > changed with Paolo's: > > > > 469b046ead06 ("memory: remove memory_region_destroy") > > > > Now the MemoryRegion gets freed automagically, so maybe the better > > option is that vfio-pci should not set has_rom to keep it out of this > > path. I don't see that has_rom serves any other purpose. Thanks, > > That would work for me too! Paolo, A question about memory region auto destruction (which might not related to this patch): I see that we have object_property_add_child() in memory_region_do_init() to achieve the auto destruction but only if the "name" of memory region is specified. Could we just do that unconditionally (though we might of course need to generate some of the names), or is there a reason not to do so? Also, not sure whether it's good we add some more comments to memory_region_do_init() to mention about its auto destruction capability, and if the "name" check is needed, possibly we comment that as well (I can do that after I understand it well, if you like)? Thanks,
On 06/07/2018 03:51, Peter Xu wrote: > > A question about memory region auto destruction (which might not > related to this patch): I see that we have object_property_add_child() > in memory_region_do_init() to achieve the auto destruction but only if > the "name" of memory region is specified. Could we just do that > unconditionally (though we might of course need to generate some of > the names), or is there a reason not to do so? I'm not sure actually if there are still regions without a name... Paolo
On Fri, Jul 06, 2018 at 05:55:23PM +0200, Paolo Bonzini wrote: > On 06/07/2018 03:51, Peter Xu wrote: > > > > A question about memory region auto destruction (which might not > > related to this patch): I see that we have object_property_add_child() > > in memory_region_do_init() to achieve the auto destruction but only if > > the "name" of memory region is specified. Could we just do that > > unconditionally (though we might of course need to generate some of > > the names), or is there a reason not to do so? > > I'm not sure actually if there are still regions without a name... > > Paolo Answer to Peter's question would be a yes then? With all the autodestruct I'm unsure when is calling vmstate_unregister_ram appropriate. Is it necessary to invoke that from pci any longer?
On 07/06/2018 06:25 PM, Michael S. Tsirkin wrote: > On Fri, Jul 06, 2018 at 05:55:23PM +0200, Paolo Bonzini wrote: >> On 06/07/2018 03:51, Peter Xu wrote: >>> >>> A question about memory region auto destruction (which might not >>> related to this patch): I see that we have object_property_add_child() >>> in memory_region_do_init() to achieve the auto destruction but only if >>> the "name" of memory region is specified. Could we just do that >>> unconditionally (though we might of course need to generate some of >>> the names), or is there a reason not to do so? >> >> I'm not sure actually if there are still regions without a name... >> >> Paolo > > Answer to Peter's question would be a yes then? > > With all the autodestruct I'm unsure when is calling vmstate_unregister_ram appropriate. > Is it necessary to invoke that from pci any longer? That could be another patch though. This is trying to fix vfio-pci unplug. C.
On 06/07/2018 18:25, Michael S. Tsirkin wrote: > On Fri, Jul 06, 2018 at 05:55:23PM +0200, Paolo Bonzini wrote: >> On 06/07/2018 03:51, Peter Xu wrote: >>> >>> A question about memory region auto destruction (which might not >>> related to this patch): I see that we have object_property_add_child() >>> in memory_region_do_init() to achieve the auto destruction but only if >>> the "name" of memory region is specified. Could we just do that >>> unconditionally (though we might of course need to generate some of >>> the names), or is there a reason not to do so? >> >> I'm not sure actually if there are still regions without a name... >> >> Paolo > > Answer to Peter's question would be a yes then? > > With all the autodestruct I'm unsure when is calling vmstate_unregister_ram appropriate. > Is it necessary to invoke that from pci any longer? > I think vmstate_unregister_ram is not necessary at all. This patch, or Alex's suggestion, are smaller changes in that direction---more suitable as we're closer to the release. Paolo
On Fri, Jul 06, 2018 at 07:06:36PM +0200, Paolo Bonzini wrote: > On 06/07/2018 18:25, Michael S. Tsirkin wrote: > > On Fri, Jul 06, 2018 at 05:55:23PM +0200, Paolo Bonzini wrote: > >> On 06/07/2018 03:51, Peter Xu wrote: > >>> > >>> A question about memory region auto destruction (which might not > >>> related to this patch): I see that we have object_property_add_child() > >>> in memory_region_do_init() to achieve the auto destruction but only if > >>> the "name" of memory region is specified. Could we just do that > >>> unconditionally (though we might of course need to generate some of > >>> the names), or is there a reason not to do so? > >> > >> I'm not sure actually if there are still regions without a name... > >> > >> Paolo > > > > Answer to Peter's question would be a yes then? > > > > With all the autodestruct I'm unsure when is calling vmstate_unregister_ram appropriate. > > Is it necessary to invoke that from pci any longer? > > > > I think vmstate_unregister_ram is not necessary at all. This patch, or > Alex's suggestion, are smaller changes in that direction---more suitable > as we're closer to the release. > > Paolo Oh absolutely. I was just wandering what am I missing. Cédric would you be interested in posting a patch removing vmstate_unregister_ram after release? You can do a series starting with this one.
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 80bc45930dee..78bf74e19f22 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -191,7 +191,6 @@ static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); static void pci_update_mappings(PCIDevice *d); static void pci_irq_handler(void *opaque, int irq_num, int level); static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, Error **); -static void pci_del_option_rom(PCIDevice *pdev); static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET; static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU; @@ -1096,7 +1095,6 @@ static void pci_qdev_unrealize(DeviceState *dev, Error **errp) PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev); pci_unregister_io_regions(pci_dev); - pci_del_option_rom(pci_dev); if (pc->exit) { pc->exit(pci_dev); @@ -2262,15 +2260,6 @@ static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); } -static void pci_del_option_rom(PCIDevice *pdev) -{ - if (!pdev->has_rom) - return; - - vmstate_unregister_ram(&pdev->rom, &pdev->qdev); - pdev->has_rom = false; -} - /* * On success, pci_add_capability() returns a positive value * that the offset of the pci capability.
PCI devices needing a ROM allocate an optional MemoryRegion with pci_add_option_rom(). pci_del_option_rom() does the cleanup when the device is destroyed. The only action taken by this routine is to call vmstate_unregister_ram() which clears the id string of the optional ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This was recently added by commit b895de502717 ("migration: discard non-migratable RAMBlocks"), . VFIO devices do their own allocation of the PCI ROM region. It is initialized in vfio_pci_size_rom() in which the PCI attribute 'has_rom' is set to true but the RAMBlock of the ROM region is not allocated. When the associated PCI device is deleted, pci_del_option_rom() calls vmstate_unregister_ram() which tries to flag a NULL RAMBlock because 'has_rom' is set, leading to a SEGV . The use of vmstate_unregister_ram() in the PCI device was added in commit b0e56e0b63f350691b52d3e75e89bb64143fbeff ("unset RAMBlock idstr when unregister MemoryRegion") and from the archive in http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html, it seems that it was trying to fix a reference count issue. vmstate_unregister_ram() being a work around, let's remove it to fix the current SEGV issue and let's try to find a fix for the initial ref count issue if we can reproduce. Signed-off-by: Cédric Le Goater <clg@kaod.org> --- hw/pci/pci.c | 11 ----------- 1 file changed, 11 deletions(-)