Message ID | 20181225140449.15786-10-fli@suse.com |
---|---|
State | New |
Headers | show |
Series | [for-4.0,v9,01/16] Fix segmentation fault when qemu_signal_init fails | expand |
Fei Li <fli@suse.com> writes: > Utilize the existed errp to propagate the error instead of the > temporary &error_abort. > > Cc: Markus Armbruster <armbru@redhat.com> > Cc: Jiri Slaby <jslaby@suse.cz> > Signed-off-by: Fei Li <fli@suse.com> > --- > hw/misc/edu.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/hw/misc/edu.c b/hw/misc/edu.c > index 3f4ba7ded3..011fe6e0b7 100644 > --- a/hw/misc/edu.c > +++ b/hw/misc/edu.c > @@ -356,9 +356,10 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp) > > qemu_mutex_init(&edu->thr_mutex); > qemu_cond_init(&edu->thr_cond); > - /* TODO: let the further caller handle the error instead of abort() here */ > - qemu_thread_create(&edu->thread, "edu", edu_fact_thread, > - edu, QEMU_THREAD_JOINABLE, &error_abort); > + if (!qemu_thread_create(&edu->thread, "edu", edu_fact_thread, > + edu, QEMU_THREAD_JOINABLE, errp)) { > + return; You need to clean up everything that got initialized so far. You might want to call qemu_thread_create() earlier so you have less to clean up. > + } > > memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu, > "edu-mmio", 1 * MiB); pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &edu->mmio); } static void pci_edu_uninit(PCIDevice *pdev) { EduState *edu = EDU(pdev); qemu_mutex_lock(&edu->thr_mutex); edu->stopping = true; qemu_mutex_unlock(&edu->thr_mutex); qemu_cond_signal(&edu->thr_cond); qemu_thread_join(&edu->thread); qemu_cond_destroy(&edu->thr_cond); qemu_mutex_destroy(&edu->thr_mutex); timer_del(&edu->dma_timer); } Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?
On 07. 01. 19, 18:29, Markus Armbruster wrote: > static void pci_edu_uninit(PCIDevice *pdev) > { > EduState *edu = EDU(pdev); > > qemu_mutex_lock(&edu->thr_mutex); > edu->stopping = true; > qemu_mutex_unlock(&edu->thr_mutex); > qemu_cond_signal(&edu->thr_cond); > qemu_thread_join(&edu->thread); > > qemu_cond_destroy(&edu->thr_cond); > qemu_mutex_destroy(&edu->thr_mutex); > > timer_del(&edu->dma_timer); > } > > Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?\ I don't know, the MSI support was added in: commit eabb5782f70b4a10975b24ccd7129929a05ac932 Author: Peter Xu <peterx@redhat.com> Date: Wed Sep 28 21:03:39 2016 +0800 hw/misc/edu: support MSI interrupt Hence CCing Peter. thanks,
On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote: > On 07. 01. 19, 18:29, Markus Armbruster wrote: > > static void pci_edu_uninit(PCIDevice *pdev) > > { > > EduState *edu = EDU(pdev); > > > > qemu_mutex_lock(&edu->thr_mutex); > > edu->stopping = true; > > qemu_mutex_unlock(&edu->thr_mutex); > > qemu_cond_signal(&edu->thr_cond); > > qemu_thread_join(&edu->thread); > > > > qemu_cond_destroy(&edu->thr_cond); > > qemu_mutex_destroy(&edu->thr_mutex); > > > > timer_del(&edu->dma_timer); > > } > > > > Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?\ > > I don't know, the MSI support was added in: > commit eabb5782f70b4a10975b24ccd7129929a05ac932 > Author: Peter Xu <peterx@redhat.com> > Date: Wed Sep 28 21:03:39 2016 +0800 > > hw/misc/edu: support MSI interrupt > > Hence CCing Peter. Hi, Jiri, Markus, Fei, IMHO msi_uninit() is optional since it only operates on the config space of the device to remove the capability or fix up the flags without really doing any real destruction of objects so nothing will be leaked (unlike msix_uninit, which should be required). But I do agree that calling msi_uninit() could be even nicer here. Anyone would like to post a patch? Or should I? Thanks,
Peter Xu <peterx@redhat.com> writes: > On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote: >> On 07. 01. 19, 18:29, Markus Armbruster wrote: >> > static void pci_edu_uninit(PCIDevice *pdev) >> > { >> > EduState *edu = EDU(pdev); >> > >> > qemu_mutex_lock(&edu->thr_mutex); >> > edu->stopping = true; >> > qemu_mutex_unlock(&edu->thr_mutex); >> > qemu_cond_signal(&edu->thr_cond); >> > qemu_thread_join(&edu->thread); >> > >> > qemu_cond_destroy(&edu->thr_cond); >> > qemu_mutex_destroy(&edu->thr_mutex); >> > >> > timer_del(&edu->dma_timer); >> > } >> > >> > Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?\ >> >> I don't know, the MSI support was added in: >> commit eabb5782f70b4a10975b24ccd7129929a05ac932 >> Author: Peter Xu <peterx@redhat.com> >> Date: Wed Sep 28 21:03:39 2016 +0800 >> >> hw/misc/edu: support MSI interrupt >> >> Hence CCing Peter. > > Hi, Jiri, Markus, Fei, > > IMHO msi_uninit() is optional since it only operates on the config > space of the device to remove the capability or fix up the flags > without really doing any real destruction of objects so nothing will > be leaked (unlike msix_uninit, which should be required). Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless bug, or a harmful bug? > But I do > agree that calling msi_uninit() could be even nicer here. > > Anyone would like to post a patch? Or should I? Please coordinate fixing this with Fei Li.
在 2019/1/8 下午4:43, Markus Armbruster 写道: > Peter Xu <peterx@redhat.com> writes: > >> On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote: >>> On 07. 01. 19, 18:29, Markus Armbruster wrote: >>>> static void pci_edu_uninit(PCIDevice *pdev) >>>> { >>>> EduState *edu = EDU(pdev); >>>> >>>> qemu_mutex_lock(&edu->thr_mutex); >>>> edu->stopping = true; >>>> qemu_mutex_unlock(&edu->thr_mutex); >>>> qemu_cond_signal(&edu->thr_cond); >>>> qemu_thread_join(&edu->thread); >>>> >>>> qemu_cond_destroy(&edu->thr_cond); >>>> qemu_mutex_destroy(&edu->thr_mutex); >>>> >>>> timer_del(&edu->dma_timer); >>>> } >>>> >>>> Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?\ >>> I don't know, the MSI support was added in: >>> commit eabb5782f70b4a10975b24ccd7129929a05ac932 >>> Author: Peter Xu <peterx@redhat.com> >>> Date: Wed Sep 28 21:03:39 2016 +0800 >>> >>> hw/misc/edu: support MSI interrupt >>> >>> Hence CCing Peter. >> Hi, Jiri, Markus, Fei, >> >> IMHO msi_uninit() is optional since it only operates on the config >> space of the device to remove the capability or fix up the flags >> without really doing any real destruction of objects so nothing will >> be leaked (unlike msix_uninit, which should be required). > Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless > bug, or a harmful bug? Kindly ping. :) If corresponding change is needed, I'd like to do the update in the next version. >> But I do >> agree that calling msi_uninit() could be even nicer here. >> >> Anyone would like to post a patch? Or should I? > Please coordinate fixing this with Fei Li. Have a nice day, thanks all Fei
On Thu, Jan 10, 2019 at 09:29:38PM +0800, Fei Li wrote: > > 在 2019/1/8 下午4:43, Markus Armbruster 写道: > > Peter Xu <peterx@redhat.com> writes: > > > > > On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote: > > > > On 07. 01. 19, 18:29, Markus Armbruster wrote: > > > > > static void pci_edu_uninit(PCIDevice *pdev) > > > > > { > > > > > EduState *edu = EDU(pdev); > > > > > > > > > > qemu_mutex_lock(&edu->thr_mutex); > > > > > edu->stopping = true; > > > > > qemu_mutex_unlock(&edu->thr_mutex); > > > > > qemu_cond_signal(&edu->thr_cond); > > > > > qemu_thread_join(&edu->thread); > > > > > > > > > > qemu_cond_destroy(&edu->thr_cond); > > > > > qemu_mutex_destroy(&edu->thr_mutex); > > > > > > > > > > timer_del(&edu->dma_timer); > > > > > } > > > > > > > > > > Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?\ > > > > I don't know, the MSI support was added in: > > > > commit eabb5782f70b4a10975b24ccd7129929a05ac932 > > > > Author: Peter Xu <peterx@redhat.com> > > > > Date: Wed Sep 28 21:03:39 2016 +0800 > > > > > > > > hw/misc/edu: support MSI interrupt > > > > > > > > Hence CCing Peter. > > > Hi, Jiri, Markus, Fei, > > > > > > IMHO msi_uninit() is optional since it only operates on the config > > > space of the device to remove the capability or fix up the flags > > > without really doing any real destruction of objects so nothing will > > > be leaked (unlike msix_uninit, which should be required). > > Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless > > bug, or a harmful bug? > > Kindly ping. :) > > If corresponding change is needed, I'd like to do the update in the next > version. Fei, If you're going to post the edu patch, please post it as a standalone patch. More patches mean harder that the series could be accepted quickly. So it would be good to split patches sometimes especially if they are irrelevant. Regards,
在 2019/1/11 上午10:49, Peter Xu 写道: > On Thu, Jan 10, 2019 at 09:29:38PM +0800, Fei Li wrote: >> 在 2019/1/8 下午4:43, Markus Armbruster 写道: >>> Peter Xu <peterx@redhat.com> writes: >>> >>>> On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote: >>>>> On 07. 01. 19, 18:29, Markus Armbruster wrote: >>>>>> static void pci_edu_uninit(PCIDevice *pdev) >>>>>> { >>>>>> EduState *edu = EDU(pdev); >>>>>> >>>>>> qemu_mutex_lock(&edu->thr_mutex); >>>>>> edu->stopping = true; >>>>>> qemu_mutex_unlock(&edu->thr_mutex); >>>>>> qemu_cond_signal(&edu->thr_cond); >>>>>> qemu_thread_join(&edu->thread); >>>>>> >>>>>> qemu_cond_destroy(&edu->thr_cond); >>>>>> qemu_mutex_destroy(&edu->thr_mutex); >>>>>> >>>>>> timer_del(&edu->dma_timer); >>>>>> } >>>>>> >>>>>> Preexisting: pci_edu_uninit() neglects to call msi_uninit(). Jiri?\ >>>>> I don't know, the MSI support was added in: >>>>> commit eabb5782f70b4a10975b24ccd7129929a05ac932 >>>>> Author: Peter Xu <peterx@redhat.com> >>>>> Date: Wed Sep 28 21:03:39 2016 +0800 >>>>> >>>>> hw/misc/edu: support MSI interrupt >>>>> >>>>> Hence CCing Peter. >>>> Hi, Jiri, Markus, Fei, >>>> >>>> IMHO msi_uninit() is optional since it only operates on the config >>>> space of the device to remove the capability or fix up the flags >>>> without really doing any real destruction of objects so nothing will >>>> be leaked (unlike msix_uninit, which should be required). >>> Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless >>> bug, or a harmful bug? >> Kindly ping. :) >> >> If corresponding change is needed, I'd like to do the update in the next >> version. > Fei, > > If you're going to post the edu patch, please post it as a standalone > patch. More patches mean harder that the series could be accepted > quickly. So it would be good to split patches sometimes especially if > they are irrelevant. > > Regards, > Ok, thanks for this helpful suggestion. Will send this patch alone in the next version. :) Have a nice day Fei
在 2019/1/8 上午1:29, Markus Armbruster 写道: > Fei Li <fli@suse.com> writes: > >> Utilize the existed errp to propagate the error instead of the >> temporary &error_abort. >> >> Cc: Markus Armbruster <armbru@redhat.com> >> Cc: Jiri Slaby <jslaby@suse.cz> >> Signed-off-by: Fei Li <fli@suse.com> >> --- >> hw/misc/edu.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/hw/misc/edu.c b/hw/misc/edu.c >> index 3f4ba7ded3..011fe6e0b7 100644 >> --- a/hw/misc/edu.c >> +++ b/hw/misc/edu.c >> @@ -356,9 +356,10 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp) >> >> qemu_mutex_init(&edu->thr_mutex); >> qemu_cond_init(&edu->thr_cond); >> - /* TODO: let the further caller handle the error instead of abort() here */ >> - qemu_thread_create(&edu->thread, "edu", edu_fact_thread, >> - edu, QEMU_THREAD_JOINABLE, &error_abort); >> + if (!qemu_thread_create(&edu->thread, "edu", edu_fact_thread, >> + edu, QEMU_THREAD_JOINABLE, errp)) { >> + return; > You need to clean up everything that got initialized so far. You might > want to call qemu_thread_create() earlier so you have less to clean up. Just to make sure about how to do the cleanup. I notice that in device_set_realized(), the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails. if (dc->realize) { dc->realize(dev, &local_err); } if (local_err != NULL) { goto fail; } Is this on purpose? (Maybe due to some devices' realize() do their own cleanup when fails? Sorry for the unsure, it is such a common function that I did not check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices. Have a nice day Fei
Fei Li <lifei1214@126.com> writes: > Just to make sure about how to do the cleanup. I notice that in device_set_realized(), > the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails. > > if (dc->realize) { > dc->realize(dev, &local_err); > } > > if (local_err != NULL) { > goto fail; > } > > Is this on purpose? (Maybe due to some devices' realize() do their own cleanup > when fails? Sorry for the unsure, it is such a common function that I did not > check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices. Yes, this is on purpose. When a realize() method fails, it must revert everything it has done so far. Results in sane "either succeed completely, or fail and do nothing" semantics.
在 2019/1/14 下午8:36, Markus Armbruster 写道: > Fei Li <lifei1214@126.com> writes: > >> Just to make sure about how to do the cleanup. I notice that in device_set_realized(), >> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails. Sorry that I am still uncertain.. I guess the code below I pasted was misleading, actually I want to stress the *dc->unrealize() is not called when dc->realize() fails* and the incomplete below "goto fail" does not include the dc->unrealize(), but instead the dc->unrealize() is included in later child_realize_fail: & post_realize_fail:. Emm, IMHO, I think when dc->realize() fails, the dc->unrealize() is either should be called in the common function: device_set_realized() in a unified way, that is if (local_err != NULL) { + if (dc->unrealize) { + dc->unrealize(dev, local_err); + } goto fail; } or do the unrealize() locally for each device earlier when dc->realize() fails. But I checked several dc->realize() function, they did not call unrealize() when fails. Besides, it may mean verbose code if unrealize() locally. Thus I think the above way is the right way to do the cleanup when realize() fails. >> >> if (dc->realize) { >> dc->realize(dev, &local_err); >> } >> >> if (local_err != NULL) { >> goto fail; >> } >> >> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup >> when fails? Sorry for the unsure, it is such a common function that I did not >> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices. > Yes, this is on purpose. > > When a realize() method fails, it must revert everything it has done so > far. Results in sane "either succeed completely, or fail and do > nothing" semantics. Have a nice day, thanks Fei
Fei Li <lifei1214@126.com> writes: > 在 2019/1/14 下午8:36, Markus Armbruster 写道: >> Fei Li <lifei1214@126.com> writes: >> >>> Just to make sure about how to do the cleanup. I notice that in device_set_realized(), >>> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails. > Sorry that I am still uncertain.. I guess the code below I pasted was > misleading, > actually I want to stress the *dc->unrealize() is not called when > dc->realize() fails* > and the incomplete below "goto fail" does not include the dc->unrealize(), > but instead the dc->unrealize() is included in later > child_realize_fail: & post_realize_fail:. > > > Emm, IMHO, I think when dc->realize() fails, the dc->unrealize() is > either should be > called in the common function: device_set_realized() in a unified way, > that is > > if (local_err != NULL) { > + if (dc->unrealize) { > + dc->unrealize(dev, local_err); > + } > goto fail; > } > > or do the unrealize() locally for each device earlier when > dc->realize() fails. > > But I checked several dc->realize() function, they did not call unrealize() > when fails. Besides, it may mean verbose code if unrealize() locally. > Thus I think the above way is the right way to do the cleanup when > realize() fails. The realize() method is specified to either succeed completely or fail completely, i.e. fail and do nothing. The "either succeed completely or fail completely" aspect of the specification is sane and perfectly ordinary. How a concrete implementation accomplishes "fail completely" is up to the implementation. An implementation may choose to structure its FOO_realize() and FOO_unrealize() in a way that lets FOO_realize() call FOO_unrealize() to clean up on failure. An implementation may also choose to clean up differently. This freedom of choice is by design. Changing the specification now would involve auditing and updating all realize() and unrealize() methods. Not going to happen without an extremely compelling reason. >>> >>> if (dc->realize) { >>> dc->realize(dev, &local_err); >>> } >>> >>> if (local_err != NULL) { >>> goto fail; >>> } >>> >>> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup >>> when fails? Sorry for the unsure, it is such a common function that I did not >>> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices. >> Yes, this is on purpose. >> >> When a realize() method fails, it must revert everything it has done so >> far. Results in sane "either succeed completely, or fail and do >> nothing" semantics. > > Have a nice day, thanks > > Fei
在 2019/1/15 下午8:55, Markus Armbruster 写道: > Fei Li <lifei1214@126.com> writes: > >> 在 2019/1/14 下午8:36, Markus Armbruster 写道: >>> Fei Li <lifei1214@126.com> writes: >>> >>>> Just to make sure about how to do the cleanup. I notice that in device_set_realized(), >>>> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails. >> Sorry that I am still uncertain.. I guess the code below I pasted was >> misleading, >> actually I want to stress the *dc->unrealize() is not called when >> dc->realize() fails* >> and the incomplete below "goto fail" does not include the dc->unrealize(), >> but instead the dc->unrealize() is included in later >> child_realize_fail: & post_realize_fail:. >> >> >> Emm, IMHO, I think when dc->realize() fails, the dc->unrealize() is >> either should be >> called in the common function: device_set_realized() in a unified way, >> that is >> >> if (local_err != NULL) { >> + if (dc->unrealize) { >> + dc->unrealize(dev, local_err); >> + } >> goto fail; >> } >> >> or do the unrealize() locally for each device earlier when >> dc->realize() fails. >> >> But I checked several dc->realize() function, they did not call unrealize() >> when fails. Besides, it may mean verbose code if unrealize() locally. >> Thus I think the above way is the right way to do the cleanup when >> realize() fails. > The realize() method is specified to either succeed completely or fail > completely, i.e. fail and do nothing. The "either succeed completely or > fail completely" aspect of the specification is sane and perfectly > ordinary. > > How a concrete implementation accomplishes "fail completely" is up to > the implementation. > > An implementation may choose to structure its FOO_realize() and > FOO_unrealize() in a way that lets FOO_realize() call FOO_unrealize() to > clean up on failure. > > An implementation may also choose to clean up differently. > > This freedom of choice is by design. > > Changing the specification now would involve auditing and updating all > realize() and unrealize() methods. Not going to happen without an > extremely compelling reason. Ok, now I see. Thanks for the detail explanation. :) Will add the cleanup in the next version. Have a nice day Fei >>>> if (dc->realize) { >>>> dc->realize(dev, &local_err); >>>> } >>>> >>>> if (local_err != NULL) { >>>> goto fail; >>>> } >>>> >>>> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup >>>> when fails? Sorry for the unsure, it is such a common function that I did not >>>> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices. >>> Yes, this is on purpose. >>> >>> When a realize() method fails, it must revert everything it has done so >>> far. Results in sane "either succeed completely, or fail and do >>> nothing" semantics. >> Have a nice day, thanks >> >> Fei
diff --git a/hw/misc/edu.c b/hw/misc/edu.c index 3f4ba7ded3..011fe6e0b7 100644 --- a/hw/misc/edu.c +++ b/hw/misc/edu.c @@ -356,9 +356,10 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp) qemu_mutex_init(&edu->thr_mutex); qemu_cond_init(&edu->thr_cond); - /* TODO: let the further caller handle the error instead of abort() here */ - qemu_thread_create(&edu->thread, "edu", edu_fact_thread, - edu, QEMU_THREAD_JOINABLE, &error_abort); + if (!qemu_thread_create(&edu->thread, "edu", edu_fact_thread, + edu, QEMU_THREAD_JOINABLE, errp)) { + return; + } memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu, "edu-mmio", 1 * MiB);
Utilize the existed errp to propagate the error instead of the temporary &error_abort. Cc: Markus Armbruster <armbru@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Fei Li <fli@suse.com> --- hw/misc/edu.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)