diff mbox

qdev: free qemu-opts when the QOM path goes away

Message ID 1445253099-16894-1-git-send-email-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini Oct. 19, 2015, 11:11 a.m. UTC
Otherwise there is a race where the DEVICE_DELETED event has been sent but
attempts to reuse the ID will fail.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/qdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini Nov. 4, 2015, 3:53 p.m. UTC | #1
On 19/10/2015 13:11, Paolo Bonzini wrote:
> Otherwise there is a race where the DEVICE_DELETED event has been sent but
> attempts to reuse the ID will fail.
> 
> Reported-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Ping?

Paolo

> ---
>  hw/core/qdev.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 4ab04aa..92bd8bb 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1203,7 +1203,6 @@ static void device_finalize(Object *obj)
>      NamedGPIOList *ngl, *next;
>  
>      DeviceState *dev = DEVICE(obj);
> -    qemu_opts_del(dev->opts);
>  
>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>          QLIST_REMOVE(ngl, node);
> @@ -1251,6 +1250,9 @@ static void device_unparent(Object *obj)
>          qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
>          g_free(path);
>      }
> +
> +    qemu_opts_del(dev->opts);
> +    dev->opts = NULL;
>  }
>  
>  static void device_class_init(ObjectClass *class, void *data)
>
Markus Armbruster Nov. 4, 2015, 6:34 p.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> Otherwise there is a race where the DEVICE_DELETED event has been sent but
> attempts to reuse the ID will fail.
>
> Reported-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Let's see whether I understand this.

> ---
>  hw/core/qdev.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 4ab04aa..92bd8bb 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1203,7 +1203,6 @@ static void device_finalize(Object *obj)
>      NamedGPIOList *ngl, *next;
>  
>      DeviceState *dev = DEVICE(obj);
> -    qemu_opts_del(dev->opts);
>  
>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>          QLIST_REMOVE(ngl, node);
> @@ -1251,6 +1250,9 @@ static void device_unparent(Object *obj)
>          qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);

DEVICE_DELETED sent here.

>          g_free(path);
>      }
> +
> +    qemu_opts_del(dev->opts);
> +    dev->opts = NULL;
>  }
>  
>  static void device_class_init(ObjectClass *class, void *data)

object_finalize_child_property() runs during unplug:

    static void object_finalize_child_property(Object *obj, const char *name,
                                               void *opaque)
    {
        Object *child = opaque;

        if (child->class->unparent) {
            (child->class->unparent)(child);  <--- calls device_unparent()
        }
        child->parent = NULL;
        object_unref(child);                  <--- calls device_finalize()
    }

device_unparent() sends DEVICE_DELETED, but dev->opts gets only deleted
later, in device_finalize.  If the client tries to reuse the ID in the
meantime, it fails.

Two remarks:

1. Wouldn't it be cleaner to delete dev-opts *before* sending
   DEVICE_DELETED?  Like this:

    +++ b/hw/core/qdev.c
    @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
             dev->parent_bus = NULL;
         }

    +    qemu_opts_del(dev->opts);
    +    dev->opts = NULL;
    +
         /* Only send event if the device had been completely realized */
         if (dev->pending_deleted_event) {
             gchar *path = object_get_canonical_path(OBJECT(dev));

2. If the device is a block device, then unplugging it also deletes its
   backend (ugly wart we keep for backward compatibility; *not* for
   blockdev-add, though).  This backend also has a QemuOpts.  It gets
   deleted in drive_info_del().  Just like device_finalize(), it runs
   within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
   different ID, or am I missing something?

   See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
Andreas Färber Nov. 5, 2015, 12:06 p.m. UTC | #3
Am 04.11.2015 um 19:34 schrieb Markus Armbruster:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> Otherwise there is a race where the DEVICE_DELETED event has been sent but
>> attempts to reuse the ID will fail.
>>
>> Reported-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> Let's see whether I understand this.
> 
>> ---
>>  hw/core/qdev.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> index 4ab04aa..92bd8bb 100644
>> --- a/hw/core/qdev.c
>> +++ b/hw/core/qdev.c
>> @@ -1203,7 +1203,6 @@ static void device_finalize(Object *obj)
>>      NamedGPIOList *ngl, *next;
>>  
>>      DeviceState *dev = DEVICE(obj);
>> -    qemu_opts_del(dev->opts);
>>  
>>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>>          QLIST_REMOVE(ngl, node);
>> @@ -1251,6 +1250,9 @@ static void device_unparent(Object *obj)
>>          qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
> 
> DEVICE_DELETED sent here.
> 
>>          g_free(path);
>>      }
>> +
>> +    qemu_opts_del(dev->opts);
>> +    dev->opts = NULL;
>>  }
>>  
>>  static void device_class_init(ObjectClass *class, void *data)
> 
> object_finalize_child_property() runs during unplug:
> 
>     static void object_finalize_child_property(Object *obj, const char *name,
>                                                void *opaque)
>     {
>         Object *child = opaque;
> 
>         if (child->class->unparent) {
>             (child->class->unparent)(child);  <--- calls device_unparent()
>         }
>         child->parent = NULL;
>         object_unref(child);                  <--- calls device_finalize()
>     }
> 
> device_unparent() sends DEVICE_DELETED, but dev->opts gets only deleted
> later, in device_finalize.  If the client tries to reuse the ID in the
> meantime, it fails.
> 
> Two remarks:
> 
> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>    DEVICE_DELETED?  Like this:
> 
>     +++ b/hw/core/qdev.c
>     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>              dev->parent_bus = NULL;
>          }
> 
>     +    qemu_opts_del(dev->opts);
>     +    dev->opts = NULL;
>     +
>          /* Only send event if the device had been completely realized */
>          if (dev->pending_deleted_event) {
>              gchar *path = object_get_canonical_path(OBJECT(dev));

To me this proposal sounds sane, but I did not get to tracing the code
flow here. Paolo, which approach do you prefer and why?

> 2. If the device is a block device, then unplugging it also deletes its
>    backend (ugly wart we keep for backward compatibility; *not* for
>    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>    deleted in drive_info_del().  Just like device_finalize(), it runs
>    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>    different ID, or am I missing something?
> 
>    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044

If we can leave this patch decoupled from block layer and decide soonish
on the desired approach, I'd be happy to include it in my upcoming
qom-devices pull.

Regards,
Andreas
Paolo Bonzini Nov. 5, 2015, 12:21 p.m. UTC | #4
On 05/11/2015 13:06, Andreas Färber wrote:
> > 1. Wouldn't it be cleaner to delete dev-opts *before* sending
> >    DEVICE_DELETED?  Like this:
> > 
> >     +++ b/hw/core/qdev.c
> >     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
> >              dev->parent_bus = NULL;
> >          }
> > 
> >     +    qemu_opts_del(dev->opts);
> >     +    dev->opts = NULL;
> >     +
> >          /* Only send event if the device had been completely realized */
> >          if (dev->pending_deleted_event) {
> >              gchar *path = object_get_canonical_path(OBJECT(dev));
> 
> To me this proposal sounds sane, but I did not get to tracing the code
> flow here. Paolo, which approach do you prefer and why?

It doesn't really matter, because the BQL is being held here.

On the other hand, if the opts are deleted in finalize, there is an
arbitrary delay because finalize is typically called after a
synchronize_rcu period.

>> > 2. If the device is a block device, then unplugging it also deletes its
>> >    backend (ugly wart we keep for backward compatibility; *not* for
>> >    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>> >    deleted in drive_info_del().  Just like device_finalize(), it runs
>> >    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>> >    different ID, or am I missing something?
>> > 
>> >    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>
> If we can leave this patch decoupled from block layer and decide soonish
> on the desired approach, I'd be happy to include it in my upcoming
> qom-devices pull.

I agree with you, the block layer bug is separate.

Paolo
Markus Armbruster Nov. 5, 2015, 12:47 p.m. UTC | #5
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 05/11/2015 13:06, Andreas Färber wrote:
>> > 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>> >    DEVICE_DELETED?  Like this:
>> > 
>> >     +++ b/hw/core/qdev.c
>> >     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>> >              dev->parent_bus = NULL;
>> >          }
>> > 
>> >     +    qemu_opts_del(dev->opts);
>> >     +    dev->opts = NULL;
>> >     +
>> >          /* Only send event if the device had been completely realized */
>> >          if (dev->pending_deleted_event) {
>> >              gchar *path = object_get_canonical_path(OBJECT(dev));
>> 
>> To me this proposal sounds sane, but I did not get to tracing the code
>> flow here. Paolo, which approach do you prefer and why?
>
> It doesn't really matter, because the BQL is being held here.
>
> On the other hand, if the opts are deleted in finalize, there is an
> arbitrary delay because finalize is typically called after a
> synchronize_rcu period.
>
>>> > 2. If the device is a block device, then unplugging it also deletes its
>>> >    backend (ugly wart we keep for backward compatibility; *not* for
>>> >    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>> >    deleted in drive_info_del().  Just like device_finalize(), it runs
>>> >    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>> >    different ID, or am I missing something?
>>> > 
>>> >    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>>
>> If we can leave this patch decoupled from block layer and decide soonish
>> on the desired approach, I'd be happy to include it in my upcoming
>> qom-devices pull.
>
> I agree with you, the block layer bug is separate.

Related, but clearly separate.  Mentioning it in the commit message
would be nice, though.
Paolo Bonzini Jan. 8, 2016, 6:17 p.m. UTC | #6
On 05/11/2015 13:06, Andreas Färber wrote:
> Am 04.11.2015 um 19:34 schrieb Markus Armbruster:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>>> Otherwise there is a race where the DEVICE_DELETED event has been sent but
>>> attempts to reuse the ID will fail.
>>>
>>> Reported-by: Michael S. Tsirkin <mst@redhat.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Let's see whether I understand this.
>>
>>> ---
>>>  hw/core/qdev.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>> index 4ab04aa..92bd8bb 100644
>>> --- a/hw/core/qdev.c
>>> +++ b/hw/core/qdev.c
>>> @@ -1203,7 +1203,6 @@ static void device_finalize(Object *obj)
>>>      NamedGPIOList *ngl, *next;
>>>  
>>>      DeviceState *dev = DEVICE(obj);
>>> -    qemu_opts_del(dev->opts);
>>>  
>>>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>>>          QLIST_REMOVE(ngl, node);
>>> @@ -1251,6 +1250,9 @@ static void device_unparent(Object *obj)
>>>          qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
>>
>> DEVICE_DELETED sent here.
>>
>>>          g_free(path);
>>>      }
>>> +
>>> +    qemu_opts_del(dev->opts);
>>> +    dev->opts = NULL;
>>>  }
>>>  
>>>  static void device_class_init(ObjectClass *class, void *data)
>>
>> object_finalize_child_property() runs during unplug:
>>
>>     static void object_finalize_child_property(Object *obj, const char *name,
>>                                                void *opaque)
>>     {
>>         Object *child = opaque;
>>
>>         if (child->class->unparent) {
>>             (child->class->unparent)(child);  <--- calls device_unparent()
>>         }
>>         child->parent = NULL;
>>         object_unref(child);                  <--- calls device_finalize()
>>     }
>>
>> device_unparent() sends DEVICE_DELETED, but dev->opts gets only deleted
>> later, in device_finalize.  If the client tries to reuse the ID in the
>> meantime, it fails.
>>
>> Two remarks:
>>
>> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>>    DEVICE_DELETED?  Like this:
>>
>>     +++ b/hw/core/qdev.c
>>     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>>              dev->parent_bus = NULL;
>>          }
>>
>>     +    qemu_opts_del(dev->opts);
>>     +    dev->opts = NULL;
>>     +
>>          /* Only send event if the device had been completely realized */
>>          if (dev->pending_deleted_event) {
>>              gchar *path = object_get_canonical_path(OBJECT(dev));
> 
> To me this proposal sounds sane, but I did not get to tracing the code
> flow here. Paolo, which approach do you prefer and why?
> 
>> 2. If the device is a block device, then unplugging it also deletes its
>>    backend (ugly wart we keep for backward compatibility; *not* for
>>    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>    deleted in drive_info_del().  Just like device_finalize(), it runs
>>    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>    different ID, or am I missing something?
>>
>>    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
> 
> If we can leave this patch decoupled from block layer and decide soonish
> on the desired approach, I'd be happy to include it in my upcoming
> qom-devices pull.

Ping?

Paolo
Andreas Färber Jan. 15, 2016, 5:03 p.m. UTC | #7
Am 05.11.2015 um 13:47 schrieb Markus Armbruster:
> Paolo Bonzini <pbonzini@redhat.com> writes:
>> On 05/11/2015 13:06, Andreas Färber wrote:
>>>> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>>>>    DEVICE_DELETED?  Like this:
>>>>
>>>>     +++ b/hw/core/qdev.c
>>>>     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>>>>              dev->parent_bus = NULL;
>>>>          }
>>>>
>>>>     +    qemu_opts_del(dev->opts);
>>>>     +    dev->opts = NULL;
>>>>     +
>>>>          /* Only send event if the device had been completely realized */
>>>>          if (dev->pending_deleted_event) {
>>>>              gchar *path = object_get_canonical_path(OBJECT(dev));
>>>
>>> To me this proposal sounds sane, but I did not get to tracing the code
>>> flow here. Paolo, which approach do you prefer and why?
>>
>> It doesn't really matter, because the BQL is being held here.
>>
>> On the other hand, if the opts are deleted in finalize, there is an
>> arbitrary delay because finalize is typically called after a
>> synchronize_rcu period.
>>
>>>>> 2. If the device is a block device, then unplugging it also deletes its
>>>>>    backend (ugly wart we keep for backward compatibility; *not* for
>>>>>    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>>>>    deleted in drive_info_del().  Just like device_finalize(), it runs
>>>>>    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>>>>    different ID, or am I missing something?
>>>>>
>>>>>    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>>>
>>> If we can leave this patch decoupled from block layer and decide soonish
>>> on the desired approach, I'd be happy to include it in my upcoming
>>> qom-devices pull.
>>
>> I agree with you, the block layer bug is separate.
> 
> Related, but clearly separate.  Mentioning it in the commit message
> would be nice, though.

Paolo, pong: I gathered that I should queue the original patch without
Markus's proposed change, correct? And do you want to add some sentence
to the commit message as requested by Markus?

Regards,
Andreas
Paolo Bonzini Jan. 15, 2016, 5:16 p.m. UTC | #8
On 15/01/2016 18:03, Andreas Färber wrote:
> Am 05.11.2015 um 13:47 schrieb Markus Armbruster:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>> On 05/11/2015 13:06, Andreas Färber wrote:
>>>>> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>>>>>    DEVICE_DELETED?  Like this:
>>>>>
>>>>>     +++ b/hw/core/qdev.c
>>>>>     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>>>>>              dev->parent_bus = NULL;
>>>>>          }
>>>>>
>>>>>     +    qemu_opts_del(dev->opts);
>>>>>     +    dev->opts = NULL;
>>>>>     +
>>>>>          /* Only send event if the device had been completely realized */
>>>>>          if (dev->pending_deleted_event) {
>>>>>              gchar *path = object_get_canonical_path(OBJECT(dev));
>>>>
>>>> To me this proposal sounds sane, but I did not get to tracing the code
>>>> flow here. Paolo, which approach do you prefer and why?
>>>
>>> It doesn't really matter, because the BQL is being held here.
>>>
>>> On the other hand, if the opts are deleted in finalize, there is an
>>> arbitrary delay because finalize is typically called after a
>>> synchronize_rcu period.
>>>
>>>>>> 2. If the device is a block device, then unplugging it also deletes its
>>>>>>    backend (ugly wart we keep for backward compatibility; *not* for
>>>>>>    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>>>>>    deleted in drive_info_del().  Just like device_finalize(), it runs
>>>>>>    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>>>>>    different ID, or am I missing something?
>>>>>>
>>>>>>    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>>>>
>>>> If we can leave this patch decoupled from block layer and decide soonish
>>>> on the desired approach, I'd be happy to include it in my upcoming
>>>> qom-devices pull.
>>>
>>> I agree with you, the block layer bug is separate.
>>
>> Related, but clearly separate.  Mentioning it in the commit message
>> would be nice, though.
> 
> Paolo, pong: I gathered that I should queue the original patch without
> Markus's proposed change, correct? And do you want to add some sentence
> to the commit message as requested by Markus?

Yes, thanks:

----
Note that similar races exist for other QemuOpts, which this patch
does not attempt to fix.

For example, if the device is a block device, then unplugging it also
deletes its backend.  However, this backend's get deleted in
drive_info_del(), which is only called when properties are
destroyed.  Just like device_finalize(), drive_info_del() is called
some time after DEVICE_DELETED is sent.  A separate patch series has
been sent to plug this other bug.  Character devices also have yet to
be fixed.
-----

Paolo
Andreas Färber Jan. 15, 2016, 5:36 p.m. UTC | #9
Am 15.01.2016 um 18:16 schrieb Paolo Bonzini:
> On 15/01/2016 18:03, Andreas Färber wrote:
>> Am 05.11.2015 um 13:47 schrieb Markus Armbruster:
>>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>>> On 05/11/2015 13:06, Andreas Färber wrote:
>>>>>> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>>>>>>    DEVICE_DELETED?  Like this:
>>>>>>
>>>>>>     +++ b/hw/core/qdev.c
>>>>>>     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>>>>>>              dev->parent_bus = NULL;
>>>>>>          }
>>>>>>
>>>>>>     +    qemu_opts_del(dev->opts);
>>>>>>     +    dev->opts = NULL;
>>>>>>     +
>>>>>>          /* Only send event if the device had been completely realized */
>>>>>>          if (dev->pending_deleted_event) {
>>>>>>              gchar *path = object_get_canonical_path(OBJECT(dev));
>>>>>
>>>>> To me this proposal sounds sane, but I did not get to tracing the code
>>>>> flow here. Paolo, which approach do you prefer and why?
>>>>
>>>> It doesn't really matter, because the BQL is being held here.
>>>>
>>>> On the other hand, if the opts are deleted in finalize, there is an
>>>> arbitrary delay because finalize is typically called after a
>>>> synchronize_rcu period.
>>>>
>>>>>>> 2. If the device is a block device, then unplugging it also deletes its
>>>>>>>    backend (ugly wart we keep for backward compatibility; *not* for
>>>>>>>    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>>>>>>    deleted in drive_info_del().  Just like device_finalize(), it runs
>>>>>>>    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>>>>>>    different ID, or am I missing something?
>>>>>>>
>>>>>>>    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>>>>>
>>>>> If we can leave this patch decoupled from block layer and decide soonish
>>>>> on the desired approach, I'd be happy to include it in my upcoming
>>>>> qom-devices pull.
>>>>
>>>> I agree with you, the block layer bug is separate.
>>>
>>> Related, but clearly separate.  Mentioning it in the commit message
>>> would be nice, though.
>>
>> Paolo, pong: I gathered that I should queue the original patch without
>> Markus's proposed change, correct? And do you want to add some sentence
>> to the commit message as requested by Markus?
> 
> Yes, thanks:
> 
> ----
> Note that similar races exist for other QemuOpts, which this patch
> does not attempt to fix.
> 
> For example, if the device is a block device, then unplugging it also
> deletes its backend.  However, this backend's get deleted in
> drive_info_del(), which is only called when properties are
> destroyed.  Just like device_finalize(), drive_info_del() is called
> some time after DEVICE_DELETED is sent.  A separate patch series has
> been sent to plug this other bug.  Character devices also have yet to
> be fixed.
> -----

Thanks, queued on qom-next with that addition:
https://github.com/afaerber/qemu-cpu/commits/qom-next

I'll leave Markus and others time until Monday for *-by or comments, but
I really need to get out my pull with Daniel's class properties.

Regards,
Andreas
Markus Armbruster Jan. 18, 2016, 9:45 a.m. UTC | #10
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 15/01/2016 18:03, Andreas Färber wrote:
>> Am 05.11.2015 um 13:47 schrieb Markus Armbruster:
>>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>>> On 05/11/2015 13:06, Andreas Färber wrote:
>>>>>> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>>>>>>    DEVICE_DELETED?  Like this:
>>>>>>
>>>>>>     +++ b/hw/core/qdev.c
>>>>>>     @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>>>>>>              dev->parent_bus = NULL;
>>>>>>          }
>>>>>>
>>>>>>     +    qemu_opts_del(dev->opts);
>>>>>>     +    dev->opts = NULL;
>>>>>>     +
>>>>>>          /* Only send event if the device had been completely realized */
>>>>>>          if (dev->pending_deleted_event) {
>>>>>>              gchar *path = object_get_canonical_path(OBJECT(dev));
>>>>>
>>>>> To me this proposal sounds sane, but I did not get to tracing the code
>>>>> flow here. Paolo, which approach do you prefer and why?
>>>>
>>>> It doesn't really matter, because the BQL is being held here.
>>>>
>>>> On the other hand, if the opts are deleted in finalize, there is an
>>>> arbitrary delay because finalize is typically called after a
>>>> synchronize_rcu period.
>>>>
>>>>>>> 2. If the device is a block device, then unplugging it also deletes its
>>>>>>>    backend (ugly wart we keep for backward compatibility; *not* for
>>>>>>>    blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>>>>>>    deleted in drive_info_del().  Just like device_finalize(), it runs
>>>>>>>    within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>>>>>>    different ID, or am I missing something?
>>>>>>>
>>>>>>>    See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>>>>>
>>>>> If we can leave this patch decoupled from block layer and decide soonish
>>>>> on the desired approach, I'd be happy to include it in my upcoming
>>>>> qom-devices pull.
>>>>
>>>> I agree with you, the block layer bug is separate.
>>>
>>> Related, but clearly separate.  Mentioning it in the commit message
>>> would be nice, though.
>> 
>> Paolo, pong: I gathered that I should queue the original patch without
>> Markus's proposed change, correct? And do you want to add some sentence
>> to the commit message as requested by Markus?
>
> Yes, thanks:
>
> ----
> Note that similar races exist for other QemuOpts, which this patch
> does not attempt to fix.
>
> For example, if the device is a block device, then unplugging it also
> deletes its backend.  However, this backend's get deleted in
> drive_info_del(), which is only called when properties are
> destroyed.  Just like device_finalize(), drive_info_del() is called
> some time after DEVICE_DELETED is sent.  A separate patch series has
> been sent to plug this other bug.  Character devices also have yet to
> be fixed.

With this addition to the commit message:

Reviewed-by: Markus Armbruster <armbru@redhat.com>
diff mbox

Patch

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 4ab04aa..92bd8bb 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -1203,7 +1203,6 @@  static void device_finalize(Object *obj)
     NamedGPIOList *ngl, *next;
 
     DeviceState *dev = DEVICE(obj);
-    qemu_opts_del(dev->opts);
 
     QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
         QLIST_REMOVE(ngl, node);
@@ -1251,6 +1250,9 @@  static void device_unparent(Object *obj)
         qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
         g_free(path);
     }
+
+    qemu_opts_del(dev->opts);
+    dev->opts = NULL;
 }
 
 static void device_class_init(ObjectClass *class, void *data)