diff mbox series

[06/10] device-core: use atomic_set on .realized property

Message ID 20200925172604.2142227-7-pbonzini@redhat.com
State New
Headers show
Series Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread | expand

Commit Message

Paolo Bonzini Sept. 25, 2020, 5:26 p.m. UTC
From: Maxim Levitsky <mlevitsk@redhat.com>

Some code might race with placement of new devices on a bus.
We currently first place a (unrealized) device on the bus
and then realize it.

As a workaround, users that scan the child device list, can
check the realized property to see if it is safe to access such a device.
Use an atomic write here too to aid with this.

A separate discussion is what to do with devices that are unrealized:
It looks like for this case we only call the hotplug handler's unplug
callback and its up to it to unrealize the device.
An atomic operation doesn't cause harm for this code path though.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/qdev.c         | 19 ++++++++++++++++++-
 include/hw/qdev-core.h |  3 +++
 2 files changed, 21 insertions(+), 1 deletion(-)

Comments

Maxim Levitsky Sept. 30, 2020, 2:31 p.m. UTC | #1
On Fri, 2020-09-25 at 13:26 -0400, Paolo Bonzini wrote:
> From: Maxim Levitsky <mlevitsk@redhat.com>
> 
> Some code might race with placement of new devices on a bus.
> We currently first place a (unrealized) device on the bus
> and then realize it.
> 
> As a workaround, users that scan the child device list, can
> check the realized property to see if it is safe to access such a device.
> Use an atomic write here too to aid with this.
> 
> A separate discussion is what to do with devices that are unrealized:
> It looks like for this case we only call the hotplug handler's unplug
> callback and its up to it to unrealize the device.
> An atomic operation doesn't cause harm for this code path though.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Message-Id: <20200913160259.32145-6-mlevitsk@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/core/qdev.c         | 19 ++++++++++++++++++-
>  include/hw/qdev-core.h |  3 +++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 59e5e710b7..fc4daa36fa 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -946,7 +946,25 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
>              }
>         }
>  
> +       qatomic_store_release(&dev->realized, value);
> +
>      } else if (!value && dev->realized) {
> +
> +        /*
> +         * Change the value so that any concurrent users are aware
> +         * that the device is going to be unrealized
> +         *
> +         * TODO: change .realized property to enum that states
> +         * each phase of the device realization/unrealization
> +         */
> +
> +        qatomic_set(&dev->realized, value);
> +        /*
> +         * Ensure that concurrent users see this update prior to
> +         * any other changes done by unrealize.
> +         */
> +        smp_wmb();
I''l probably never fully understand where to use read/write/full barrier.
If I understand corrctly, read barrier prevents reads done by this thread to be reordered,
by the CPU and write barrier prevents writes done by this CPU to be re-ordered.
Both (depending on the macro) usually imply compiler barrier (to avoid compilier re-ordering
stuff...)



> +
>          QLIST_FOREACH(bus, &dev->child_bus, sibling) {
>              qbus_unrealize(bus);
>          }
> @@ -961,7 +979,6 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
>      }
>  
>      assert(local_err == NULL);
> -    dev->realized = value;
>      return;
>  
>  child_realize_fail:
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 8067497074..39490e76ee 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -163,6 +163,9 @@ struct NamedClockList {
>  /**
>   * DeviceState:
>   * @realized: Indicates whether the device has been fully constructed.
> + *            When accessed without the iothread mutex, consider using
> + *            qatomic_load_acquire() before accessing any other field in
> + *            the device.
This sounds way better that what I wrote. As you probably noticed recently
(but not this patchset yet) I started to review the spelling/comments
more throughfully, so hopefully we see less of stuff that I myself read
and wonder why I did that many spelling/grammar mistakes.... :-)

>   * @reset: ResettableState for the device; handled by Resettable interface.
>   *
>   * This structure should not be accessed directly.  We declare it here


Best regards,
	Maxim Levitsky
Paolo Bonzini Sept. 30, 2020, 5:44 p.m. UTC | #2
On 30/09/20 16:31, Maxim Levitsky wrote:
>> +
>> +        qatomic_set(&dev->realized, value);
>> +        /*
>> +         * Ensure that concurrent users see this update prior to
>> +         * any other changes done by unrealize.
>> +         */
>> +        smp_wmb();
>
> I''l probably never fully understand where to use read/write/full barrier.
> If I understand corrctly, read barrier prevents reads done by this thread to be reordered,
> by the CPU and write barrier prevents writes done by this CPU to be re-ordered.

I must say that the above is not really satisfactory.  The right thing
to do would be to say which changes are done by unrealize; then you
should make sure that *after* reading something that unrealize could
undo you check if dev->realized is still true.

scsi_device_find is one such case, but I'm not convinced it is enough.

Paolo

> Both (depending on the macro) usually imply compiler barrier (to avoid compilier re-ordering
> stuff...)
Maxim Levitsky Sept. 30, 2020, 6:03 p.m. UTC | #3
On Wed, 2020-09-30 at 19:44 +0200, Paolo Bonzini wrote:
> On 30/09/20 16:31, Maxim Levitsky wrote:
> > > +
> > > +        qatomic_set(&dev->realized, value);
> > > +        /*
> > > +         * Ensure that concurrent users see this update prior to
> > > +         * any other changes done by unrealize.
> > > +         */
> > > +        smp_wmb();
> > 
> > I''l probably never fully understand where to use read/write/full barrier.
> > If I understand corrctly, read barrier prevents reads done by this thread to be reordered,
> > by the CPU and write barrier prevents writes done by this CPU to be re-ordered.
> 
> I must say that the above is not really satisfactory.  The right thing
> to do would be to say which changes are done by unrealize; then you
> should make sure that *after* reading something that unrealize could
> undo you check if dev->realized is still true.
I didn't fully understand this to be honest.

I just wanted to explain what I know and what I don't know about hardware barriers.

I know that read barriers should be paired with write barriers, like if one CPU has a write barrier,
which ensures the orders of writes to two memory locations, the other CPU can then use a read barrier to ensure
that it sees those writes in this order.

I thus think that reads of dev->realized should be paired with read barrier, 
but here a full memory barrier isn't really needed.

Best regards,
	Maxim Levitsky

> 
> scsi_device_find is one such case, but I'm not convinced it is enough.
> 
> Paolo
> 
> > Both (depending on the macro) usually imply compiler barrier (to avoid compilier re-ordering
> > stuff...)
diff mbox series

Patch

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 59e5e710b7..fc4daa36fa 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -946,7 +946,25 @@  static void device_set_realized(Object *obj, bool value, Error **errp)
             }
        }
 
+       qatomic_store_release(&dev->realized, value);
+
     } else if (!value && dev->realized) {
+
+        /*
+         * Change the value so that any concurrent users are aware
+         * that the device is going to be unrealized
+         *
+         * TODO: change .realized property to enum that states
+         * each phase of the device realization/unrealization
+         */
+
+        qatomic_set(&dev->realized, value);
+        /*
+         * Ensure that concurrent users see this update prior to
+         * any other changes done by unrealize.
+         */
+        smp_wmb();
+
         QLIST_FOREACH(bus, &dev->child_bus, sibling) {
             qbus_unrealize(bus);
         }
@@ -961,7 +979,6 @@  static void device_set_realized(Object *obj, bool value, Error **errp)
     }
 
     assert(local_err == NULL);
-    dev->realized = value;
     return;
 
 child_realize_fail:
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 8067497074..39490e76ee 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -163,6 +163,9 @@  struct NamedClockList {
 /**
  * DeviceState:
  * @realized: Indicates whether the device has been fully constructed.
+ *            When accessed without the iothread mutex, consider using
+ *            qatomic_load_acquire() before accessing any other field in
+ *            the device.
  * @reset: ResettableState for the device; handled by Resettable interface.
  *
  * This structure should not be accessed directly.  We declare it here