[v3,2/2] QEMUBH: make AioContext's bh re-entrant

Message ID	CAJnKYQ=Sj99wh_M1zVpX3V6TyDbZedF0kTLQ9phgRttz6e8Wrg@mail.gmail.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> MIME-Version: 1.0 In-Reply-To: <51C2BA6B.2050706@redhat.com> References: <1371675569-6516-1-git-send-email-pingfank@linux.vnet.ibm.com> <1371675569-6516-3-git-send-email-pingfank@linux.vnet.ibm.com> <20130620073924.GA14255@stefanha-thinkpad.redhat.com> <51C2BA6B.2050706@redhat.com> From: liu ping fan <qemulist@gmail.com> Date: Thu, 20 Jun 2013 17:41:09 +0800 Message-ID: <CAJnKYQ=Sj99wh_M1zVpX3V6TyDbZedF0kTLQ9phgRttz6e8Wrg@mail.gmail.com> To: Paolo Bonzini <pbonzini@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Cc: Kevin Wolf <kwolf@redhat.com>, Anthony Liguori <anthony@codemonkey.ws>, Liu Ping Fan <pingfank@linux.vnet.ibm.com>, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com> Subject: Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Message ID

CAJnKYQ=Sj99wh_M1zVpX3V6TyDbZedF0kTLQ9phgRttz6e8Wrg@mail.gmail.com

State

New

Headers

MIME-Version: 1.0
In-Reply-To: <51C2BA6B.2050706@redhat.com>
References: <1371675569-6516-1-git-send-email-pingfank@linux.vnet.ibm.com>
	<1371675569-6516-3-git-send-email-pingfank@linux.vnet.ibm.com>
	<20130620073924.GA14255@stefanha-thinkpad.redhat.com>
	<51C2BA6B.2050706@redhat.com>
From: liu ping fan <qemulist@gmail.com>
Date: Thu, 20 Jun 2013 17:41:09 +0800
Message-ID: <CAJnKYQ=Sj99wh_M1zVpX3V6TyDbZedF0kTLQ9phgRttz6e8Wrg@mail.gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Kevin Wolf <kwolf@redhat.com>, Anthony Liguori <anthony@codemonkey.ws>, 
	Liu Ping Fan <pingfank@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh
	re-entrant
Precedence: list
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Commit Message

pingfan liu June 20, 2013, 9:41 a.m. UTC

On Thu, Jun 20, 2013 at 4:16 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 20/06/2013 09:39, Stefan Hajnoczi ha scritto:
>> qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch.
>>
>> It seems that calling them from a thread is a little risky because there
>> is no guarantee that the BH is no longer invoked after a thread calls
>> these functions.
>>
>> I think that's worth a comment or do you want them to take the lock so
>> they become safe?
>
> Taking the lock wouldn't help.  The invoking loop of aio_bh_poll runs
> lockless.  I think a comment is better.
>
> qemu_bh_cancel is inherently not thread-safe, there's not much you can
> do about it.
>
> qemu_bh_delete is safe as long as you wait for the bottom half to stop
> before deleting the containing object.  Once we have RCU, deletion of
> QOM objects will be RCU-protected.  Hence, a simple way could be to put
> the first part of aio_bh_poll() within rcu_read_lock/unlock.
>
In fact, I have some idea about this,  introduce another member -
Object for QEMUBH which will be refereed in cb, then we leave anything
to refcnt mechanism.
For qemu_bh_cancel(), I do not figure out whether it is important or
not to sync with caller.

     int ret;
+    int sched;

     ctx->walking_bh++;

@@ -69,8 +70,10 @@ int aio_bh_poll(AioContext *ctx)
         /* Make sure fetching bh before accessing its members */
         smp_read_barrier_depends();
         next = bh->next;
-        if (!bh->deleted && bh->scheduled) {
-            bh->scheduled = 0;
+        sched = 0;
+        atomic_xchg(&bh->scheduled, sched);
+        if (!bh->deleted && sched) {
+            //bh->scheduled = 0;
             if (!bh->idle)
                 ret = 1;
             bh->idle = 0;
@@ -79,6 +82,9 @@ int aio_bh_poll(AioContext *ctx)
              */
             smp_rmb();
             bh->cb(bh->opaque);
+            if (bh->obj) {
+                object_unref(bh->obj);
+            }
         }
     }

@@ -105,8 +111,12 @@ int aio_bh_poll(AioContext *ctx)

 void qemu_bh_schedule_idle(QEMUBH *bh)
 {
-    if (bh->scheduled)
+    int sched = 1;
+
+    atomic_xchg( &bh->scheduled, sched);
+    if (sched) {
         return;
+    }
     /* Make sure any writes that are needed by the callback are done
      * before the locations are read in the aio_bh_poll.
      */
@@ -117,25 +127,46 @@ void qemu_bh_schedule_idle(QEMUBH *bh)

 void qemu_bh_schedule(QEMUBH *bh)
 {
-    if (bh->scheduled)
+    int sched = 1;
+
+    atomic_xchg( &bh->scheduled, sched);
+    if (sched) {
         return;
+    }
     /* Make sure any writes that are needed by the callback are done
      * before the locations are read in the aio_bh_poll.
      */
     smp_wmb();
     bh->scheduled = 1;
+    if (bh->obj) {
+        object_ref(bh->obj);
+    }
     bh->idle = 0;
     aio_notify(bh->ctx);
 }

 void qemu_bh_cancel(QEMUBH *bh)
 {
-    bh->scheduled = 0;
+    int sched = 0;
+
+    atomic_xchg( &bh->scheduled, sched);
+    if (sched) {
+        if (bh->obj) {
+            object_ref(bh->obj);
+        }
+    }
 }

 void qemu_bh_delete(QEMUBH *bh)
 {
-    bh->scheduled = 0;
+    int sched = 0;
+
+    atomic_xchg( &bh->scheduled, sched);
+    if (sched) {
+        if (bh->obj) {
+            object_ref(bh->obj);
+        }
+    }
     bh->deleted = 1;
 }

Regards,
Pingfan
>> The other thing I'm unclear on is the ->idle assignment followed
>> immediately by a ->scheduled assignment.  Without memory barriers
>> aio_bh_poll() isn't guaranteed to get an ordered view of these updates:
>> it may see an idle BH as a regular scheduled BH because ->idle is still
>> 0.
>
> Right.  You need to order ->idle writes before ->scheduled writes, and
> add memory barriers, or alternatively use two bits in ->scheduled so
> that you can assign both atomically.
>
> Paolo

Comments

Paolo Bonzini June 20, 2013, 9:45 a.m. UTC | #1

Il 20/06/2013 11:41, liu ping fan ha scritto:
> On Thu, Jun 20, 2013 at 4:16 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 20/06/2013 09:39, Stefan Hajnoczi ha scritto:
>>> qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch.
>>>
>>> It seems that calling them from a thread is a little risky because there
>>> is no guarantee that the BH is no longer invoked after a thread calls
>>> these functions.
>>>
>>> I think that's worth a comment or do you want them to take the lock so
>>> they become safe?
>>
>> Taking the lock wouldn't help.  The invoking loop of aio_bh_poll runs
>> lockless.  I think a comment is better.
>>
>> qemu_bh_cancel is inherently not thread-safe, there's not much you can
>> do about it.
>>
>> qemu_bh_delete is safe as long as you wait for the bottom half to stop
>> before deleting the containing object.  Once we have RCU, deletion of
>> QOM objects will be RCU-protected.  Hence, a simple way could be to put
>> the first part of aio_bh_poll() within rcu_read_lock/unlock.
>>
> In fact, I have some idea about this,  introduce another member -
> Object for QEMUBH which will be refereed in cb, then we leave anything
> to refcnt mechanism.
> For qemu_bh_cancel(), I do not figure out whether it is important or
> not to sync with caller.

This is a separate patch anyway... and a long discussion to have before
too. :)

Let's concentrate on one thing at a time.

Paolo

> diff --git a/async.c b/async.c
> index 4b17eb7..60c35a1 100644
> --- a/async.c
> +++ b/async.c
> @@ -61,6 +61,7 @@ int aio_bh_poll(AioContext *ctx)
>  {
>      QEMUBH *bh, **bhp, *next;
>      int ret;
> +    int sched;
> 
>  {
>      QEMUBH *bh, **bhp, *next;
>      int ret;
> +    int sched;
> 
>      ctx->walking_bh++;
> 
> @@ -69,8 +70,10 @@ int aio_bh_poll(AioContext *ctx)
>          /* Make sure fetching bh before accessing its members */
>          smp_read_barrier_depends();
>          next = bh->next;
> -        if (!bh->deleted && bh->scheduled) {
> -            bh->scheduled = 0;
> +        sched = 0;
> +        atomic_xchg(&bh->scheduled, sched);

This is expensive.

> +        if (!bh->deleted && sched) {
> +            //bh->scheduled = 0;
>              if (!bh->idle)
>                  ret = 1;
>              bh->idle = 0;
> @@ -79,6 +82,9 @@ int aio_bh_poll(AioContext *ctx)
>               */
>              smp_rmb();
>              bh->cb(bh->opaque);
> +            if (bh->obj) {
> +                object_unref(bh->obj);
> +            }
>          }
>      }
> 
> @@ -105,8 +111,12 @@ int aio_bh_poll(AioContext *ctx)
> 
>  void qemu_bh_schedule_idle(QEMUBH *bh)
>  {
> -    if (bh->scheduled)
> +    int sched = 1;
> +
> +    atomic_xchg( &bh->scheduled, sched);
> +    if (sched) {
>          return;
> +    }
>      /* Make sure any writes that are needed by the callback are done
>       * before the locations are read in the aio_bh_poll.
>       */
> @@ -117,25 +127,46 @@ void qemu_bh_schedule_idle(QEMUBH *bh)
> 
>  void qemu_bh_schedule(QEMUBH *bh)
>  {
> -    if (bh->scheduled)
> +    int sched = 1;
> +
> +    atomic_xchg( &bh->scheduled, sched);
> +    if (sched) {
>          return;
> +    }
>      /* Make sure any writes that are needed by the callback are done
>       * before the locations are read in the aio_bh_poll.
>       */
>      smp_wmb();
>      bh->scheduled = 1;
> +    if (bh->obj) {
> +        object_ref(bh->obj);
> +    }
>      bh->idle = 0;
>      aio_notify(bh->ctx);
>  }
> 
>  void qemu_bh_cancel(QEMUBH *bh)
>  {
> -    bh->scheduled = 0;
> +    int sched = 0;
> +
> +    atomic_xchg( &bh->scheduled, sched);
> +    if (sched) {
> +        if (bh->obj) {
> +            object_ref(bh->obj);
> +        }
> +    }
>  }
> 
>  void qemu_bh_delete(QEMUBH *bh)
>  {
> -    bh->scheduled = 0;
> +    int sched = 0;
> +
> +    atomic_xchg( &bh->scheduled, sched);
> +    if (sched) {
> +        if (bh->obj) {
> +            object_ref(bh->obj);
> +        }
> +    }
>      bh->deleted = 1;
>  }
> 
> Regards,
> Pingfan
>>> The other thing I'm unclear on is the ->idle assignment followed
>>> immediately by a ->scheduled assignment.  Without memory barriers
>>> aio_bh_poll() isn't guaranteed to get an ordered view of these updates:
>>> it may see an idle BH as a regular scheduled BH because ->idle is still
>>> 0.
>>
>> Right.  You need to order ->idle writes before ->scheduled writes, and
>> add memory barriers, or alternatively use two bits in ->scheduled so
>> that you can assign both atomically.
>>
>> Paolo

pingfan liu June 21, 2013, 4:35 a.m. UTC | #2

[...]
>>>
>>> qemu_bh_delete is safe as long as you wait for the bottom half to stop
>>> before deleting the containing object.  Once we have RCU, deletion of
>>> QOM objects will be RCU-protected.  Hence, a simple way could be to put
>>> the first part of aio_bh_poll() within rcu_read_lock/unlock.
>>>
>> In fact, I have some idea about this,  introduce another member -
>> Object for QEMUBH which will be refereed in cb, then we leave anything
>> to refcnt mechanism.
>> For qemu_bh_cancel(), I do not figure out whether it is important or
>> not to sync with caller.
>
> This is a separate patch anyway... and a long discussion to have before
> too. :)
>
> Let's concentrate on one thing at a time.
>
Yes, will do like this.

Regards,
Pingfan

> Paolo
>
>> diff --git a/async.c b/async.c
>> index 4b17eb7..60c35a1 100644
>> --- a/async.c
>> +++ b/async.c
>> @@ -61,6 +61,7 @@ int aio_bh_poll(AioContext *ctx)
>>  {
>>      QEMUBH *bh, **bhp, *next;
>>      int ret;
>> +    int sched;
>>
>>  {
>>      QEMUBH *bh, **bhp, *next;
>>      int ret;
>> +    int sched;
>>
>>      ctx->walking_bh++;
>>
>> @@ -69,8 +70,10 @@ int aio_bh_poll(AioContext *ctx)
>>          /* Make sure fetching bh before accessing its members */
>>          smp_read_barrier_depends();
>>          next = bh->next;
>> -        if (!bh->deleted && bh->scheduled) {
>> -            bh->scheduled = 0;
>> +        sched = 0;
>> +        atomic_xchg(&bh->scheduled, sched);
>
> This is expensive.
>
>> +        if (!bh->deleted && sched) {
>> +            //bh->scheduled = 0;
>>              if (!bh->idle)
>>                  ret = 1;
>>              bh->idle = 0;
>> @@ -79,6 +82,9 @@ int aio_bh_poll(AioContext *ctx)
>>               */
>>              smp_rmb();
>>              bh->cb(bh->opaque);
>> +            if (bh->obj) {
>> +                object_unref(bh->obj);
>> +            }
>>          }
>>      }
>>
>> @@ -105,8 +111,12 @@ int aio_bh_poll(AioContext *ctx)
>>
>>  void qemu_bh_schedule_idle(QEMUBH *bh)
>>  {
>> -    if (bh->scheduled)
>> +    int sched = 1;
>> +
>> +    atomic_xchg( &bh->scheduled, sched);
>> +    if (sched) {
>>          return;
>> +    }
>>      /* Make sure any writes that are needed by the callback are done
>>       * before the locations are read in the aio_bh_poll.
>>       */
>> @@ -117,25 +127,46 @@ void qemu_bh_schedule_idle(QEMUBH *bh)
>>
>>  void qemu_bh_schedule(QEMUBH *bh)
>>  {
>> -    if (bh->scheduled)
>> +    int sched = 1;
>> +
>> +    atomic_xchg( &bh->scheduled, sched);
>> +    if (sched) {
>>          return;
>> +    }
>>      /* Make sure any writes that are needed by the callback are done
>>       * before the locations are read in the aio_bh_poll.
>>       */
>>      smp_wmb();
>>      bh->scheduled = 1;
>> +    if (bh->obj) {
>> +        object_ref(bh->obj);
>> +    }
>>      bh->idle = 0;
>>      aio_notify(bh->ctx);
>>  }
>>
>>  void qemu_bh_cancel(QEMUBH *bh)
>>  {
>> -    bh->scheduled = 0;
>> +    int sched = 0;
>> +
>> +    atomic_xchg( &bh->scheduled, sched);
>> +    if (sched) {
>> +        if (bh->obj) {
>> +            object_ref(bh->obj);
>> +        }
>> +    }
>>  }
>>
>>  void qemu_bh_delete(QEMUBH *bh)
>>  {
>> -    bh->scheduled = 0;
>> +    int sched = 0;
>> +
>> +    atomic_xchg( &bh->scheduled, sched);
>> +    if (sched) {
>> +        if (bh->obj) {
>> +            object_ref(bh->obj);
>> +        }
>> +    }
>>      bh->deleted = 1;
>>  }
>>
>> Regards,
>> Pingfan
>>>> The other thing I'm unclear on is the ->idle assignment followed
>>>> immediately by a ->scheduled assignment.  Without memory barriers
>>>> aio_bh_poll() isn't guaranteed to get an ordered view of these updates:
>>>> it may see an idle BH as a regular scheduled BH because ->idle is still
>>>> 0.
>>>
>>> Right.  You need to order ->idle writes before ->scheduled writes, and
>>> add memory barriers, or alternatively use two bits in ->scheduled so
>>> that you can assign both atomically.
>>>
>>> Paolo
>

diff --git a/async.c b/async.c
index 4b17eb7..60c35a1 100644
--- a/async.c
+++ b/async.c
@@ -61,6 +61,7 @@  int aio_bh_poll(AioContext *ctx)
 {
     QEMUBH *bh, **bhp, *next;
     int ret;
+    int sched;

 {
     QEMUBH *bh, **bhp, *next;

[v3,2/2] QEMUBH: make AioContext's bh re-entrant

Commit Message

Comments

Patch