Patchwork [v2,4/4] coroutine: pool coroutines to speed up creation

login
register
mail settings
Submitter Stefan Hajnoczi
Date May 12, 2011, 9:54 a.m.
Message ID <1305194086-9832-5-git-send-email-stefanha@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/95275/
State New
Headers show

Comments

Stefan Hajnoczi - May 12, 2011, 9:54 a.m.
This patch speeds up coroutine creation by reusing freed coroutines.
When a coroutine terminates it is placed in the pool instead of having
its resources freed.  The next time a coroutine is created it can be
taken straight from the pool and requires no initialization.

Performance results on an Intel Core2 Duo T9400 (2.53GHz) for
./check-coroutine --benchmark-lifecycle 20000000:

  No pooling:    19.5 sec
  With pooling:   1.1 sec

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 check-coroutine.c    |    2 ++
 qemu-coroutine-int.h |    2 ++
 qemu-coroutine.c     |   49 ++++++++++++++++++++++++++++++++++++++++++++++---
 qemu-coroutine.h     |    9 +++++++++
 vl.c                 |    2 ++
 5 files changed, 61 insertions(+), 3 deletions(-)
Kevin Wolf - May 12, 2011, 10:13 a.m.
terAm 12.05.2011 11:54, schrieb Stefan Hajnoczi:
> This patch speeds up coroutine creation by reusing freed coroutines.
> When a coroutine terminates it is placed in the pool instead of having
> its resources freed.  The next time a coroutine is created it can be
> taken straight from the pool and requires no initialization.
> 
> Performance results on an Intel Core2 Duo T9400 (2.53GHz) for
> ./check-coroutine --benchmark-lifecycle 20000000:
> 
>   No pooling:    19.5 sec
>   With pooling:   1.1 sec
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
>  check-coroutine.c    |    2 ++
>  qemu-coroutine-int.h |    2 ++
>  qemu-coroutine.c     |   49 ++++++++++++++++++++++++++++++++++++++++++++++---
>  qemu-coroutine.h     |    9 +++++++++
>  vl.c                 |    2 ++
>  5 files changed, 61 insertions(+), 3 deletions(-)
> 
> diff --git a/check-coroutine.c b/check-coroutine.c
> index 5a42c49..223c50c 100644
> --- a/check-coroutine.c
> +++ b/check-coroutine.c
> @@ -218,6 +218,8 @@ int main(int argc, char **argv)
>      };
>      int i;
>  
> +    qemu_coroutine_init();

Can we use module_init instead of adding an explicit call to main()?
This would prevent forgetting to add it in qemu-img and qemu-io like in
this patch.

Kevin
Stefan Hajnoczi - May 12, 2011, 10:22 a.m.
On Thu, May 12, 2011 at 11:13 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> terAm 12.05.2011 11:54, schrieb Stefan Hajnoczi:
>> This patch speeds up coroutine creation by reusing freed coroutines.
>> When a coroutine terminates it is placed in the pool instead of having
>> its resources freed.  The next time a coroutine is created it can be
>> taken straight from the pool and requires no initialization.
>>
>> Performance results on an Intel Core2 Duo T9400 (2.53GHz) for
>> ./check-coroutine --benchmark-lifecycle 20000000:
>>
>>   No pooling:    19.5 sec
>>   With pooling:   1.1 sec
>>
>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
>> ---
>>  check-coroutine.c    |    2 ++
>>  qemu-coroutine-int.h |    2 ++
>>  qemu-coroutine.c     |   49 ++++++++++++++++++++++++++++++++++++++++++++++---
>>  qemu-coroutine.h     |    9 +++++++++
>>  vl.c                 |    2 ++
>>  5 files changed, 61 insertions(+), 3 deletions(-)
>>
>> diff --git a/check-coroutine.c b/check-coroutine.c
>> index 5a42c49..223c50c 100644
>> --- a/check-coroutine.c
>> +++ b/check-coroutine.c
>> @@ -218,6 +218,8 @@ int main(int argc, char **argv)
>>      };
>>      int i;
>>
>> +    qemu_coroutine_init();
>
> Can we use module_init instead of adding an explicit call to main()?
> This would prevent forgetting to add it in qemu-img and qemu-io like in
> this patch.

module_init what? :)  qemu-img/qemu-io only init MODULE_INIT_BLOCK so
we'd have to modify them anyway.

I don't want to add qemu-img/qemu-io things yet because we don't have
a block layer user for coroutines yet.  The qcow2 patches should
contain these changes.

Stefan
Kevin Wolf - May 12, 2011, 10:38 a.m.
Am 12.05.2011 12:22, schrieb Stefan Hajnoczi:
> On Thu, May 12, 2011 at 11:13 AM, Kevin Wolf <kwolf@redhat.com> wrote:
>> terAm 12.05.2011 11:54, schrieb Stefan Hajnoczi:
>>> This patch speeds up coroutine creation by reusing freed coroutines.
>>> When a coroutine terminates it is placed in the pool instead of having
>>> its resources freed.  The next time a coroutine is created it can be
>>> taken straight from the pool and requires no initialization.
>>>
>>> Performance results on an Intel Core2 Duo T9400 (2.53GHz) for
>>> ./check-coroutine --benchmark-lifecycle 20000000:
>>>
>>>   No pooling:    19.5 sec
>>>   With pooling:   1.1 sec
>>>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
>>> ---
>>>  check-coroutine.c    |    2 ++
>>>  qemu-coroutine-int.h |    2 ++
>>>  qemu-coroutine.c     |   49 ++++++++++++++++++++++++++++++++++++++++++++++---
>>>  qemu-coroutine.h     |    9 +++++++++
>>>  vl.c                 |    2 ++
>>>  5 files changed, 61 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/check-coroutine.c b/check-coroutine.c
>>> index 5a42c49..223c50c 100644
>>> --- a/check-coroutine.c
>>> +++ b/check-coroutine.c
>>> @@ -218,6 +218,8 @@ int main(int argc, char **argv)
>>>      };
>>>      int i;
>>>
>>> +    qemu_coroutine_init();
>>
>> Can we use module_init instead of adding an explicit call to main()?
>> This would prevent forgetting to add it in qemu-img and qemu-io like in
>> this patch.
> 
> module_init what? :)  qemu-img/qemu-io only init MODULE_INIT_BLOCK so
> we'd have to modify them anyway.

Right... I was thinking of block, but in fact coroutines are not limited
to the block layer, so we would abuse it.

> I don't want to add qemu-img/qemu-io things yet because we don't have
> a block layer user for coroutines yet.  The qcow2 patches should
> contain these changes.

I hope we won't forget it. A missing atexit isn't a very obvious bug.

Kevin
Paolo Bonzini - May 12, 2011, 11:12 a.m.
On 05/12/2011 12:38 PM, Kevin Wolf wrote:
>> I don't want to add qemu-img/qemu-io things yet because we don't have
>> a block layer user for coroutines yet.  The qcow2 patches should
>> contain these changes.
>
> I hope we won't forget it. A missing atexit isn't a very obvious bug.

I was going to reply that this atexit is actually useless, since memory 
is freed automatically at exit?

Paolo
Stefan Hajnoczi - May 12, 2011, 11:15 a.m.
On Thu, May 12, 2011 at 12:12 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 05/12/2011 12:38 PM, Kevin Wolf wrote:
>>>
>>> I don't want to add qemu-img/qemu-io things yet because we don't have
>>> a block layer user for coroutines yet.  The qcow2 patches should
>>> contain these changes.
>>
>> I hope we won't forget it. A missing atexit isn't a very obvious bug.
>
> I was going to reply that this atexit is actually useless, since memory is
> freed automatically at exit?

It's just for completeness to make tools like valgrind happy.  Sure,
the kernel will reclaim memory and we're just burning CPU by freeing
this stuff ;).

Stefan
Paolo Bonzini - May 12, 2011, 11:18 a.m.
On 05/12/2011 01:15 PM, Stefan Hajnoczi wrote:
> It's just for completeness to make tools like valgrind happy.  Sure,
> the kernel will reclaim memory and we're just burning CPU by freeing
> this stuff;).

But valgrind will not complain about reachable memory still allocated at 
exit, at least not with the default command-line options.

Paolo
Stefan Hajnoczi - May 12, 2011, 11:52 a.m.
On Thu, May 12, 2011 at 12:18 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 05/12/2011 01:15 PM, Stefan Hajnoczi wrote:
>>
>> It's just for completeness to make tools like valgrind happy.  Sure,
>> the kernel will reclaim memory and we're just burning CPU by freeing
>> this stuff;).
>
> But valgrind will not complain about reachable memory still allocated at
> exit, at least not with the default command-line options.

/me is outsmarted by valgrind

Then I will remove pool freeing in v3.

Stefan

Patch

diff --git a/check-coroutine.c b/check-coroutine.c
index 5a42c49..223c50c 100644
--- a/check-coroutine.c
+++ b/check-coroutine.c
@@ -218,6 +218,8 @@  int main(int argc, char **argv)
     };
     int i;
 
+    qemu_coroutine_init();
+
     if (argc == 3 && strcmp(argv[1], "--benchmark-lifecycle") == 0) {
         benchmark_lifecycle(argv[2]);
         return EXIT_SUCCESS;
diff --git a/qemu-coroutine-int.h b/qemu-coroutine-int.h
index 71c6ee9..b264881 100644
--- a/qemu-coroutine-int.h
+++ b/qemu-coroutine-int.h
@@ -45,6 +45,8 @@  struct Coroutine {
     void *data;
     CoroutineEntry *entry;
 
+    QLIST_ENTRY(Coroutine) pool_next;
+
     jmp_buf env;
 };
 
diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 80bed14..cff5a4f 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -24,16 +24,35 @@ 
 #include "qemu-coroutine.h"
 #include "qemu-coroutine-int.h"
 
+enum {
+    /* Maximum free pool size prevents holding too many freed coroutines */
+    POOL_MAX_SIZE = 64,
+};
+
+static QLIST_HEAD(, Coroutine) pool = QLIST_HEAD_INITIALIZER(&pool);
+static unsigned int pool_size;
 static __thread Coroutine leader;
 static __thread Coroutine *current;
 
-static void coroutine_terminate(Coroutine *co)
+static void coroutine_delete(Coroutine *co)
 {
-    trace_qemu_coroutine_terminate(co);
     qemu_free(co->stack);
     qemu_free(co);
 }
 
+static void coroutine_terminate(Coroutine *co)
+{
+    trace_qemu_coroutine_terminate(co);
+
+    if (pool_size < POOL_MAX_SIZE) {
+        QLIST_INSERT_HEAD(&pool, co, pool_next);
+        co->caller = NULL;
+        pool_size++;
+    } else {
+        coroutine_delete(co);
+    }
+}
+
 static Coroutine *coroutine_new(void)
 {
     const size_t stack_size = 4 << 20;
@@ -49,7 +68,15 @@  Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
 {
     Coroutine *co;
 
-    co = coroutine_new();
+    co = QLIST_FIRST(&pool);
+
+    if (co) {
+        QLIST_REMOVE(co, pool_next);
+        pool_size--;
+    } else {
+        co = coroutine_new();
+    }
+
     co->entry = entry;
 
     return co;
@@ -123,3 +150,19 @@  void *coroutine_fn qemu_coroutine_yield(void)
     self->caller = NULL;
     return coroutine_swap(self, to, NULL);
 }
+
+static void coroutine_free_pool(void)
+{
+    Coroutine *co;
+    Coroutine *tmp;
+
+    QLIST_FOREACH_SAFE(co, &pool, pool_next, tmp) {
+        coroutine_delete(co);
+    }
+    pool_size = 0;
+}
+
+void qemu_coroutine_init(void)
+{
+    atexit(coroutine_free_pool);
+}
diff --git a/qemu-coroutine.h b/qemu-coroutine.h
index b79b4bf..0fae0a4 100644
--- a/qemu-coroutine.h
+++ b/qemu-coroutine.h
@@ -17,6 +17,15 @@ 
 #include <stdbool.h>
 
 /**
+ * Initialize coroutine implementation
+ *
+ * This function should be called when the program is starting up.  It installs
+ * an atexit(3) handler which will free pooled coroutines when the program
+ * exits.  This keeps valgrind output clean.
+ */
+void qemu_coroutine_init(void);
+
+/**
  * Mark a function that executes in coroutine context
  *
  * Functions that execute in coroutine context cannot be called directly from
diff --git a/vl.c b/vl.c
index 6b9a2f6..7092c9f 100644
--- a/vl.c
+++ b/vl.c
@@ -145,6 +145,7 @@  int main(int argc, char **argv)
 #include "qemu-config.h"
 #include "qemu-objects.h"
 #include "qemu-options.h"
+#include "qemu-coroutine.h"
 #ifdef CONFIG_VIRTFS
 #include "fsdev/qemu-fsdev.h"
 #endif
@@ -1975,6 +1976,7 @@  int main(int argc, char **argv, char **envp)
     init_clocks();
 
     qemu_cache_utils_init(envp);
+    qemu_coroutine_init();
 
     QLIST_INIT (&vm_change_state_head);
     os_setup_early_signal_handling();