Patchwork Fix performance regression in qemu_get_ram_ptr

login
register
mail settings
Submitter Vincent Palatin
Date March 10, 2011, 8:47 p.m.
Message ID <1299790066-768-1-git-send-email-vpalatin@chromium.org>
Download mbox | patch
Permalink /patch/86335/
State New
Headers show

Comments

Vincent Palatin - March 10, 2011, 8:47 p.m.
When the commit f471a17e9d869df3c6573f7ec02c4725676d6f3a converted the
ram_blocks structure to QLIST, it also removed the conditional check before
switching the current block at the beginning of the list.

In the common use case where ram_blocks has a few blocks with only one
frequently accessed (the main RAM), this has a performance impact as it
performs the useless list operations on each call (which are on a really
hot path).

On my machine emulation (ARM on amd64), this patch reduces the
percentage of CPU time spent in qemu_get_ram_ptr from 6.3% to 2.1% in the
profiling of a full boot.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
---
 exec.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)
Alex Williamson - March 10, 2011, 9:14 p.m.
On Thu, 2011-03-10 at 15:47 -0500, Vincent Palatin wrote:
> When the commit f471a17e9d869df3c6573f7ec02c4725676d6f3a converted the
> ram_blocks structure to QLIST, it also removed the conditional check before
> switching the current block at the beginning of the list.
> 
> In the common use case where ram_blocks has a few blocks with only one
> frequently accessed (the main RAM), this has a performance impact as it
> performs the useless list operations on each call (which are on a really
> hot path).
> 
> On my machine emulation (ARM on amd64), this patch reduces the
> percentage of CPU time spent in qemu_get_ram_ptr from 6.3% to 2.1% in the
> profiling of a full boot.
> 
> Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
> ---
>  exec.c |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index d611100..81f08b7 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2957,8 +2957,11 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
>  
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (addr - block->offset < block->length) {
> -            QLIST_REMOVE(block, next);
> -            QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +            /* Move this entry to to start of the list.  */
> +            if (block != QLIST_FIRST(&ram_list.blocks)) {
> +                QLIST_REMOVE(block, next);
> +                QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +            }
>              return block->host + (addr - block->offset);
>          }
>      }

Looks good

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Chris Wright - March 10, 2011, 9:52 p.m.
* Vincent Palatin (vpalatin@chromium.org) wrote:
> When the commit f471a17e9d869df3c6573f7ec02c4725676d6f3a converted the
> ram_blocks structure to QLIST, it also removed the conditional check before
> switching the current block at the beginning of the list.

Nice catch.

> In the common use case where ram_blocks has a few blocks with only one
> frequently accessed (the main RAM), this has a performance impact as it
> performs the useless list operations on each call (which are on a really
> hot path).
> 
> On my machine emulation (ARM on amd64), this patch reduces the
> percentage of CPU time spent in qemu_get_ram_ptr from 6.3% to 2.1% in the
> profiling of a full boot.

Hopefully this is back on par with before the QLIST switchover.

> Signed-off-by: Vincent Palatin <vpalatin@chromium.org>

Acked-by: Chris Wright <chrisw@redhat.com>

> ---
>  exec.c |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index d611100..81f08b7 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2957,8 +2957,11 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
>  
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (addr - block->offset < block->length) {
> -            QLIST_REMOVE(block, next);
> -            QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +            /* Move this entry to to start of the list.  */
> +            if (block != QLIST_FIRST(&ram_list.blocks)) {
> +                QLIST_REMOVE(block, next);
> +                QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +            }

Pretty close to self-documenting code now.  Not sure if it's subtle enough
to warrant change to the comment like:
 
  /* Move block to head of list if it's not there already */

thanks,
-chris
Anthony Liguori - March 10, 2011, 11:17 p.m.
On 03/10/2011 02:47 PM, Vincent Palatin wrote:
> When the commit f471a17e9d869df3c6573f7ec02c4725676d6f3a converted the
> ram_blocks structure to QLIST, it also removed the conditional check before
> switching the current block at the beginning of the list.
>
> In the common use case where ram_blocks has a few blocks with only one
> frequently accessed (the main RAM), this has a performance impact as it
> performs the useless list operations on each call (which are on a really
> hot path).
>
> On my machine emulation (ARM on amd64), this patch reduces the
> percentage of CPU time spent in qemu_get_ram_ptr from 6.3% to 2.1% in the
> profiling of a full boot.
>
> Signed-off-by: Vincent Palatin<vpalatin@chromium.org>

Applied.  Thanks.

Regards,

Anthony Liguori

> ---
>   exec.c |    7 +++++--
>   1 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index d611100..81f08b7 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2957,8 +2957,11 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
>
>       QLIST_FOREACH(block,&ram_list.blocks, next) {
>           if (addr - block->offset<  block->length) {
> -            QLIST_REMOVE(block, next);
> -            QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +            /* Move this entry to to start of the list.  */
> +            if (block != QLIST_FIRST(&ram_list.blocks)) {
> +                QLIST_REMOVE(block, next);
> +                QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +            }
>               return block->host + (addr - block->offset);
>           }
>       }

Patch

diff --git a/exec.c b/exec.c
index d611100..81f08b7 100644
--- a/exec.c
+++ b/exec.c
@@ -2957,8 +2957,11 @@  void *qemu_get_ram_ptr(ram_addr_t addr)
 
     QLIST_FOREACH(block, &ram_list.blocks, next) {
         if (addr - block->offset < block->length) {
-            QLIST_REMOVE(block, next);
-            QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
+            /* Move this entry to to start of the list.  */
+            if (block != QLIST_FIRST(&ram_list.blocks)) {
+                QLIST_REMOVE(block, next);
+                QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
+            }
             return block->host + (addr - block->offset);
         }
     }