diff mbox

[v1,2/2] reduce qemu's heap Rss size from 12252kB to 2752KB

Message ID 53522014-8f70-9358-7d60-8b6c491c9611@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini March 10, 2017, 9:14 a.m. UTC
On 10/03/2017 16:14, Yang Zhong wrote:
> There is no need to delete subregion and do memory begin/commit for
> subpage in memory_region_finalize().
> 
> This patch is from Anthony Xu <anthony.xu@intel.com>.
> 
> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> ---
>  memory.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/memory.c b/memory.c
> index 284894b..3e9bfff 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1505,13 +1505,14 @@ static void memory_region_finalize(Object *obj)
>       * and cause an infinite loop.
>       */
>      mr->enabled = false;
> -    memory_region_transaction_begin();
> -    while (!QTAILQ_EMPTY(&mr->subregions)) {
> -        MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);
> -        memory_region_del_subregion(mr, subregion);
> +    if (!mr->subpage) {
> +        memory_region_transaction_begin();
> +        while (!QTAILQ_EMPTY(&mr->subregions)) {
> +            MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);
> +            memory_region_del_subregion(mr, subregion);
> +        }
> +        memory_region_transaction_commit();

Subpages never have subregions, so the loop never runs.  The
begin/commit pair then becomes:

    ++memory_region_transaction_depth;
    --memory_region_transaction_depth;
    if (!memory_region_transaction_depth) {
        if (memory_region_update_pending) {
            ...
        } else if (ioeventfd_update_pending) {
            ...
        }
        // memory_region_clear_pending()
        memory_region_update_pending = false;
        ioeventfd_update_pending = false;
   }

If memory_region_transaction_depth is > 0 the begin/commit pair does
nothing.

But if memory_region_transaction_depth is == 0, there should be no
update pending because the loop has never run.  So I don't see what your
patch can change.

Of course there could be an update pending because of a bug elsewhere,
but I tried adding this patch:


and at least a basic qemu-system-x86_64 run started just fine.  So why
does this patch make a difference?

Paolo

>      }
> -    memory_region_transaction_commit();
> -

Comments

Xu, Anthony March 10, 2017, 5:40 p.m. UTC | #1
> -----Original Message-----

> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> Sent: Friday, March 10, 2017 1:14 AM

> To: Zhong, Yang <yang.zhong@intel.com>; qemu-devel@nongnu.org

> Cc: Xu, Anthony <anthony.xu@intel.com>; Peng, Chao P

> <chao.p.peng@intel.com>

> Subject: Re: [PATCH v1 2/2] reduce qemu's heap Rss size from 12252kB to

> 2752KB

> 

> 

> 

> On 10/03/2017 16:14, Yang Zhong wrote:

> > There is no need to delete subregion and do memory begin/commit for

> > subpage in memory_region_finalize().

> >

> > This patch is from Anthony Xu <anthony.xu@intel.com>.

> >

> > Signed-off-by: Yang Zhong <yang.zhong@intel.com>

> > ---

> >  memory.c | 13 +++++++------

> >  1 file changed, 7 insertions(+), 6 deletions(-)

> >

> > diff --git a/memory.c b/memory.c

> > index 284894b..3e9bfff 100644

> > --- a/memory.c

> > +++ b/memory.c

> > @@ -1505,13 +1505,14 @@ static void memory_region_finalize(Object

> *obj)

> >       * and cause an infinite loop.

> >       */

> >      mr->enabled = false;

> > -    memory_region_transaction_begin();

> > -    while (!QTAILQ_EMPTY(&mr->subregions)) {

> > -        MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);

> > -        memory_region_del_subregion(mr, subregion);

> > +    if (!mr->subpage) {

> > +        memory_region_transaction_begin();

> > +        while (!QTAILQ_EMPTY(&mr->subregions)) {

> > +            MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);

> > +            memory_region_del_subregion(mr, subregion);

> > +        }

> > +        memory_region_transaction_commit();

> 

> Subpages never have subregions, so the loop never runs.  The begin/commit

> pair then becomes:

> 

>     ++memory_region_transaction_depth;

>     --memory_region_transaction_depth;

>     if (!memory_region_transaction_depth) {

>         if (memory_region_update_pending) {

>             ...

>         } else if (ioeventfd_update_pending) {

>             ...

>         }

>         // memory_region_clear_pending()

>         memory_region_update_pending = false;

>         ioeventfd_update_pending = false;

>    }

> 

> If memory_region_transaction_depth is > 0 the begin/commit pair does

> nothing.

> 

> But if memory_region_transaction_depth is == 0, there should be no update

> pending because the loop has never run.  So I don't see what your patch can

> change.


As I mentioned in PATCH1, this patch is used to fix an issue after we remove
the global lock in RCU callback. After global lock is removed, other thread may 
set up update pending, so memory_region_transaction_commit 
may try to rebuild PhysPageMap even the loop doesn’t run, other thread may try to 
rebuild PhysPageMap at the same time, it is a race condition.
subpage MemoryRegion is a specific MemoryRegion, it doesn't belong to any address 
space, it is only used to handle subpage. We may use a new structure other than 
MemoryRegion to handle subpage to make the logic more clearer. After the change, 
RCU callback will not free any MemoryRegion. 


Anthony

> 

> Of course there could be an update pending because of a bug elsewhere,

> but I tried adding this patch:

> 

> diff --git a/memory.c b/memory.c

> index 284894b..2208a21 100644

> --- a/memory.c

> +++ b/memory.c

> @@ -903,6 +903,10 @@ static void

> address_space_update_topology(AddressSpace *as)  void

> memory_region_transaction_begin(void)

>  {

>      qemu_flush_coalesced_mmio_buffer();

> +    if (!memory_region_transaction_depth) {

> +        assert(!memory_region_update_pending);

> +        assert(!ioeventfd_update_pending);

> +    }

>      ++memory_region_transaction_depth;

>  }

> 

> and at least a basic qemu-system-x86_64 run started just fine.  So why does

> this patch make a difference?

> 

> Paolo

> 

> >      }

> > -    memory_region_transaction_commit();

> > -
Paolo Bonzini March 11, 2017, 5:04 p.m. UTC | #2
> > Subpages never have subregions, so the loop never runs.  The begin/commit
> > pair then becomes:
> > 
> >     ++memory_region_transaction_depth;
> >     --memory_region_transaction_depth;
> >     if (!memory_region_transaction_depth) {
> >         if (memory_region_update_pending) {
> >             ...
> >         } else if (ioeventfd_update_pending) {
> >             ...
> >         }
> >         // memory_region_clear_pending()
> >         memory_region_update_pending = false;
> >         ioeventfd_update_pending = false;
> >    }
> > 
> > If memory_region_transaction_depth is > 0 the begin/commit pair does
> > nothing.
> > 
> > But if memory_region_transaction_depth is == 0, there should be no update
> > pending because the loop has never run.  So I don't see what your patch can
> > change.
> 
> As I mentioned in PATCH1, this patch is used to fix an issue after we remove
> the global lock in RCU callback. After global lock is removed, other thread
> may set up update pending, so memory_region_transaction_commit
> may try to rebuild PhysPageMap even the loop doesn’t run, other thread may
> try to rebuild PhysPageMap at the same time, it is a race condition.
> subpage MemoryRegion is a specific MemoryRegion, it doesn't belong to any
> address space, it is only used to handle subpage. We may use a new structure
> other than MemoryRegion to handle subpage to make the logic more clearer. After
> the change, RCU callback will not free any MemoryRegion.

This is not true.  Try hot-unplugging a device.

I'm all for reducing the scope of the global QEMU lock, but this needs a plan
and a careful analysis of the involved data structures across _all_ instance_finalize
implementations.

Paolo
diff mbox

Patch

diff --git a/memory.c b/memory.c
index 284894b..2208a21 100644
--- a/memory.c
+++ b/memory.c
@@ -903,6 +903,10 @@  static void
address_space_update_topology(AddressSpace *as)
 void memory_region_transaction_begin(void)
 {
     qemu_flush_coalesced_mmio_buffer();
+    if (!memory_region_transaction_depth) {
+        assert(!memory_region_update_pending);
+        assert(!ioeventfd_update_pending);
+    }
     ++memory_region_transaction_depth;
 }