diff mbox

[v1,2/2] reduce qemu's heap Rss size from 12252kB to 2752KB

Message ID 4712D8F4B26E034E80552F30A67BE0B1A13045@ORSMSX112.amr.corp.intel.com
State New
Headers show

Commit Message

Xu, Anthony March 14, 2017, 5:14 a.m. UTC
> > > Subpages never have subregions, so the loop never runs.  The

> begin/commit

> > > pair then becomes:

> > >

> > >     ++memory_region_transaction_depth;

> > >     --memory_region_transaction_depth;

> > >     if (!memory_region_transaction_depth) {

> > >         if (memory_region_update_pending) {

> > >             ...

> > >         } else if (ioeventfd_update_pending) {

> > >             ...

> > >         }

> > >         // memory_region_clear_pending()

> > >         memory_region_update_pending = false;

> > >         ioeventfd_update_pending = false;

> > >    }

> > >

> > > If memory_region_transaction_depth is > 0 the begin/commit pair does

> > > nothing.

> > >

> > > But if memory_region_transaction_depth is == 0, there should be no

> update

> > > pending because the loop has never run.  So I don't see what your patch

> can

> > > change.

> >

> > As I mentioned in PATCH1, this patch is used to fix an issue after we

> remove

> > the global lock in RCU callback. After global lock is removed, other thread

> > may set up update pending, so memory_region_transaction_commit

> > may try to rebuild PhysPageMap even the loop doesn’t run, other thread

> may

> > try to rebuild PhysPageMap at the same time, it is a race condition.

> > subpage MemoryRegion is a specific MemoryRegion, it doesn't belong to

> any

> > address space, it is only used to handle subpage. We may use a new

> structure

> > other than MemoryRegion to handle subpage to make the logic more

> clearer. After

> > the change, RCU callback will not free any MemoryRegion.

> 

> This is not true.  Try hot-unplugging a device.


You are right, hot-unplugging does cause memory region(not for subpage) freed in RCU thread.

I tried pci device hot plug/unplug,
When plug a device,
handle_hmp_command->qmp_device_add->qdev_device_add->
virtio_pci_dc_realize->pci_qdev_realize->virtio_pci_realize->
virtio_device_realize->virtio_bus_device_plugged->virtio_pci_device_plugged 
are called.
when unplug a device,
kvm_cpu_exec->memory_region_write_accessor->pci_write->
virtio_device_unrealize->virtio_pci_device_unplugged 
are called.

memory region addition and remove happen in virtio_pci_device_plugged 
and virtio_pci_device_unplugged respectively, memory region operation needs
 to acquire the global lock, but none of them happens in RCU thread.

When memory_region_finalize is called, the memory region has been removed 
from the address space (removed in virtio_pci_device_unplugged), both mr->subregions 
and mr->coalesced are empty. It makes sense to me, when memory_region_finalize is called, 
means this memory region is not used any more.
Please correct me if I'm wrong here, I only tried pci device hot plug/unplug.

If above assumption is correct, seems we don't need the global lock for memory region  
reclamation in RCU thread. Please let me know if other  memory reclamation in RCU thread 
need the global lock.

Under the assumption, I have below patch to remove the global lock in RCU thread, 
I tested vm boot ,reboot, shutdown and pci device hot plug/unplug.
Please review the patch, 
If no further issue, I will send out an official patch later.

Thanks,
Anthony







> 

> I'm all for reducing the scope of the global QEMU lock, 

Thanks,


>but this needs a plan

> and a careful analysis of the involved data structures across _all_

> instance_finalize

> implementations.

Agreed,

Below functions are registered in RCU thread
address_space_dispatch_free,
do_address_space_destroy
flatview_unref
reclaim_ramblock,
qht_map_destroy,
migration_bitmap_free

first three are address space related, should work without global lock per above analysis.
The rest are very simple, seems doesn't need global lock.




> 

> Paolo

Comments

Paolo Bonzini March 14, 2017, 10:14 a.m. UTC | #1
On 14/03/2017 06:14, Xu, Anthony wrote:
> Below functions are registered in RCU thread
> address_space_dispatch_free,
> do_address_space_destroy
> flatview_unref
> reclaim_ramblock,
> qht_map_destroy,
> migration_bitmap_free
> 
> first three are address space related, should work without global lock per above analysis.
> The rest are very simple, seems doesn't need global lock.

flatview_unref can call object_unref and thus reach:

- all QOM instance_finalize callbacks

- all QOM property release callbacks

In turn, of QOM property release callbacks the more important ones are
release_drive (which calls blockdev_auto_del and blk_detach_dev) and
release_chr (which calls qemu_chr_fe_deinit).

Your patch is incorrect, sorry.  If it were that simple, it would have
been done already...

Paolo
diff mbox

Patch

diff --git a/memory.c b/memory.c
index 6c58373..43e06e9 100644
--- a/memory.c
+++ b/memory.c
@@ -1503,15 +1503,9 @@  static void memory_region_finalize(Object *obj)
      * and cause an infinite loop.
      */
     mr->enabled = false;
-    memory_region_transaction_begin();
-    while (!QTAILQ_EMPTY(&mr->subregions)) {
-        MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);
-        memory_region_del_subregion(mr, subregion);
-    }
-    memory_region_transaction_commit();
-
+    assert(QTAILQ_EMPTY(&mr->subregions));
     mr->destructor(mr);
-    memory_region_clear_coalescing(mr);
+    assert(QTAILQ_EMPTY(&mr->coalesced));
     g_free((char *)mr->name);
     g_free(mr->ioeventfds);
 }
diff --git a/util/rcu.c b/util/rcu.c
index 9adc5e4..51e0248 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -254,24 +254,20 @@  static void *call_rcu_thread(void *opaque)

         atomic_sub(&rcu_call_count, n);
         synchronize_rcu();
-        qemu_mutex_lock_iothread();
         while (n > 0) {
             node = try_dequeue();
             while (!node) {
-                qemu_mutex_unlock_iothread();
                 qemu_event_reset(&rcu_call_ready_event);
                 node = try_dequeue();
                 if (!node) {
                     qemu_event_wait(&rcu_call_ready_event);
                     node = try_dequeue();
                 }
-                qemu_mutex_lock_iothread();
             }

             n--;
             node->func(node);
         }
-        qemu_mutex_unlock_iothread();
     }
     abort();
 }