mbox series

[v2,0/5] migration: Downtime tracepoints

Message ID 20231030163346.765724-1-peterx@redhat.com
Headers show
Series migration: Downtime tracepoints | expand

Message

Peter Xu Oct. 30, 2023, 4:33 p.m. UTC
v2:
- Added two more patches (patch 4&5) to add the checkpoints too, it means
  it merges Joao's series into tracepoints, and extend that to dest QEMU.
  - Patch 5: Prefixed checkpoints with "src-" and "dst-"

This small series wants to improve ability of QEMU downtime analysis
similarly to what Joao used to propose here:

  https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com

But with a few enhancements:

  - Nothing exported yet to qapi, all tracepoints so far

  - Besides major checkpoints, finer granule by providing downtime
    measurements for each vmstate (I made microsecond to be the unit to be
    accurate) alongside.

  - Trace dest QEMU too for either the checkpoints or vmsd load()s

For the last bullet: consider the case where a device save() can be super
fast, while load() can actually be super slow.  Both of them will
contribute to the ultimate downtime, but not a simple addition: when src
QEMU is save()ing on device1, dst QEMU can logically be load()ing on
device2.  So they can run in parallel.  However the only way to figure all
components of the downtime is to record both.

Please have a look, thanks.

Peter Xu (5):
  migration: Set downtime_start even for postcopy
  migration: Add migration_downtime_start|end() helpers
  migration: Add per vmstate downtime tracepoints
  migration: migration_stop_vm() helper
  migration: Add tracepoints for downtime checkpoints

 migration/migration.h  |  2 ++
 migration/migration.c  | 63 +++++++++++++++++++++++++++++++-----------
 migration/savevm.c     | 63 ++++++++++++++++++++++++++++++++++++------
 migration/trace-events |  4 ++-
 4 files changed, 106 insertions(+), 26 deletions(-)