@@ -298,7 +298,7 @@ In most migration scenarios there is only a single data path that runs
from the source VM to the destination, typically along a single fd (although
possibly with another fd or similar for some fast way of throwing pages across).
-However, some uses need two way communication; in particular the Postcopy
+However, some uses need two way communication; in particular the RAM postcopy
destination needs to be able to request pages on demand from the source.
For these scenarios there is a 'return path' from the destination to the source;
@@ -321,32 +321,62 @@ the amount of migration traffic and time it takes, the down side is that during
the postcopy phase, a failure of *either* side or the network connection causes
the guest to be lost.
-In postcopy the destination CPUs are started before all the memory has been
+== Sorts of state data ==
+States data, which should be migrated may be divided into three groups:
+ 1. Precopy only - data, which _must_ be transferred before destination CPUs
+ are started.
+ 2. Compatible - data, which may be transferred both in precopy and postcopy
+ phases (RAM).
+ 3. Postcopy only - data, which _must_ be transferred after destination CPUs
+ are started (dirty bitmaps).
+
+Note: also, any type of data may be transferred in the stopped state, when both
+source and destination are stopped.
+
+Postcopy phase starts after the destination CPUs are started (and after stopped
+phase, of-course), if the following conditions are met:
+ 1. Some postcopy migration capabilities are turned on.
+ 2. Current state data to be transferred is too large to be transferred in a
+ stopped state.
+ 3. Current precopy-only data is small enough to be transferred in the
+ stopped state.
+ 4. One of the following (or both):
+ 4a. Postcopy is forced by migrate_start_postcopy
+ 4b. State data which _may_ be transferred as precopy
+ ( = precopy-only + compatible ) is small enough to be transferred
+ in the stopped state.
+
+== migrate_start_postcopy ==
+
+Issuing 'migrate_start_postcopy' command during precopy migration will cause the
+transition from precopy to postcopy. It can be issued immediately after
+migration is started or any time later on. Issuing it after the end of a
+migration is harmless. This command is not guaranteed to cause immediate start
+of destination and switch to postcopy (see above).
+
+Note: During the postcopy phase, the bandwidth limits set using
+migrate_set_speed are ignored.
+
+Most postcopy related things are explained in 'RAM Postcopy' section, as RAM
+postcopy was the first postcopy mechanism in Qemu and it dictated overall
+architecture.
+
+== RAM Postcopy ==
+In RAM postcopy the destination CPUs are started before all the memory has been
transferred, and accesses to pages that are yet to be transferred cause
a fault that's translated by QEMU into a request to the source QEMU.
-Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
-doesn't finish in a given time the switch is made to postcopy.
+RAM postcopy can be combined with precopy (i.e. normal migration) so that if
+precopy doesn't finish in a given time the switch is made to postcopy.
-=== Enabling postcopy ===
+=== Enabling RAM postcopy ===
-To enable postcopy, issue this command on the monitor prior to the
+To enable RAM postcopy, issue this command on the monitor prior to the
start of migration:
migrate_set_capability x-postcopy-ram on
-The normal commands are then used to start a migration, which is still
-started in precopy mode. Issuing:
-
-migrate_start_postcopy
-
-will now cause the transition from precopy to postcopy.
-It can be issued immediately after migration is started or any
-time later on. Issuing it after the end of a migration is harmless.
-
-Note: During the postcopy phase, the bandwidth limits set using
-migrate_set_speed is ignored (to avoid delaying requested pages that
-the destination is waiting for).
+Then, to switch to postcopy, 'migrate_start_postcopy' command may be used.
=== Postcopy device transfer ===
@@ -482,3 +512,27 @@ request for a page that has already been sent is ignored. Duplicate requests
such as this can happen as a page is sent at about the same time the
destination accesses it.
+
+== Block dirty bitmaps postcopy ==
+
+Postcopy is good place to migrate dirty bitmaps as they are not critical data,
+and if postcopy fails, we will just drop bitmaps and do full backup instead of
+next incremental and nothing worse.
+
+The good thing here is that bitmaps postcopy doesn't mean RAM postcopy, so if
+only postcopy-bitmaps migration capability is on RAM would be migrated as usual
+in precopy. Also, block dirty bitmap migration doesn't use return path as RAM
+postcopy.
+
+Dirty bitmap migration state data is postcopy-only (see above). So, it is
+migrated only in stopped state or in postcopy phase.
+
+Only named dirty bitmaps, associated with root nodes and non-root named nodes
+are migrated. If destination Qemu is already containing a dirty bitmap with the
+same name as a migrated bitmap (for the same node), then, if their
+granularities are the same the migration will be done, otherwise the error will
+be generated. If destination Qemu doesn't contain such bitmap it will be
+created.
+
+The protocol of migration is specified (and realized) in
+migration/block-dirty-bitmap.c.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> --- There is documentation draft for the feature. Here is nothing about bitmap migration protocol, it is commented in migration/block-dirty-bitmap.c. Capability name differs with other patches. Here - postcopy-bitmaps, and in patches dirty-bitmaps. In the following series it would be fixed to postcopy-bitmaps or x-postcopy-bitmaps (more like RAM) docs/migration.txt | 90 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 72 insertions(+), 18 deletions(-)