diff mbox

[COLO-Frame,v16,21/35] qmp event: Add COLO_EXIT event to notify users while exited from COLO

Message ID 1460096797-14916-22-git-send-email-zhang.zhanghailiang@huawei.com
State New
Headers show

Commit Message

Zhanghailiang April 8, 2016, 6:26 a.m. UTC
If some errors happen during VM's COLO FT stage, it's important to notify the users
of this event. Together with 'x_colo_lost_heartbeat', users can intervene in COLO's
failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
v13:
- Remove optional 'error' string for this event.
  (I doubted it was usefull for users, Since users shouldn't
   interpret it and can't depend on it to decide what happened
   exaclty. Besides it is really hard to organize.)
- Remove unused 'unknown' member for enum COLOExitReason.
 (Eric's suggestion)
- Fix comment for COLO_EXIT
v11:
- Fix several typos found by Eric
---
 docs/qmp-events.txt | 16 ++++++++++++++++
 migration/colo.c    | 20 ++++++++++++++++++++
 qapi-schema.json    | 14 ++++++++++++++
 qapi/event.json     | 15 +++++++++++++++
 4 files changed, 65 insertions(+)

Comments

Eric Blake April 22, 2016, 2:25 p.m. UTC | #1
On 04/08/2016 12:26 AM, zhanghailiang wrote:
> If some errors happen during VM's COLO FT stage, it's important to notify the users
> of this event. Together with 'x_colo_lost_heartbeat', users can intervene in COLO's
> failover work immediately.
> If users don't want to get involved in COLO's failover verdict,
> it is still necessary to notify users that we exited COLO mode.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---

> +++ b/migration/colo.c
> @@ -18,6 +18,7 @@
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "migration/failover.h"
> +#include "qapi-event.h"
>  
>  /* colo buffer */
>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> @@ -368,6 +369,18 @@ out:
>      if (local_err) {
>          error_report_err(local_err);
>      }
> +    /*
> +    * There are only two reasons we can go here, something error happened,
> +    * Or users triggered failover.

s/something/some/
s/Or users/or the user/

Otherwise looks fine. As comments are minor fixes,
Reviewed-by: Eric Blake <eblake@redhat.com>
Zhanghailiang April 25, 2016, 9:33 a.m. UTC | #2
On 2016/4/22 22:25, Eric Blake wrote:
> On 04/08/2016 12:26 AM, zhanghailiang wrote:
>> If some errors happen during VM's COLO FT stage, it's important to notify the users
>> of this event. Together with 'x_colo_lost_heartbeat', users can intervene in COLO's
>> failover work immediately.
>> If users don't want to get involved in COLO's failover verdict,
>> it is still necessary to notify users that we exited COLO mode.
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>
>> +++ b/migration/colo.c
>> @@ -18,6 +18,7 @@
>>   #include "qemu/error-report.h"
>>   #include "qapi/error.h"
>>   #include "migration/failover.h"
>> +#include "qapi-event.h"
>>
>>   /* colo buffer */
>>   #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>> @@ -368,6 +369,18 @@ out:
>>       if (local_err) {
>>           error_report_err(local_err);
>>       }
>> +    /*
>> +    * There are only two reasons we can go here, something error happened,
>> +    * Or users triggered failover.
>
> s/something/some/
> s/Or users/or the user/
>
> Otherwise looks fine. As comments are minor fixes,
> Reviewed-by: Eric Blake <eblake@redhat.com>
>

Thanks very much, I'll fix them in next version.

Hailiang
diff mbox

Patch

diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt
index fa7574d..77c634e 100644
--- a/docs/qmp-events.txt
+++ b/docs/qmp-events.txt
@@ -184,6 +184,22 @@  Example:
 Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
 event.
 
+COLO_EXIT
+---------
+
+Emitted when VM finishes COLO mode due to some errors happening or
+at the request of users.
+
+Data:
+
+ - "mode": COLO mode, primary or secondary side (json-string)
+ - "reason": the exit reason, internal error or external request. (json-string)
+
+Example:
+
+{"timestamp": {"seconds": 2032141960, "microseconds": 417172},
+ "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
+
 DEVICE_DELETED
 --------------
 
diff --git a/migration/colo.c b/migration/colo.c
index b3d88ef..42f7983 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -18,6 +18,7 @@ 
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "migration/failover.h"
+#include "qapi-event.h"
 
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
@@ -368,6 +369,18 @@  out:
     if (local_err) {
         error_report_err(local_err);
     }
+    /*
+    * There are only two reasons we can go here, something error happened,
+    * Or users triggered failover.
+    */
+    if (!failover_request_is_active()) {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+    }
+
     qsb_free(buffer);
     buffer = NULL;
 
@@ -531,6 +544,13 @@  out:
     if (local_err) {
         error_report_err(local_err);
     }
+    if (!failover_request_is_active()) {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+    }
 
     if (fb) {
         qemu_fclose(fb);
diff --git a/qapi-schema.json b/qapi-schema.json
index 0c348fb..719fcde 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -776,6 +776,20 @@ 
   'data': [ 'unknown', 'primary', 'secondary'] }
 
 ##
+# @COLOExitReason
+#
+# The reason for a COLO exit
+#
+# @request: COLO exit is due to an external request
+#
+# @error: COLO exit is due to an internal error
+#
+# Since: 2.7
+##
+{ 'enum': 'COLOExitReason',
+  'data': [ 'request', 'error' ] }
+
+##
 # @x-colo-lost-heartbeat
 #
 # Tell qemu that heartbeat is lost, request it to do takeover procedures.
diff --git a/qapi/event.json b/qapi/event.json
index 8642052..09542fa 100644
--- a/qapi/event.json
+++ b/qapi/event.json
@@ -268,6 +268,21 @@ 
   'data': { 'pass': 'int' } }
 
 ##
+# @COLO_EXIT
+#
+# Emitted when VM finishes COLO mode due to some errors happening or
+# at the request of users.
+#
+# @mode: which COLO mode the VM was in when it exited.
+#
+# @reason: describes the reason for the COLO exit.
+#
+# Since: 2.7
+##
+{ 'event': 'COLO_EXIT',
+  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
+
+##
 # @ACPI_DEVICE_OST
 #
 # Emitted when guest executes ACPI _OST method.