diff mbox

[5/5] disk_deadlines: add info disk-deadlines option

Message ID 1441699228-25767-6-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev Sept. 8, 2015, 8 a.m. UTC
From: Raushaniya Maksudova <rmaksudova@virtuozzo.com>

This patch adds "info disk-deadlines" qemu-monitor option that prints
dump of all disk requests which caused a disk deadline in Guest OS
from the very start of Virtual Machine:

   disk_id  type       size total_time        start_time
.--------------------------------------------------------
  ide0-hd1 FLUSH         0b 46.403s     22232930059574ns
  ide0-hd1 FLUSH         0b 57.591s     22451499241285ns
  ide0-hd1 FLUSH         0b 103.482s    22574100547397ns

Signed-off-by: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Luiz Capitulino <lcapitulino@redhat.com>
---
 block/disk-deadlines.c         | 85 +++++++++++++++++++++++++++++++++++++++++-
 blockdev.c                     |  3 +-
 hmp.c                          | 37 ++++++++++++++++++
 hmp.h                          |  1 +
 include/block/disk-deadlines.h |  4 +-
 monitor.c                      |  7 ++++
 qapi-schema.json               | 33 ++++++++++++++++
 7 files changed, 167 insertions(+), 3 deletions(-)

Comments

Eric Blake Sept. 8, 2015, 4:20 p.m. UTC | #1
On 09/08/2015 02:00 AM, Denis V. Lunev wrote:
> From: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
> 
> This patch adds "info disk-deadlines" qemu-monitor option that prints
> dump of all disk requests which caused a disk deadline in Guest OS
> from the very start of Virtual Machine:
> 
>    disk_id  type       size total_time        start_time
> .--------------------------------------------------------
>   ide0-hd1 FLUSH         0b 46.403s     22232930059574ns
>   ide0-hd1 FLUSH         0b 57.591s     22451499241285ns
>   ide0-hd1 FLUSH         0b 103.482s    22574100547397ns
> 
> Signed-off-by: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Markus Armbruster <armbru@redhat.com>
> CC: Luiz Capitulino <lcapitulino@redhat.com>
> ---

qapi interface review only:


> +++ b/qapi-schema.json
> @@ -3808,3 +3808,36 @@
>  
>  # Rocker ethernet network switch
>  { 'include': 'qapi/rocker.json' }
> +
> +## @DiskDeadlinesInfo
> +#
> +# Contains info about late requests which caused VM stopping
> +#
> +# @disk-id: name of disk (unique for each disk)

Mark this with '#optional', and maybe describe why it would be missing.
 Does this correspond to the BDS node name where the deadline expired,
in which case 'node' might be a nicer name than 'disk-id'?

> +#
> +# @type: type of request could be READ, WRITE or FLUSH

Likewise for using #optional. Please make this an enum type, not an
open-coded string.

> +#
> +# @size: size in bytes

of the failed request? Should you also mention which offset the failed
request started at?

> +#
> +# @total-time-ns: total time of request execution
> +#
> +# @start-time-ns: indicates the start of request execution
> +#
> +# Since: 2.5
> +##
> +{ 'struct': 'DiskDeadlinesInfo',
> +  'data'  : { '*disk-id': 'str',
> +              '*type': 'str',
> +              'size': 'uint64',
> +              'total-time-ns': 'uint64',
> +              'start-time-ns': 'uint64' } }
> +##
> +# @query-disk-deadlines:
> +#
> +# Returns information about last late disk requests.
> +#
> +# Returns: a list of @DiskDeadlinesInfo
> +#
> +# Since: 2.5
> +##
> +{ 'command': 'query-disk-deadlines', 'returns': ['DiskDeadlinesInfo'] }

Should it be possible to filter to deadlines missed for a specific node,
by having an arguments with an optional node name?

Should any of the existing query-block or similar commands be modified
to make it obvious that there are missed deadline stats, and that it
would be useful to call query-disk-deadlines to learn more about them?
Eric Blake Sept. 8, 2015, 4:26 p.m. UTC | #2
On 09/08/2015 10:20 AM, Eric Blake wrote:
> On 09/08/2015 02:00 AM, Denis V. Lunev wrote:
>> From: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
>>
>> This patch adds "info disk-deadlines" qemu-monitor option that prints
>> dump of all disk requests which caused a disk deadline in Guest OS
>> from the very start of Virtual Machine:
>>

> 
> qapi interface review only:
> 

> Should it be possible to filter to deadlines missed for a specific node,
> by having an arguments with an optional node name?
> 
> Should any of the existing query-block or similar commands be modified
> to make it obvious that there are missed deadline stats, and that it
> would be useful to call query-disk-deadlines to learn more about them?

Also, should there be an event raised when a timeout occurs, so that
management doesn't have to poll this API?
Denis V. Lunev Sept. 10, 2015, 6:53 p.m. UTC | #3
On 09/08/2015 07:26 PM, Eric Blake wrote:
> On 09/08/2015 10:20 AM, Eric Blake wrote:
>> On 09/08/2015 02:00 AM, Denis V. Lunev wrote:
>>> From: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
>>>
>>> This patch adds "info disk-deadlines" qemu-monitor option that prints
>>> dump of all disk requests which caused a disk deadline in Guest OS
>>> from the very start of Virtual Machine:
>>>
>> qapi interface review only:
>>
>> Should it be possible to filter to deadlines missed for a specific node,
>> by having an arguments with an optional node name?
>>
>> Should any of the existing query-block or similar commands be modified
>> to make it obvious that there are missed deadline stats, and that it
>> would be useful to call query-disk-deadlines to learn more about them?
> Also, should there be an event raised when a timeout occurs, so that
> management doesn't have to poll this API?
>
ok, this seems a nice addition
Denis V. Lunev Sept. 10, 2015, 7:13 p.m. UTC | #4
On 09/08/2015 07:20 PM, Eric Blake wrote:
> On 09/08/2015 02:00 AM, Denis V. Lunev wrote:
>> From: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
>>
>> This patch adds "info disk-deadlines" qemu-monitor option that prints
>> dump of all disk requests which caused a disk deadline in Guest OS
>> from the very start of Virtual Machine:
>>
>>     disk_id  type       size total_time        start_time
>> .--------------------------------------------------------
>>    ide0-hd1 FLUSH         0b 46.403s     22232930059574ns
>>    ide0-hd1 FLUSH         0b 57.591s     22451499241285ns
>>    ide0-hd1 FLUSH         0b 103.482s    22574100547397ns
>>
>> Signed-off-by: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Markus Armbruster <armbru@redhat.com>
>> CC: Luiz Capitulino <lcapitulino@redhat.com>
>> ---
> qapi interface review only:
>
>
>> +++ b/qapi-schema.json
>> @@ -3808,3 +3808,36 @@
>>   
>>   # Rocker ethernet network switch
>>   { 'include': 'qapi/rocker.json' }
>> +
>> +## @DiskDeadlinesInfo
>> +#
>> +# Contains info about late requests which caused VM stopping
>> +#
>> +# @disk-id: name of disk (unique for each disk)
> Mark this with '#optional', and maybe describe why it would be missing.
>   Does this correspond to the BDS node name where the deadline expired,
> in which case 'node' might be a nicer name than 'disk-id'?
as far as I could understand the code this is not BDS node name. This
name is bound to a name of a hardware device under which we could
have several block drivers at the moment. There is no query by
this name at the moment and the identifier collected is good enough
for us for a while to understand and debug the code.

Exact name is to be defined though. Originally I would like to bind
it to device name deadlines are attached to and as it would not be
good to calculate them in each BDS.

Anyway, exact meaning of this 'id' will be defined when we will decide
on proper attach point, whether it will be generic block code or filter
driver or whatever also.

Does it sound good for you? All your suggestions are welcome.

>> +#
>> +# @type: type of request could be READ, WRITE or FLUSH
> Likewise for using #optional. Please make this an enum type, not an
> open-coded string.
ok

>> +#
>> +# @size: size in bytes
> of the failed request? Should you also mention which offset the failed
> request started at?
I'll add this. This info is not accessible in stats and that's why it
was not added. If the code will be a part of the block layer
or will be in the driver filter that would be not a problem.

>> +#
>> +# @total-time-ns: total time of request execution
>> +#
>> +# @start-time-ns: indicates the start of request execution
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'struct': 'DiskDeadlinesInfo',
>> +  'data'  : { '*disk-id': 'str',
>> +              '*type': 'str',
>> +              'size': 'uint64',
>> +              'total-time-ns': 'uint64',
>> +              'start-time-ns': 'uint64' } }
>> +##
>> +# @query-disk-deadlines:
>> +#
>> +# Returns information about last late disk requests.
>> +#
>> +# Returns: a list of @DiskDeadlinesInfo
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'command': 'query-disk-deadlines', 'returns': ['DiskDeadlinesInfo'] }
> Should it be possible to filter to deadlines missed for a specific node,
> by having an arguments with an optional node name?
ok, this seems quite reasonable.

> Should any of the existing query-block or similar commands be modified
> to make it obvious that there are missed deadline stats, and that it
> would be useful to call query-disk-deadlines to learn more about them?
>
What do you think if we will also provide the list of pending requests using
similar API? Would it be useful for others?

Den
diff mbox

Patch

diff --git a/block/disk-deadlines.c b/block/disk-deadlines.c
index acb44bc..6f76e4f 100644
--- a/block/disk-deadlines.c
+++ b/block/disk-deadlines.c
@@ -26,6 +26,7 @@ 
 #include "block/accounting.h"
 #include "sysemu/sysemu.h"
 #include "qemu/atomic.h"
+#include "qmp-commands.h"
 
 /*
  * Number of late requests which were not completed in time
@@ -41,6 +42,80 @@  typedef struct RequestInfo {
     int64_t expire_time;
 } RequestInfo;
 
+const char *types[] = {
+    "READ",
+    "WRITE",
+    "FLUSH",
+    "IOTYPE",
+};
+
+typedef struct Log {
+    GSList *head;
+    pthread_mutex_t mtx;
+} Log;
+
+Log ddinfo_list = {
+    NULL,
+    PTHREAD_MUTEX_INITIALIZER,
+};
+
+static void copy_disk_deadlines_info(DiskDeadlinesInfo *ddinfo_new,
+                                      DiskDeadlinesInfo *ddinfo_old)
+{
+    ddinfo_new->total_time_ns = ddinfo_old->total_time_ns;
+    ddinfo_new->start_time_ns = ddinfo_old->start_time_ns;
+    ddinfo_new->size = ddinfo_old->size;
+    ddinfo_new->type = g_strdup(ddinfo_old->type);
+    ddinfo_new->has_type = !!ddinfo_new->type;
+    ddinfo_new->disk_id = g_strdup(ddinfo_old->disk_id);
+    ddinfo_new->has_disk_id = !!ddinfo_new->disk_id;
+}
+
+static void fill_disk_deadlines_info(DiskDeadlinesInfo *ddinfo,
+                                     BlockAcctCookie *cookie,
+                                     DiskDeadlines *disk_deadlines)
+{
+    ddinfo->total_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) -
+                            cookie->start_time_ns;
+                            ddinfo->start_time_ns = cookie->start_time_ns;
+    ddinfo->size = cookie->bytes;
+    ddinfo->type = g_strdup(types[cookie->type]);
+    ddinfo->has_type = !!ddinfo->type;
+    ddinfo->disk_id = g_strdup(disk_deadlines->disk_id);
+    ddinfo->has_disk_id = !!ddinfo->disk_id;
+}
+
+DiskDeadlinesInfoList *qmp_query_disk_deadlines(Error **errp)
+{
+    DiskDeadlinesInfoList *list = NULL, *entry;
+    DiskDeadlinesInfo *ddinfo;
+    GSList *curr = ddinfo_list.head;
+
+    pthread_mutex_lock(&ddinfo_list.mtx);
+    for (curr = ddinfo_list.head; curr; curr = g_slist_next(curr)) {
+        ddinfo = g_new(DiskDeadlinesInfo, 1);
+        copy_disk_deadlines_info(ddinfo, curr->data);
+
+        entry = g_new(DiskDeadlinesInfoList, 1);
+        entry->value = ddinfo;
+        entry->next = list;
+        list = entry;
+    }
+    pthread_mutex_unlock(&ddinfo_list.mtx);
+    return list;
+}
+
+static void log_disk_deadlines_info(BlockAcctCookie *cookie,
+                                    DiskDeadlines *disk_deadlines)
+{
+    DiskDeadlinesInfo *data = g_new(DiskDeadlinesInfo, 1);
+    fill_disk_deadlines_info(data, cookie, disk_deadlines);
+
+    pthread_mutex_lock(&ddinfo_list.mtx);
+    ddinfo_list.head = g_slist_prepend(ddinfo_list.head, data);
+    pthread_mutex_unlock(&ddinfo_list.mtx);
+}
+
 static gint compare(gconstpointer a, gconstpointer b)
 {
     return (int64_t)a - (int64_t)b;
@@ -122,7 +197,8 @@  static void disk_deadlines_callback(void *opaque)
     }
 }
 
-void disk_deadlines_init(DiskDeadlines *disk_deadlines, bool enabled)
+void disk_deadlines_init(DiskDeadlines *disk_deadlines, bool enabled,
+                         const char *disk_id)
 {
     disk_deadlines->enabled = enabled;
     if (!disk_deadlines->enabled) {
@@ -139,6 +215,7 @@  void disk_deadlines_init(DiskDeadlines *disk_deadlines, bool enabled)
 
     pthread_mutex_init(&disk_deadlines->mtx_tree, NULL);
     disk_deadlines->expired_tree = false;
+    disk_deadlines->disk_id = g_strdup(disk_id);
     disk_deadlines->request_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
                                                  disk_deadlines_callback,
                                                  (void *)disk_deadlines);
@@ -165,6 +242,7 @@  void insert_request(DiskDeadlines *disk_deadlines, void *request)
 void remove_request(DiskDeadlines *disk_deadlines, void *request)
 {
     bool need_vmstart = false;
+    bool need_log_disk_deadline = false;
     RequestInfo request_info = {
         .cookie = request,
         .expire_time = 0,
@@ -176,6 +254,7 @@  void remove_request(DiskDeadlines *disk_deadlines, void *request)
     g_tree_remove(disk_deadlines->requests_tree,
                   (int64_t *)request_info.expire_time);
 
+    need_log_disk_deadline = disk_deadlines->expired_tree;
     /*
      * If tree is empty, but marked as expired, then one needs to
      * unset "expired_tree" flag and check whether VM can be resumed
@@ -194,4 +273,8 @@  void remove_request(DiskDeadlines *disk_deadlines, void *request)
     if (need_vmstart) {
         qemu_system_vmstart_request();
     }
+
+    if (need_log_disk_deadline) {
+        log_disk_deadlines_info(request, disk_deadlines);
+    }
 }
diff --git a/blockdev.c b/blockdev.c
index 6cd9c6e..9a38c43 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -561,7 +561,8 @@  static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
 
     bs->detect_zeroes = detect_zeroes;
 
-    disk_deadlines_init(&bs->stats.disk_deadlines, disk_deadlines);
+    disk_deadlines_init(&bs->stats.disk_deadlines, disk_deadlines,
+                        qemu_opts_id(opts));
 
     bdrv_set_on_error(bs, on_read_error, on_write_error);
 
diff --git a/hmp.c b/hmp.c
index 3f807b7..2c3660a 100644
--- a/hmp.c
+++ b/hmp.c
@@ -850,6 +850,43 @@  void hmp_info_tpm(Monitor *mon, const QDict *qdict)
     qapi_free_TPMInfoList(info_list);
 }
 
+static double nano_to_seconds(int64_t value)
+{
+    return ((double)value)/1000000000.0;
+}
+
+void hmp_info_disk_deadlines(Monitor *mon, const QDict *qdict)
+{
+    int i;
+    DiskDeadlinesInfoList *ddinfo_list, *curr;
+
+    ddinfo_list = qmp_query_disk_deadlines(NULL);
+    if (!ddinfo_list) {
+        monitor_printf(mon, "No disk deadlines occured\n");
+        return;
+    }
+
+    monitor_printf(mon, "\n%10s %5s %10s %-10s %17s\n",
+                        "disk_id", "type", "size",
+                        "total_time", "start_time");
+
+    /* Print line-delimiter */
+    for (i = 0; i < 3; i++) {
+        monitor_printf(mon, "-------------------");
+    }
+
+    for (curr = ddinfo_list; curr != NULL; curr = curr->next) {
+        monitor_printf(mon, "\n%10s %5s %9"PRIu64"b %-6.3lfs %18"PRIu64"ns",
+                       curr->value->has_disk_id ? curr->value->disk_id : "",
+                       curr->value->has_type ? curr->value->type : "",
+                       curr->value->size,
+                       nano_to_seconds(curr->value->total_time_ns),
+                       curr->value->start_time_ns);
+    }
+    monitor_printf(mon, "\n");
+    qapi_free_DiskDeadlinesInfoList(ddinfo_list);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
     monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 81656c3..8fe0150 100644
--- a/hmp.h
+++ b/hmp.h
@@ -38,6 +38,7 @@  void hmp_info_spice(Monitor *mon, const QDict *qdict);
 void hmp_info_balloon(Monitor *mon, const QDict *qdict);
 void hmp_info_pci(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
+void hmp_info_disk_deadlines(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
 void hmp_info_iothreads(Monitor *mon, const QDict *qdict);
 void hmp_quit(Monitor *mon, const QDict *qdict);
diff --git a/include/block/disk-deadlines.h b/include/block/disk-deadlines.h
index 9672aff..d9b4143 100644
--- a/include/block/disk-deadlines.h
+++ b/include/block/disk-deadlines.h
@@ -34,12 +34,14 @@ 
 typedef struct DiskDeadlines {
     bool enabled;
     bool expired_tree;
+    char *disk_id;
     pthread_mutex_t mtx_tree;
     GTree *requests_tree;
     QEMUTimer *request_timer;
 } DiskDeadlines;
 
-void disk_deadlines_init(DiskDeadlines *disk_deadlines, bool enabled);
+void disk_deadlines_init(DiskDeadlines *disk_deadlines, bool enabled,
+                         const char *disk_id);
 void insert_request(DiskDeadlines *disk_deadlines, void *request);
 void remove_request(DiskDeadlines *disk_deadlines, void *request);
 
diff --git a/monitor.c b/monitor.c
index 5455ab9..065effa 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2898,6 +2898,13 @@  static mon_cmd_t info_cmds[] = {
     },
 #endif
     {
+        .name       = "disk-deadlines",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show dump of late disk requests",
+        .mhandler.cmd = hmp_info_disk_deadlines,
+    },
+    {
         .name       = NULL,
     },
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index 67fef37..ffc1445 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3808,3 +3808,36 @@ 
 
 # Rocker ethernet network switch
 { 'include': 'qapi/rocker.json' }
+
+## @DiskDeadlinesInfo
+#
+# Contains info about late requests which caused VM stopping
+#
+# @disk-id: name of disk (unique for each disk)
+#
+# @type: type of request could be READ, WRITE or FLUSH
+#
+# @size: size in bytes
+#
+# @total-time-ns: total time of request execution
+#
+# @start-time-ns: indicates the start of request execution
+#
+# Since: 2.5
+##
+{ 'struct': 'DiskDeadlinesInfo',
+  'data'  : { '*disk-id': 'str',
+              '*type': 'str',
+              'size': 'uint64',
+              'total-time-ns': 'uint64',
+              'start-time-ns': 'uint64' } }
+##
+# @query-disk-deadlines:
+#
+# Returns information about last late disk requests.
+#
+# Returns: a list of @DiskDeadlinesInfo
+#
+# Since: 2.5
+##
+{ 'command': 'query-disk-deadlines', 'returns': ['DiskDeadlinesInfo'] }