Patchwork [v2,14/45] block: introduce block job error

login
register
mail settings
Submitter Paolo Bonzini
Date Sept. 26, 2012, 3:56 p.m.
Message ID <1348675011-8794-15-git-send-email-pbonzini@redhat.com>
Download mbox | patch
Permalink /patch/187166/
State New
Headers show

Comments

Paolo Bonzini - Sept. 26, 2012, 3:56 p.m.
The following behaviors are possible:

'report': The behavior is the same as in 1.1.  An I/O error,
respectively during a read or a write, will complete the job immediately
with an error code.

'ignore': An I/O error, respectively during a read or a write, will be
ignored.  For streaming, the job will complete with an error and the
backing file will be left in place.  For mirroring, the sector will be
marked again as dirty and re-examined later.

'stop': The job will be paused and the job iostatus will be set to
failed or nospace, while the VM will keep running.  This can only be
specified if the block device has rerror=stop and werror=stop or enospc.

'enospc': Behaves as 'stop' for ENOSPC errors, 'report' for others.

In all cases, even for 'report', the I/O error is reported as a QMP
event BLOCK_JOB_ERROR, with the same arguments as BLOCK_IO_ERROR.

It is possible that while stopping the VM a BLOCK_IO_ERROR event will be
reported and will clobber the event from BLOCK_JOB_ERROR, or vice versa.
This is not really avoidable since stopping the VM completes all pending
I/O requests.  In fact, it is already possible now that a series of
BLOCK_IO_ERROR events are reported with rerror=stop, because vm_stop
calls bdrv_drain_all and this can generate further errors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        v1->v2: introduced block_job_iostatus_reset.  Removed sorting
        of iostatus values with "failed" overriding "nospace" but not
        vice versa.  Documented that block-job-resume clears the
        iostatus field.  Always set errors on the block job even if
        they happen on the target; this removes the need to expose
        the target's BlockInfo in "query-blockjobs".

 QMP/qmp-events.txt | 22 ++++++++++++++++++++
 block.c            |  9 ++++----
 block_int.h        |  4 ++++
 blockjob.c         | 61 ++++++++++++++++++++++++++++++++++++++++++++++++------
 blockjob.h         | 25 ++++++++++++++++++++++
 monitor.c          |  1 +
 monitor.h          |  1 +
 qapi-schema.json   |  7 ++++++-
 8 file modificati, 119 inserzioni(+), 11 rimozioni(-)
Eric Blake - Sept. 26, 2012, 7:10 p.m.
On 09/26/2012 09:56 AM, Paolo Bonzini wrote:
> The following behaviors are possible:
> 
> 'report': The behavior is the same as in 1.1.  An I/O error,
> respectively during a read or a write, will complete the job immediately
> with an error code.
> 
> 'ignore': An I/O error, respectively during a read or a write, will be
> ignored.  For streaming, the job will complete with an error and the
> backing file will be left in place.  For mirroring, the sector will be
> marked again as dirty and re-examined later.
> 
> 'stop': The job will be paused and the job iostatus will be set to
> failed or nospace, while the VM will keep running.  This can only be
> specified if the block device has rerror=stop and werror=stop or enospc.
> 
> 'enospc': Behaves as 'stop' for ENOSPC errors, 'report' for others.
> 

> +Emitted when a block job encounters an error.
> +
> +Data:
> +
> +- "device": device name (json-string)
> +- "operation": I/O operation (json-string, "read" or "write")

For symmetry with BLOCK_JOB_{CANCELLED,COMPLETED}, you also need:
- "type":     Job type ("stream" for image streaming, json-string)

Libvirt would like to key off of the 'type' field for all three events.
 Besides, if management issues several block commands in a row, and only
then starts processing the pending event queue, it would be nice to know
whether the error stemmed from a 'stream', 'mirror', or (when combined
with Jeff's patches) 'commit' job.


> +++ b/qapi-schema.json
> @@ -1127,11 +1127,14 @@
>  #
>  # @speed: the rate limit, bytes per second
>  #
> +# @io-status: the status of the job (since 1.2)

1.3
Eric Blake - Sept. 26, 2012, 7:27 p.m.
On 09/26/2012 01:10 PM, Eric Blake wrote:

>> +Emitted when a block job encounters an error.
>> +
>> +Data:
>> +
>> +- "device": device name (json-string)
>> +- "operation": I/O operation (json-string, "read" or "write")
> 
> For symmetry with BLOCK_JOB_{CANCELLED,COMPLETED}, you also need:
> - "type":     Job type ("stream" for image streaming, json-string)
> 
> Libvirt would like to key off of the 'type' field for all three events.
>  Besides, if management issues several block commands in a row, and only
> then starts processing the pending event queue, it would be nice to know
> whether the error stemmed from a 'stream', 'mirror', or (when combined
> with Jeff's patches) 'commit' job.

For that matter, maybe qapi-schema.json should define an enum:

{ 'enum': 'BlockJobType',
  'data': [ 'stream', 'mirror', 'commit' ] }

and have the job type listed throughout the various QMP calls as a
member of that enum, rather than open-coded strings, to make it easier
the next time we add a job type.
Paolo Bonzini - Sept. 27, 2012, 9:24 a.m.
Il 26/09/2012 21:10, Eric Blake ha scritto:
>> > +- "device": device name (json-string)
>> > +- "operation": I/O operation (json-string, "read" or "write")
> For symmetry with BLOCK_JOB_{CANCELLED,COMPLETED}, you also need:
> - "type":     Job type ("stream" for image streaming, json-string)
> 
> Libvirt would like to key off of the 'type' field for all three events.
>  Besides, if management issues several block commands in a row, and only
> then starts processing the pending event queue, it would be nice to know
> whether the error stemmed from a 'stream', 'mirror', or (when combined
> with Jeff's patches) 'commit' job.
> 
> 

Let's add it as a follow-up.

Paolo
Kevin Wolf - Sept. 27, 2012, 1:41 p.m.
Am 26.09.2012 17:56, schrieb Paolo Bonzini:
> The following behaviors are possible:
> 
> 'report': The behavior is the same as in 1.1.  An I/O error,
> respectively during a read or a write, will complete the job immediately
> with an error code.
> 
> 'ignore': An I/O error, respectively during a read or a write, will be
> ignored.  For streaming, the job will complete with an error and the
> backing file will be left in place.  For mirroring, the sector will be
> marked again as dirty and re-examined later.
> 
> 'stop': The job will be paused and the job iostatus will be set to
> failed or nospace, while the VM will keep running.  This can only be
> specified if the block device has rerror=stop and werror=stop or enospc.

Why is that? I don't see the dependency on rerror/werror in the code,
and documentation doesn't mention it either.

> 'enospc': Behaves as 'stop' for ENOSPC errors, 'report' for others.
> 
> In all cases, even for 'report', the I/O error is reported as a QMP
> event BLOCK_JOB_ERROR, with the same arguments as BLOCK_IO_ERROR.
> 
> It is possible that while stopping the VM a BLOCK_IO_ERROR event will be
> reported and will clobber the event from BLOCK_JOB_ERROR, or vice versa.
> This is not really avoidable since stopping the VM completes all pending
> I/O requests.  In fact, it is already possible now that a series of
> BLOCK_IO_ERROR events are reported with rerror=stop, because vm_stop
> calls bdrv_drain_all and this can generate further errors.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>         v1->v2: introduced block_job_iostatus_reset.  Removed sorting
>         of iostatus values with "failed" overriding "nospace" but not
>         vice versa.  Documented that block-job-resume clears the
>         iostatus field.  Always set errors on the block job even if
>         they happen on the target; this removes the need to expose
>         the target's BlockInfo in "query-blockjobs".

> +BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
> +                                        BlockdevOnError on_err,
> +                                        int is_read, int error)
> +{
> +    BlockErrorAction action;
> +
> +    switch (on_err) {
> +    case BLOCKDEV_ON_ERROR_ENOSPC:
> +        action = (error == ENOSPC) ? BDRV_ACTION_STOP : BDRV_ACTION_REPORT;
> +        break;
> +    case BLOCKDEV_ON_ERROR_STOP:
> +        action = BDRV_ACTION_STOP;
> +        break;
> +    case BLOCKDEV_ON_ERROR_REPORT:
> +        action = BDRV_ACTION_REPORT;
> +        break;
> +    case BLOCKDEV_ON_ERROR_IGNORE:
> +        action = BDRV_ACTION_IGNORE;
> +        break;
> +    default:
> +        abort();
> +    }

Isn't this a duplication of bdrv_get_error_action()?

Kevin
Paolo Bonzini - Sept. 27, 2012, 2:50 p.m.
Il 27/09/2012 15:41, Kevin Wolf ha scritto:
>> > +BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
>> > +                                        BlockdevOnError on_err,
>> > +                                        int is_read, int error)
>> > +{
>> > +    BlockErrorAction action;
>> > +
>> > +    switch (on_err) {
>> > +    case BLOCKDEV_ON_ERROR_ENOSPC:
>> > +        action = (error == ENOSPC) ? BDRV_ACTION_STOP : BDRV_ACTION_REPORT;
>> > +        break;
>> > +    case BLOCKDEV_ON_ERROR_STOP:
>> > +        action = BDRV_ACTION_STOP;
>> > +        break;
>> > +    case BLOCKDEV_ON_ERROR_REPORT:
>> > +        action = BDRV_ACTION_REPORT;
>> > +        break;
>> > +    case BLOCKDEV_ON_ERROR_IGNORE:
>> > +        action = BDRV_ACTION_IGNORE;
>> > +        break;
>> > +    default:
>> > +        abort();
>> > +    }
> Isn't this a duplication of bdrv_get_error_action()?

bdrv_get_error_action() has this:

BlockdevOnError on_err = is_read ? bs->on_read_error : bs->on_write_error;

It can use some refactoring to commonize the switch statement, but
it's not a direct replacement.

Paolo

Patch

diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
index 2878058..ae19db2 100644
--- a/QMP/qmp-events.txt
+++ b/QMP/qmp-events.txt
@@ -94,6 +94,28 @@  Example:
                "speed": 0 },
      "timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
 
+BLOCK_JOB_ERROR
+---------------
+
+Emitted when a block job encounters an error.
+
+Data:
+
+- "device": device name (json-string)
+- "operation": I/O operation (json-string, "read" or "write")
+- "action": action that has been taken, it's one of the following (json-string):
+    "ignore": error has been ignored, the job may fail later
+    "report": error will be reported and the job canceled
+    "stop": error caused job to be paused
+
+Example:
+
+{ "event": "BLOCK_JOB_ERROR",
+    "data": { "device": "ide0-hd1",
+              "operation": "write",
+              "action": "stop" },
+    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+
 DEVICE_TRAY_MOVED
 -----------------
 
diff --git a/block.c b/block.c
index b7b04aa..83a695b 100644
--- a/block.c
+++ b/block.c
@@ -1387,8 +1387,9 @@  void bdrv_set_dev_ops(BlockDriverState *bs, const BlockDevOps *ops,
     }
 }
 
-static void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
-                                      BlockErrorAction action, bool is_read)
+void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
+                               enum MonitorEvent ev,
+                               BlockErrorAction action, bool is_read)
 {
     QObject *data;
     const char *action_str;
@@ -1411,7 +1412,7 @@  static void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
                               bdrv->device_name,
                               action_str,
                               is_read ? "read" : "write");
-    monitor_protocol_event(QEVENT_BLOCK_IO_ERROR, data);
+    monitor_protocol_event(ev, data);
 
     qobject_decref(data);
 }
@@ -2370,7 +2371,7 @@  void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
                        bool is_read, int error)
 {
     assert(error >= 0);
-    bdrv_emit_qmp_error_event(bs, action, is_read);
+    bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
     if (action == BDRV_ACTION_STOP) {
         vm_stop(RUN_STATE_IO_ERROR);
         bdrv_iostatus_set_err(bs, error);
diff --git a/block_int.h b/block_int.h
index db487eb..5257b10 100644
--- a/block_int.h
+++ b/block_int.h
@@ -31,6 +31,7 @@ 
 #include "qemu-timer.h"
 #include "qapi-types.h"
 #include "qerror.h"
+#include "monitor.h"
 
 #define BLOCK_FLAG_ENCRYPT          1
 #define BLOCK_FLAG_COMPAT6          4
@@ -286,6 +287,9 @@  void bdrv_set_io_limits(BlockDriverState *bs,
 #ifdef _WIN32
 int is_windows_drive(const char *filename);
 #endif
+void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
+                               enum MonitorEvent ev,
+                               BlockErrorAction action, bool is_read);
 
 /**
  * stream_start:
diff --git a/blockjob.c b/blockjob.c
index 884bd2b..5dd9c1e 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -112,6 +112,7 @@  bool block_job_is_paused(BlockJob *job)
 void block_job_resume(BlockJob *job)
 {
     job->paused = false;
+    block_job_iostatus_reset(job);
     if (job->co && !job->busy) {
         qemu_coroutine_enter(job->co, NULL);
     }
@@ -128,6 +129,11 @@  bool block_job_is_cancelled(BlockJob *job)
     return job->cancelled;
 }
 
+void block_job_iostatus_reset(BlockJob *job)
+{
+    job->iostatus = BLOCK_DEVICE_IO_STATUS_OK;
+}
+
 struct BlockCancelData {
     BlockJob *job;
     BlockDriverCompletionFunc *cb;
@@ -189,11 +195,54 @@  void block_job_sleep_ns(BlockJob *job, QEMUClock *clock, int64_t ns)
 BlockJobInfo *block_job_query(BlockJob *job)
 {
     BlockJobInfo *info = g_new0(BlockJobInfo, 1);
-    info->type   = g_strdup(job->job_type->job_type);
-    info->device = g_strdup(bdrv_get_device_name(job->bs));
-    info->len    = job->len;
-    info->paused = job->paused;
-    info->offset = job->offset;
-    info->speed  = job->speed;
+    info->type      = g_strdup(job->job_type->job_type);
+    info->device    = g_strdup(bdrv_get_device_name(job->bs));
+    info->len       = job->len;
+    info->paused    = job->paused;
+    info->offset    = job->offset;
+    info->speed     = job->speed;
+    info->io_status = job->iostatus;
     return info;
 }
+
+static void block_job_iostatus_set_err(BlockJob *job, int error)
+{
+    if (job->iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
+        job->iostatus = error == ENOSPC ? BLOCK_DEVICE_IO_STATUS_NOSPACE :
+                                          BLOCK_DEVICE_IO_STATUS_FAILED;
+    }
+}
+
+
+BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
+                                        BlockdevOnError on_err,
+                                        int is_read, int error)
+{
+    BlockErrorAction action;
+
+    switch (on_err) {
+    case BLOCKDEV_ON_ERROR_ENOSPC:
+        action = (error == ENOSPC) ? BDRV_ACTION_STOP : BDRV_ACTION_REPORT;
+        break;
+    case BLOCKDEV_ON_ERROR_STOP:
+        action = BDRV_ACTION_STOP;
+        break;
+    case BLOCKDEV_ON_ERROR_REPORT:
+        action = BDRV_ACTION_REPORT;
+        break;
+    case BLOCKDEV_ON_ERROR_IGNORE:
+        action = BDRV_ACTION_IGNORE;
+        break;
+    default:
+        abort();
+    }
+    bdrv_emit_qmp_error_event(job->bs, QEVENT_BLOCK_JOB_ERROR, action, is_read);
+    if (action == BDRV_ACTION_STOP) {
+        block_job_pause(job);
+        block_job_iostatus_set_err(job, error);
+        if (bs != job->bs) {
+            bdrv_iostatus_set_err(bs, error);
+        }
+    }
+    return action;
+}
diff --git a/blockjob.h b/blockjob.h
index a0d1b5c..2070a1b 100644
--- a/blockjob.h
+++ b/blockjob.h
@@ -82,6 +82,9 @@  struct BlockJob {
      */
     bool busy;
 
+    /** Status that is published by the query-block-jobs QMP API */
+    BlockDeviceIoStatus iostatus;
+
     /** Offset that is published by the query-block-jobs QMP API */
     int64_t offset;
 
@@ -216,4 +219,26 @@  bool block_job_is_paused(BlockJob *job);
  */
 int block_job_cancel_sync(BlockJob *job);
 
+/**
+ * block_job_iostatus_reset:
+ * @job: The job whose I/O status should be reset.
+ *
+ * Reset I/O status on @job.
+ */
+void block_job_iostatus_reset(BlockJob *job);
+
+/**
+ * block_job_error_action:
+ * @job: The job to signal an error for.
+ * @bs: The block device on which to set an I/O error.
+ * @on_err: The error action setting.
+ * @is_read: Whether the operation was a read.
+ * @error: The error that was reported.
+ *
+ * Report an I/O error for a block job and possibly stop the VM.  Return the
+ * action that was selected based on @on_err and @error.
+ */
+BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
+                                        BlockdevOnError on_err,
+                                        int is_read, int error);
 #endif
diff --git a/monitor.c b/monitor.c
index 67064e2..d4bd5fe 100644
--- a/monitor.c
+++ b/monitor.c
@@ -450,6 +450,7 @@  static const char *monitor_event_names[] = {
     [QEVENT_SPICE_DISCONNECTED] = "SPICE_DISCONNECTED",
     [QEVENT_BLOCK_JOB_COMPLETED] = "BLOCK_JOB_COMPLETED",
     [QEVENT_BLOCK_JOB_CANCELLED] = "BLOCK_JOB_CANCELLED",
+    [QEVENT_BLOCK_JOB_ERROR] = "BLOCK_JOB_ERROR",
     [QEVENT_DEVICE_TRAY_MOVED] = "DEVICE_TRAY_MOVED",
     [QEVENT_SUSPEND] = "SUSPEND",
     [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK",
diff --git a/monitor.h b/monitor.h
index 64c1561..43040af 100644
--- a/monitor.h
+++ b/monitor.h
@@ -38,6 +38,7 @@  typedef enum MonitorEvent {
     QEVENT_SPICE_DISCONNECTED,
     QEVENT_BLOCK_JOB_COMPLETED,
     QEVENT_BLOCK_JOB_CANCELLED,
+    QEVENT_BLOCK_JOB_ERROR,
     QEVENT_DEVICE_TRAY_MOVED,
     QEVENT_SUSPEND,
     QEVENT_SUSPEND_DISK,
diff --git a/qapi-schema.json b/qapi-schema.json
index 43d3345..6e4b5b7 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1127,11 +1127,14 @@ 
 #
 # @speed: the rate limit, bytes per second
 #
+# @io-status: the status of the job (since 1.2)
+#
 # Since: 1.1
 ##
 { 'type': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
-           'offset': 'int', 'paused': 'bool', 'speed': 'int'} }
+           'offset': 'int', 'paused': 'bool', 'speed': 'int',
+           'io-status': 'BlockDeviceIoStatus'} }
 
 ##
 # @query-block-jobs:
@@ -1919,6 +1922,8 @@ 
 # operation.  It is an error to call this command if no operation is in
 # progress.  Resuming an already running job is not an error.
 #
+# This command also clears the error status of the job.
+#
 # @device: the device name
 #
 # Returns: Nothing on success