diff mbox

block/raw-posix: fix launching with failed disks

Message ID 1425509290-14048-1-git-send-email-stefanha@redhat.com
State New
Headers show

Commit Message

Stefan Hajnoczi March 4, 2015, 10:48 p.m. UTC
Since commit c25f53b06eba1575d5d0e92a0132455c97825b83 ("raw: Probe
required direct I/O alignment") QEMU has failed to launch if image files
produce I/O errors.

Previously, QEMU would launch successfully and the guest would see the
errors when attempting I/O.

This is a regression and may prevent multipath I/O inside the guest,
where QEMU must launch and let the guest figure out by itself which
disks are online.

Tweak the alignment probing code in raw-posix.c to explicitly look for
EINVAL on Linux instead of bailing.  The kernel refuses misaligned
requests with this error code and other error codes can be ignored.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-posix.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

Comments

Kevin Wolf March 5, 2015, 12:53 p.m. UTC | #1
Am 04.03.2015 um 23:48 hat Stefan Hajnoczi geschrieben:
> Since commit c25f53b06eba1575d5d0e92a0132455c97825b83 ("raw: Probe
> required direct I/O alignment") QEMU has failed to launch if image files
> produce I/O errors.
> 
> Previously, QEMU would launch successfully and the guest would see the
> errors when attempting I/O.
> 
> This is a regression and may prevent multipath I/O inside the guest,
> where QEMU must launch and let the guest figure out by itself which
> disks are online.
> 
> Tweak the alignment probing code in raw-posix.c to explicitly look for
> EINVAL on Linux instead of bailing.  The kernel refuses misaligned
> requests with this error code and other error codes can be ignored.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

This seems to conflict with the geometry series. Please rebase on the
current block branch.

Also, I would be surprised if this had been working by design. It's
probably more by chance. If we want to make this a supported case, we
need to add a qemu-iotests case, as this seems to be easy to break
accidentally.

Kevin
Stefan Hajnoczi March 5, 2015, 5:45 p.m. UTC | #2
On Thu, Mar 05, 2015 at 01:53:57PM +0100, Kevin Wolf wrote:
> Am 04.03.2015 um 23:48 hat Stefan Hajnoczi geschrieben:
> > Since commit c25f53b06eba1575d5d0e92a0132455c97825b83 ("raw: Probe
> > required direct I/O alignment") QEMU has failed to launch if image files
> > produce I/O errors.
> > 
> > Previously, QEMU would launch successfully and the guest would see the
> > errors when attempting I/O.
> > 
> > This is a regression and may prevent multipath I/O inside the guest,
> > where QEMU must launch and let the guest figure out by itself which
> > disks are online.
> > 
> > Tweak the alignment probing code in raw-posix.c to explicitly look for
> > EINVAL on Linux instead of bailing.  The kernel refuses misaligned
> > requests with this error code and other error codes can be ignored.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> This seems to conflict with the geometry series. Please rebase on the
> current block branch.
> 
> Also, I would be surprised if this had been working by design. It's
> probably more by chance. If we want to make this a supported case, we
> need to add a qemu-iotests case, as this seems to be easy to break
> accidentally.

Will send v2.

Stefan
diff mbox

Patch

diff --git a/block/raw-posix.c b/block/raw-posix.c
index b5f077a..6eb3925 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -218,6 +218,31 @@  static int raw_normalize_devicepath(const char **filename)
 }
 #endif
 
+/* Check if read is allowed with given memory buffer and length.
+ *
+ * This function is used to check O_DIRECT memory buffer and request alignment.
+ */
+static bool raw_is_io_aligned(int fd, void *buf, size_t len)
+{
+    ssize_t ret = pread(fd, buf, len, 0);
+
+    if (ret >= 0) {
+        return true;
+    }
+
+#ifdef __linux__
+    /* The Linux kernel returns EINVAL for misaligned O_DIRECT reads.  Ignore
+     * other errors (e.g. real I/O error), which could happen on a failed
+     * drive, since we only care about probing alignment.
+     */
+    if (errno != EINVAL) {
+        return true;
+    }
+#endif
+
+    return false;
+}
+
 static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
@@ -267,7 +292,7 @@  static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
         size_t align;
         buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
         for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-            if (pread(fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
+            if (raw_is_io_aligned(fd, buf + align, MAX_BLOCKSIZE)) {
                 s->buf_align = align;
                 break;
             }
@@ -279,7 +304,7 @@  static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
         size_t align;
         buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
         for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
-            if (pread(fd, buf, align, 0) >= 0) {
+            if (raw_is_io_aligned(fd, buf, align)) {
                 bs->request_alignment = align;
                 break;
             }