diff mbox

[v5,08/45] Return path: socket_writev_buffer: Block even on non-blocking fd's

Message ID 1424883128-9841-9-git-send-email-dgilbert@redhat.com
State New
Headers show

Commit Message

Dr. David Alan Gilbert Feb. 25, 2015, 4:51 p.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The return path uses a non-blocking fd so as not to block waiting
for the (possibly broken) destination to finish returning a message,
however we still want outbound data to behave in the same way and block.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/qemu-file-unix.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)

Comments

David Gibson March 10, 2015, 2:56 a.m. UTC | #1
On Wed, Feb 25, 2015 at 04:51:31PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> The return path uses a non-blocking fd so as not to block waiting
> for the (possibly broken) destination to finish returning a message,
> however we still want outbound data to behave in the same way and block.

It's not clear to me from this description exactly where the situation
is that you need to write to the non-blocking socket.  Is it on the
source or the destination?  If the source, why are you writing to the
return path?  If the destination, why are you marking the outgoing
return path as non-blocking?

> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/qemu-file-unix.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/migration/qemu-file-unix.c b/migration/qemu-file-unix.c
> index 50291cf..218dbd0 100644
> --- a/migration/qemu-file-unix.c
> +++ b/migration/qemu-file-unix.c
> @@ -39,12 +39,43 @@ static ssize_t socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt,
>      QEMUFileSocket *s = opaque;
>      ssize_t len;
>      ssize_t size = iov_size(iov, iovcnt);
> +    ssize_t offset = 0;
> +    int     err;
>  
> -    len = iov_send(s->fd, iov, iovcnt, 0, size);
> -    if (len < size) {
> -        len = -socket_error();
> -    }
> -    return len;
> +    while (size > 0) {
> +        len = iov_send(s->fd, iov, iovcnt, offset, size);
> +
> +        if (len > 0) {
> +            size -= len;
> +            offset += len;
> +        }
> +
> +        if (size > 0) {
> +            err = socket_error();
> +
> +            if (err != EAGAIN) {
> +                error_report("socket_writev_buffer: Got err=%d for (%zd/%zd)",
> +                             err, size, len);
> +                /*
> +                 * If I've already sent some but only just got the error, I
> +                 * could return the amount validly sent so far and wait for the
> +                 * next call to report the error, but I'd rather flag the error
> +                 * immediately.
> +                 */
> +                return -err;
> +            }
> +
> +            /* Emulate blocking */
> +            GPollFD pfd;
> +
> +            pfd.fd = s->fd;
> +            pfd.events = G_IO_OUT | G_IO_ERR;
> +            pfd.revents = 0;
> +            g_poll(&pfd, 1 /* 1 fd */, -1 /* no timeout */);
> +        }
> +     }
> +
> +    return offset;
>  }
>  
>  static int socket_get_fd(void *opaque)
Dr. David Alan Gilbert March 10, 2015, 1:35 p.m. UTC | #2
* David Gibson (david@gibson.dropbear.id.au) wrote:
> On Wed, Feb 25, 2015 at 04:51:31PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > The return path uses a non-blocking fd so as not to block waiting
> > for the (possibly broken) destination to finish returning a message,
> > however we still want outbound data to behave in the same way and block.
> 
> It's not clear to me from this description exactly where the situation
> is that you need to write to the non-blocking socket.  Is it on the
> source or the destination?  If the source, why are you writing to the
> return path?  If the destination, why are you marking the outgoing
> return path as non-blocking?

My understanding is that the semantics of set_nonblock() are to
set non-blocking on all operations on the transport associated with
the fd - and that it's true even if you dup() the fd; and so if you
set non-blocking in one direction you get it in the other direction as well.

The (existing) destination side sets non-block (see process_incoming_migration
in migration.c), and so it gets non-blocking on the incoming data stream, but
that has the side effect that it's also going to be non-blocking on
the destinations writes to the reverse-path; thus we need to be able
to safely do writes from the destination reverse-path.

The text is out of date, back in v2 the source used to use non-blocking
for the return-path, but we managed to kill that off by using a thread
for the return path in the source.

How about changing the text to:
--------
The destination sets the fd to non-blocking on incoming migrations;
this also affects the return path from the destination, and thus we
need to make sure we can safely write to the return path.

Dave

> 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  migration/qemu-file-unix.c | 41 ++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 36 insertions(+), 5 deletions(-)
> > 
> > diff --git a/migration/qemu-file-unix.c b/migration/qemu-file-unix.c
> > index 50291cf..218dbd0 100644
> > --- a/migration/qemu-file-unix.c
> > +++ b/migration/qemu-file-unix.c
> > @@ -39,12 +39,43 @@ static ssize_t socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt,
> >      QEMUFileSocket *s = opaque;
> >      ssize_t len;
> >      ssize_t size = iov_size(iov, iovcnt);
> > +    ssize_t offset = 0;
> > +    int     err;
> >  
> > -    len = iov_send(s->fd, iov, iovcnt, 0, size);
> > -    if (len < size) {
> > -        len = -socket_error();
> > -    }
> > -    return len;
> > +    while (size > 0) {
> > +        len = iov_send(s->fd, iov, iovcnt, offset, size);
> > +
> > +        if (len > 0) {
> > +            size -= len;
> > +            offset += len;
> > +        }
> > +
> > +        if (size > 0) {
> > +            err = socket_error();
> > +
> > +            if (err != EAGAIN) {
> > +                error_report("socket_writev_buffer: Got err=%d for (%zd/%zd)",
> > +                             err, size, len);
> > +                /*
> > +                 * If I've already sent some but only just got the error, I
> > +                 * could return the amount validly sent so far and wait for the
> > +                 * next call to report the error, but I'd rather flag the error
> > +                 * immediately.
> > +                 */
> > +                return -err;
> > +            }
> > +
> > +            /* Emulate blocking */
> > +            GPollFD pfd;
> > +
> > +            pfd.fd = s->fd;
> > +            pfd.events = G_IO_OUT | G_IO_ERR;
> > +            pfd.revents = 0;
> > +            g_poll(&pfd, 1 /* 1 fd */, -1 /* no timeout */);
> > +        }
> > +     }
> > +
> > +    return offset;
> >  }
> >  
> >  static int socket_get_fd(void *opaque)
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
David Gibson March 11, 2015, 1:51 a.m. UTC | #3
On Tue, Mar 10, 2015 at 01:35:58PM +0000, Dr. David Alan Gilbert wrote:
> * David Gibson (david@gibson.dropbear.id.au) wrote:
> > On Wed, Feb 25, 2015 at 04:51:31PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > The return path uses a non-blocking fd so as not to block waiting
> > > for the (possibly broken) destination to finish returning a message,
> > > however we still want outbound data to behave in the same way and block.
> > 
> > It's not clear to me from this description exactly where the situation
> > is that you need to write to the non-blocking socket.  Is it on the
> > source or the destination?  If the source, why are you writing to the
> > return path?  If the destination, why are you marking the outgoing
> > return path as non-blocking?
> 
> My understanding is that the semantics of set_nonblock() are to
> set non-blocking on all operations on the transport associated with
> the fd - and that it's true even if you dup() the fd; and so if you
> set non-blocking in one direction you get it in the other direction as well.

Ah.. yes, I think you're right.

> The (existing) destination side sets non-block (see process_incoming_migration
> in migration.c), and so it gets non-blocking on the incoming data stream, but
> that has the side effect that it's also going to be non-blocking on
> the destinations writes to the reverse-path; thus we need to be able
> to safely do writes from the destination reverse-path.
> 
> The text is out of date, back in v2 the source used to use non-blocking
> for the return-path, but we managed to kill that off by using a thread
> for the return path in the source.
> 
> How about changing the text to:
> --------
> The destination sets the fd to non-blocking on incoming migrations;
> this also affects the return path from the destination, and thus we
> need to make sure we can safely write to the return path.

Yes, I think that makes it clearer.
Paolo Bonzini March 28, 2015, 3:30 p.m. UTC | #4
On 25/02/2015 17:51, Dr. David Alan Gilbert (git) wrote:
> +            if (err != EAGAIN) {

if (err != EAGAIN && err != EWOULDBLOCK)

Paolo
David Gibson March 29, 2015, 4:07 a.m. UTC | #5
On Sat, Mar 28, 2015 at 04:30:06PM +0100, Paolo Bonzini wrote:
> 
> 
> On 25/02/2015 17:51, Dr. David Alan Gilbert (git) wrote:
> > +            if (err != EAGAIN) {
> 
> if (err != EAGAIN && err != EWOULDBLOCK)

I assume that's for the benefit of non-Linux hosts?  On Linux EAGAIN
== EWOULDBLOCK.
Paolo Bonzini March 29, 2015, 9:03 a.m. UTC | #6
On 29/03/2015 06:07, David Gibson wrote:
> On Sat, Mar 28, 2015 at 04:30:06PM +0100, Paolo Bonzini wrote:
>> 
>> 
>> On 25/02/2015 17:51, Dr. David Alan Gilbert (git) wrote:
>>> +            if (err != EAGAIN) {
>> 
>> if (err != EAGAIN && err != EWOULDBLOCK)
> 
> I assume that's for the benefit of non-Linux hosts?  On Linux
> EAGAIN == EWOULDBLOCK.

Yes, that's just the standard idiom in QEMU.  This is generic code, so
assumption based on the host platform are not wise. :)

Paolo
Dr. David Alan Gilbert March 30, 2015, 4:50 p.m. UTC | #7
* Paolo Bonzini (pbonzini@redhat.com) wrote:
> 
> 
> On 29/03/2015 06:07, David Gibson wrote:
> > On Sat, Mar 28, 2015 at 04:30:06PM +0100, Paolo Bonzini wrote:
> >> 
> >> 
> >> On 25/02/2015 17:51, Dr. David Alan Gilbert (git) wrote:
> >>> +            if (err != EAGAIN) {
> >> 
> >> if (err != EAGAIN && err != EWOULDBLOCK)
> > 
> > I assume that's for the benefit of non-Linux hosts?  On Linux
> > EAGAIN == EWOULDBLOCK.
> 
> Yes, that's just the standard idiom in QEMU.  This is generic code, so
> assumption based on the host platform are not wise. :)

Done; I didn't know of EWOULDBLOCK - and indeed as far as I can tell
most places we only test for EAGAIN.

> 
> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Markus Armbruster March 30, 2015, 6:22 p.m. UTC | #8
"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>> 
>> 
>> On 29/03/2015 06:07, David Gibson wrote:
>> > On Sat, Mar 28, 2015 at 04:30:06PM +0100, Paolo Bonzini wrote:
>> >> 
>> >> 
>> >> On 25/02/2015 17:51, Dr. David Alan Gilbert (git) wrote:
>> >>> +            if (err != EAGAIN) {
>> >> 
>> >> if (err != EAGAIN && err != EWOULDBLOCK)
>> > 
>> > I assume that's for the benefit of non-Linux hosts?  On Linux
>> > EAGAIN == EWOULDBLOCK.
>> 
>> Yes, that's just the standard idiom in QEMU.  This is generic code, so
>> assumption based on the host platform are not wise. :)
>
> Done; I didn't know of EWOULDBLOCK - and indeed as far as I can tell
> most places we only test for EAGAIN.

Bug unless the place in question is effectively #ifdef __GNU_LIBRARY__
or similar.

https://www.gnu.org/software/libc/manual/html_node/Error-Codes.html#Error-Codes

 -- Macro: int EAGAIN
     Resource temporarily unavailable; the call might work if you try
     again later.  The macro 'EWOULDBLOCK' is another name for 'EAGAIN';
     they are always the same in the GNU C Library.

     This error can happen in a few different situations:

        * An operation that would block was attempted on an object that
          has non-blocking mode selected.  Trying the same operation
          again will block until some external condition makes it
          possible to read, write, or connect (whatever the operation).
          You can use 'select' to find out when the operation will be
          possible; *note Waiting for I/O::.

          *Portability Note:* In many older Unix systems, this condition
          was indicated by 'EWOULDBLOCK', which was a distinct error
          code different from 'EAGAIN'.  To make your program portable,
          you should check for both codes and treat them the same.

        * A temporary resource shortage made an operation impossible.
          'fork' can return this error.  It indicates that the shortage
          is expected to pass, so your program can try the call again
          later and it may succeed.  It is probably a good idea to delay
          for a few seconds before trying it again, to allow time for
          other processes to release scarce resources.  Such shortages
          are usually fairly serious and affect the whole system, so
          usually an interactive program should report the error to the
          user and return to its command loop.

 -- Macro: int EWOULDBLOCK
     In the GNU C Library, this is another name for 'EAGAIN' (above).
     The values are always the same, on every operating system.

     C libraries in many older Unix systems have 'EWOULDBLOCK' as a
     separate error code.
diff mbox

Patch

diff --git a/migration/qemu-file-unix.c b/migration/qemu-file-unix.c
index 50291cf..218dbd0 100644
--- a/migration/qemu-file-unix.c
+++ b/migration/qemu-file-unix.c
@@ -39,12 +39,43 @@  static ssize_t socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt,
     QEMUFileSocket *s = opaque;
     ssize_t len;
     ssize_t size = iov_size(iov, iovcnt);
+    ssize_t offset = 0;
+    int     err;
 
-    len = iov_send(s->fd, iov, iovcnt, 0, size);
-    if (len < size) {
-        len = -socket_error();
-    }
-    return len;
+    while (size > 0) {
+        len = iov_send(s->fd, iov, iovcnt, offset, size);
+
+        if (len > 0) {
+            size -= len;
+            offset += len;
+        }
+
+        if (size > 0) {
+            err = socket_error();
+
+            if (err != EAGAIN) {
+                error_report("socket_writev_buffer: Got err=%d for (%zd/%zd)",
+                             err, size, len);
+                /*
+                 * If I've already sent some but only just got the error, I
+                 * could return the amount validly sent so far and wait for the
+                 * next call to report the error, but I'd rather flag the error
+                 * immediately.
+                 */
+                return -err;
+            }
+
+            /* Emulate blocking */
+            GPollFD pfd;
+
+            pfd.fd = s->fd;
+            pfd.events = G_IO_OUT | G_IO_ERR;
+            pfd.revents = 0;
+            g_poll(&pfd, 1 /* 1 fd */, -1 /* no timeout */);
+        }
+     }
+
+    return offset;
 }
 
 static int socket_get_fd(void *opaque)