diff mbox

Socket reconnection.

Message ID 4B1503EF.6090806@collabora.co.uk
State New
Headers show

Commit Message

Ian Molton Dec. 1, 2009, 11:54 a.m. UTC
Anthony Liguori wrote:
> Ian Molton wrote:
>> Hi folks,
>>
>> I need my source of data for virtio-rng to be reliable - IOW. if the
>> server dies and comes back up, I want qemu to reconnect and suck down
>> fresh entropy, rather than hand the rngd process on the guest.
>>
>> I'm using the chardev 'socket' type to make the connection to the host.
>>
>> Would a patch adding (optional) auto-reconnect (with a back-off delay)
>> to the socket chardev be acceptable ?
>>
>> If not, I'll have to cook up a thread to handle EGD requests, but that
>> seems perverse...
>>   
> 
> Hrm, I'm not sure.  What are the circumstances that this connection
> would die?  What happens while the connection is dead?

The most common would be the entropy gathering daemon being restarted,
perhaps due to an upgrade. Its hardly a good idea to require all the
guest VMs to reboot on this occuring. Next most common I guess would be
the daemon crashing, but again, not something you'd want taking out your
guest VMs...

Whilst the connection is down, the guests will potentially starve of
entropy - but that only means they'll block processes that try to use
/dev/random if they run out altogether.

Here are two patches that implement socket reconnection. This first
cleans up the APIs needed a little  and the second implements the guts.

If these patches are acceptable, I will repost my 4-patch series which
also includes the SIZE parameter patch and an updated virtio-rng patch
that uses the reconnection infrastructure to enhance its reliability.

-Ian

Comments

Krumme, Chris Dec. 1, 2009, 2:38 p.m. UTC | #1
Hello Ian,

Since you did not inline your source I will paste in a chunk:


@@ -2030,10 +2036,18 @@ static void tcp_chr_read(void *opaque)
         if (s->listen_fd >= 0) {
             qemu_set_fd_handler(s->listen_fd, tcp_chr_accept, NULL,
chr);
         }
-        qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
+        if (!s->reconnect)
+            qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
         closesocket(s->fd);
         s->fd = -1;
-        qemu_chr_event(chr, CHR_EVENT_CLOSED);
+        if (!s->reconnect) {
+            qemu_chr_event(chr, CHR_EVENT_CLOSED);
+        } else {
+            do {
+                sleep(s->reconnect);
+            } while(!qemu_chr_connect_socket(s));
+            qemu_chr_event(chr, CHR_EVENT_RECONNECTED);
+        }
     } else if (size > 0) {
         if (s->do_telnetopt)
             tcp_chr_process_IAC_bytes(chr, s, buf, &size);


Should you be introducing a while sleep loop into Qemu here?

I would think you should be returning 'no data', maybe after trying
once.

Hope this helps.

Chris

 

> -----Original Message-----
> From: 
> qemu-devel-bounces+chris.krumme=windriver.com@nongnu.org 
> [mailto:qemu-devel-bounces+chris.krumme=windriver.com@nongnu.o
> rg] On Behalf Of Ian Molton
> Sent: Tuesday, December 01, 2009 5:54 AM
> To: Anthony Liguori
> Cc: qemu-devel@nongnu.org
> Subject: Re: [Qemu-devel] Socket reconnection.
> 
> Anthony Liguori wrote:
> > Ian Molton wrote:
> >> Hi folks,
> >>
> >> I need my source of data for virtio-rng to be reliable - 
> IOW. if the
> >> server dies and comes back up, I want qemu to reconnect 
> and suck down
> >> fresh entropy, rather than hand the rngd process on the guest.
> >>
> >> I'm using the chardev 'socket' type to make the connection 
> to the host.
> >>
> >> Would a patch adding (optional) auto-reconnect (with a 
> back-off delay)
> >> to the socket chardev be acceptable ?
> >>
> >> If not, I'll have to cook up a thread to handle EGD 
> requests, but that
> >> seems perverse...
> >>   
> > 
> > Hrm, I'm not sure.  What are the circumstances that this connection
> > would die?  What happens while the connection is dead?
> 
> The most common would be the entropy gathering daemon being restarted,
> perhaps due to an upgrade. Its hardly a good idea to require all the
> guest VMs to reboot on this occuring. Next most common I 
> guess would be
> the daemon crashing, but again, not something you'd want 
> taking out your
> guest VMs...
> 
> Whilst the connection is down, the guests will potentially starve of
> entropy - but that only means they'll block processes that try to use
> /dev/random if they run out altogether.
> 
> Here are two patches that implement socket reconnection. This first
> cleans up the APIs needed a little  and the second implements 
> the guts.
> 
> If these patches are acceptable, I will repost my 4-patch series which
> also includes the SIZE parameter patch and an updated virtio-rng patch
> that uses the reconnection infrastructure to enhance its reliability.
> 
> -Ian
>
Ian Molton Dec. 1, 2009, 6:29 p.m. UTC | #2
Krumme, Chris wrote:
> Hello Ian,

Hi!

> Should you be introducing a while sleep loop into Qemu here?
> 
> I would think you should be returning 'no data', maybe after trying
> once.

Indeed.

I could have sworn I tried that. Too much crack, will re-do.
Anthony Liguori Dec. 1, 2009, 6:54 p.m. UTC | #3
Krumme, Chris wrote:
> Hello Ian,
>
> Since you did not inline your source I will paste in a chunk:
>
>
> @@ -2030,10 +2036,18 @@ static void tcp_chr_read(void *opaque)
>          if (s->listen_fd >= 0) {
>              qemu_set_fd_handler(s->listen_fd, tcp_chr_accept, NULL,
> chr);
>          }
> -        qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
> +        if (!s->reconnect)
> +            qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
>          closesocket(s->fd);
>          s->fd = -1;
> -        qemu_chr_event(chr, CHR_EVENT_CLOSED);
> +        if (!s->reconnect) {
> +            qemu_chr_event(chr, CHR_EVENT_CLOSED);
> +        } else {
> +            do {
> +                sleep(s->reconnect);
> +            } while(!qemu_chr_connect_socket(s));
> +            qemu_chr_event(chr, CHR_EVENT_RECONNECTED);
> +        }
>      } else if (size > 0) {
>          if (s->do_telnetopt)
>              tcp_chr_process_IAC_bytes(chr, s, buf, &size);
>
>
> Should you be introducing a while sleep loop into Qemu here?
>
> I would think you should be returning 'no data', maybe after trying
> once.
>
> Hope this helps.
>
> Chris
>   

sleep() in qemu is very, very wrong.  It will pause the guest's 
execution and all sorts of badness can ensue.

The right thing to do is set a timer and not generate data while 
disconnected.  I still am not confident this is really a great thing to do.

Regards,

Anthony Liguori
diff mbox

Patch

From 1641774343cec683a6e11c22ef2ab162ab2cb842 Mon Sep 17 00:00:00 2001
From: Ian Molton <ian.molton@collabora.co.uk>
Date: Tue, 1 Dec 2009 11:18:41 +0000
Subject: [PATCH 2/4] socket: Add a reconnect option.

	Add a reconnect option that allows sockets to reconnect (after a
specified delay) to the specified server. This makes the virtio-rng driver
useful in production environments where the EGD server may need to be restarted.

Signed-off-by: Ian Molton <ian.molton@collabora.co.uk>
---
 qemu-char.c   |  118 ++++++++++++++++++++++++++++++++++++++-------------------
 qemu-char.h   |    1 +
 qemu-config.c |    3 +
 3 files changed, 83 insertions(+), 39 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index e202585..ec24c29 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -1870,8 +1870,12 @@  typedef struct {
     int max_size;
     int do_telnetopt;
     int do_nodelay;
+    int reconnect;
     int is_unix;
     int msgfd;
+    QemuOpts *opts;
+    CharDriverState *chr;
+    int (*setup)(QemuOpts *opts);
 } TCPCharDriver;
 
 static void tcp_chr_accept(void *opaque);
@@ -2011,6 +2015,8 @@  static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
 }
 #endif
 
+static int qemu_chr_connect_socket(TCPCharDriver *s);
+
 static void tcp_chr_read(void *opaque)
 {
     CharDriverState *chr = opaque;
@@ -2030,10 +2036,18 @@  static void tcp_chr_read(void *opaque)
         if (s->listen_fd >= 0) {
             qemu_set_fd_handler(s->listen_fd, tcp_chr_accept, NULL, chr);
         }
-        qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
+        if (!s->reconnect)
+            qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
         closesocket(s->fd);
         s->fd = -1;
-        qemu_chr_event(chr, CHR_EVENT_CLOSED);
+        if (!s->reconnect) {
+            qemu_chr_event(chr, CHR_EVENT_CLOSED);
+        } else {
+            do {
+                sleep(s->reconnect);
+            } while(!qemu_chr_connect_socket(s));
+            qemu_chr_event(chr, CHR_EVENT_RECONNECTED);
+        }
     } else if (size > 0) {
         if (s->do_telnetopt)
             tcp_chr_process_IAC_bytes(chr, s, buf, &size);
@@ -2137,7 +2151,6 @@  static CharDriverState *qemu_chr_open_socket(QemuOpts *opts)
 {
     CharDriverState *chr = NULL;
     TCPCharDriver *s = NULL;
-    int fd = -1;
     int is_listen;
     int is_waitconnect;
     int do_nodelay;
@@ -2145,34 +2158,40 @@  static CharDriverState *qemu_chr_open_socket(QemuOpts *opts)
     int is_telnet;
 
     is_listen      = qemu_opt_get_bool(opts, "server", 0);
+    is_unix        = qemu_opt_get(opts, "path") != NULL;
+
     is_waitconnect = qemu_opt_get_bool(opts, "wait", 1);
     is_telnet      = qemu_opt_get_bool(opts, "telnet", 0);
     do_nodelay     = !qemu_opt_get_bool(opts, "delay", 1);
-    is_unix        = qemu_opt_get(opts, "path") != NULL;
-    if (!is_listen)
+
+    if (!is_listen) {
         is_waitconnect = 0;
+    } else {
+        if (is_telnet)
+            s->do_telnetopt = 1;
+    }
+
 
-    chr = qemu_mallocz(sizeof(CharDriverState));
     s = qemu_mallocz(sizeof(TCPCharDriver));
+    chr = qemu_mallocz(sizeof(CharDriverState));
+    s->opts = opts;
+
+    if (!is_listen && !is_telnet)
+        s->reconnect = qemu_opt_get_number(opts, "reconnect", 0);
 
     if (is_unix) {
         if (is_listen) {
-            fd = unix_listen_opts(opts);
+            s->setup = unix_listen_opts;
         } else {
-            fd = unix_connect_opts(opts);
+            s->setup = unix_connect_opts;
         }
     } else {
         if (is_listen) {
-            fd = inet_listen_opts(opts, 0);
+            s->setup = inet_listen_opts;
         } else {
-            fd = inet_connect_opts(opts);
+            s->setup = inet_connect_opts;
         }
     }
-    if (fd < 0)
-        goto fail;
-
-    if (!is_waitconnect)
-        socket_set_nonblock(fd);
 
     s->connected = 0;
     s->fd = -1;
@@ -2186,19 +2205,6 @@  static CharDriverState *qemu_chr_open_socket(QemuOpts *opts)
     chr->chr_close = tcp_chr_close;
     chr->get_msgfd = tcp_get_msgfd;
 
-    if (is_listen) {
-        s->listen_fd = fd;
-        qemu_set_fd_handler(s->listen_fd, tcp_chr_accept, NULL, chr);
-        if (is_telnet)
-            s->do_telnetopt = 1;
-
-    } else {
-        s->connected = 1;
-        s->fd = fd;
-        socket_set_nodelay(fd);
-        tcp_chr_connect(chr);
-    }
-
     /* for "info chardev" monitor command */
     chr->filename = qemu_malloc(256);
     if (is_unix) {
@@ -2215,22 +2221,56 @@  static CharDriverState *qemu_chr_open_socket(QemuOpts *opts)
                  qemu_opt_get_bool(opts, "server", 0) ? ",server" : "");
     }
 
-    if (is_listen && is_waitconnect) {
-        printf("QEMU waiting for connection on: %s\n",
-               chr->filename);
-        tcp_chr_accept(chr);
-        socket_set_nonblock(s->listen_fd);
-    }
-    return chr;
+    s->chr = chr;
+
+    if(qemu_chr_connect_socket(s))
+        return chr;
 
- fail:
-    if (fd >= 0)
-        closesocket(fd);
-    qemu_free(s);
     qemu_free(chr);
+    qemu_free(s);
+
     return NULL;
 }
 
+
+static int qemu_chr_connect_socket(TCPCharDriver *s)
+{
+    QemuOpts *opts = s->opts;
+    int is_listen;
+    int fd;
+    int is_waitconnect;
+    int do_nodelay;
+
+    is_waitconnect = qemu_opt_get_bool(opts, "wait", 1);
+    is_listen      = qemu_opt_get_bool(opts, "server", 0);
+    do_nodelay     = !qemu_opt_get_bool(opts, "delay", 1);
+
+
+    fd = s->setup(s->opts);
+    if (fd < 0)
+        return 0;
+
+    if (!is_waitconnect)
+        socket_set_nonblock(fd);
+
+    if (is_listen) {
+        s->listen_fd = fd;
+        qemu_set_fd_handler(s->listen_fd, tcp_chr_accept, NULL, s->chr);
+        if (is_waitconnect) {
+            printf("QEMU waiting for connection on: %s\n",
+                   s->chr->filename);
+            tcp_chr_accept(s->chr);
+            socket_set_nonblock(s->listen_fd);
+        }
+    } else {
+        s->fd = fd;
+        socket_set_nodelay(fd);
+        tcp_chr_connect(s->chr);
+    }
+
+    return 1;
+}
+
 static QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename)
 {
     char host[65], port[33], width[8], height[8];
diff --git a/qemu-char.h b/qemu-char.h
index 9957db1..266536d 100644
--- a/qemu-char.h
+++ b/qemu-char.h
@@ -14,6 +14,7 @@ 
 #define CHR_EVENT_MUX_IN  3 /* mux-focus was set to this terminal */
 #define CHR_EVENT_MUX_OUT 4 /* mux-focus will move on */
 #define CHR_EVENT_CLOSED  5 /* connection closed */
+#define CHR_EVENT_RECONNECTED  6 /* reconnect event */
 
 
 #define CHR_IOCTL_SERIAL_SET_PARAMS   1
diff --git a/qemu-config.c b/qemu-config.c
index 590fc05..ff8b06e 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -140,6 +140,9 @@  QemuOptsList qemu_chardev_opts = {
         },{
             .name = "signal",
             .type = QEMU_OPT_BOOL,
+        },{
+            .name = "reconnect",
+            .type = QEMU_OPT_NUMBER,
         },
         { /* end if list */ }
     },
-- 
1.6.5