Message ID | 20180110131832.16623-1-klim.kireev@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | chardev/char-socket: add POLLHUP handler | expand |
On 10/01/2018 14:18, Klim Kireev wrote: > The following behavior was observed for QEMU configured by libvirt > to use guest agent as usual for the guests without virtio-serial > driver (Windows or the guest remaining in BIOS stage). > > In QEMU on first connect to listen character device socket > the listen socket is removed from poll just after the accept(). > virtio_serial_guest_ready() returns 0 and the descriptor > of the connected Unix socket is removed from poll and it will > not be present in poll() until the guest will initialize the driver > and change the state of the serial to "guest connected". > > In libvirt connect() to guest agent is performed on restart and > is run under VM state lock. Connect() is blocking and can > wait forever. > In this case libvirt can not perform ANY operation on that VM. > > The bug can be easily reproduced this way: > > Terminal 1: > qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait > (virtio-serial and isa-serial also fit) > > Terminal 2: > minicom -D unix\#/tmp/console.sock > (type something and pres enter) > C-a x (to exit) > > Do 3 times: > minicom -D unix\#/tmp/console.sock > C-a x > > It needs 4 connections, because the first one is accepted by QEMU, then two are queued by > the kernel, and the 4th blocks. > > The problem is that QEMU doesn't add a read watcher after succesful read > until the guest device wants to acquire recieved data, so > I propose to install a separate pullhup watcher regardless of > whether the device waits for data or not. After closing the connection, > the guest has a capability to read the data within timeout. I don't understand the timeout part. Apart from that, maybe the bug is in io_watch_poll_prepare, which needs to _always_ set up a "G_IO_ERR | G_IO_HUP | G_IO_NVAL" watch. Only G_IO_IN depends on iwp->fd_can_read(...) > 0. So the change would start with something like this: diff --git a/chardev/char-io.c b/chardev/char-io.c index f81052481a..a5e65d4e7c 100644 --- a/chardev/char-io.c +++ b/chardev/char-io.c @@ -29,6 +29,7 @@ typedef struct IOWatchPoll { QIOChannel *ioc; GSource *src; + GIOCondition cond; IOCanReadHandler *fd_can_read; GSourceFunc fd_read; @@ -41,25 +42,32 @@ static IOWatchPoll *io_watch_poll_from_source(GSource *source) return container_of(source, IOWatchPoll, parent); } +static void io_watch_poll_destroy_source(IOWatchPoll *iwp) +{ + if (iwp->src) { + g_source_destroy(iwp->src); + g_source_unref(iwp->src); + iwp->src = NULL; + iwp->cond = 0; + } +} + static gboolean io_watch_poll_prepare(GSource *source, gint *timeout) { IOWatchPoll *iwp = io_watch_poll_from_source(source); bool now_active = iwp->fd_can_read(iwp->opaque) > 0; - bool was_active = iwp->src != NULL; - if (was_active == now_active) { - return FALSE; + GIOCondition cond = G_IO_ERR | G_IO_HUP | G_IO_NVAL; + if (now_active) { + cond |= G_IO_IN; } - if (now_active) { - iwp->src = qio_channel_create_watch( - iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL); + if (iwp->cond != cond) { + io_watch_poll_destroy_source(iwp); + iwp->cond = cond; + iwp->src = qio_channel_create_watch(iwp->ioc, cond); g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL); g_source_attach(iwp->src, iwp->context); - } else { - g_source_destroy(iwp->src); - g_source_unref(iwp->src); - iwp->src = NULL; } return FALSE; } @@ -131,11 +139,7 @@ static void io_remove_watch_poll(GSource *source) IOWatchPoll *iwp; iwp = io_watch_poll_from_source(source); - if (iwp->src) { - g_source_destroy(iwp->src); - g_source_unref(iwp->src); - iwp->src = NULL; - } + io_watch_poll_destroy_source(iwp); g_source_destroy(&iwp->parent); } Thanks, Paolo
Hi On Wed, Jan 10, 2018 at 2:18 PM, Klim Kireev <klim.kireev@virtuozzo.com> wrote: > The following behavior was observed for QEMU configured by libvirt > to use guest agent as usual for the guests without virtio-serial > driver (Windows or the guest remaining in BIOS stage). > > In QEMU on first connect to listen character device socket > the listen socket is removed from poll just after the accept(). > virtio_serial_guest_ready() returns 0 and the descriptor > of the connected Unix socket is removed from poll and it will > not be present in poll() until the guest will initialize the driver > and change the state of the serial to "guest connected". > > In libvirt connect() to guest agent is performed on restart and > is run under VM state lock. Connect() is blocking and can > wait forever. > In this case libvirt can not perform ANY operation on that VM. Adding Daniel in CC for comments about libvirt behaviour. > > The bug can be easily reproduced this way: > > Terminal 1: > qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait > (virtio-serial and isa-serial also fit) > > Terminal 2: > minicom -D unix\#/tmp/console.sock > (type something and pres enter) > C-a x (to exit) > > Do 3 times: > minicom -D unix\#/tmp/console.sock > C-a x > > It needs 4 connections, because the first one is accepted by QEMU, then two are queued by > the kernel, and the 4th blocks. > > The problem is that QEMU doesn't add a read watcher after succesful read > until the guest device wants to acquire recieved data, so > I propose to install a separate pullhup watcher regardless of > whether the device waits for data or not. After closing the connection, > the guest has a capability to read the data within timeout. This patch could use some tests in test-char.c > > Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com> > --- > chardev/char-socket.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 54 insertions(+), 1 deletion(-) > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c > index 630a7f2995..eb120b2aab 100644 > --- a/chardev/char-socket.c > +++ b/chardev/char-socket.c > @@ -37,6 +37,8 @@ > > #define TCP_MAX_FDS 16 > > +#define DATA_TIMEOUT 1 Hmm... why is that data timeout necessary? > + > typedef struct { > Chardev parent; > QIOChannel *ioc; /* Client I/O channel */ > @@ -57,6 +59,9 @@ typedef struct { > bool is_telnet; > bool is_tn3270; > > + guint data_timer; > + guint destroy_tag; > + > guint reconnect_timer; > int64_t reconnect_time; > bool connect_err_reported; > @@ -341,6 +346,15 @@ static void tcp_chr_free_connection(Chardev *chr) > s->read_msgfds_num = 0; > } > > + if (s->destroy_tag != 0) { > + g_source_remove(s->destroy_tag); > + s->destroy_tag = 0; > + } > + if (s->data_timer != 0) { > + g_source_remove(s->data_timer); > + s->data_timer = 0; > + } > + > tcp_set_msgfds(chr, NULL, 0); > remove_fd_in_watch(chr); > object_unref(OBJECT(s->sioc)); > @@ -444,6 +458,37 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque) > return TRUE; > } > > +static gboolean tcp_chr_data_timeout(gpointer opaque) > +{ > + Chardev *chr = CHARDEV(opaque); > + > + if (tcp_chr_read_poll(chr) <= 0) { > + tcp_chr_disconnect(chr); > + return TRUE; > + } else { > + tcp_chr_read(NULL, 0, opaque); > + return TRUE; > + } Easier to have a single return at the end of the function. Please use G_SOURCE_CONTINUE (or rather REMOVE here?) > +} > + > +static gboolean tcp_chr_destroy(QIOChannel *channel, > + GIOCondition cond, > + void *opaque) > +{ It's a HUP handler, please call it that way to avoid confusion. > + Chardev *chr = CHARDEV(opaque); > + SocketChardev *s = SOCKET_CHARDEV(chr); > + tcp_chr_read(channel, cond, opaque); I don't think we should call tcp_chr_read() unless the frontend is ready to receive data > + if (s->connected != 0) { > + s->data_timer = g_timeout_add_seconds(DATA_TIMEOUT, > + tcp_chr_data_timeout, chr); > + if (s->destroy_tag != 0) { > + g_source_remove(s->destroy_tag); > + s->destroy_tag = 0; > + } > + } > + return TRUE; Use G_SOURCE_CONTINUE (or REMOVE?) > +} > + > static int tcp_chr_sync_read(Chardev *chr, const uint8_t *buf, int len) > { > SocketChardev *s = SOCKET_CHARDEV(chr); > @@ -517,6 +562,10 @@ static void tcp_chr_connect(void *opaque) > tcp_chr_read, > chr, chr->gcontext); > } > + if (s->destroy_tag == 0) { > + s->destroy_tag = qio_channel_add_watch(s->ioc, G_IO_HUP, > + tcp_chr_destroy, chr, NULL); > + } > qemu_chr_be_event(chr, CHR_EVENT_OPENED); > } > > @@ -535,7 +584,11 @@ static void tcp_chr_update_read_handler(Chardev *chr) > tcp_chr_read, chr, > chr->gcontext); > } > -} > + if (s->destroy_tag == 0) { > + s->destroy_tag = qio_channel_add_watch(s->ioc, G_IO_HUP, > + tcp_chr_destroy, chr, NULL); > + } Perhaps the watch can be created initially. > + } > > typedef struct { > Chardev *chr; > -- > 2.14.3 > >
On Tue, Jan 16, 2018 at 06:56:20PM +0100, Marc-André Lureau wrote: > Hi > > On Wed, Jan 10, 2018 at 2:18 PM, Klim Kireev <klim.kireev@virtuozzo.com> wrote: > > The following behavior was observed for QEMU configured by libvirt > > to use guest agent as usual for the guests without virtio-serial > > driver (Windows or the guest remaining in BIOS stage). > > > > In QEMU on first connect to listen character device socket > > the listen socket is removed from poll just after the accept(). > > virtio_serial_guest_ready() returns 0 and the descriptor > > of the connected Unix socket is removed from poll and it will > > not be present in poll() until the guest will initialize the driver > > and change the state of the serial to "guest connected". > > > > In libvirt connect() to guest agent is performed on restart and > > is run under VM state lock. Connect() is blocking and can > > wait forever. > > In this case libvirt can not perform ANY operation on that VM. > > Adding Daniel in CC for comments about libvirt behaviour. This is a libvirt bug - libvirt should put a finite timeout on connecting to the guest agent socket to avoid this. Regards, Daniel
On 01/16/2018 08:25 PM, Paolo Bonzini wrote: > On 10/01/2018 14:18, Klim Kireev wrote: >> The following behavior was observed for QEMU configured by libvirt >> to use guest agent as usual for the guests without virtio-serial >> driver (Windows or the guest remaining in BIOS stage). >> >> In QEMU on first connect to listen character device socket >> the listen socket is removed from poll just after the accept(). >> virtio_serial_guest_ready() returns 0 and the descriptor >> of the connected Unix socket is removed from poll and it will >> not be present in poll() until the guest will initialize the driver >> and change the state of the serial to "guest connected". >> >> In libvirt connect() to guest agent is performed on restart and >> is run under VM state lock. Connect() is blocking and can >> wait forever. >> In this case libvirt can not perform ANY operation on that VM. >> >> The bug can be easily reproduced this way: >> >> Terminal 1: >> qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait >> (virtio-serial and isa-serial also fit) >> >> Terminal 2: >> minicom -D unix\#/tmp/console.sock >> (type something and pres enter) >> C-a x (to exit) >> >> Do 3 times: >> minicom -D unix\#/tmp/console.sock >> C-a x >> >> It needs 4 connections, because the first one is accepted by QEMU, then two are queued by >> the kernel, and the 4th blocks. >> >> The problem is that QEMU doesn't add a read watcher after succesful read >> until the guest device wants to acquire recieved data, so >> I propose to install a separate pullhup watcher regardless of >> whether the device waits for data or not. After closing the connection, >> the guest has a capability to read the data within timeout. > I don't understand the timeout part. The timeout is important, because of following situation: client sends to the guest big bunch of data and closes his end of socket. if we just destroy the connection on the qemu's side, the guest can't read remaining data in channel. (which can be much more than guest's buffer), although he would do it later. For this case timeout after POLLHUP is added. It is set after receiving POLLHUP and reset after each reading. > Apart from that, maybe the bug is in io_watch_poll_prepare, which needs > to _always_ set up a "G_IO_ERR | G_IO_HUP | G_IO_NVAL" watch. Only > G_IO_IN depends on iwp->fd_can_read(...) > 0. > > So the change would start with something like this: > > diff --git a/chardev/char-io.c b/chardev/char-io.c > index f81052481a..a5e65d4e7c 100644 > --- a/chardev/char-io.c > +++ b/chardev/char-io.c > @@ -29,6 +29,7 @@ typedef struct IOWatchPoll { > > QIOChannel *ioc; > GSource *src; > + GIOCondition cond; > > IOCanReadHandler *fd_can_read; > GSourceFunc fd_read; > @@ -41,25 +42,32 @@ static IOWatchPoll *io_watch_poll_from_source(GSource *source) > return container_of(source, IOWatchPoll, parent); > } > > +static void io_watch_poll_destroy_source(IOWatchPoll *iwp) > +{ > + if (iwp->src) { > + g_source_destroy(iwp->src); > + g_source_unref(iwp->src); > + iwp->src = NULL; > + iwp->cond = 0; > + } > +} > + > static gboolean io_watch_poll_prepare(GSource *source, > gint *timeout) > { > IOWatchPoll *iwp = io_watch_poll_from_source(source); > bool now_active = iwp->fd_can_read(iwp->opaque) > 0; > - bool was_active = iwp->src != NULL; > - if (was_active == now_active) { > - return FALSE; > + GIOCondition cond = G_IO_ERR | G_IO_HUP | G_IO_NVAL; > + if (now_active) { > + cond |= G_IO_IN; > } > > - if (now_active) { > - iwp->src = qio_channel_create_watch( > - iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL); > + if (iwp->cond != cond) { > + io_watch_poll_destroy_source(iwp); > + iwp->cond = cond; > + iwp->src = qio_channel_create_watch(iwp->ioc, cond); > g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL); > g_source_attach(iwp->src, iwp->context); > - } else { > - g_source_destroy(iwp->src); > - g_source_unref(iwp->src); > - iwp->src = NULL; > } > return FALSE; > } > @@ -131,11 +139,7 @@ static void io_remove_watch_poll(GSource *source) > IOWatchPoll *iwp; > > iwp = io_watch_poll_from_source(source); > - if (iwp->src) { > - g_source_destroy(iwp->src); > - g_source_unref(iwp->src); > - iwp->src = NULL; > - } > + io_watch_poll_destroy_source(iwp); > g_source_destroy(&iwp->parent); > } > > > > Thanks, > > Paolo
On 18/01/2018 10:41, klim wrote: >>> >>> I propose to install a separate pullhup watcher regardless of >>> whether the device waits for data or not. After closing the connection, >>> the guest has a capability to read the data within timeout. >> I don't understand the timeout part. > The timeout is important, because of following situation: > client sends to the guest big bunch of data and closes his end of socket. > if we just destroy the connection on the qemu's side, the guest can't > read remaining data in channel. > (which can be much more than guest's buffer), although he would do it > later. > For this case timeout after POLLHUP is added. It is set after receiving > POLLHUP and reset after each reading. But why would one second be enough? Paolo
On 01/18/2018 12:44 PM, Paolo Bonzini wrote: > On 18/01/2018 10:41, klim wrote: >>>> I propose to install a separate pullhup watcher regardless of >>>> whether the device waits for data or not. After closing the connection, >>>> the guest has a capability to read the data within timeout. >>> I don't understand the timeout part. >> The timeout is important, because of following situation: >> client sends to the guest big bunch of data and closes his end of socket. >> if we just destroy the connection on the qemu's side, the guest can't >> read remaining data in channel. >> (which can be much more than guest's buffer), although he would do it >> later. >> For this case timeout after POLLHUP is added. It is set after receiving >> POLLHUP and reset after each reading. > But why would one second be enough? One second was merely a suggestion. Maybe 10 seconds would be better. > > Paolo
On Thu, Jan 18, 2018 at 12:41:08PM +0300, klim wrote: > On 01/16/2018 08:25 PM, Paolo Bonzini wrote: > > On 10/01/2018 14:18, Klim Kireev wrote: > > > The following behavior was observed for QEMU configured by libvirt > > > to use guest agent as usual for the guests without virtio-serial > > > driver (Windows or the guest remaining in BIOS stage). > > > > > > In QEMU on first connect to listen character device socket > > > the listen socket is removed from poll just after the accept(). > > > virtio_serial_guest_ready() returns 0 and the descriptor > > > of the connected Unix socket is removed from poll and it will > > > not be present in poll() until the guest will initialize the driver > > > and change the state of the serial to "guest connected". > > > > > > In libvirt connect() to guest agent is performed on restart and > > > is run under VM state lock. Connect() is blocking and can > > > wait forever. > > > In this case libvirt can not perform ANY operation on that VM. > > > > > > The bug can be easily reproduced this way: > > > > > > Terminal 1: > > > qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait > > > (virtio-serial and isa-serial also fit) > > > > > > Terminal 2: > > > minicom -D unix\#/tmp/console.sock > > > (type something and pres enter) > > > C-a x (to exit) > > > > > > Do 3 times: > > > minicom -D unix\#/tmp/console.sock > > > C-a x > > > > > > It needs 4 connections, because the first one is accepted by QEMU, then two are queued by > > > the kernel, and the 4th blocks. > > > > > > The problem is that QEMU doesn't add a read watcher after succesful read > > > until the guest device wants to acquire recieved data, so > > > I propose to install a separate pullhup watcher regardless of > > > whether the device waits for data or not. After closing the connection, > > > the guest has a capability to read the data within timeout. > > I don't understand the timeout part. > The timeout is important, because of following situation: > client sends to the guest big bunch of data and closes his end of socket. > if we just destroy the connection on the qemu's side, the guest can't read > remaining data in channel. Why is that a problem that needs solving ? IMHO if the clients wants any kind of assurance that the guest has read all the data, it should keep the socket open and wait for a reply from the guest agent. It is totally reasonable for the request to be dropped/partially sent if the client closes the socket without waiting. The guest should be robust to seeing such partial messages. Regards, Daniel
diff --git a/chardev/char-socket.c b/chardev/char-socket.c index 630a7f2995..eb120b2aab 100644 --- a/chardev/char-socket.c +++ b/chardev/char-socket.c @@ -37,6 +37,8 @@ #define TCP_MAX_FDS 16 +#define DATA_TIMEOUT 1 + typedef struct { Chardev parent; QIOChannel *ioc; /* Client I/O channel */ @@ -57,6 +59,9 @@ typedef struct { bool is_telnet; bool is_tn3270; + guint data_timer; + guint destroy_tag; + guint reconnect_timer; int64_t reconnect_time; bool connect_err_reported; @@ -341,6 +346,15 @@ static void tcp_chr_free_connection(Chardev *chr) s->read_msgfds_num = 0; } + if (s->destroy_tag != 0) { + g_source_remove(s->destroy_tag); + s->destroy_tag = 0; + } + if (s->data_timer != 0) { + g_source_remove(s->data_timer); + s->data_timer = 0; + } + tcp_set_msgfds(chr, NULL, 0); remove_fd_in_watch(chr); object_unref(OBJECT(s->sioc)); @@ -444,6 +458,37 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque) return TRUE; } +static gboolean tcp_chr_data_timeout(gpointer opaque) +{ + Chardev *chr = CHARDEV(opaque); + + if (tcp_chr_read_poll(chr) <= 0) { + tcp_chr_disconnect(chr); + return TRUE; + } else { + tcp_chr_read(NULL, 0, opaque); + return TRUE; + } +} + +static gboolean tcp_chr_destroy(QIOChannel *channel, + GIOCondition cond, + void *opaque) +{ + Chardev *chr = CHARDEV(opaque); + SocketChardev *s = SOCKET_CHARDEV(chr); + tcp_chr_read(channel, cond, opaque); + if (s->connected != 0) { + s->data_timer = g_timeout_add_seconds(DATA_TIMEOUT, + tcp_chr_data_timeout, chr); + if (s->destroy_tag != 0) { + g_source_remove(s->destroy_tag); + s->destroy_tag = 0; + } + } + return TRUE; +} + static int tcp_chr_sync_read(Chardev *chr, const uint8_t *buf, int len) { SocketChardev *s = SOCKET_CHARDEV(chr); @@ -517,6 +562,10 @@ static void tcp_chr_connect(void *opaque) tcp_chr_read, chr, chr->gcontext); } + if (s->destroy_tag == 0) { + s->destroy_tag = qio_channel_add_watch(s->ioc, G_IO_HUP, + tcp_chr_destroy, chr, NULL); + } qemu_chr_be_event(chr, CHR_EVENT_OPENED); } @@ -535,7 +584,11 @@ static void tcp_chr_update_read_handler(Chardev *chr) tcp_chr_read, chr, chr->gcontext); } -} + if (s->destroy_tag == 0) { + s->destroy_tag = qio_channel_add_watch(s->ioc, G_IO_HUP, + tcp_chr_destroy, chr, NULL); + } + } typedef struct { Chardev *chr;
The following behavior was observed for QEMU configured by libvirt to use guest agent as usual for the guests without virtio-serial driver (Windows or the guest remaining in BIOS stage). In QEMU on first connect to listen character device socket the listen socket is removed from poll just after the accept(). virtio_serial_guest_ready() returns 0 and the descriptor of the connected Unix socket is removed from poll and it will not be present in poll() until the guest will initialize the driver and change the state of the serial to "guest connected". In libvirt connect() to guest agent is performed on restart and is run under VM state lock. Connect() is blocking and can wait forever. In this case libvirt can not perform ANY operation on that VM. The bug can be easily reproduced this way: Terminal 1: qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait (virtio-serial and isa-serial also fit) Terminal 2: minicom -D unix\#/tmp/console.sock (type something and pres enter) C-a x (to exit) Do 3 times: minicom -D unix\#/tmp/console.sock C-a x It needs 4 connections, because the first one is accepted by QEMU, then two are queued by the kernel, and the 4th blocks. The problem is that QEMU doesn't add a read watcher after succesful read until the guest device wants to acquire recieved data, so I propose to install a separate pullhup watcher regardless of whether the device waits for data or not. After closing the connection, the guest has a capability to read the data within timeout. Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com> --- chardev/char-socket.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-)