diff mbox series

[3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE

Message ID 20180531074601.10647-4-slp@redhat.com
State New
Headers show
Series Avoid retrying on serial_xmit with EPIPE | expand

Commit Message

Sergio Lopez May 31, 2018, 7:46 a.m. UTC
If writing to the frontend channel failed with EPIPE, don't set up a
retry. EPIPE is not a recoverable error, so trying again is a waste of CPU
cycles.

If the vCPU writing to the serial device and emulator thread are pinned
to the same pCPU, it can also compromise the stability of the Guest OS,
as both threads will be competing for pCPU's time, with the vCPU
actively polling the serial device and barely giving time to the
emulator thread to make actual progress.
---
 hw/char/serial.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Marc-André Lureau May 31, 2018, 9:52 a.m. UTC | #1
Hi

On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote:
> If writing to the frontend channel failed with EPIPE, don't set up a
> retry. EPIPE is not a recoverable error, so trying again is a waste of CPU
> cycles.
>
> If the vCPU writing to the serial device and emulator thread are pinned
> to the same pCPU, it can also compromise the stability of the Guest OS,
> as both threads will be competing for pCPU's time, with the vCPU
> actively polling the serial device and barely giving time to the
> emulator thread to make actual progress.
> ---
>  hw/char/serial.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/char/serial.c b/hw/char/serial.c
> index 2c080c9..f26e86b 100644
> --- a/hw/char/serial.c
> +++ b/hw/char/serial.c
> @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s)
>              /* in loopback mode, say that we just received a char */
>              serial_receive1(s, &s->tsr, 1);
>          } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 &&
> +                   errno != EPIPE &&
>                     s->tsr_retry < MAX_XMIT_RETRY) {

Instead of adding explicit handling of EPIPE, shouldn't the code be
rewritten to treat -1 return && errno != EAGAIN as fatal?

>              assert(s->watch_tag == 0);
>              s->watch_tag =
> --
> 1.8.3.1
>
>
Sergio Lopez May 31, 2018, 11:03 a.m. UTC | #2
On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote:
> On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote:
> > If writing to the frontend channel failed with EPIPE, don't set up a
> > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU
> > cycles.
> >
> > If the vCPU writing to the serial device and emulator thread are pinned
> > to the same pCPU, it can also compromise the stability of the Guest OS,
> > as both threads will be competing for pCPU's time, with the vCPU
> > actively polling the serial device and barely giving time to the
> > emulator thread to make actual progress.
> > ---
> >  hw/char/serial.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/hw/char/serial.c b/hw/char/serial.c
> > index 2c080c9..f26e86b 100644
> > --- a/hw/char/serial.c
> > +++ b/hw/char/serial.c
> > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s)
> >              /* in loopback mode, say that we just received a char */
> >              serial_receive1(s, &s->tsr, 1);
> >          } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 &&
> > +                   errno != EPIPE &&
> >                     s->tsr_retry < MAX_XMIT_RETRY) {
> 
> Instead of adding explicit handling of EPIPE, shouldn't the code be
> rewritten to treat -1 return && errno != EAGAIN as fatal?

What kind of action should it trigger if treating errno != EAGAIN as fatal?

If you meant something like calling 'abort()', I think that, while this
could be considered as the "correct" behavior, it's a bit unpractical
for the users. The most common situation for this to happen is
restarting virtlogd while having the console/serial redirected to it.
The VM will stop writing to the serial device, but otherwise the Guest
OS can probably keep working without any issues.

Sergio.
Marc-André Lureau May 31, 2018, 11:07 a.m. UTC | #3
Hi

On Thu, May 31, 2018 at 1:03 PM, Sergio Lopez <slp@redhat.com> wrote:
> On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote:
>> On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote:
>> > If writing to the frontend channel failed with EPIPE, don't set up a
>> > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU
>> > cycles.
>> >
>> > If the vCPU writing to the serial device and emulator thread are pinned
>> > to the same pCPU, it can also compromise the stability of the Guest OS,
>> > as both threads will be competing for pCPU's time, with the vCPU
>> > actively polling the serial device and barely giving time to the
>> > emulator thread to make actual progress.
>> > ---
>> >  hw/char/serial.c | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/hw/char/serial.c b/hw/char/serial.c
>> > index 2c080c9..f26e86b 100644
>> > --- a/hw/char/serial.c
>> > +++ b/hw/char/serial.c
>> > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s)
>> >              /* in loopback mode, say that we just received a char */
>> >              serial_receive1(s, &s->tsr, 1);
>> >          } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 &&
>> > +                   errno != EPIPE &&
>> >                     s->tsr_retry < MAX_XMIT_RETRY) {
>>
>> Instead of adding explicit handling of EPIPE, shouldn't the code be
>> rewritten to treat -1 return && errno != EAGAIN as fatal?
>
> What kind of action should it trigger if treating errno != EAGAIN as fatal?

"fatal" was perhaps the wrong word. I mean that we don't go into the
retry loop and treat the chardev as disconnected (without explicit
handling of EPIPE).

>
> If you meant something like calling 'abort()', I think that, while this
> could be considered as the "correct" behavior, it's a bit unpractical
> for the users. The most common situation for this to happen is
> restarting virtlogd while having the console/serial redirected to it.
> The VM will stop writing to the serial device, but otherwise the Guest
> OS can probably keep working without any issues.
>
> Sergio.
Daniel P. Berrangé June 1, 2018, 11:52 a.m. UTC | #4
On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote:
> > If writing to the frontend channel failed with EPIPE, don't set up a
> > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU
> > cycles.
> >
> > If the vCPU writing to the serial device and emulator thread are pinned
> > to the same pCPU, it can also compromise the stability of the Guest OS,
> > as both threads will be competing for pCPU's time, with the vCPU
> > actively polling the serial device and barely giving time to the
> > emulator thread to make actual progress.
> > ---
> >  hw/char/serial.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/hw/char/serial.c b/hw/char/serial.c
> > index 2c080c9..f26e86b 100644
> > --- a/hw/char/serial.c
> > +++ b/hw/char/serial.c
> > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s)
> >              /* in loopback mode, say that we just received a char */
> >              serial_receive1(s, &s->tsr, 1);
> >          } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 &&
> > +                   errno != EPIPE &&
> >                     s->tsr_retry < MAX_XMIT_RETRY) {
> 
> Instead of adding explicit handling of EPIPE, shouldn't the code be
> rewritten to treat -1 return && errno != EAGAIN as fatal?

Yes, exactly this code is already broken for every single errno
value, not simply EPIPE. It needs fixing to treat '-1' return code
correctly instead of retrying everything.

> 
> >              assert(s->watch_tag == 0);
> >              s->watch_tag =

Regards,
Daniel
Sergio Lopez June 1, 2018, 12:05 p.m. UTC | #5
On Fri, Jun 01, 2018 at 12:52:12PM +0100, Daniel P. Berrangé wrote:
> On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote:
> > Hi
> > 
> > On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote:
> > > If writing to the frontend channel failed with EPIPE, don't set up a
> > > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU
> > > cycles.
> > >
> > > If the vCPU writing to the serial device and emulator thread are pinned
> > > to the same pCPU, it can also compromise the stability of the Guest OS,
> > > as both threads will be competing for pCPU's time, with the vCPU
> > > actively polling the serial device and barely giving time to the
> > > emulator thread to make actual progress.
> > > ---
> > >  hw/char/serial.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/hw/char/serial.c b/hw/char/serial.c
> > > index 2c080c9..f26e86b 100644
> > > --- a/hw/char/serial.c
> > > +++ b/hw/char/serial.c
> > > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s)
> > >              /* in loopback mode, say that we just received a char */
> > >              serial_receive1(s, &s->tsr, 1);
> > >          } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 &&
> > > +                   errno != EPIPE &&
> > >                     s->tsr_retry < MAX_XMIT_RETRY) {
> > 
> > Instead of adding explicit handling of EPIPE, shouldn't the code be
> > rewritten to treat -1 return && errno != EAGAIN as fatal?
> 
> Yes, exactly this code is already broken for every single errno
> value, not simply EPIPE. It needs fixing to treat '-1' return code
> correctly instead of retrying everything.

Given that EAGAIN is already taken care of in
chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or
should we just drop all the tsr_retry logic?

Sergio.
Paolo Bonzini June 1, 2018, 12:16 p.m. UTC | #6
On 01/06/2018 14:05, Sergio Lopez wrote:
>>> Instead of adding explicit handling of EPIPE, shouldn't the code be
>>> rewritten to treat -1 return && errno != EAGAIN as fatal?
>> Yes, exactly this code is already broken for every single errno
>> value, not simply EPIPE. It needs fixing to treat '-1' return code
>> correctly instead of retrying everything.
> Given that EAGAIN is already taken care of in
> chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or
> should we just drop all the tsr_retry logic?

It's not handled there, the call from qemu_chr_fe_write has write_all ==
false.

Paolo
Sergio Lopez June 1, 2018, 12:28 p.m. UTC | #7
On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote:
> On 01/06/2018 14:05, Sergio Lopez wrote:
> >>> Instead of adding explicit handling of EPIPE, shouldn't the code be
> >>> rewritten to treat -1 return && errno != EAGAIN as fatal?
> >> Yes, exactly this code is already broken for every single errno
> >> value, not simply EPIPE. It needs fixing to treat '-1' return code
> >> correctly instead of retrying everything.
> > Given that EAGAIN is already taken care of in
> > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or
> > should we just drop all the tsr_retry logic?
> 
> It's not handled there, the call from qemu_chr_fe_write has write_all ==
> false.

You're right. So should we just retry only for -1 && errno == EAGAIN and
just ignore the error otherwise?

Sergio.
Paolo Bonzini June 1, 2018, 12:31 p.m. UTC | #8
On 01/06/2018 14:28, Sergio Lopez wrote:
> On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote:
>> On 01/06/2018 14:05, Sergio Lopez wrote:
>>>>> Instead of adding explicit handling of EPIPE, shouldn't the code be
>>>>> rewritten to treat -1 return && errno != EAGAIN as fatal?
>>>> Yes, exactly this code is already broken for every single errno
>>>> value, not simply EPIPE. It needs fixing to treat '-1' return code
>>>> correctly instead of retrying everything.
>>> Given that EAGAIN is already taken care of in
>>> chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or
>>> should we just drop all the tsr_retry logic?
>>
>> It's not handled there, the call from qemu_chr_fe_write has write_all ==
>> false.
> 
> You're right. So should we just retry only for -1 && errno == EAGAIN and
> just ignore the error otherwise?

Probably, yes.

Paolo
Peter Maydell June 1, 2018, 12:43 p.m. UTC | #9
On 1 June 2018 at 13:28, Sergio Lopez <slp@redhat.com> wrote:
> On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote:
>> On 01/06/2018 14:05, Sergio Lopez wrote:
>> >>> Instead of adding explicit handling of EPIPE, shouldn't the code be
>> >>> rewritten to treat -1 return && errno != EAGAIN as fatal?
>> >> Yes, exactly this code is already broken for every single errno
>> >> value, not simply EPIPE. It needs fixing to treat '-1' return code
>> >> correctly instead of retrying everything.
>> > Given that EAGAIN is already taken care of in
>> > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or
>> > should we just drop all the tsr_retry logic?
>>
>> It's not handled there, the call from qemu_chr_fe_write has write_all ==
>> false.
>
> You're right. So should we just retry only for -1 && errno == EAGAIN and
> just ignore the error otherwise?

I don't think we should make all of qemu_chr_fe_write()'s callers
have to handle EAGAIN. That function already has a way to tell
the caller that it didn't consume any data, which is to return 0.

(EAGAIN is a curse and we should strive to avoid it spreading its
poison into areas of the codebase that we can keep it out of.)

In general there are three cases I think qemu_chr_fe_write() callers
might care about:
 * data was consumed (return >0)
 * data wasn't consumed, try again later (return 0)
 * data wasn't consumed, and a later attempt won't work either
   (return -1)

NB that hw/char/serial.c is not the only serial device using
this pattern and probably needing adjustment (though I think it's
the only one with the complicated tsr_retry logic).

In the patch that started this thread the report is that we
use lots of CPU. Can you explain why that happens? I was expecting
that we would set up the qemu_chr_fe_add_watch() on the dead
chr FE and it would just never fire since the FE can never
accept further data...

thanks
-- PMM
Sergio Lopez June 1, 2018, 4:20 p.m. UTC | #10
On Fri, Jun 01, 2018 at 01:43:56PM +0100, Peter Maydell wrote:
> On 1 June 2018 at 13:28, Sergio Lopez <slp@redhat.com> wrote:
> > On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote:
> >> On 01/06/2018 14:05, Sergio Lopez wrote:
> >> >>> Instead of adding explicit handling of EPIPE, shouldn't the code be
> >> >>> rewritten to treat -1 return && errno != EAGAIN as fatal?
> >> >> Yes, exactly this code is already broken for every single errno
> >> >> value, not simply EPIPE. It needs fixing to treat '-1' return code
> >> >> correctly instead of retrying everything.
> >> > Given that EAGAIN is already taken care of in
> >> > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or
> >> > should we just drop all the tsr_retry logic?
> >>
> >> It's not handled there, the call from qemu_chr_fe_write has write_all ==
> >> false.
> >
> > You're right. So should we just retry only for -1 && errno == EAGAIN and
> > just ignore the error otherwise?
> 
> I don't think we should make all of qemu_chr_fe_write()'s callers
> have to handle EAGAIN. That function already has a way to tell
> the caller that it didn't consume any data, which is to return 0.
> 
> (EAGAIN is a curse and we should strive to avoid it spreading its
> poison into areas of the codebase that we can keep it out of.)
> 
> In general there are three cases I think qemu_chr_fe_write() callers
> might care about:
>  * data was consumed (return >0)
>  * data wasn't consumed, try again later (return 0)
>  * data wasn't consumed, and a later attempt won't work either
>    (return -1)
> 
> NB that hw/char/serial.c is not the only serial device using
> this pattern and probably needing adjustment (though I think it's
> the only one with the complicated tsr_retry logic).
> 
> In the patch that started this thread the report is that we
> use lots of CPU. Can you explain why that happens? I was expecting
> that we would set up the qemu_chr_fe_add_watch() on the dead
> chr FE and it would just never fire since the FE can never
> accept further data...

The callback is indeed triggered (I haven't looked why yet), and both
the main thread and the vCPU that issued the write to the serial port
(which is now polling the device to see if the transmitter is empty) are
fighting for the qemu_global_mutex.

This issue is heavily amplified if both threads are sharing the same
pCPU (by chance or pinning).

Some debugging info from a simulation:


 - vCPU thread waiting to acquire qemu_global_mutex:

Thread 3 (Thread 0x7fb6c7fff700 (LWP 31641)):
#0  0x00007fb6e591c4cd in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fb6e5917dcb in _L_lock_812 () at /lib64/libpthread.so.0
#2  0x00007fb6e5917c98 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x556dfb7b6420 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:79
#3  0x0000556dfaebcfd7 in qemu_mutex_lock_impl (mutex=mutex@entry=0x556dfb7b6420 <qemu_global_mutex>, file=file@entry=0x556dfaf57477 "/root/Projects/qemu/cpus.c", line=line@entry=1765)
    at util/qemu-thread-posix.c:67
#4  0x0000556dfaacfe58 in qemu_mutex_lock_iothread () at /root/Projects/qemu/cpus.c:1765
#5  0x0000556dfaa90bcd in prepare_mmio_access (mr=0x556dfe71c0e0, mr=0x556dfe71c0e0)
    at /root/Projects/qemu/exec.c:3068
#6  0x0000556dfaa95f98 in flatview_read_continue (fv=fv@entry=0x7fb6c028aa80, addr=addr@entry=1021, attrs=..., attrs@entry=..., buf=buf@entry=0x7fb6e739f000 "", len=len@entry=1, addr1=5, l=1, mr=0x556dfe71c0e0)
    at /root/Projects/qemu/exec.c:3189
---Type <return> to continue, or q <return> to quit---
#7  0x0000556dfaa9617c in flatview_read (fv=0x7fb6c028aa80, addr=1021, attrs=..., buf=0x7fb6e739f000 "", len=1) at /root/Projects/qemu/exec.c:3255
#8  0x0000556dfaa9629f in address_space_read_full (as=<optimized out>, addr=addr@entry=1021, attrs=..., buf=<optimized out>, len=len@entry=1) at /root/Projects/qemu/exec.c:3268
#9  0x0000556dfaa963fa in address_space_rw (as=<optimized out>, addr=addr@entry=1021, attrs=...,
    attrs@entry=..., buf=<optimized out>, len=len@entry=1, is_write=is_write@entry=false)
    at /root/Projects/qemu/exec.c:3298
#10 0x0000556dfaaf6536 in kvm_cpu_exec (count=1, size=1, direction=<optimized out>, data=<optimized out>, attrs=..., port=1021) at /root/Projects/qemu/accel/kvm/kvm-all.c:1730
#11 0x0000556dfaaf6536 in kvm_cpu_exec (cpu=cpu@entry=0x556dfd3d8020)
    at /root/Projects/qemu/accel/kvm/kvm-all.c:1970
#12 0x0000556dfaad0006 in qemu_kvm_cpu_thread_fn (arg=0x556dfd3d8020) at /root/Projects/qemu/cpus.c:1215
#13 0x00007fb6e5915dd5 in start_thread (arg=0x7fb6c7fff700) at pthread_create.c:308
#14 0x00007fb6d9e97b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113


 - The owner is LWP 31634, the main thread

 (gdb) p qemu_global_mutex
$1 = {lock = {__data = {__lock = 2, __count = 0, __owner = 31634, __nusers = 1, __kind = 0, __spins = 0,
      __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
    __size = "\002\000\000\000\000\000\000\000\222{\000\000\001", '\000' <repeats 26 times>,
    __align = 2}, initialized = true}


 - The main thead is in the callback

 Thread 1 (Thread 0x7fb6e7355cc0 (LWP 31634)):
---Type <return> to continue, or q <return> to quit---
#0  0x0000556dfac59420 in serial_watch_cb (chan=0x556dfd321400, cond=G_IO_OUT, opaque=0x556dfe71c000)
    at hw/char/serial.c:233
#1  0x00007fb6e68638f9 in g_main_context_dispatch (context=0x556dfd314210) at gmain.c:3146
#2  0x00007fb6e68638f9 in g_main_context_dispatch (context=context@entry=0x556dfd314210) at gmain.c:3811
#3  0x0000556dfaeba126 in main_loop_wait () at util/main-loop.c:215
#4  0x0000556dfaeba126 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:263
#5  0x0000556dfaeba126 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:522
#6  0x0000556dfaa89a2f in main () at vl.c:1943
#7  0x0000556dfaa89a2f in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4679


Given your question, I was wondering what would happen if the callback
never fired. The result is much, much worse, as QEMU doesn't set
UART_LSR_TEMT, and the Guest OS (RHEL7.5) keeps polling the serial
device for ~10 seconds. Of course this can be regarded as a issue if the
Guest's serial driver, but I'm pretty sure we don't want to break it ;-)

This is not the first time I'm fighting the retry logic in serial.c.
Some versions back, serial_xmit would happily add up watchers with no
limit, eventually exceeding FD_SETSIZE limit. I'm starting to thing that
tsr_retry logic does more harm than good.

In any case, I think the question on the table is when should
serial_xmit retry writing to the FE, and we seem to have two options:

 1. When ret != 1 and errno == EAGAIN.

 2. When ret == 0.

I'm good with either, so I leave the decision to you.

Sergio.
diff mbox series

Patch

diff --git a/hw/char/serial.c b/hw/char/serial.c
index 2c080c9..f26e86b 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -262,6 +262,7 @@  static void serial_xmit(SerialState *s)
             /* in loopback mode, say that we just received a char */
             serial_receive1(s, &s->tsr, 1);
         } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 &&
+                   errno != EPIPE &&
                    s->tsr_retry < MAX_XMIT_RETRY) {
             assert(s->watch_tag == 0);
             s->watch_tag =