Patchwork [RESEND] Make char muxer more robust wrt small FIFOs

login
register
mail settings
Submitter Alexander Graf
Date April 20, 2010, 4:56 p.m.
Message ID <1271782566-16420-1-git-send-email-agraf@suse.de>
Download mbox | patch
Permalink /patch/50548/
State New
Headers show

Comments

Alexander Graf - April 20, 2010, 4:56 p.m.
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.

While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.

To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.

This patch fixes input when using -nographic on s390 for me.

---

Please consider for stable.
---
 qemu-char.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)
Anthony Liguori - May 4, 2010, 1:44 p.m.
On 04/20/2010 11:56 AM, Alexander Graf wrote:
> Virtio-Console can only process one character at a time. Using it on S390
> gave me strage "lags" where I got the character I pressed before when
> pressing one. So I typed in "abc" and only received "a", then pressed "d"
> but the guest received "b" and so on.
>
> While the stdio driver calls a poll function that just processes on its
> queue in case virtio-console can't take multiple characters at once, the
> muxer does not have such callbacks, so it can't empty its queue.
>
> To work around that limitation, I introduced a new timer that only gets
> active when the guest can not receive any more characters. In that case
> it polls again after a while to check if the guest is now receiving input.
>
> This patch fixes input when using -nographic on s390 for me.
>    

I think this is really a kvm issue.  I assume it's because s390 idles in 
the kernel so you never drop to userspace to repoll the descriptor.

A timer is a hacky solution.  You really need to use an io thread to 
solve this and then you need to switch away from qemu_set_fd_handler2 to 
qemu_set_fd_handler() and make sure that the later breaks select 
whenever it's invoked.

Regards,

Anthony Liguori

> ---
>
> Please consider for stable.
> ---
>   qemu-char.c |   10 ++++++++++
>   1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/qemu-char.c b/qemu-char.c
> index 05df971..ce9df3a 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -235,6 +235,7 @@ typedef struct {
>       IOEventHandler *chr_event[MAX_MUX];
>       void *ext_opaque[MAX_MUX];
>       CharDriverState *drv;
> +    QEMUTimer *accept_timer;
>       int focus;
>       int mux_cnt;
>       int term_got_escape;
> @@ -396,6 +397,13 @@ static void mux_chr_accept_input(CharDriverState *chr)
>           d->chr_read[m](d->ext_opaque[m],
>                          &d->buffer[m][d->cons[m]++&  MUX_BUFFER_MASK], 1);
>       }
> +
> +    /* We're still not able to sync producer and consumer, so let's wait a bit
> +       and try again by then. */
> +    if (d->prod[m] != d->cons[m]) {
> +        qemu_mod_timer(d->accept_timer, qemu_get_clock(vm_clock)
> +                                        + (int64_t)100000);
> +    }
>   }
>
>   static int mux_chr_can_read(void *opaque)
> @@ -478,6 +486,8 @@ static CharDriverState *qemu_chr_open_mux(CharDriverState *drv)
>       chr->opaque = d;
>       d->drv = drv;
>       d->focus = -1;
> +    d->accept_timer = qemu_new_timer(vm_clock,
> +                                     (QEMUTimerCB*)mux_chr_accept_input, chr);
>       chr->chr_write = mux_chr_write;
>       chr->chr_update_read_handler = mux_chr_update_read_handler;
>       chr->chr_accept_input = mux_chr_accept_input;
>
Alexander Graf - May 4, 2010, 2:30 p.m.
Am 04.05.2010 um 15:44 schrieb Anthony Liguori <anthony@codemonkey.ws>:

> On 04/20/2010 11:56 AM, Alexander Graf wrote:
>> Virtio-Console can only process one character at a time. Using it  
>> on S390
>> gave me strage "lags" where I got the character I pressed before when
>> pressing one. So I typed in "abc" and only received "a", then  
>> pressed "d"
>> but the guest received "b" and so on.
>>
>> While the stdio driver calls a poll function that just processes on  
>> its
>> queue in case virtio-console can't take multiple characters at  
>> once, the
>> muxer does not have such callbacks, so it can't empty its queue.
>>
>> To work around that limitation, I introduced a new timer that only  
>> gets
>> active when the guest can not receive any more characters. In that  
>> case
>> it polls again after a while to check if the guest is now receiving  
>> input.
>>
>> This patch fixes input when using -nographic on s390 for me.
>>
>
> I think this is really a kvm issue.  I assume it's because s390  
> idles in the kernel so you never drop to userspace to repoll the  
> descriptor.

There is no polling for the muxer. That's why it never knows when  
virtio-console can receive again.

This patch basically adfs timer based polling for that exact case.


Alex

>
> A timer is a hacky solution.  You really need to use an io thread to  
> solve this and then you need to switch away from  
> qemu_set_fd_handler2 to qemu_set_fd_handler() and make sure that the  
> later breaks select whenever it's invoked.
>
> Regards,
>
> Anthony Liguori
>
>> ---
>>
>> Please consider for stable.
>> ---
>>  qemu-char.c |   10 ++++++++++
>>  1 files changed, 10 insertions(+), 0 deletions(-)
>>
>> diff --git a/qemu-char.c b/qemu-char.c
>> index 05df971..ce9df3a 100644
>> --- a/qemu-char.c
>> +++ b/qemu-char.c
>> @@ -235,6 +235,7 @@ typedef struct {
>>      IOEventHandler *chr_event[MAX_MUX];
>>      void *ext_opaque[MAX_MUX];
>>      CharDriverState *drv;
>> +    QEMUTimer *accept_timer;
>>      int focus;
>>      int mux_cnt;
>>      int term_got_escape;
>> @@ -396,6 +397,13 @@ static void mux_chr_accept_input 
>> (CharDriverState *chr)
>>          d->chr_read[m](d->ext_opaque[m],
>>                         &d->buffer[m][d->cons[m]++&   
>> MUX_BUFFER_MASK], 1);
>>      }
>> +
>> +    /* We're still not able to sync producer and consumer, so  
>> let's wait a bit
>> +       and try again by then. */
>> +    if (d->prod[m] != d->cons[m]) {
>> +        qemu_mod_timer(d->accept_timer, qemu_get_clock(vm_clock)
>> +                                        + (int64_t)100000);
>> +    }
>>  }
>>
>>  static int mux_chr_can_read(void *opaque)
>> @@ -478,6 +486,8 @@ static CharDriverState *qemu_chr_open_mux 
>> (CharDriverState *drv)
>>      chr->opaque = d;
>>      d->drv = drv;
>>      d->focus = -1;
>> +    d->accept_timer = qemu_new_timer(vm_clock,
>> +                                     (QEMUTimerCB*) 
>> mux_chr_accept_input, chr);
>>      chr->chr_write = mux_chr_write;
>>      chr->chr_update_read_handler = mux_chr_update_read_handler;
>>      chr->chr_accept_input = mux_chr_accept_input;
>>
>
Anthony Liguori - May 4, 2010, 2:34 p.m.
On 05/04/2010 09:30 AM, Alexander Graf wrote:
>
>
> Am 04.05.2010 um 15:44 schrieb Anthony Liguori <anthony@codemonkey.ws>:
>
>> On 04/20/2010 11:56 AM, Alexander Graf wrote:
>>> Virtio-Console can only process one character at a time. Using it on 
>>> S390
>>> gave me strage "lags" where I got the character I pressed before when
>>> pressing one. So I typed in "abc" and only received "a", then 
>>> pressed "d"
>>> but the guest received "b" and so on.
>>>
>>> While the stdio driver calls a poll function that just processes on its
>>> queue in case virtio-console can't take multiple characters at once, 
>>> the
>>> muxer does not have such callbacks, so it can't empty its queue.
>>>
>>> To work around that limitation, I introduced a new timer that only gets
>>> active when the guest can not receive any more characters. In that case
>>> it polls again after a while to check if the guest is now receiving 
>>> input.
>>>
>>> This patch fixes input when using -nographic on s390 for me.
>>>
>>
>> I think this is really a kvm issue.  I assume it's because s390 idles 
>> in the kernel so you never drop to userspace to repoll the descriptor.
>
> There is no polling for the muxer. That's why it never knows when 
> virtio-console can receive again.

Maybe I'm missing something simple, but it looks to me like the muxer is 
polling.  mux_chr_can_read() is going to eventually poll the muxed 
devices to figure this out.

If the root of the problem is that mux_chr_can_read() isn't being 
invoked for a prolonged period of time, the real issue is the problem I 
described.

Regards,

Anthony Liguori

> This patch basically adfs timer based polling for that exact case.
>
>
> Alex
>
>>
>> A timer is a hacky solution.  You really need to use an io thread to 
>> solve this and then you need to switch away from qemu_set_fd_handler2 
>> to qemu_set_fd_handler() and make sure that the later breaks select 
>> whenever it's invoked.
>>
>> Regards,
>>
>> Anthony Liguori
>>
>>> ---
>>>
>>> Please consider for stable.
>>> ---
>>>  qemu-char.c |   10 ++++++++++
>>>  1 files changed, 10 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/qemu-char.c b/qemu-char.c
>>> index 05df971..ce9df3a 100644
>>> --- a/qemu-char.c
>>> +++ b/qemu-char.c
>>> @@ -235,6 +235,7 @@ typedef struct {
>>>      IOEventHandler *chr_event[MAX_MUX];
>>>      void *ext_opaque[MAX_MUX];
>>>      CharDriverState *drv;
>>> +    QEMUTimer *accept_timer;
>>>      int focus;
>>>      int mux_cnt;
>>>      int term_got_escape;
>>> @@ -396,6 +397,13 @@ static void 
>>> mux_chr_accept_input(CharDriverState *chr)
>>>          d->chr_read[m](d->ext_opaque[m],
>>> &d->buffer[m][d->cons[m]++&  MUX_BUFFER_MASK], 1);
>>>      }
>>> +
>>> +    /* We're still not able to sync producer and consumer, so let's 
>>> wait a bit
>>> +       and try again by then. */
>>> +    if (d->prod[m] != d->cons[m]) {
>>> +        qemu_mod_timer(d->accept_timer, qemu_get_clock(vm_clock)
>>> +                                        + (int64_t)100000);
>>> +    }
>>>  }
>>>
>>>  static int mux_chr_can_read(void *opaque)
>>> @@ -478,6 +486,8 @@ static CharDriverState 
>>> *qemu_chr_open_mux(CharDriverState *drv)
>>>      chr->opaque = d;
>>>      d->drv = drv;
>>>      d->focus = -1;
>>> +    d->accept_timer = qemu_new_timer(vm_clock,
>>> +                                     
>>> (QEMUTimerCB*)mux_chr_accept_input, chr);
>>>      chr->chr_write = mux_chr_write;
>>>      chr->chr_update_read_handler = mux_chr_update_read_handler;
>>>      chr->chr_accept_input = mux_chr_accept_input;
>>>
>>
Alexander Graf - May 4, 2010, 4:01 p.m.
Am 04.05.2010 um 16:34 schrieb Anthony Liguori <anthony@codemonkey.ws>:

> On 05/04/2010 09:30 AM, Alexander Graf wrote:
>>
>>
>> Am 04.05.2010 um 15:44 schrieb Anthony Liguori  
>> <anthony@codemonkey.ws>:
>>
>>> On 04/20/2010 11:56 AM, Alexander Graf wrote:
>>>> Virtio-Console can only process one character at a time. Using it  
>>>> on S390
>>>> gave me strage "lags" where I got the character I pressed before  
>>>> when
>>>> pressing one. So I typed in "abc" and only received "a", then  
>>>> pressed "d"
>>>> but the guest received "b" and so on.
>>>>
>>>> While the stdio driver calls a poll function that just processes  
>>>> on its
>>>> queue in case virtio-console can't take multiple characters at  
>>>> once, the
>>>> muxer does not have such callbacks, so it can't empty its queue.
>>>>
>>>> To work around that limitation, I introduced a new timer that  
>>>> only gets
>>>> active when the guest can not receive any more characters. In  
>>>> that case
>>>> it polls again after a while to check if the guest is now  
>>>> receiving input.
>>>>
>>>> This patch fixes input when using -nographic on s390 for me.
>>>>
>>>
>>> I think this is really a kvm issue.  I assume it's because s390  
>>> idles in the kernel so you never drop to userspace to repoll the  
>>> descriptor.
>>
>> There is no polling for the muxer. That's why it never knows when  
>> virtio-console can receive again.
>
> Maybe I'm missing something simple, but it looks to me like the  
> muxer is polling.  mux_chr_can_read() is going to eventually poll  
> the muxed devices to figure this out.
>
> If the root of the problem is that mux_chr_can_read() isn't being  
> invoked for a prolonged period of time, the real issue is the  
> problem I described.

The problem is that the select list of fds includes the stdio fd, so  
that gets notified and is coupled with virtio-console, but there's  
nothing passing that on to mux and I don't think it'd be clever to  
expose internal data to the muxer to tell it about the backend fds.

Alex

>
> Regards,
>
> Anthony Liguori
>
>> This patch basically adfs timer based polling for that exact case.
>>
>>
>> Alex
>>
>>>
>>> A timer is a hacky solution.  You really need to use an io thread  
>>> to solve this and then you need to switch away from  
>>> qemu_set_fd_handler2 to qemu_set_fd_handler() and make sure that  
>>> the later breaks select whenever it's invoked.
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>> ---
>>>>
>>>> Please consider for stable.
>>>> ---
>>>> qemu-char.c |   10 ++++++++++
>>>> 1 files changed, 10 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/qemu-char.c b/qemu-char.c
>>>> index 05df971..ce9df3a 100644
>>>> --- a/qemu-char.c
>>>> +++ b/qemu-char.c
>>>> @@ -235,6 +235,7 @@ typedef struct {
>>>>     IOEventHandler *chr_event[MAX_MUX];
>>>>     void *ext_opaque[MAX_MUX];
>>>>     CharDriverState *drv;
>>>> +    QEMUTimer *accept_timer;
>>>>     int focus;
>>>>     int mux_cnt;
>>>>     int term_got_escape;
>>>> @@ -396,6 +397,13 @@ static void mux_chr_accept_input 
>>>> (CharDriverState *chr)
>>>>         d->chr_read[m](d->ext_opaque[m],
>>>> &d->buffer[m][d->cons[m]++&  MUX_BUFFER_MASK], 1);
>>>>     }
>>>> +
>>>> +    /* We're still not able to sync producer and consumer, so  
>>>> let's wait a bit
>>>> +       and try again by then. */
>>>> +    if (d->prod[m] != d->cons[m]) {
>>>> +        qemu_mod_timer(d->accept_timer, qemu_get_clock(vm_clock)
>>>> +                                        + (int64_t)100000);
>>>> +    }
>>>> }
>>>>
>>>> static int mux_chr_can_read(void *opaque)
>>>> @@ -478,6 +486,8 @@ static CharDriverState *qemu_chr_open_mux 
>>>> (CharDriverState *drv)
>>>>     chr->opaque = d;
>>>>     d->drv = drv;
>>>>     d->focus = -1;
>>>> +    d->accept_timer = qemu_new_timer(vm_clock,
>>>> +                                     (QEMUTimerCB*) 
>>>> mux_chr_accept_input, chr);
>>>>     chr->chr_write = mux_chr_write;
>>>>     chr->chr_update_read_handler = mux_chr_update_read_handler;
>>>>     chr->chr_accept_input = mux_chr_accept_input;
>>>>
>>>
>
Anthony Liguori - May 4, 2010, 4:25 p.m.
On 05/04/2010 11:01 AM, Alexander Graf wrote:
>
> Am 04.05.2010 um 16:34 schrieb Anthony Liguori <anthony@codemonkey.ws>:
>
>> On 05/04/2010 09:30 AM, Alexander Graf wrote:
>>>
>>>
>>> Am 04.05.2010 um 15:44 schrieb Anthony Liguori <anthony@codemonkey.ws>:
>>>
>>>> On 04/20/2010 11:56 AM, Alexander Graf wrote:
>>>>> Virtio-Console can only process one character at a time. Using it 
>>>>> on S390
>>>>> gave me strage "lags" where I got the character I pressed before when
>>>>> pressing one. So I typed in "abc" and only received "a", then 
>>>>> pressed "d"
>>>>> but the guest received "b" and so on.
>>>>>
>>>>> While the stdio driver calls a poll function that just processes 
>>>>> on its
>>>>> queue in case virtio-console can't take multiple characters at 
>>>>> once, the
>>>>> muxer does not have such callbacks, so it can't empty its queue.
>>>>>
>>>>> To work around that limitation, I introduced a new timer that only 
>>>>> gets
>>>>> active when the guest can not receive any more characters. In that 
>>>>> case
>>>>> it polls again after a while to check if the guest is now 
>>>>> receiving input.
>>>>>
>>>>> This patch fixes input when using -nographic on s390 for me.
>>>>>
>>>>
>>>> I think this is really a kvm issue.  I assume it's because s390 
>>>> idles in the kernel so you never drop to userspace to repoll the 
>>>> descriptor.
>>>
>>> There is no polling for the muxer. That's why it never knows when 
>>> virtio-console can receive again.
>>
>> Maybe I'm missing something simple, but it looks to me like the muxer 
>> is polling.  mux_chr_can_read() is going to eventually poll the muxed 
>> devices to figure this out.
>>
>> If the root of the problem is that mux_chr_can_read() isn't being 
>> invoked for a prolonged period of time, the real issue is the problem 
>> I described.
>
> The problem is that the select list of fds includes the stdio fd, so 
> that gets notified and is coupled with virtio-console, but there's 
> nothing passing that on to mux and I don't think it'd be clever to 
> expose internal data to the muxer to tell it about the backend fds.

When stdio is readable, it should invoke qemu_chr_read() with the read 
data which in turn ought to invoke mux_chr_read().

I'm not sure I understand what signalling is missing.  Jan, does the 
problem Alex describes ring a bell?  I seem to recall you saying that 
mux was still fundamentally broken but ought to work most of the time...

Regards,

Anthony Liguori

> Alex
Jan Kiszka - May 4, 2010, 4:49 p.m.
Anthony Liguori wrote:
> On 05/04/2010 11:01 AM, Alexander Graf wrote:
>> Am 04.05.2010 um 16:34 schrieb Anthony Liguori <anthony@codemonkey.ws>:
>>
>>> On 05/04/2010 09:30 AM, Alexander Graf wrote:
>>>>
>>>> Am 04.05.2010 um 15:44 schrieb Anthony Liguori <anthony@codemonkey.ws>:
>>>>
>>>>> On 04/20/2010 11:56 AM, Alexander Graf wrote:
>>>>>> Virtio-Console can only process one character at a time. Using it 
>>>>>> on S390
>>>>>> gave me strage "lags" where I got the character I pressed before when
>>>>>> pressing one. So I typed in "abc" and only received "a", then 
>>>>>> pressed "d"
>>>>>> but the guest received "b" and so on.
>>>>>>
>>>>>> While the stdio driver calls a poll function that just processes 
>>>>>> on its
>>>>>> queue in case virtio-console can't take multiple characters at 
>>>>>> once, the
>>>>>> muxer does not have such callbacks, so it can't empty its queue.
>>>>>>
>>>>>> To work around that limitation, I introduced a new timer that only 
>>>>>> gets
>>>>>> active when the guest can not receive any more characters. In that 
>>>>>> case
>>>>>> it polls again after a while to check if the guest is now 
>>>>>> receiving input.
>>>>>>
>>>>>> This patch fixes input when using -nographic on s390 for me.
>>>>>>
>>>>> I think this is really a kvm issue.  I assume it's because s390 
>>>>> idles in the kernel so you never drop to userspace to repoll the 
>>>>> descriptor.
>>>> There is no polling for the muxer. That's why it never knows when 
>>>> virtio-console can receive again.
>>> Maybe I'm missing something simple, but it looks to me like the muxer 
>>> is polling.  mux_chr_can_read() is going to eventually poll the muxed 
>>> devices to figure this out.
>>>
>>> If the root of the problem is that mux_chr_can_read() isn't being 
>>> invoked for a prolonged period of time, the real issue is the problem 
>>> I described.
>> The problem is that the select list of fds includes the stdio fd, so 
>> that gets notified and is coupled with virtio-console, but there's 
>> nothing passing that on to mux and I don't think it'd be clever to 
>> expose internal data to the muxer to tell it about the backend fds.
> 
> When stdio is readable, it should invoke qemu_chr_read() with the read 
> data which in turn ought to invoke mux_chr_read().
> 
> I'm not sure I understand what signalling is missing.  Jan, does the 
> problem Alex describes ring a bell?  I seem to recall you saying that 
> mux was still fundamentally broken but ought to work most of the time...

That problem was (and still is) that the muxer needs to accept
characters even if the active front-end device is not in order to filter
out control sequences. Once its queue is full, it will start dropping
those the active device would not if directly connected. Could only be
solved via some peek service on pending front-end data.

I think Alex' problem can be addressed by registering
qemu_set_fd_handler2(..., backend->read_poll, mux_chr_read, ...). That
means the backend has to tell us about its read poll handler (if any).

Jan

Patch

diff --git a/qemu-char.c b/qemu-char.c
index 05df971..ce9df3a 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -235,6 +235,7 @@  typedef struct {
     IOEventHandler *chr_event[MAX_MUX];
     void *ext_opaque[MAX_MUX];
     CharDriverState *drv;
+    QEMUTimer *accept_timer;
     int focus;
     int mux_cnt;
     int term_got_escape;
@@ -396,6 +397,13 @@  static void mux_chr_accept_input(CharDriverState *chr)
         d->chr_read[m](d->ext_opaque[m],
                        &d->buffer[m][d->cons[m]++ & MUX_BUFFER_MASK], 1);
     }
+
+    /* We're still not able to sync producer and consumer, so let's wait a bit
+       and try again by then. */
+    if (d->prod[m] != d->cons[m]) {
+        qemu_mod_timer(d->accept_timer, qemu_get_clock(vm_clock)
+                                        + (int64_t)100000);
+    }
 }
 
 static int mux_chr_can_read(void *opaque)
@@ -478,6 +486,8 @@  static CharDriverState *qemu_chr_open_mux(CharDriverState *drv)
     chr->opaque = d;
     d->drv = drv;
     d->focus = -1;
+    d->accept_timer = qemu_new_timer(vm_clock,
+                                     (QEMUTimerCB*)mux_chr_accept_input, chr);
     chr->chr_write = mux_chr_write;
     chr->chr_update_read_handler = mux_chr_update_read_handler;
     chr->chr_accept_input = mux_chr_accept_input;