[RFC,qemu,legoater/aspeed-3.1,4/4] timer: aspeed: Provide back-pressure information for short periods

Message ID 20190111035638.19725-5-andrew@aj.id.au
State Not Applicable, archived
Headers show
Series
  • Handle short timer periods
Related show

Commit Message

Andrew Jeffery Jan. 11, 2019, 3:56 a.m.
First up: This is not the way the hardware behaves.

However, it helps resolve real-world problems with short periods being
used under Linux. Commit 4451d3f59f2a ("clocksource/drivers/fttmr010:
Fix set_next_event handler") in Linux fixed the timer driver to
correctly schedule the next event for the Aspeed controller, and in
combination with 5daa8212c08e ("ARM: dts: aspeed: Describe random number
device") Linux will now set a timer with a period as low as 1us.

Configuring a qemu timer with such a short period results in spending
time handling the interrupt in the model rather than executing guest
code, leading to noticeable "sticky" behaviour in the guest.

The behaviour of Linux is correct with respect to the hardware, so we
need to improve our handling under emulation. The approach chosen is to
provide back-pressure information by calculating an acceptable minimum
number of ticks to be set on the model. Under Linux an additional read
is added in the timer configuration path to detect back-pressure, which
will never occur on hardware. However if back-pressure is observed, the
driver alerts the clock event subsystem, which then performs its own
next event dilation via a config option - d1748302f70b ("clockevents:
Make minimum delay adjustments configurable")

A minimum period of 5us was experimentally determined on a Lenovo
T480s, which I've increased to 20us for "safety".

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
---
 hw/misc/aspeed_scu.c            | 6 ++++++
 hw/timer/aspeed_timer.c         | 6 +++++-
 include/hw/timer/aspeed_timer.h | 1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

Comments

Cédric Le Goater Jan. 11, 2019, 10:10 a.m. | #1
On 1/11/19 4:56 AM, Andrew Jeffery wrote:
> First up: This is not the way the hardware behaves.

I think this is OK from the QEMU side. It would be better if Linux was 
not involved though.

> However, it helps resolve real-world problems with short periods being
> used under Linux. Commit 4451d3f59f2a ("clocksource/drivers/fttmr010:
> Fix set_next_event handler") in Linux fixed the timer driver to
> correctly schedule the next event for the Aspeed controller, and in
> combination with 5daa8212c08e ("ARM: dts: aspeed: Describe random number
> device") Linux will now set a timer with a period as low as 1us.
> 
> Configuring a qemu timer with such a short period results in spending
> time handling the interrupt in the model rather than executing guest
> code, leading to noticeable "sticky" behaviour in the guest.
> 
> The behaviour of Linux is correct with respect to the hardware, so we
> need to improve our handling under emulation. The approach chosen is to
> provide back-pressure information by calculating an acceptable minimum
> number of ticks to be set on the model. Under Linux an additional read
> is added in the timer configuration path to detect back-pressure, which
> will never occur on hardware. However if back-pressure is observed, the
> driver alerts the clock event subsystem, which then performs its own
> next event dilation via a config option - d1748302f70b ("clockevents:
> Make minimum delay adjustments configurable")

So, this is approach is not totally unacceptable, QEMU is an hypervisor
which has its own timing constraints.
 
> A minimum period of 5us was experimentally determined on a Lenovo
> T480s, which I've increased to 20us for "safety".
> 
> Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
> ---
>  hw/misc/aspeed_scu.c            | 6 ++++++
>  hw/timer/aspeed_timer.c         | 6 +++++-
>  include/hw/timer/aspeed_timer.h | 1 +
>  3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
> index 257f9a6c6b8a..0410776b456a 100644
> --- a/hw/misc/aspeed_scu.c
> +++ b/hw/misc/aspeed_scu.c
> @@ -432,6 +432,12 @@ static void aspeed_scu_realize(DeviceState *dev, Error **errp)
>                            TYPE_ASPEED_SCU, SCU_IO_REGION_SIZE);
>  
>      sysbus_init_mmio(sbd, &s->iomem);
> +
> +    /*
> +     * Reset on realize to ensure the APB clock value is calculated in time for
> +     * use by the timer model, which is reset before the SCU.
> +     */
> +    aspeed_scu_reset(dev);

sigh. May be we should call aspeed_scu_set_apb_freq() from the Aspeed timer
model ?

>  }
>  
>  static const VMStateDescription vmstate_aspeed_scu = {
> diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
> index 35b40a7c4010..0f3501ac5a5c 100644
> --- a/hw/timer/aspeed_timer.c
> +++ b/hw/timer/aspeed_timer.c
> @@ -254,7 +254,7 @@ static void aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
>      switch (reg) {
>      case TIMER_REG_RELOAD:
>          old_reload = t->reload;
> -        t->reload = value;
> +        t->reload = value < t->min_ticks ? t->min_ticks : value;
>  
>          /* If the reload value was not previously set, or zero, and
>           * the current value is valid, try to start the timer if it is
> @@ -306,7 +306,11 @@ static void aspeed_timer_ctrl_enable(AspeedTimer *t, bool enable)
>  
>  static void aspeed_timer_ctrl_external_clock(AspeedTimer *t, bool enable)
>  {
> +    AspeedTimerCtrlState *s = timer_to_ctrl(t);
> +    uint32_t rate = enable ? TIMER_CLOCK_EXT_HZ : s->scu->apb_freq;
> +
>      trace_aspeed_timer_ctrl_external_clock(t->id, enable);
> +    t->min_ticks = muldiv64(20 * SCALE_US, rate, NANOSECONDS_PER_SECOND);

That '20' deserves a define I think.

C.

>  }
>  
>  static void aspeed_timer_ctrl_overflow_interrupt(AspeedTimer *t, bool enable)
> diff --git a/include/hw/timer/aspeed_timer.h b/include/hw/timer/aspeed_timer.h
> index 1fb949e16710..10c851ebb6d7 100644
> --- a/include/hw/timer/aspeed_timer.h
> +++ b/include/hw/timer/aspeed_timer.h
> @@ -41,6 +41,7 @@ typedef struct AspeedTimer {
>       * interrupts, signalling with both the rising and falling edge.
>       */
>      int32_t level;
> +    uint32_t min_ticks;
>      uint32_t reload;
>      uint32_t match[2];
>      uint64_t start;
>
Andrew Jeffery Jan. 13, 2019, 11:07 p.m. | #2
On Fri, 11 Jan 2019, at 20:40, Cédric Le Goater wrote:
> On 1/11/19 4:56 AM, Andrew Jeffery wrote:
> > First up: This is not the way the hardware behaves.
> 
> I think this is OK from the QEMU side. It would be better if Linux was 
> not involved though.

It doesn't have to be, just it may see wonky timer behaviour because
the timer will not be doing exactly what was asked.

> 
> > However, it helps resolve real-world problems with short periods being
> > used under Linux. Commit 4451d3f59f2a ("clocksource/drivers/fttmr010:
> > Fix set_next_event handler") in Linux fixed the timer driver to
> > correctly schedule the next event for the Aspeed controller, and in
> > combination with 5daa8212c08e ("ARM: dts: aspeed: Describe random number
> > device") Linux will now set a timer with a period as low as 1us.
> > 
> > Configuring a qemu timer with such a short period results in spending
> > time handling the interrupt in the model rather than executing guest
> > code, leading to noticeable "sticky" behaviour in the guest.
> > 
> > The behaviour of Linux is correct with respect to the hardware, so we
> > need to improve our handling under emulation. The approach chosen is to
> > provide back-pressure information by calculating an acceptable minimum
> > number of ticks to be set on the model. Under Linux an additional read
> > is added in the timer configuration path to detect back-pressure, which
> > will never occur on hardware. However if back-pressure is observed, the
> > driver alerts the clock event subsystem, which then performs its own
> > next event dilation via a config option - d1748302f70b ("clockevents:
> > Make minimum delay adjustments configurable")
> 
> So, this is approach is not totally unacceptable, QEMU is an hypervisor
> which has its own timing constraints.
>  
> > A minimum period of 5us was experimentally determined on a Lenovo
> > T480s, which I've increased to 20us for "safety".
> > 
> > Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
> > ---
> >  hw/misc/aspeed_scu.c            | 6 ++++++
> >  hw/timer/aspeed_timer.c         | 6 +++++-
> >  include/hw/timer/aspeed_timer.h | 1 +
> >  3 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
> > index 257f9a6c6b8a..0410776b456a 100644
> > --- a/hw/misc/aspeed_scu.c
> > +++ b/hw/misc/aspeed_scu.c
> > @@ -432,6 +432,12 @@ static void aspeed_scu_realize(DeviceState *dev, Error **errp)
> >                            TYPE_ASPEED_SCU, SCU_IO_REGION_SIZE);
> >  
> >      sysbus_init_mmio(sbd, &s->iomem);
> > +
> > +    /*
> > +     * Reset on realize to ensure the APB clock value is calculated in time for
> > +     * use by the timer model, which is reset before the SCU.
> > +     */
> > +    aspeed_scu_reset(dev);
> 
> sigh. May be we should call aspeed_scu_set_apb_freq() from the Aspeed timer
> model ?

I dunno - will we run into this in other models as well? If we do, should they also call
aspeed_scu_set_apb_freq()? If they do it probably won't matter (same input, same
output), but it seems a bit messy to distribute the call through the code?

> 
> >  }
> >  
> >  static const VMStateDescription vmstate_aspeed_scu = {
> > diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
> > index 35b40a7c4010..0f3501ac5a5c 100644
> > --- a/hw/timer/aspeed_timer.c
> > +++ b/hw/timer/aspeed_timer.c
> > @@ -254,7 +254,7 @@ static void aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
> >      switch (reg) {
> >      case TIMER_REG_RELOAD:
> >          old_reload = t->reload;
> > -        t->reload = value;
> > +        t->reload = value < t->min_ticks ? t->min_ticks : value;
> >  
> >          /* If the reload value was not previously set, or zero, and
> >           * the current value is valid, try to start the timer if it is
> > @@ -306,7 +306,11 @@ static void aspeed_timer_ctrl_enable(AspeedTimer *t, bool enable)
> >  
> >  static void aspeed_timer_ctrl_external_clock(AspeedTimer *t, bool enable)
> >  {
> > +    AspeedTimerCtrlState *s = timer_to_ctrl(t);
> > +    uint32_t rate = enable ? TIMER_CLOCK_EXT_HZ : s->scu->apb_freq;
> > +
> >      trace_aspeed_timer_ctrl_external_clock(t->id, enable);
> > +    t->min_ticks = muldiv64(20 * SCALE_US, rate, NANOSECONDS_PER_SECOND);
> 
> That '20' deserves a define I think.

Yes, good call :)

Andrew
Cédric Le Goater Jan. 14, 2019, 8:55 a.m. | #3
On 1/14/19 12:07 AM, Andrew Jeffery wrote:
> On Fri, 11 Jan 2019, at 20:40, Cédric Le Goater wrote:
>> On 1/11/19 4:56 AM, Andrew Jeffery wrote:
>>> First up: This is not the way the hardware behaves.
>>
>> I think this is OK from the QEMU side. It would be better if Linux was 
>> not involved though.
> 
> It doesn't have to be, just it may see wonky timer behaviour because
> the timer will not be doing exactly what was asked.
> 
>>
>>> However, it helps resolve real-world problems with short periods being
>>> used under Linux. Commit 4451d3f59f2a ("clocksource/drivers/fttmr010:
>>> Fix set_next_event handler") in Linux fixed the timer driver to
>>> correctly schedule the next event for the Aspeed controller, and in
>>> combination with 5daa8212c08e ("ARM: dts: aspeed: Describe random number
>>> device") Linux will now set a timer with a period as low as 1us.
>>>
>>> Configuring a qemu timer with such a short period results in spending
>>> time handling the interrupt in the model rather than executing guest
>>> code, leading to noticeable "sticky" behaviour in the guest.
>>>
>>> The behaviour of Linux is correct with respect to the hardware, so we
>>> need to improve our handling under emulation. The approach chosen is to
>>> provide back-pressure information by calculating an acceptable minimum
>>> number of ticks to be set on the model. Under Linux an additional read
>>> is added in the timer configuration path to detect back-pressure, which
>>> will never occur on hardware. However if back-pressure is observed, the
>>> driver alerts the clock event subsystem, which then performs its own
>>> next event dilation via a config option - d1748302f70b ("clockevents:
>>> Make minimum delay adjustments configurable")
>>
>> So, this is approach is not totally unacceptable, QEMU is an hypervisor
>> which has its own timing constraints.
>>  
>>> A minimum period of 5us was experimentally determined on a Lenovo
>>> T480s, which I've increased to 20us for "safety".
>>>
>>> Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
>>> ---
>>>  hw/misc/aspeed_scu.c            | 6 ++++++
>>>  hw/timer/aspeed_timer.c         | 6 +++++-
>>>  include/hw/timer/aspeed_timer.h | 1 +
>>>  3 files changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
>>> index 257f9a6c6b8a..0410776b456a 100644
>>> --- a/hw/misc/aspeed_scu.c
>>> +++ b/hw/misc/aspeed_scu.c
>>> @@ -432,6 +432,12 @@ static void aspeed_scu_realize(DeviceState *dev, Error **errp)
>>>                            TYPE_ASPEED_SCU, SCU_IO_REGION_SIZE);
>>>  
>>>      sysbus_init_mmio(sbd, &s->iomem);
>>> +
>>> +    /*
>>> +     * Reset on realize to ensure the APB clock value is calculated in time for
>>> +     * use by the timer model, which is reset before the SCU.
>>> +     */
>>> +    aspeed_scu_reset(dev);
>>
>> sigh. May be we should call aspeed_scu_set_apb_freq() from the Aspeed timer
>> model ?
> 
> I dunno - will we run into this in other models as well? If we do, should they also call
> aspeed_scu_set_apb_freq()? If they do it probably won't matter (same input, same
> output), but it seems a bit messy to distribute the call through the code?

This is really a reset order problem. SCU reset should be called before all 
others models. I don't know if we can order the calls but calling reset in 
a realize routine is a problem also. 

The APB clock depends on the HPLL_PARAM register and on the CLK_SEL register,
and these can be set a runtime by FW.

May be we can introduce a reset method at the machine level forcing SCU
first. That  would be one way to ask how it should be done.

C.



>>
>>>  }
>>>  
>>>  static const VMStateDescription vmstate_aspeed_scu = {
>>> diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
>>> index 35b40a7c4010..0f3501ac5a5c 100644
>>> --- a/hw/timer/aspeed_timer.c
>>> +++ b/hw/timer/aspeed_timer.c
>>> @@ -254,7 +254,7 @@ static void aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
>>>      switch (reg) {
>>>      case TIMER_REG_RELOAD:
>>>          old_reload = t->reload;
>>> -        t->reload = value;
>>> +        t->reload = value < t->min_ticks ? t->min_ticks : value;
>>>  
>>>          /* If the reload value was not previously set, or zero, and
>>>           * the current value is valid, try to start the timer if it is
>>> @@ -306,7 +306,11 @@ static void aspeed_timer_ctrl_enable(AspeedTimer *t, bool enable)
>>>  
>>>  static void aspeed_timer_ctrl_external_clock(AspeedTimer *t, bool enable)
>>>  {
>>> +    AspeedTimerCtrlState *s = timer_to_ctrl(t);
>>> +    uint32_t rate = enable ? TIMER_CLOCK_EXT_HZ : s->scu->apb_freq;
>>> +
>>>      trace_aspeed_timer_ctrl_external_clock(t->id, enable);
>>> +    t->min_ticks = muldiv64(20 * SCALE_US, rate, NANOSECONDS_PER_SECOND);
>>
>> That '20' deserves a define I think.
> 
> Yes, good call :)
> 
> Andrew
>

Patch

diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
index 257f9a6c6b8a..0410776b456a 100644
--- a/hw/misc/aspeed_scu.c
+++ b/hw/misc/aspeed_scu.c
@@ -432,6 +432,12 @@  static void aspeed_scu_realize(DeviceState *dev, Error **errp)
                           TYPE_ASPEED_SCU, SCU_IO_REGION_SIZE);
 
     sysbus_init_mmio(sbd, &s->iomem);
+
+    /*
+     * Reset on realize to ensure the APB clock value is calculated in time for
+     * use by the timer model, which is reset before the SCU.
+     */
+    aspeed_scu_reset(dev);
 }
 
 static const VMStateDescription vmstate_aspeed_scu = {
diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
index 35b40a7c4010..0f3501ac5a5c 100644
--- a/hw/timer/aspeed_timer.c
+++ b/hw/timer/aspeed_timer.c
@@ -254,7 +254,7 @@  static void aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
     switch (reg) {
     case TIMER_REG_RELOAD:
         old_reload = t->reload;
-        t->reload = value;
+        t->reload = value < t->min_ticks ? t->min_ticks : value;
 
         /* If the reload value was not previously set, or zero, and
          * the current value is valid, try to start the timer if it is
@@ -306,7 +306,11 @@  static void aspeed_timer_ctrl_enable(AspeedTimer *t, bool enable)
 
 static void aspeed_timer_ctrl_external_clock(AspeedTimer *t, bool enable)
 {
+    AspeedTimerCtrlState *s = timer_to_ctrl(t);
+    uint32_t rate = enable ? TIMER_CLOCK_EXT_HZ : s->scu->apb_freq;
+
     trace_aspeed_timer_ctrl_external_clock(t->id, enable);
+    t->min_ticks = muldiv64(20 * SCALE_US, rate, NANOSECONDS_PER_SECOND);
 }
 
 static void aspeed_timer_ctrl_overflow_interrupt(AspeedTimer *t, bool enable)
diff --git a/include/hw/timer/aspeed_timer.h b/include/hw/timer/aspeed_timer.h
index 1fb949e16710..10c851ebb6d7 100644
--- a/include/hw/timer/aspeed_timer.h
+++ b/include/hw/timer/aspeed_timer.h
@@ -41,6 +41,7 @@  typedef struct AspeedTimer {
      * interrupts, signalling with both the rising and falling edge.
      */
     int32_t level;
+    uint32_t min_ticks;
     uint32_t reload;
     uint32_t match[2];
     uint64_t start;