Patchwork [v2,1/6] idle: move the cpuidle entry point to the generic idle loop

login
register
mail settings
Submitter Daniel Lezcano
Date Jan. 30, 2014, 5:28 p.m.
Message ID <52EA8BD4.6020803@linaro.org>
Download mbox | patch
Permalink /patch/315415/
State New
Headers show

Comments

Daniel Lezcano - Jan. 30, 2014, 5:28 p.m.
On 01/30/2014 05:07 PM, Nicolas Pitre wrote:
> On Thu, 30 Jan 2014, Daniel Lezcano wrote:
>
>> On 01/30/2014 06:28 AM, Nicolas Pitre wrote:
>>> On Thu, 30 Jan 2014, Preeti U Murthy wrote:
>>>
>>>> Hi Nicolas,
>>>>
>>>> On 01/30/2014 02:01 AM, Nicolas Pitre wrote:
>>>>> On Wed, 29 Jan 2014, Nicolas Pitre wrote:
>>>>>
>>>>>> In order to integrate cpuidle with the scheduler, we must have a
>>>>>> better
>>>>>> proximity in the core code with what cpuidle is doing and not delegate
>>>>>> such interaction to arch code.
>>>>>>
>>>>>> Architectures implementing arch_cpu_idle() should simply enter
>>>>>> a cheap idle mode in the absence of a proper cpuidle driver.
>>>>>>
>>>>>> Signed-off-by: Nicolas Pitre <nico@linaro.org>
>>>>>> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>>>>
>>>>> As mentioned in my reply to Olof's comment on patch #5/6, here's a new
>>>>> version of this patch adding the safety local_irq_enable() to the core
>>>>> code.
>>>>>
>>>>> ----- >8
>>>>>
>>>>> From: Nicolas Pitre <nicolas.pitre@linaro.org>
>>>>> Subject: idle: move the cpuidle entry point to the generic idle loop
>>>>>
>>>>> In order to integrate cpuidle with the scheduler, we must have a better
>>>>> proximity in the core code with what cpuidle is doing and not delegate
>>>>> such interaction to arch code.
>>>>>
>>>>> Architectures implementing arch_cpu_idle() should simply enter
>>>>> a cheap idle mode in the absence of a proper cpuidle driver.
>>>>>
>>>>> In both cases i.e. whether it is a cpuidle driver or the default
>>>>> arch_cpu_idle(), the calling convention expects IRQs to be disabled
>>>>> on entry and enabled on exit. There is a warning in place already but
>>>>> let's add a forced IRQ enable here as well.  This will allow for
>>>>> removing the forced IRQ enable some implementations do locally and
>>>>
>>>> Why would this patch allow for removing the forced IRQ enable that are
>>>> being done on some archs in arch_cpu_idle()? Isn't this patch expecting
>>>> the default arch_cpu_idle() to have re-enabled the interrupts after
>>>> exiting from the default idle state? Its supposed to only catch faulty
>>>> cpuidle drivers that haven't enabled IRQs on exit from idle state but
>>>> are expected to have done so, isn't it?
>>>
>>> Exact.  However x86 currently does this:
>>>
>>>   if (cpuidle_idle_call())
>>>           x86_idle();
>>>   else
>>>           local_irq_enable();
>>>
>>> So whenever cpuidle_idle_call() is successful then IRQs are
>>> unconditionally enabled whether or not the underlying cpuidle driver has
>>> properly done it or not.  And the reason is that some of the x86 cpuidle
>>> do fail to enable IRQs before returning.
>>>
>>> So the idea is to get rid of this unconditional IRQ enabling and let the
>>> core issue a warning instead (as well as enabling IRQs to allow the
>>> system to run).
>>
>> But what I don't get with your comment is the local_irq_enable is done from
>> the cpuidle common framework in 'cpuidle_enter_state' it is not done from the
>> arch specific backend cpuidle driver.
>
> Oh well... This certainly means we'll have to clean this mess as some
> drivers do it on their own while some others don't.  Some drivers also
> loop on !need_resched() while some others simply return on the first
> interrupt.

Ok, I think the mess is coming from 'default_idle' which does not 
re-enable the local_irq but used from different places like 
amd_e400_idle and apm_cpu_idle.

void default_idle(void)
{
         trace_cpu_idle_rcuidle(1, smp_processor_id());
         safe_halt();
         trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
}

Considering the system configured without cpuidle because this one 
*always* enable the local irq, we have the different cases:

x86_idle = default_idle();
==> local_irq_enable is missing

x86_idle = amd_e400_idle();
==> it calls local_irq_disable(); but in the idle loop context where the 
local irqs are already disabled.
==> if amd_e400_c1e_detected is true, the local_irq are enabled
==> otherwise no
==> default_idle is called from there and does not enable local_irqs


>> So the code above could be:
>>
>> 	if (cpuidle_idle_call())
>> 		x86_idle();
>>
>> without the else section, this local_irq_enable is pointless. Or may be I
>> missed something ?
>
> A later patch removes it anyway.  But if it is really necessary to
> enable interrupts then the core will do it but with a warning now.

This WARN should disappear. It was there because it was up to the 
backend cpuidle driver to enable the irq. But in the meantime, that was 
consolidated into a single place in the cpuidle framework so no need to 
try to catch errors.

What about (based on this patchset).
Peter Zijlstra - Jan. 30, 2014, 6:06 p.m.
On Thu, Jan 30, 2014 at 06:28:52PM +0100, Daniel Lezcano wrote:
> Ok, I think the mess is coming from 'default_idle' which does not re-enable
> the local_irq but used from different places like amd_e400_idle and
> apm_cpu_idle.
> 
> void default_idle(void)
> {
>         trace_cpu_idle_rcuidle(1, smp_processor_id());
>         safe_halt();
>         trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
> }
> 
> Considering the system configured without cpuidle because this one *always*
> enable the local irq, we have the different cases:
> 
> x86_idle = default_idle();
> ==> local_irq_enable is missing
> 

safe_halt() is "sti; hlt" and so very much does the irq_enable.
Nicolas Pitre - Jan. 30, 2014, 7:24 p.m.
On Thu, 30 Jan 2014, Daniel Lezcano wrote:

> On 01/30/2014 05:07 PM, Nicolas Pitre wrote:
> > On Thu, 30 Jan 2014, Daniel Lezcano wrote:
> > > But what I don't get with your comment is the local_irq_enable is done
> > > from
> > > the cpuidle common framework in 'cpuidle_enter_state' it is not done from
> > > the
> > > arch specific backend cpuidle driver.
> >
> > Oh well... This certainly means we'll have to clean this mess as some
> > drivers do it on their own while some others don't.  Some drivers also
> > loop on !need_resched() while some others simply return on the first
> > interrupt.
> 
> Ok, I think the mess is coming from 'default_idle' which does not re-enable
> the local_irq but used from different places like amd_e400_idle and
> apm_cpu_idle.

Yet if you look at the code path before my patches you'll see that IRQs 
were enabled only after cpuidle_idle_call() had returned success.

> void default_idle(void)
> {
>         trace_cpu_idle_rcuidle(1, smp_processor_id());
>         safe_halt();
>         trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
> }
> 
> Considering the system configured without cpuidle because this one *always*
> enable the local irq,

Yet this is discutable. Given that some hardware do have IRQs turned on 
upon exiting idle mode, I think we should generalize it so that 
the explicit enabling 
of IRQs, when needed, should be done as close as possible to the 
operation that caused idle mode to be entered.

> we have the different cases:
> 
> x86_idle = default_idle();
> ==> local_irq_enable is missing

According to Peter it is not.

> x86_idle = amd_e400_idle();
> ==> it calls local_irq_disable(); but in the idle loop context where the 
> local irqs are already disabled.

Since it returned from default_idle() then IRQs are enabled.

> ==> if amd_e400_c1e_detected is true, the local_irq are enabled
> ==> otherwise no
> ==> default_idle is called from there and does not enable local_irqs

Again, it does.

> > > So the code above could be:
> > >
> > >  if (cpuidle_idle_call())
> > >   x86_idle();
> > >
> > > without the else section, this local_irq_enable is pointless. Or may be I
> > > missed something ?
> >
> > A later patch removes it anyway.  But if it is really necessary to
> > enable interrupts then the core will do it but with a warning now.
> 
> This WARN should disappear. It was there because it was up to the backend
> cpuidle driver to enable the irq. But in the meantime, that was consolidated
> into a single place in the cpuidle framework so no need to try to catch
> errors.

And that consolidation was a mistake IMHO.  We should assume that the 
exiting of idle mode has IRQs enabled already, and do so manually in the 
backend driver if it is not the case on particular hardware.  That's the 
only way to ensure uniformity at a higher level.

Yet, if a code path is buggy in that regard, whether this is through 
cpuidle when enabled, or the default idle function otherwise, then the 
warning is there in cpu_idle_loop() to catch them all.

> What about (based on this patchset).
> 
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index 4505e2a..2d60cbb 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -299,6 +299,7 @@ void arch_cpu_idle_dead(void)
>  void arch_cpu_idle(void)
>  {
>         x86_idle();
> +       local_irq_enable();
>  }

Again this is redundant.


Nicolas

Patch

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4505e2a..2d60cbb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -299,6 +299,7 @@  void arch_cpu_idle_dead(void)
  void arch_cpu_idle(void)
  {
         x86_idle();
+       local_irq_enable();
  }

  /*