Patchwork [RFC,v4] ARM hibernation/suspend-to-disk support

login
register
mail settings
Submitter Frank Hofmann
Date June 7, 2011, 4:48 p.m.
Message ID <alpine.DEB.2.00.1106071722040.2236@localhost6.localdomain6>
Download mbox | patch
Permalink /patch/99297/
State New
Headers show

Comments

Frank Hofmann - June 7, 2011, 4:48 p.m.
Hi,

time for another round on this one...

This got quite a bit cleaned up now.

There's now no more need for a "swsusp context" at all. The code uses 
cpu_suspend/resume and keeps the snapshot state on the stack while 
writing it out.

There are a few dependencies this patch brings in:

* due to the use of cpu_suspend / cpu_resume, it'll only apply as-is
   to kernels no older than f6b0fa02e8b0708d17d631afce456524eadf87ff,
   where Russell King introduced the generic interface.
   Patching these into older kernels is a little work.

* it temporarily uses swapper_pg_dir and establishes 1:1 mappings there
   for a MMU-off transition, which is necessary before resume.
   In order to tear these down afterwards, identity_mapping_del() needs
   to be called; for some reason that's #ifdef CONFIG_SMP ...

* it needs to "catch" sleep_save_sp after cpu_suspend() so that resume
   can be provided with the proper starting point.
   This requires an ENTRY(sleep_save_sp) in arch/arm/kernel/sleep.S so
   that the symbol becomes public.

* it assumes cpu_reset will disable the MMU. cpu_v6_reset/cpu_v7_reset
   are currently not doing so (amongst some other minor chip types).

* there's kind of a circular dependency between CONFIG_HIBERNATION and
   CONFIG_PM_SLEEP, on ARM. The latter is necessary so that cpu_suspend
   and cpu_resume are compiled in, but it cannot be selected via
   ARCH_HIBERNATION_POSSIBLE because CONFIG_PM_SLEEP depends on
   CONFIG_HIBERNATION_INTERFACE - selected by CONFIG_HIBERNATION.

   Consequence is that right now, both CONFIG_PM_SLEEP and ...HIBERNATION
   must be set in your defconfig file to be able to compile.

   (my head swirls from writing this ...)

Otherwise, this is by far the cleanest in the series yet.


I've tested this on ARM1176; still need to do OMAP3 (Cortex-A8), will 
report on that.


Please let me know what you think,
FrankH.
Rafael J. Wysocki - June 7, 2011, 9:48 p.m.
On Tuesday, June 07, 2011, Frank Hofmann wrote:
> Hi,
> 
> time for another round on this one...
> 
> This got quite a bit cleaned up now.
> 
> There's now no more need for a "swsusp context" at all. The code uses 
> cpu_suspend/resume and keeps the snapshot state on the stack while 
> writing it out.
> 
> There are a few dependencies this patch brings in:
> 
> * due to the use of cpu_suspend / cpu_resume, it'll only apply as-is
>    to kernels no older than f6b0fa02e8b0708d17d631afce456524eadf87ff,
>    where Russell King introduced the generic interface.
>    Patching these into older kernels is a little work.
> 
> * it temporarily uses swapper_pg_dir and establishes 1:1 mappings there
>    for a MMU-off transition, which is necessary before resume.
>    In order to tear these down afterwards, identity_mapping_del() needs
>    to be called; for some reason that's #ifdef CONFIG_SMP ...
> 
> * it needs to "catch" sleep_save_sp after cpu_suspend() so that resume
>    can be provided with the proper starting point.
>    This requires an ENTRY(sleep_save_sp) in arch/arm/kernel/sleep.S so
>    that the symbol becomes public.
> 
> * it assumes cpu_reset will disable the MMU. cpu_v6_reset/cpu_v7_reset
>    are currently not doing so (amongst some other minor chip types).
> 
> * there's kind of a circular dependency between CONFIG_HIBERNATION and
>    CONFIG_PM_SLEEP, on ARM. The latter is necessary so that cpu_suspend
>    and cpu_resume are compiled in, but it cannot be selected via
>    ARCH_HIBERNATION_POSSIBLE because CONFIG_PM_SLEEP depends on
>    CONFIG_HIBERNATION_INTERFACE - selected by CONFIG_HIBERNATION.
> 
>    Consequence is that right now, both CONFIG_PM_SLEEP and ...HIBERNATION
>    must be set in your defconfig file to be able to compile.

In fact, CONFIG_PM_SLEEP = CONFIG_SUSPEND || CONFIG_HIBERNATE_CALLBACKS, so it
should be sufficient to set HIBERNATION.  ARCH_HIBERNATION_POSSIBLE only
causes HIBERNATION to become a valid option (that may or may not be set).

>    (my head swirls from writing this ...)

What problem exactly did you have with those settings?

Rafael
Frank Hofmann - June 9, 2011, 3:30 p.m.
On Tue, 7 Jun 2011, Rafael J. Wysocki wrote:

> On Tuesday, June 07, 2011, Frank Hofmann wrote:
[ ... ]
>> * there's kind of a circular dependency between CONFIG_HIBERNATION and
>>    CONFIG_PM_SLEEP, on ARM. The latter is necessary so that cpu_suspend
>>    and cpu_resume are compiled in, but it cannot be selected via
>>    ARCH_HIBERNATION_POSSIBLE because CONFIG_PM_SLEEP depends on
>>    CONFIG_HIBERNATION_INTERFACE - selected by CONFIG_HIBERNATION.
>>
>>    Consequence is that right now, both CONFIG_PM_SLEEP and ...HIBERNATION
>>    must be set in your defconfig file to be able to compile.
>
> In fact, CONFIG_PM_SLEEP = CONFIG_SUSPEND || CONFIG_HIBERNATE_CALLBACKS, so it
> should be sufficient to set HIBERNATION.  ARCH_HIBERNATION_POSSIBLE only
> causes HIBERNATION to become a valid option (that may or may not be set).
>
>>    (my head swirls from writing this ...)
>
> What problem exactly did you have with those settings?

Ah, I tried to do a "select PM_SLEEP" from ARM's ARCH_HIBERNATION_POSSIBLE 
... which is circular.

Sorry the noise. It does look like the diff I sent can correctly be 
enabled just by selecting CONFIG_HIBERNATION as it's supposed to be, and 
CONFIG_PM_SLEEP will be automatically enabled then.

Found a few more nits with the patch as last sent:

- MULTI_CPU configs don't compile, needs changes (Will Deacon is on that)
   for cpu_reset.

- the hardcoded v:p offset in swsusp.S needs to go, the value can now
   be changed at kernel init and hence a small func to query it is needed

- the patch assumes the codepath is single-cpu (which the framework does
   ensure, as disable_nonboot_cpus is called) but also assumes that the
   boot CPU has ID 0; only for that is the sleep_save_sp[] entry restored.

   At least a WARN_ON(smp_processor_id()) is warranted; having a different
   core suspend the system than resume it, I'm not sure ...

- the identity mappings should match what setup_mm_for_reboot does, i.e.
   let them cover the whole user range (not just _stext.._etext). That also
   makes sure whatever happens during restore, swapper_pg_dir is "virgin"
   again afterwards.


Btw, when testing this I found that generic cpu_suspend seems to be just 
fine for OMAP3; the OMAP platforms though do not at this time use the 
generic cpu_suspend/resume for sleep, is it planned to change that ?

FrankH.



FrankH.


>
> Rafael
>
Russell King - ARM Linux - June 9, 2011, 3:40 p.m.
On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
> Btw, when testing this I found that generic cpu_suspend seems to be just  
> fine for OMAP3; the OMAP platforms though do not at this time use the  
> generic cpu_suspend/resume for sleep, is it planned to change that ?

That's because OMAP was doing changes to their sleep code while I was
consolidating the sleep code, and although I asked several times that
the OMAP folk should participate in this effort, but evidentally I was
unsuccessful in achieving anything in that direction.

And of course since then it's been forgotten about, and I've given up
on that particular aspect.  I've also come to the conclusion that OMAP
is sufficiently weird (requiring soo much to execute from SRAM) that
its hopeless to persue.
Frank Hofmann - June 9, 2011, 4:26 p.m.
On Thu, 9 Jun 2011, Russell King - ARM Linux wrote:

> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>> Btw, when testing this I found that generic cpu_suspend seems to be just
>> fine for OMAP3; the OMAP platforms though do not at this time use the
>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>
> That's because OMAP was doing changes to their sleep code while I was
> consolidating the sleep code, and although I asked several times that
> the OMAP folk should participate in this effort, but evidentally I was
> unsuccessful in achieving anything in that direction.
>
> And of course since then it's been forgotten about, and I've given up
> on that particular aspect.  I've also come to the conclusion that OMAP
> is sufficiently weird (requiring soo much to execute from SRAM) that
> its hopeless to persue.
>

Thanks for the info.

You're right the omap sleep code is long. The l1_logic_lost sequence of 
p15 accesses in there could probably be shortened via cpu_suspend/resume 
but it's not fully obvious (to me) how to plug it in.

I'm largely asking because the hibernation patch, as written (and using 
the cpu_v7_do_suspend/resume backends), does "work" on OMAP3 as far as 
I've tested it, i.e. it didn't need the complex dance done by the OMAP 
sleep code itself.

That said, there's secure state, there's maybe other stuff the iROM deals 
with, and I don't know how to comprehensively test this gets restored to 
a usable state (or even needs to), so a claim that the hibernation patch 
is proven perfect goes a bit far. Hence quotes, "works".

Re OMAP, there's two questions regarding hibernation/suspend-to-disk:

a) What reasons if any are there why cpu_{v7_do_}suspend/resume are not
    ok to use (on OMAP) for snapshotting core state, for the purpose of
    hibernation ?
    If there are any such issues, then how could they be addresssed ?

b) Is it necessary to provide machine hooks in the hibernation code for
    some auxilliary stuff that's not saved/restored by device suspend ?
    If so, how would this need to look like, from the OMAP view ?

OMAP is a complex piece as it seems, hence maybe someone can comment from 
that angle ?

FrankH.
Santosh Shilimkar - June 9, 2011, 4:27 p.m.
On 6/9/2011 9:10 PM, Russell King - ARM Linux wrote:
> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>> Btw, when testing this I found that generic cpu_suspend seems to be just
>> fine for OMAP3; the OMAP platforms though do not at this time use the
>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>
> That's because OMAP was doing changes to their sleep code while I was
> consolidating the sleep code, and although I asked several times that
> the OMAP folk should participate in this effort, but evidentally I was
> unsuccessful in achieving anything in that direction.
>
Agreed but the situation at that point was the code was not at
all in convertible position. Looking at your below comment,
it's still not :)

> And of course since then it's been forgotten about, and I've given up
> on that particular aspect.  I've also come to the conclusion that OMAP
> is sufficiently weird (requiring soo much to execute from SRAM) that
> its hopeless to persue.
>
We did discuss this Russell and requested your help here. I guess
you have already looked at OMAP code from generic suspend
hooks point of view and the SRAM execution, Errata's seems to
make you feel it's not going to work.
Is that what you mean here ?

Regards
Santosh
Santosh Shilimkar - June 9, 2011, 4:35 p.m.
On 6/9/2011 9:56 PM, Frank Hofmann wrote:
>
>
> On Thu, 9 Jun 2011, Russell King - ARM Linux wrote:
>
>> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>>> Btw, when testing this I found that generic cpu_suspend seems to be just
>>> fine for OMAP3; the OMAP platforms though do not at this time use the
>>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>>
>> That's because OMAP was doing changes to their sleep code while I was
>> consolidating the sleep code, and although I asked several times that
>> the OMAP folk should participate in this effort, but evidentally I was
>> unsuccessful in achieving anything in that direction.
>>
>> And of course since then it's been forgotten about, and I've given up
>> on that particular aspect. I've also come to the conclusion that OMAP
>> is sufficiently weird (requiring soo much to execute from SRAM) that
>> its hopeless to persue.
>>
>
> Thanks for the info.
>
> You're right the omap sleep code is long. The l1_logic_lost sequence of
> p15 accesses in there could probably be shortened via cpu_suspend/resume
> but it's not fully obvious (to me) how to plug it in.
>
> I'm largely asking because the hibernation patch, as written (and using
> the cpu_v7_do_suspend/resume backends), does "work" on OMAP3 as far as
> I've tested it, i.e. it didn't need the complex dance done by the OMAP
> sleep code itself.
>
It's not doing it for fun for sure.

> That said, there's secure state, there's maybe other stuff the iROM
> deals with, and I don't know how to comprehensively test this gets
> restored to a usable state (or even needs to), so a claim that the
> hibernation patch is proven perfect goes a bit far. Hence quotes, "works".
>
Thanks for appreciating something in that code
about secure state and all.

> Re OMAP, there's two questions regarding hibernation/suspend-to-disk:
>
> a) What reasons if any are there why cpu_{v7_do_}suspend/resume are not
> ok to use (on OMAP) for snapshotting core state, for the purpose of
> hibernation ?
> If there are any such issues, then how could they be addresssed ?
>
Part of the answer is what Russell described. We think it's doable,
but it needs some work. First and fore most is this code should
be able to be executed from DDR. It's not the case today.

There are some trust zone registers comes on the way. Like
AUXCTRL can't be written on OMAP directly. It will either
abort or not have any effect. L2 cache invalidations need
to use secure APIs etc.
WFI can't be executed being in DDR. It must be from SRAM
which is mapped as non-cache able memory.


> b) Is it necessary to provide machine hooks in the hibernation code for
> some auxilliary stuff that's not saved/restored by device suspend ?
> If so, how would this need to look like, from the OMAP view ?
>
I already covered this above.

> OMAP is a complex piece as it seems, hence maybe someone can comment
> from that angle ?
>
Agree it's complex but with some provisions in generic code,
it might work.

Regards
Santosh
Russell King - ARM Linux - June 9, 2011, 4:40 p.m.
On Thu, Jun 09, 2011 at 09:57:06PM +0530, Santosh Shilimkar wrote:
> On 6/9/2011 9:10 PM, Russell King - ARM Linux wrote:
>> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>>> Btw, when testing this I found that generic cpu_suspend seems to be just
>>> fine for OMAP3; the OMAP platforms though do not at this time use the
>>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>>
>> That's because OMAP was doing changes to their sleep code while I was
>> consolidating the sleep code, and although I asked several times that
>> the OMAP folk should participate in this effort, but evidentally I was
>> unsuccessful in achieving anything in that direction.
>
> Agreed but the situation at that point was the code was not at
> all in convertible position. Looking at your below comment,
> it's still not :)

Well, I had a look before posting this reply, and ran away from it.
I've gone back to it several times since, and got a similar reaction.

I seem to remember that it looked _more_ convertable when I looked at
it when doing the generic suspend/resume support - I could see a nice
simple way to pull out the saving and just leave the PLL resume stuff
in SRAM.

I'm now convinced that if I try to convert it use the generic support,
it will end up being a horrible broken mess.
Russell King - ARM Linux - June 9, 2011, 4:50 p.m.
On Tue, Jun 07, 2011 at 05:48:17PM +0100, Frank Hofmann wrote:
> There are a few dependencies this patch brings in:
>
> * due to the use of cpu_suspend / cpu_resume, it'll only apply as-is
>   to kernels no older than f6b0fa02e8b0708d17d631afce456524eadf87ff,
>   where Russell King introduced the generic interface.
>   Patching these into older kernels is a little work.
>
> * it temporarily uses swapper_pg_dir and establishes 1:1 mappings there
>   for a MMU-off transition, which is necessary before resume.
>   In order to tear these down afterwards, identity_mapping_del() needs
>   to be called; for some reason that's #ifdef CONFIG_SMP ...
>
> * it needs to "catch" sleep_save_sp after cpu_suspend() so that resume
>   can be provided with the proper starting point.
>   This requires an ENTRY(sleep_save_sp) in arch/arm/kernel/sleep.S so
>   that the symbol becomes public.
>
> * it assumes cpu_reset will disable the MMU. cpu_v6_reset/cpu_v7_reset
>   are currently not doing so (amongst some other minor chip types).
>
> * there's kind of a circular dependency between CONFIG_HIBERNATION and
>   CONFIG_PM_SLEEP, on ARM. The latter is necessary so that cpu_suspend
>   and cpu_resume are compiled in, but it cannot be selected via
>   ARCH_HIBERNATION_POSSIBLE because CONFIG_PM_SLEEP depends on
>   CONFIG_HIBERNATION_INTERFACE - selected by CONFIG_HIBERNATION.

Another issue is that it uses PHYS_OFFSET in assembly code which is not
permissible with P2V patching.
Santosh Shilimkar - June 9, 2011, 4:53 p.m.
On 6/9/2011 10:10 PM, Russell King - ARM Linux wrote:
> On Thu, Jun 09, 2011 at 09:57:06PM +0530, Santosh Shilimkar wrote:
>> On 6/9/2011 9:10 PM, Russell King - ARM Linux wrote:
>>> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>>>> Btw, when testing this I found that generic cpu_suspend seems to be just
>>>> fine for OMAP3; the OMAP platforms though do not at this time use the
>>>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>>>
>>> That's because OMAP was doing changes to their sleep code while I was
>>> consolidating the sleep code, and although I asked several times that
>>> the OMAP folk should participate in this effort, but evidentally I was
>>> unsuccessful in achieving anything in that direction.
>>
>> Agreed but the situation at that point was the code was not at
>> all in convertible position. Looking at your below comment,
>> it's still not :)
>
> Well, I had a look before posting this reply, and ran away from it.
> I've gone back to it several times since, and got a similar reaction.
>
> I seem to remember that it looked _more_ convertable when I looked at
> it when doing the generic suspend/resume support - I could see a nice
> simple way to pull out the saving and just leave the PLL resume stuff
> in SRAM.
>
> I'm now convinced that if I try to convert it use the generic support,
> it will end up being a horrible broken mess.
I must admit that I had same impression when I started looking at it.

Few provisions are necessary for OMAP which I can think of are:
1. WFI loop should be made a seperate function so that it can pushed
on SRAM which is must for OMAP3.

2. A callback before WFI to implement the Errata WA's

3. Avoid direct write to AUXCTRL in generic suspend code.

4. Before MMU is enabled in resume a callback to restore
secure register, setup auxctrl etc.

With above addressed, mostly we should be able to
get it working. But for sure it will mess up the
simple suspend hooks as they are today.

btw, for OMAP4 as well I looked at this suspend hooks
and most the requirement above apply except 4)

Additionally the L2 cache handling isn't part of
these common suspend hooks.

Regards
Santosh
Frank Hofmann - June 9, 2011, 4:53 p.m.
On Thu, 9 Jun 2011, Russell King - ARM Linux wrote:

> On Tue, Jun 07, 2011 at 05:48:17PM +0100, Frank Hofmann wrote:
>> There are a few dependencies this patch brings in:
>>
>> * due to the use of cpu_suspend / cpu_resume, it'll only apply as-is
>>   to kernels no older than f6b0fa02e8b0708d17d631afce456524eadf87ff,
>>   where Russell King introduced the generic interface.
>>   Patching these into older kernels is a little work.
>>
>> * it temporarily uses swapper_pg_dir and establishes 1:1 mappings there
>>   for a MMU-off transition, which is necessary before resume.
>>   In order to tear these down afterwards, identity_mapping_del() needs
>>   to be called; for some reason that's #ifdef CONFIG_SMP ...
>>
>> * it needs to "catch" sleep_save_sp after cpu_suspend() so that resume
>>   can be provided with the proper starting point.
>>   This requires an ENTRY(sleep_save_sp) in arch/arm/kernel/sleep.S so
>>   that the symbol becomes public.
>>
>> * it assumes cpu_reset will disable the MMU. cpu_v6_reset/cpu_v7_reset
>>   are currently not doing so (amongst some other minor chip types).
>>
>> * there's kind of a circular dependency between CONFIG_HIBERNATION and
>>   CONFIG_PM_SLEEP, on ARM. The latter is necessary so that cpu_suspend
>>   and cpu_resume are compiled in, but it cannot be selected via
>>   ARCH_HIBERNATION_POSSIBLE because CONFIG_PM_SLEEP depends on
>>   CONFIG_HIBERNATION_INTERFACE - selected by CONFIG_HIBERNATION.
>
> Another issue is that it uses PHYS_OFFSET in assembly code which is not
> permissible with P2V patching.
>

Would calling something like:

unsigned long __swsusp_arch_get_vpoffset(void *addr)
{
         return (virt_to_phys(addr) - (unsigned long)addr);
}

from the assembly be ok ?

(That's what I've gone for at the moment to address this; a generic func 
somewhere to query for this would obviously be ok as well)

FrankH.
Santosh Shilimkar - June 9, 2011, 4:56 p.m.
On 6/9/2011 10:14 PM, Frank Hofmann wrote:
>
>
> On Thu, 9 Jun 2011, Santosh Shilimkar wrote:
>
>> On 6/9/2011 9:10 PM, Russell King - ARM Linux wrote:
>>> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>>>> Btw, when testing this I found that generic cpu_suspend seems to be
>>>> just
>>>> fine for OMAP3; the OMAP platforms though do not at this time use the
>>>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>>>
>>> That's because OMAP was doing changes to their sleep code while I was
>>> consolidating the sleep code, and although I asked several times that
>>> the OMAP folk should participate in this effort, but evidentally I was
>>> unsuccessful in achieving anything in that direction.
>>>
>> Agreed but the situation at that point was the code was not at
>> all in convertible position. Looking at your below comment,
>> it's still not :)
>>
>>> And of course since then it's been forgotten about, and I've given up
>>> on that particular aspect. I've also come to the conclusion that OMAP
>>> is sufficiently weird (requiring soo much to execute from SRAM) that
>>> its hopeless to persue.
>>>
>> We did discuss this Russell and requested your help here. I guess
>> you have already looked at OMAP code from generic suspend
>> hooks point of view and the SRAM execution, Errata's seems to
>> make you feel it's not going to work.
>> Is that what you mean here ?
>>
>> Regards
>> Santosh
>>
>
> Sorry for interjecting ... you're right there's a lot special about
> OMAP. What I've been talking about is a rather small(ish) bit. Maybe the
> diff illustrates what I mean - use cpu_suspend/resume for the parts of
> off-mode save/restore that are non-OMAP-specific.
>
> Like this (not tested, just for illustration what I mean):
>
Mostly it won't work.
Just replied to your questions. I think you can get the
answer on why this change won't work in it's current form.

Regards
Santosh
Frank Hofmann - June 9, 2011, 5:07 p.m.
On Thu, 9 Jun 2011, Santosh Shilimkar wrote:

> On 6/9/2011 9:56 PM, Frank Hofmann wrote:
>> 
>> 
[ ... ]
>> a) What reasons if any are there why cpu_{v7_do_}suspend/resume are not
>> ok to use (on OMAP) for snapshotting core state, for the purpose of
>> hibernation ?
>> If there are any such issues, then how could they be addresssed ?
>> 
> Part of the answer is what Russell described. We think it's doable,
> but it needs some work. First and fore most is this code should
> be able to be executed from DDR. It's not the case today.

Ah, I gather that's the _real_ critical point, i.e.

 	_omap_sram_idle = omap_sram_push(omap34xx_cpu_suspend,
 					omap34xx_cpu_suspend_sz);

relies on this to be completely consecutive in mem, and relocatable, i.e. 
calling _outside_ that area isn't possible ?

I.e. unless a way can be found to _embed_ cpu_suspend/resume here, it's 
pretty hard to use ?

Would it be possible / acceptable to have it be relocatable code, and put 
it into a common .section ?


Thanks for the feedback,
FrankH.
Santosh Shilimkar - June 9, 2011, 5:10 p.m.
On 6/9/2011 10:37 PM, Frank Hofmann wrote:
>
>
> On Thu, 9 Jun 2011, Santosh Shilimkar wrote:
>
>> On 6/9/2011 9:56 PM, Frank Hofmann wrote:
>>>
>>>
> [ ... ]
>>> a) What reasons if any are there why cpu_{v7_do_}suspend/resume are not
>>> ok to use (on OMAP) for snapshotting core state, for the purpose of
>>> hibernation ?
>>> If there are any such issues, then how could they be addresssed ?
>>>
>> Part of the answer is what Russell described. We think it's doable,
>> but it needs some work. First and fore most is this code should
>> be able to be executed from DDR. It's not the case today.
>
> Ah, I gather that's the _real_ critical point, i.e.
>
> _omap_sram_idle = omap_sram_push(omap34xx_cpu_suspend,
> omap34xx_cpu_suspend_sz);
>
> relies on this to be completely consecutive in mem, and relocatable,
> i.e. calling _outside_ that area isn't possible ?
>
> I.e. unless a way can be found to _embed_ cpu_suspend/resume here, it's
> pretty hard to use ?
>
> Would it be possible / acceptable to have it be relocatable code, and
> put it into a common .section ?
>
Surely acceptable :)
But other points about callback are also important o.w
you will use MMU with wrong CP15 configurations after
one sleep transition.
Russell King - ARM Linux - June 9, 2011, 5:12 p.m.
On Thu, Jun 09, 2011 at 10:23:07PM +0530, Santosh Shilimkar wrote:
> On 6/9/2011 10:10 PM, Russell King - ARM Linux wrote:
>> On Thu, Jun 09, 2011 at 09:57:06PM +0530, Santosh Shilimkar wrote:
>>> On 6/9/2011 9:10 PM, Russell King - ARM Linux wrote:
>>>> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>>>>> Btw, when testing this I found that generic cpu_suspend seems to be just
>>>>> fine for OMAP3; the OMAP platforms though do not at this time use the
>>>>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>>>>
>>>> That's because OMAP was doing changes to their sleep code while I was
>>>> consolidating the sleep code, and although I asked several times that
>>>> the OMAP folk should participate in this effort, but evidentally I was
>>>> unsuccessful in achieving anything in that direction.
>>>
>>> Agreed but the situation at that point was the code was not at
>>> all in convertible position. Looking at your below comment,
>>> it's still not :)
>>
>> Well, I had a look before posting this reply, and ran away from it.
>> I've gone back to it several times since, and got a similar reaction.
>>
>> I seem to remember that it looked _more_ convertable when I looked at
>> it when doing the generic suspend/resume support - I could see a nice
>> simple way to pull out the saving and just leave the PLL resume stuff
>> in SRAM.
>>
>> I'm now convinced that if I try to convert it use the generic support,
>> it will end up being a horrible broken mess.
>
> I must admit that I had same impression when I started looking at it.
>
> Few provisions are necessary for OMAP which I can think of are:
> 1. WFI loop should be made a seperate function so that it can pushed
> on SRAM which is must for OMAP3.

If you look at the generic cpu suspend, it sits in the suspend path.
Once cpu_suspend() has been called, it will return as normal and you
can then do whatever you require to place the system into suspend -
including calling out to a function in WFI.

> 2. A callback before WFI to implement the Errata WA's

"WA's" ?

> 3. Avoid direct write to AUXCTRL in generic suspend code.

This is the only problematical one that I can see.  We need to restore
this on systems running in secure mode.  What we could do is rather than
writing to the register, read it first and compare its value with what
was saved to see whether we need to write it.

Then, if platforms run in non-secure mode, they are responsible for
restoring that register back to its pre-suspend value before their
assembly calls cpu_resume().

> 4. Before MMU is enabled in resume a callback to restore
> secure register, setup auxctrl etc.

You can do this before your assembly calls cpu_resume().

> Additionally the L2 cache handling isn't part of
> these common suspend hooks.

L2 cache handling can't fit into the generic code - it doesn't really
belong there either.  It needs to be in the parent or hooked into the
syscore_ops stuff as I've said previously.

So:

ENTRY(my_soc_suspend)
        stmfd   sp!, {r4 - r12, lr}
        ldr     r3, =resume
        bl      cpu_suspend
	/*
	 * Insert whatever code is required here for suspend
	 * eg, save secure mode, then jump to sram to call WFI function
	 */
resume:
	ldmfd	sp!, {r4 - r12, pc}
ENDPROC(my_soc_suspend)

ENTRY(my_soc_resume)
	/*
	 * Insert whatever other code is required to be run before resume
	 * eg, WFI function returns to this symbol after DDR becomes
	 * accessible.  restore secure mode state
	 */
	b	cpu_resume
ENDPROC(my_soc_resume)

What makes it far more complicated in the OMAP case is all that "is l1 state
lost?  is l2 state lost?" stuff.
Russell King - ARM Linux - June 9, 2011, 5:14 p.m.
On Thu, Jun 09, 2011 at 06:07:59PM +0100, Frank Hofmann wrote:
> I.e. unless a way can be found to _embed_ cpu_suspend/resume here, it's  
> pretty hard to use ?

I don't think OMAP needs to push cpu_suspend/resume calls out into SRAM
at all... see the mail I just sent prior to this.

> Would it be possible / acceptable to have it be relocatable code, and put
> it into a common .section ?

No, because it needs to call CPU specific functions itself, which can't
be relocated.  I don't see why OMAP needs any of that complexity anyway,
and so its pure overengineering.
Santosh Shilimkar - June 9, 2011, 5:21 p.m.
On 6/9/2011 10:42 PM, Russell King - ARM Linux wrote:
> On Thu, Jun 09, 2011 at 10:23:07PM +0530, Santosh Shilimkar wrote:
>> On 6/9/2011 10:10 PM, Russell King - ARM Linux wrote:
>>> On Thu, Jun 09, 2011 at 09:57:06PM +0530, Santosh Shilimkar wrote:
>>>> On 6/9/2011 9:10 PM, Russell King - ARM Linux wrote:
>>>>> On Thu, Jun 09, 2011 at 04:30:08PM +0100, Frank Hofmann wrote:
>>>>>> Btw, when testing this I found that generic cpu_suspend seems to be just
>>>>>> fine for OMAP3; the OMAP platforms though do not at this time use the
>>>>>> generic cpu_suspend/resume for sleep, is it planned to change that ?
>>>>>
>>>>> That's because OMAP was doing changes to their sleep code while I was
>>>>> consolidating the sleep code, and although I asked several times that
>>>>> the OMAP folk should participate in this effort, but evidentally I was
>>>>> unsuccessful in achieving anything in that direction.
>>>>
>>>> Agreed but the situation at that point was the code was not at
>>>> all in convertible position. Looking at your below comment,
>>>> it's still not :)
>>>
>>> Well, I had a look before posting this reply, and ran away from it.
>>> I've gone back to it several times since, and got a similar reaction.
>>>
>>> I seem to remember that it looked _more_ convertable when I looked at
>>> it when doing the generic suspend/resume support - I could see a nice
>>> simple way to pull out the saving and just leave the PLL resume stuff
>>> in SRAM.
>>>
>>> I'm now convinced that if I try to convert it use the generic support,
>>> it will end up being a horrible broken mess.
>>
>> I must admit that I had same impression when I started looking at it.
>>
>> Few provisions are necessary for OMAP which I can think of are:
>> 1. WFI loop should be made a seperate function so that it can pushed
>> on SRAM which is must for OMAP3.
>
> If you look at the generic cpu suspend, it sits in the suspend path.
> Once cpu_suspend() has been called, it will return as normal and you
> can then do whatever you require to place the system into suspend -
> including calling out to a function in WFI.
>
>> 2. A callback before WFI to implement the Errata WA's
>
> "WA's" ?
>
Software Work-Arounds for issues around WFI or special
cases.

>> 3. Avoid direct write to AUXCTRL in generic suspend code.
>
> This is the only problematical one that I can see.  We need to restore
> this on systems running in secure mode.  What we could do is rather than
> writing to the register, read it first and compare its value with what
> was saved to see whether we need to write it.
>
> Then, if platforms run in non-secure mode, they are responsible for
> restoring that register back to its pre-suspend value before their
> assembly calls cpu_resume().
>
>> 4. Before MMU is enabled in resume a callback to restore
>> secure register, setup auxctrl etc.
>
> You can do this before your assembly calls cpu_resume().
>
>> Additionally the L2 cache handling isn't part of
>> these common suspend hooks.
>
> L2 cache handling can't fit into the generic code - it doesn't really
> belong there either.  It needs to be in the parent or hooked into the
> syscore_ops stuff as I've said previously.
>
Ok. I missed these points in last discussion.

> So:
>
> ENTRY(my_soc_suspend)
>          stmfd   sp!, {r4 - r12, lr}
>          ldr     r3, =resume
>          bl      cpu_suspend
> 	/*
> 	 * Insert whatever code is required here for suspend
> 	 * eg, save secure mode, then jump to sram to call WFI function
> 	 */
> resume:
> 	ldmfd	sp!, {r4 - r12, pc}
> ENDPROC(my_soc_suspend)
>
> ENTRY(my_soc_resume)
> 	/*
> 	 * Insert whatever other code is required to be run before resume
> 	 * eg, WFI function returns to this symbol after DDR becomes
> 	 * accessible.  restore secure mode state
> 	 */
> 	b	cpu_resume
> ENDPROC(my_soc_resume)
>
> What makes it far more complicated in the OMAP case is all that "is l1 state
> lost?  is l2 state lost?" stuff.

Exactly. And that's where the most complexity comes in. Also on OMAP
we use same sleep code for CPUidle and suspend and in idle, there
are many low power states possible with variation of L1, l2, CPU logic,
L2 controller logic, interrupt controller logic etc.

Thanks for bringing up these points. Now I better understand
as well why I struggled to get anything running relaibly
on OMAP4 with few hours attempt with suspend hooks.

Regards
Frank Hofmann - June 10, 2011, 12:22 p.m.
On Thu, 9 Jun 2011, Russell King - ARM Linux wrote:

> On Thu, Jun 09, 2011 at 10:23:07PM +0530, Santosh Shilimkar wrote:
[ ... ]
>> 3. Avoid direct write to AUXCTRL in generic suspend code.
>
> This is the only problematical one that I can see.  We need to restore
> this on systems running in secure mode.  What we could do is rather than
> writing to the register, read it first and compare its value with what
> was saved to see whether we need to write it.
>
> Then, if platforms run in non-secure mode, they are responsible for
> restoring that register back to its pre-suspend value before their
> assembly calls cpu_resume().

While this is ok from the point of view of having cpu_suspend / resume 
being service functions for the platform-specific idle/off-mode code, it 
also illustrates the difficulty this creates for the hibernation code.

If it's not possible to call cpu_suspend / cpu_resume (or something like 
it - not tied to names ...) as a full-featured generic interface, then 
creating a true snapshot capability becomes problematic.

>
>> 4. Before MMU is enabled in resume a callback to restore
>> secure register, setup auxctrl etc.
>
> You can do this before your assembly calls cpu_resume().

Only that it's known at the point of call to cpu_suspend/resume.

Again, I admit to being biased regarding the usecase here ...

See below.

>
>> Additionally the L2 cache handling isn't part of
>> these common suspend hooks.
>
> L2 cache handling can't fit into the generic code - it doesn't really
> belong there either.  It needs to be in the parent or hooked into the
> syscore_ops stuff as I've said previously.

For the "take things down" side, dode like machine_restart() already is 
able to flush/inval/disable all these things on the way down, without any 
SoC-specific knowledge.

OMAP suspend/resume has, just recently, gone half-way there (use the 
provided flush/inval instead of the home-grown table walker code),

Agree with that. Cache flushing / disabling / invalidation has interfaces 
(the outer_*, l2* and cache ops) already, and e.g. the OMAP code has 
recently started to use some of those (kernel_flush instead of the 
home-grown inval loop). Looks like that part is on its way.

On the resume side, there's actually a problem here - the generic way of 
enabling a cache is through initcalls - which don't happen on resume, so 
if the system comes out of a "low enough" state there's some work to do 
here - which generic cpu_resume() does not do.

>
> So:
>
> ENTRY(my_soc_suspend)
>        stmfd   sp!, {r4 - r12, lr}
>        ldr     r3, =resume
>        bl      cpu_suspend
> 	/*
> 	 * Insert whatever code is required here for suspend
> 	 * eg, save secure mode, then jump to sram to call WFI function
> 	 */
> resume:
> 	ldmfd	sp!, {r4 - r12, pc}
> ENDPROC(my_soc_suspend)
>
> ENTRY(my_soc_resume)
> 	/*
> 	 * Insert whatever other code is required to be run before resume
> 	 * eg, WFI function returns to this symbol after DDR becomes
> 	 * accessible.  restore secure mode state
> 	 */
> 	b	cpu_resume
> ENDPROC(my_soc_resume)

That alone doesn't accommodate the following situations:

a) there might be pre-suspend / post-resume activities necessary, i.e.
    the assumption that any SoC-specific "go down" activity can be done
    after cpu_suspend() and any SoC-specific "bring up" activity before
    cpu_resume() might not be sufficient.
    Case in point: Reenabling L2 caches after resume.

b) my bias - snapshotting state (for hibernation).
    Delegating this to a SoC-specific method risks creating code like
    the OMAP stuff - where state saving, power management, off mode
    and whatnot is all interwoven and interdependent.
    It also creates the problem that _generic_ (platform-independent)
    hibernation code becomes impossible to do ...

    A clean thing are separate steps:

 	<mach preparation for suspend state snapshot>
 	<generic state snapshot>
 	<mach state snapshot>
 	...
 	<wfi>	/* or not ... */
 	...
 	<mach prep for resume / basic initialization>
 	<generic resume>
 	<mach resume>

So why not have those hooks _inside_ cpu_suspend / cpu_resume, i.e. like:

.data
ENTRY(cpu_suspend)
 	mov	r9, lr
#ifdef	CONFIG_ARCH_NEEDS_SPAGHETTI_SUSPEND
 	mov	lr, pc
 	ldr	pc, mach_pre_suspend_hook
#endif
 	...
#ifdef	CONFIG_ARCH_NEEDS_SPAGHETTI_SUSPEND
 	mov	lr, r9
 	ldr	r4, mach_post_suspend_hook
 	b	r4
#else
 	mov	pc, r9
END(cpu_suspend)

#ifdef	CONFIG_ARCH_NEEDS_SPAGHETTI_SUSPEND
mach_pre_suspend_hook:
.long 0
mach_post_suspend_hook:
.long 0
#endif


and let the SoC initialization set them if it so desires ?


This would allow to use them for snapshotting the state as well.

Key point, again, from the hibernation bias, is really to have that 
stuff _separate_ from power-down / wfi / whatever-to-enter-lowpower-mode.


As many of these activities as possible should be dealt with by sysdev / 
syscore, agreed; but unfortunately there might be certain things, 
especially around secure state, that are too closely tied in to delegate 
it to those ?


>
> What makes it far more complicated in the OMAP case is all that "is l1 state
> lost?  is l2 state lost?" stuff.
>

It looks like it's structured this way:

omap_cpu_suspend()
{
 	switch (state) {
 	case 3:
 	case 1:
 		/* save context */
 	case 2:
 		/* clean caches */
 	case 0:
 		wfi();
 	}
}

omap_cpu_resume()	/* from OFF, case 2 / 0 never happens */
{
 	if (state == 3)
 		/* disable / inval L2 */

 	/* restore context */
 	/* reenable L2 */
}


But the code doesn't perfectly match the comments in it.


FrankH.
Russell King - ARM Linux - June 10, 2011, 1:43 p.m.
On Fri, Jun 10, 2011 at 01:22:24PM +0100, Frank Hofmann wrote:
> On Thu, 9 Jun 2011, Russell King - ARM Linux wrote:
>
>> On Thu, Jun 09, 2011 at 10:23:07PM +0530, Santosh Shilimkar wrote:
> [ ... ]
>>> 3. Avoid direct write to AUXCTRL in generic suspend code.
>>
>> This is the only problematical one that I can see.  We need to restore
>> this on systems running in secure mode.  What we could do is rather than
>> writing to the register, read it first and compare its value with what
>> was saved to see whether we need to write it.
>>
>> Then, if platforms run in non-secure mode, they are responsible for
>> restoring that register back to its pre-suspend value before their
>> assembly calls cpu_resume().
>
> While this is ok from the point of view of having cpu_suspend / resume  
> being service functions for the platform-specific idle/off-mode code, it  
> also illustrates the difficulty this creates for the hibernation code.
>
> If it's not possible to call cpu_suspend / cpu_resume (or something like  
> it - not tied to names ...) as a full-featured generic interface, then  
> creating a true snapshot capability becomes problematic.

It is not intended to be a full-featured interface.  It is designed to
be an interface to handle the CPU specific part of the suspend/resume
only.

And I use the term CPU as strictly defined not as most people lazily do
to define the entire SoC.  I mean the core processor itself.

A generic interface can not handle the issues of secure mode when there
is no defined secure mode API.  It can't handle the L2 cache crap because
that's outside the scope of the CPU and is platform dependent.  It can't
handle devices because that's again outside the scope of the CPU and is
SoC dependent.

The only thing it can do - and should do - is deal with the CPU specific
part.  That's why it has a cpu_ prefix.

>>> 4. Before MMU is enabled in resume a callback to restore
>>> secure register, setup auxctrl etc.
>>
>> You can do this before your assembly calls cpu_resume().
>
> Only that it's known at the point of call to cpu_suspend/resume.
>
> Again, I admit to being biased regarding the usecase here ...

How can generic CPU code know about all the platform idiotic farces that
people pull?  No, what you're asking for is total madness.

>>> Additionally the L2 cache handling isn't part of
>>> these common suspend hooks.
>>
>> L2 cache handling can't fit into the generic code - it doesn't really
>> belong there either.  It needs to be in the parent or hooked into the
>> syscore_ops stuff as I've said previously.
>
> For the "take things down" side, dode like machine_restart() already is  
> able to flush/inval/disable all these things on the way down, without any 
> SoC-specific knowledge.
>
> OMAP suspend/resume has, just recently, gone half-way there (use the  
> provided flush/inval instead of the home-grown table walker code),
>
> Agree with that. Cache flushing / disabling / invalidation has interfaces 
> (the outer_*, l2* and cache ops) already, and e.g. the OMAP code has  
> recently started to use some of those (kernel_flush instead of the  
> home-grown inval loop). Looks like that part is on its way.
>
> On the resume side, there's actually a problem here - the generic way of  
> enabling a cache is through initcalls - which don't happen on resume, so  
> if the system comes out of a "low enough" state there's some work to do  
> here - which generic cpu_resume() does not do.

And can't do.

>>
>> So:
>>
>> ENTRY(my_soc_suspend)
>>        stmfd   sp!, {r4 - r12, lr}
>>        ldr     r3, =resume
>>        bl      cpu_suspend
>> 	/*
>> 	 * Insert whatever code is required here for suspend
>> 	 * eg, save secure mode, then jump to sram to call WFI function
>> 	 */
>> resume:
>> 	ldmfd	sp!, {r4 - r12, pc}
>> ENDPROC(my_soc_suspend)
>>
>> ENTRY(my_soc_resume)
>> 	/*
>> 	 * Insert whatever other code is required to be run before resume
>> 	 * eg, WFI function returns to this symbol after DDR becomes
>> 	 * accessible.  restore secure mode state
>> 	 */
>> 	b	cpu_resume
>> ENDPROC(my_soc_resume)
>
> That alone doesn't accommodate the following situations:
>
> a) there might be pre-suspend / post-resume activities necessary, i.e.
>    the assumption that any SoC-specific "go down" activity can be done
>    after cpu_suspend() and any SoC-specific "bring up" activity before
>    cpu_resume() might not be sufficient.
>    Case in point: Reenabling L2 caches after resume.

Look, platform code calls cpu_suspend() as part of whatever is required
to do the suspend work.  It deals with the core CPU crap, nothing more.
Platforms have to take care to deal with whatever shite they have before
the CPU core crap is handled, and then do whatever shite they need to
do after the CPU core crap has been handled.

You can't get away from that.  You can't go stuffing L2 shite into the
middle of this.

> b) my bias - snapshotting state (for hibernation).
>    Delegating this to a SoC-specific method risks creating code like
>    the OMAP stuff - where state saving, power management, off mode
>    and whatnot is all interwoven and interdependent.
>    It also creates the problem that _generic_ (platform-independent)
>    hibernation code becomes impossible to do ...
>
>    A clean thing are separate steps:
>
> 	<mach preparation for suspend state snapshot>
> 	<generic state snapshot>
> 	<mach state snapshot>
> 	...
> 	<wfi>	/* or not ... */
> 	...
> 	<mach prep for resume / basic initialization>
> 	<generic resume>
> 	<mach resume>
>
> So why not have those hooks _inside_ cpu_suspend / cpu_resume, i.e. like:

What's the point when platform code has *ALREADY* to call these functions?

Is it really too sodding difficult for platforms to do:

my_suspend_hook()
{
	mach_pre_suspend_hook();
	cpu_suspend();
	mach_post_suspend_hook();
}

?

<not read the rest of the message, this is getting idiotic>
Frank Hofmann - June 10, 2011, 1:47 p.m.
On Fri, 10 Jun 2011, Russell King - ARM Linux wrote:

[ ... ]
> What's the point when platform code has *ALREADY* to call these functions?
>
> Is it really too sodding difficult for platforms to do:
>
> my_suspend_hook()
> {
> 	mach_pre_suspend_hook();
> 	cpu_suspend();
> 	mach_post_suspend_hook();
> }
>
> ?
>
> <not read the rest of the message, this is getting idiotic>
>

Sorry for being unclear.

Yes, they already have to do all this. And they should.


Except for one thing:

They all, IN ADDITION, do ALSO:

 	wfi();			/* or whatever else to power down */


For Hibernation, _THAT_ needs to be out of the codepath.


So that one can snapshot without powering down.

FrankH.
Russell King - ARM Linux - June 10, 2011, 2:02 p.m.
On Fri, Jun 10, 2011 at 02:47:30PM +0100, Frank Hofmann wrote:
> On Fri, 10 Jun 2011, Russell King - ARM Linux wrote:
>
> [ ... ]
>> What's the point when platform code has *ALREADY* to call these functions?
>>
>> Is it really too sodding difficult for platforms to do:
>>
>> my_suspend_hook()
>> {
>> 	mach_pre_suspend_hook();
>> 	cpu_suspend();
>> 	mach_post_suspend_hook();
>> }
>>
>> ?
>>
>> <not read the rest of the message, this is getting idiotic>
>>
>
> Sorry for being unclear.
>
> Yes, they already have to do all this. And they should.
>
>
> Except for one thing:
>
> They all, IN ADDITION, do ALSO:
>
> 	wfi();			/* or whatever else to power down */
>
>
> For Hibernation, _THAT_ needs to be out of the codepath.
>
>
> So that one can snapshot without powering down.

I think there's a fundamental problem here - what's required for S2RAM
is not what's required for hibernate.  After cpu_suspend() has done
its job, you are in a _very_ specific environment designed for the last
stages of S2RAM _only_ and not hibernate.

In order to use cpu_suspend() for hibernate, it requires a completely
different path entirely, and there's no getting away from that.

You can see that when you analyze the differences between S2RAM and
hibernate, when you realize that the final part of the S2RAM process
(which happens after cpu_suspend() returns) on many SoCs is dealing
with putting SDRAM into self-refresh mode before writing some kind of
power mode register to tell the power supply to kill power to most
of the platform.  That is all _very_ SoC specific.

Also realize that the code which executes after cpu_suspend() returns
is _not_ running in the same context as the code which called
cpu_suspend() - cpu_suspend() has modified the stack pointer to store
the CPU specific state and that is not the same stack pointer as was
the case before cpu_suspend() was called.

You don't want to run any of that code when you're dealing with hibernate,
so expecting to be able to reuse these S2RAM paths is not realistic.

What we could do is provide a cpu_hibernate() function which has saner
semantics for saving the CPU specific state for hibernate.
Frank Hofmann - June 10, 2011, 2:54 p.m.
On Fri, 10 Jun 2011, Russell King - ARM Linux wrote:

>> [ ... ]
> I think there's a fundamental problem here - what's required for S2RAM
> is not what's required for hibernate.  After cpu_suspend() has done
> its job, you are in a _very_ specific environment designed for the last
> stages of S2RAM _only_ and not hibernate.
>
> In order to use cpu_suspend() for hibernate, it requires a completely
> different path entirely, and there's no getting away from that.
>
> You can see that when you analyze the differences between S2RAM and
> hibernate, when you realize that the final part of the S2RAM process
> (which happens after cpu_suspend() returns) on many SoCs is dealing
> with putting SDRAM into self-refresh mode before writing some kind of
> power mode register to tell the power supply to kill power to most
> of the platform.  That is all _very_ SoC specific.

Yes, that's what I'm trying to say - the _final_ stage, for s2ram, sends 
the SoC to low-power.
Up until there, we do the same for hibernation, don't we ? Where exactly 
is it different ?

>
> Also realize that the code which executes after cpu_suspend() returns
> is _not_ running in the same context as the code which called
> cpu_suspend() - cpu_suspend() has modified the stack pointer to store
> the CPU specific state and that is not the same stack pointer as was
> the case before cpu_suspend() was called.

Yes, the function isn't "well behaved" from the ABI point of view because 
it doesn't preserve registers (including the stack), but that can be 
accommodated by the caller.
The current s2ram callers have to accommodate that as well. Which is 
ultimately easy for them - since poweroff doesn't care.


The only reason why hibernation / swsusp_arch_suspend() is different there 
is because the activity _after_ cpu_suspend() is extensive and _can fail_ 
(saving the image); on that failure, one would prefer to see an error 
message and continue instead of panicing the system. So the stack change 
you mention needs to be addressed, swsusp_arch_suspend() must be a 
well-behaved function from the ABI point of view.


Normally, if all goes _right_, swsusp_save() does not return either. It 
ends powering the system off. If one were willing to die without message 
on failure to save the snapshot to disk, and would be willing to block 
cpu_suspend during writing the snapshot (to guarantee sleep_save_sp isn't 
changing) one wouldn't need to care about the stack and could simply:

ENTRY(swsusp_arch_suspend)
 	mrs	r1, cpsr
 	mrs	r2, spsr
 	stmfd	sp! {r1-r12,lr}
 	bl	__swsusp_arch_get_vpoffset
 	mov	r1, r0
 	adr	r3, .Lresume_post_mmu
 	bl	cpu_suspend
 	bl	swsusp_save
0:
 	b	0b		@ should never reach this
ENDPROC(swsusp_arch_suspend)


Resume is quite trivial either way:

ENTRY(swsusp_arch_resume)
         setmode PSR_I_BIT | PSR_F_BIT | SVC_MODE, r2
         ldr     sp, =(__swsusp_resume_stk + PAGE_SIZE / 2)
 	/*
 	 * replays image, and ends in cpu_reset(cpu_resume)
 	 */
         b       __swsusp_arch_restore_image
.Lresume_post_mmu:
         ldmfd   sp!, {r1-r12}
         msr     cpsr, r1
         msr     spsr, r2
         bl      cpu_init                        @ reinitialize other modes
         ldmfd   sp!, {lr}
         b       __swsusp_arch_resume_finish     @ cleanup
ENDPROC(swsusp_arch_resume)


>
> You don't want to run any of that code when you're dealing with hibernate,
> so expecting to be able to reuse these S2RAM paths is not realistic.

Hmm, well ... in the end, hibernation does:

 	<snapshot state>
 	<some long operation that writes the image out>
 	<poweroff>

while s2ram does:

 	<snapshot state>
 	<some quick operation setting low power modes>
 	<poweroff>

>
> What we could do is provide a cpu_hibernate() function which has saner
> semantics for saving the CPU specific state for hibernate.

Yes, that's exactly what I'm hoping for. From my point of view, this 
would, though, end up in:

cpu_soc_suspend:
 	cpu_hibernate_snapshot_state();
 	/* S2RAM codepath to send soc to low power */

cpu_soc_resume:
 	/* S2RAM codepath for waking up soc essentials */
 	cpu_hibernate_restore_state();

At least I can't come up with a really good reason why the state 
snapshotting operation would have to be different between s2ram and 
s2disk.


FrankH.
Russell King - ARM Linux - June 21, 2011, 10:11 a.m.
On Thu, Jun 09, 2011 at 06:53:13PM +0100, Russell King - ARM Linux wrote:
> On Thu, Jun 09, 2011 at 06:12:55PM +0100, Russell King - ARM Linux wrote:
> > > 3. Avoid direct write to AUXCTRL in generic suspend code.
> > 
> > This is the only problematical one that I can see.  We need to restore
> > this on systems running in secure mode.  What we could do is rather than
> > writing to the register, read it first and compare its value with what
> > was saved to see whether we need to write it.
> > 
> > Then, if platforms run in non-secure mode, they are responsible for
> > restoring that register back to its pre-suspend value before their
> > assembly calls cpu_resume().
> 
> And here's a patch which does that:

Ping.

> 8<-----------
> From: Russell King <rmk+kernel@arm.linux.org.uk>
> ARM: Avoid writing to auxctrl register unless it needs to be updated
> 
> As the auxiliary control register is not writable in non-secure mode
> such as on OMAP, we must avoid writing the register when resuming in
> non-secure mode.  Avoid this by moving the responsibility to the
> SoC code in this case to ensure that the auxiliary control register
> is restored before cpu_resume() is called.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> --
>  arch/arm/mm/proc-v7.S |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 3c38678..fa1e6d5 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -237,7 +237,9 @@ ENTRY(cpu_v7_do_resume)
>  	mcr	p15, 0, r7, c2, c0, 0	@ TTB 0
>  	mcr	p15, 0, r8, c2, c0, 1	@ TTB 1
>  	mcr	p15, 0, ip, c2, c0, 2	@ TTB control register
> -	mcr	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
> +	mrc	p15, 0, r4, c1, c0, 1	@ Read auxiliary control register
> +	teq	r4, r10
> +	mcrne	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
>  	mcr	p15, 0, r11, c1, c0, 2	@ Co-processor access control
>  	ldr	r4, =PRRR		@ PRRR
>  	ldr	r5, =NMRR		@ NMRR
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Patch

 arch/arm/include/asm/memory.h |    1 +
 arch/arm/kernel/Makefile      |    1 +
 arch/arm/mm/Kconfig           |    5 ++
 arch/arm/kernel/cpu.c         |   94 +++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/swsusp.S      |   84 ++++++++++++++++++++++++++++++++++++
 5 files changed, 185 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h

index 431077c..c7ef454 100644

--- a/arch/arm/include/asm/memory.h

+++ b/arch/arm/include/asm/memory.h

@@ -250,6 +250,7 @@  static inline void *phys_to_virt(phys_addr_t x)

  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
 #define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __pa_symbol(x)		__pa(RELOC_HIDE((unsigned long)(x),0))

 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile

index 8d95446..b76a403 100644

--- a/arch/arm/kernel/Makefile

+++ b/arch/arm/kernel/Makefile

@@ -30,6 +30,7 @@  obj-$(CONFIG_ARTHUR)		+= arthur.o

 obj-$(CONFIG_ISA_DMA)		+= dma-isa.o
 obj-$(CONFIG_PCI)		+= bios32.o isa.o
 obj-$(CONFIG_PM_SLEEP)		+= sleep.o
+obj-$(CONFIG_HIBERNATION)	+= cpu.o swsusp.o

 obj-$(CONFIG_HAVE_SCHED_CLOCK)	+= sched_clock.o
 obj-$(CONFIG_SMP)		+= smp.o smp_tlb.o
 obj-$(CONFIG_HAVE_ARM_SCU)	+= smp_scu.o
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig

index 0074b8d..c668f8f 100644

--- a/arch/arm/mm/Kconfig

+++ b/arch/arm/mm/Kconfig

@@ -627,6 +627,11 @@  config CPU_USE_DOMAINS

 config IO_36
 	bool
 
+config ARCH_HIBERNATION_POSSIBLE

+	bool

+	depends on MMU

+	default y if CPU_ARM920T || CPU_ARM926T || CPU_SA1100 || CPU_XSCALE || CPU_XSC3 || CPU_V6 || CPU_V6K || CPU_V7

+

 comment "Processor Features"
 
 config ARM_THUMB
diff --git a/arch/arm/kernel/cpu.c b/arch/arm/kernel/cpu.c

new file mode 100644
index 0000000..2cdfa85

--- /dev/null

+++ b/arch/arm/kernel/cpu.c

@@ -0,0 +1,94 @@ 

+/*

+ * Hibernation support specific for ARM

+ *

+ * Derived from work on ARM hibernation support by:

+ *

+ * Ubuntu project, hibernation support for mach-dove

+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)

+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)

+ *	https://lkml.org/lkml/2010/6/18/4

+ *	https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html

+ *	https://patchwork.kernel.org/patch/96442/

+ *

+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>

+ *

+ * License terms: GNU General Public License (GPL) version 2

+ */

+

+#include <linux/mm.h>

+#include <linux/sched.h>

+#include <linux/suspend.h>

+#include <asm/tlbflush.h>

+#include <asm/cacheflush.h>

+#include <asm/pgalloc.h>

+#include <asm/sections.h>

+

+extern const void __nosave_begin, __nosave_end;

+

+int pfn_is_nosave(unsigned long pfn)

+{

+	unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT;

+	unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;

+

+	return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn);

+}

+

+void notrace save_processor_state(void)

+{

+	flush_thread();

+	local_fiq_disable();

+}

+

+void notrace restore_processor_state(void)

+{

+	flush_tlb_all();

+	flush_cache_all();

+	local_fiq_enable();

+}

+

+u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata;

+u32 __swsusp_save_sp;

+

+int __swsusp_arch_resume_finish(void)

+{

+	identity_mapping_del(swapper_pg_dir, __pa(_stext), __pa(_etext));

+	return 0;

+}

+

+/*

+ * The framework loads the hibernation image into a linked list anchored

+ * at restore_pblist, for swsusp_arch_resume() to copy back to the proper

+ * destinations.

+ *

+ * To make this work if resume is triggered from initramfs, the

+ * pagetables need to be switched to allow writes to kernel mem.

+ */

+void notrace __swsusp_arch_restore_image(void)

+{

+	extern struct pbe *restore_pblist;

+	extern void cpu_resume(void);

+	extern unsigned long sleep_save_sp;

+	struct pbe *pbe;

+	typeof(cpu_reset) *phys_reset = (typeof(cpu_reset) *)virt_to_phys(cpu_reset);

+

+	cpu_switch_mm(swapper_pg_dir, &init_mm);

+

+	for (pbe = restore_pblist; pbe; pbe = pbe->next)

+		copy_page(pbe->orig_address, pbe->address);

+

+	sleep_save_sp = __swsusp_save_sp;

+	flush_tlb_all();

+	flush_cache_all();

+

+	identity_mapping_add(swapper_pg_dir, __pa(_stext), __pa(_etext));

+

+	flush_tlb_all();

+	flush_cache_all();

+	cpu_proc_fin();

+

+	flush_tlb_all();

+	flush_cache_all();

+

+	phys_reset(virt_to_phys(cpu_resume));

+}

+

diff --git a/arch/arm/kernel/swsusp.S b/arch/arm/kernel/swsusp.S

new file mode 100644
index 0000000..c3a4b83

--- /dev/null

+++ b/arch/arm/kernel/swsusp.S

@@ -0,0 +1,84 @@ 

+/*

+ * Hibernation support specific for ARM

+ *

+ * Based on work by:

+ *

+ * Ubuntu project, hibernation support for mach-dove,

+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)

+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)

+ *	https://lkml.org/lkml/2010/6/18/4

+ *	https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html

+ *	https://patchwork.kernel.org/patch/96442/

+ *

+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>

+ *

+ * License terms: GNU General Public License (GPL) version 2

+ */

+

+#include <linux/linkage.h>

+#include <asm/memory.h>

+#include <asm/page.h>

+#include <asm/assembler.h>

+

+/*

+ * Save the current CPU state before suspend / poweroff.

+ * cpu_suspend() allocates space on the stack to save all necessary

+ * information. This has two consequences:

+ *	- swsusp_save() has to be called without changing anything on

+ *	  the stack. One cannot just return into it.

+ *	- should swsusp_save() fail for some reason, the previous value

+ *	  of sp has to be restored from a safe place.

+ */

+ENTRY(swsusp_arch_suspend)

+	mrs	r1, cpsr

+	mrs	r2, spsr

+	stmfd	sp!, {r1-r11,lr}		@ save registers

+	ldr	r0, =.Ltemp_sp

+	str	sp, [r0]			@ temp

+	ldr	r1, =(PHYS_OFFSET - PAGE_OFFSET)

+	adr	r3, .Lresume_post_mmu		@ resume here

+	bl	cpu_suspend			@ snapshot state (to stack)

+	ldr	r1, =sleep_save_sp

+	ldr	r0, =__swsusp_save_sp

+	ldr	r2, [r1]

+	str	r2, [r0]

+	bl	swsusp_save			@ write snapshot

+	ldr	r1, =.Ltemp_sp

+	ldr	sp, [r1]			@ restore stack

+	ldmfd	sp!, {r1-r11, pc}		@ return

+ENDPROC(swsusp_arch_suspend)

+

+/*

+ * Restore the memory image from the pagelists, and load the CPU registers

+ * from saved state.

+ */

+ENTRY(swsusp_arch_resume)

+	setmode PSR_I_BIT | PSR_F_BIT | SVC_MODE, r2

+	/*

+	 * Switch stack to a nosavedata region to make sure image restore

+	 * doesn't clobber it underneath itself.

+	 * Note that this effectively nukes "current"; from here on, the

+	 * executing code runs context-less and no functions can be called

+	 * that have side effects beyond accessing global variables.

+	 */

+	ldr	sp, =(__swsusp_resume_stk + PAGE_SIZE / 2)

+	b	__swsusp_arch_restore_image

+.ltorg

+.align 5

+	/*

+	 * Execution returns here via resuming the saved context.

+	 * MMU is active again and CPU core state has been restored, all

+	 * that remains to be done now is to restore the CPU registers.

+	 */

+.Lresume_post_mmu:

+	ldmfd	sp!, {r1-r11}

+	msr	cpsr, r1

+	msr	spsr, r2

+	bl	cpu_init			@ reinitialize other modes

+	ldmfd	sp!, {lr}

+	b	__swsusp_arch_resume_finish	@ cleanup

+ENDPROC(swsusp_arch_resume)

+

+.data

+.Ltemp_sp:

+	.long 0