SLW: Add idle state stop5 for DD2.0 and above

Message ID 1510550280-14055-1-git-send-email-akshay.adiga@linux.vnet.ibm.com
State Accepted
Headers show
Series
  • SLW: Add idle state stop5 for DD2.0 and above
Related show

Commit Message

Akshay Adiga Nov. 13, 2017, 5:18 a.m.
Adding stop5 idle state with rough residency and latency numbers.

Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
---
 hw/slw.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Stewart Smith Nov. 14, 2017, 7:52 a.m. | #1
Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> writes:
> Adding stop5 idle state with rough residency and latency numbers.

How stable has stop5 proved? I gather this patch is an indication that
we're fairly confident we have stop5 working well enough for an OS to
use?

Is there some way we could write a test from userspace to force a core
into stop5 and out again a bunch of times? Maybe disable all other stop
states, do nothing on it and then check?
Nicholas Piggin Nov. 16, 2017, 1:06 a.m. | #2
On Tue, 14 Nov 2017 18:52:09 +1100
Stewart Smith <stewart@linux.vnet.ibm.com> wrote:

> Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> writes:
> > Adding stop5 idle state with rough residency and latency numbers.  
> 
> How stable has stop5 proved? I gather this patch is an indication that
> we're fairly confident we have stop5 working well enough for an OS to
> use?
> 
> Is there some way we could write a test from userspace to force a core
> into stop5 and out again a bunch of times? Maybe disable all other stop
> states, do nothing on it and then check?
> 

Yes, disable all other stop states. You can do it at runtime with
/sys/devices/system/cpu/cpu*/cpuidle/state*/disable

You can then use context switch benchmark or some IO traffic etc
to really hammer it.

Thanks,
Nick
Stewart Smith Dec. 19, 2017, 12:30 a.m. | #3
Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> writes:
> Adding stop5 idle state with rough residency and latency numbers.
>
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
> ---
>  hw/slw.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)

As in practice we've been going "only enable stop states that actually
work", I think I'm okay to take this now. Merged to master as of
1953b41e1dd5a1a91d6e6a5dbfba0672955defc9.
Stewart Smith Feb. 7, 2018, 5:55 a.m. | #4
Nicholas Piggin <npiggin@gmail.com> writes:
> On Tue, 14 Nov 2017 18:52:09 +1100
> Stewart Smith <stewart@linux.vnet.ibm.com> wrote:
>
>> Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> writes:
>> > Adding stop5 idle state with rough residency and latency numbers.  
>> 
>> How stable has stop5 proved? I gather this patch is an indication that
>> we're fairly confident we have stop5 working well enough for an OS to
>> use?
>> 
>> Is there some way we could write a test from userspace to force a core
>> into stop5 and out again a bunch of times? Maybe disable all other stop
>> states, do nothing on it and then check?
>> 
>
> Yes, disable all other stop states. You can do it at runtime with
> /sys/devices/system/cpu/cpu*/cpuidle/state*/disable
>
> You can then use context switch benchmark or some IO traffic etc
> to really hammer it.

So, I've gotten a skeleton going for an op-test test that does this.

It also turns out you need your kernel bugfix
3ed09c94580de9d5b18cc35d1f97e9f24cd9233b "cpuidle: menu: allow state 0 to be disabled"
or else it's not too useful test, as you end up just testing snooze
rather than any stop state.

I'm now trying to come up with the world's worst context switch/IO
benchmark that can be run in a busybox shell with whatever we build into
petitboot.

Can anyone think of something better than
'taskset -c 6 find / |head -n 200000 > /dev/null'
?
(6 = cpu nr being tested)

Weirdly though, CPU0 seems to get a bunch of accounting done for it over
others, with it 'entering' (according to /sys/..../stateN/usage) the
stop state a lot more than other threads or even other cores....

# CPU 0 entered idle state ['stop4'] 236 times
# CPU 1 entered idle state ['stop4'] 1 times
# CPU 2 entered idle state ['stop4'] 1 times
# CPU 9 entered idle state ['stop4'] 1 times
# CPU 0 entered idle state ['stop5'] 221 times
# CPU 1 entered idle state ['stop5'] 1 times
# CPU 2 entered idle state ['stop5'] 1 times
# CPU 9 entered idle state ['stop5'] 1 times

(this test disables all but the stop state being tested, runs that task
above on it, then measures the difference in 'usage' number for the CPU
we're interested in.

Is this behaviour expected?

Patch

diff --git a/hw/slw.c b/hw/slw.c
index c2c755d..8110b5a 100644
--- a/hw/slw.c
+++ b/hw/slw.c
@@ -613,6 +613,22 @@  static struct cpu_idle_states power9_cpu_idle_states[] = {
 				 | OPAL_PM_PSSCR_ESL \
 				 | OPAL_PM_PSSCR_EC,
 		.pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK },
+	{
+		.name = "stop5",
+		.latency_ns = 200000,
+		.residency_ns = 2000000,
+		.flags = 0*OPAL_PM_DEC_STOP \
+		       | 0*OPAL_PM_TIMEBASE_STOP  \
+		       | 1*OPAL_PM_LOSE_USER_CONTEXT \
+		       | 1*OPAL_PM_LOSE_HYP_CONTEXT \
+		       | 1*OPAL_PM_LOSE_FULL_CONTEXT \
+		       | 1*OPAL_PM_STOP_INST_DEEP,
+		.pm_ctrl_reg_val = OPAL_PM_PSSCR_RL(5) \
+				 | OPAL_PM_PSSCR_MTL(7) \
+				 | OPAL_PM_PSSCR_TR(3) \
+				 | OPAL_PM_PSSCR_ESL \
+				 | OPAL_PM_PSSCR_EC,
+		.pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK },
 
 	{
 		.name = "stop8",