diff mbox

[1/6] powernv:idle: Correctly initialize core_idle_state_ptr

Message ID f6c49fcf42b78c0b8bdad17d6e659679c0288cd9.1494585671.git.ego@linux.vnet.ibm.com (mailing list archive)
State Accepted
Commit 5f221c3ca13dceaea8eefe21dbd85da91ed9b1e8
Headers show

Commit Message

Gautham R Shenoy May 16, 2017, 8:49 a.m. UTC
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The lower 8 bits of core_idle_state_ptr tracks the number of non-idle
threads in the core. This is supposed to be initialized to bit-map
corresponding to the threads_per_core. However, currently it is
initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for
POWER8 which has 8 threads per core, but not for POWER9 which has 4
threads per core.

As a result, on POWER9, core_idle_state_ptr gets initialized to
0xFF. In case when all the threads of the core are idle, the bits
corresponding tracking the idle-threads are non-zero. As a result, the
idle entry/exit code fails to save/restore per-core hypervisor state
since it assumes that there are threads in the cores which are still
active.

Fix this by correctly initializing the lower bits of the
core_idle_state_ptr on the basis of threads_per_core.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/idle.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

Comments

Nicholas Piggin May 30, 2017, 5:56 a.m. UTC | #1
On Tue, 16 May 2017 14:19:43 +0530
"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote:

> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The lower 8 bits of core_idle_state_ptr tracks the number of non-idle
> threads in the core. This is supposed to be initialized to bit-map
> corresponding to the threads_per_core. However, currently it is
> initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for
> POWER8 which has 8 threads per core, but not for POWER9 which has 4
> threads per core.
> 
> As a result, on POWER9, core_idle_state_ptr gets initialized to
> 0xFF. In case when all the threads of the core are idle, the bits
> corresponding tracking the idle-threads are non-zero. As a result, the
> idle entry/exit code fails to save/restore per-core hypervisor state
> since it assumes that there are threads in the cores which are still
> active.
> 
> Fix this by correctly initializing the lower bits of the
> core_idle_state_ptr on the basis of threads_per_core.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

This looks good to me.

Until this patch series, we can't enable HV state loss idle modes
on POWER9, is that correct? And after your series does it work?

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Michael Ellerman May 30, 2017, 9:11 a.m. UTC | #2
On Tue, 2017-05-16 at 08:49:43 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The lower 8 bits of core_idle_state_ptr tracks the number of non-idle
> threads in the core. This is supposed to be initialized to bit-map
> corresponding to the threads_per_core. However, currently it is
> initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for
> POWER8 which has 8 threads per core, but not for POWER9 which has 4
> threads per core.
> 
> As a result, on POWER9, core_idle_state_ptr gets initialized to
> 0xFF. In case when all the threads of the core are idle, the bits
> corresponding tracking the idle-threads are non-zero. As a result, the
> idle entry/exit code fails to save/restore per-core hypervisor state
> since it assumes that there are threads in the cores which are still
> active.
> 
> Fix this by correctly initializing the lower bits of the
> core_idle_state_ptr on the basis of threads_per_core.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/5f221c3ca13dceaea8eefe21dbd85d

cheers
Gautham R Shenoy May 30, 2017, 10:23 a.m. UTC | #3
Hi Nicholas,

On Tue, May 30, 2017 at 03:56:12PM +1000, Nicholas Piggin wrote:
> On Tue, 16 May 2017 14:19:43 +0530
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote:
> 
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > 
> > The lower 8 bits of core_idle_state_ptr tracks the number of non-idle
> > threads in the core. This is supposed to be initialized to bit-map
> > corresponding to the threads_per_core. However, currently it is
> > initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for
> > POWER8 which has 8 threads per core, but not for POWER9 which has 4
> > threads per core.
> > 
> > As a result, on POWER9, core_idle_state_ptr gets initialized to
> > 0xFF. In case when all the threads of the core are idle, the bits
> > corresponding tracking the idle-threads are non-zero. As a result, the
> > idle entry/exit code fails to save/restore per-core hypervisor state
> > since it assumes that there are threads in the cores which are still
> > active.
> > 
> > Fix this by correctly initializing the lower bits of the
> > core_idle_state_ptr on the basis of threads_per_core.
> > 
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> 
> This looks good to me.
> 
> Until this patch series, we can't enable HV state loss idle modes
> on POWER9, is that correct? And after your series does it work?

Yes, that is correct.

> 
> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
>

Thanks for reviewing the patch!

--
Thanks and Regards
gautham.
diff mbox

Patch

diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 445f30a..84eb9bc 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -96,15 +96,24 @@  static void pnv_alloc_idle_core_states(void)
 	u32 *core_idle_state;
 
 	/*
-	 * core_idle_state - First 8 bits track the idle state of each thread
-	 * of the core. The 8th bit is the lock bit. Initially all thread bits
-	 * are set. They are cleared when the thread enters deep idle state
-	 * like sleep and winkle. Initially the lock bit is cleared.
-	 * The lock bit has 2 purposes
-	 * a. While the first thread is restoring core state, it prevents
-	 * other threads in the core from switching to process context.
-	 * b. While the last thread in the core is saving the core state, it
-	 * prevents a different thread from waking up.
+	 * core_idle_state - The lower 8 bits track the idle state of
+	 * each thread of the core.
+	 *
+	 * The most significant bit is the lock bit.
+	 *
+	 * Initially all the bits corresponding to threads_per_core
+	 * are set. They are cleared when the thread enters deep idle
+	 * state like sleep and winkle/stop.
+	 *
+	 * Initially the lock bit is cleared.  The lock bit has 2
+	 * purposes:
+	 * 	a. While the first thread in the core waking up from
+	 * 	   idle is restoring core state, it prevents other
+	 * 	   threads in the core from switching to process
+	 * 	   context.
+	 * 	b. While the last thread in the core is saving the
+	 *	   core state, it prevents a different thread from
+	 *	   waking up.
 	 */
 	for (i = 0; i < nr_cores; i++) {
 		int first_cpu = i * threads_per_core;
@@ -112,7 +121,7 @@  static void pnv_alloc_idle_core_states(void)
 		size_t paca_ptr_array_size;
 
 		core_idle_state = kmalloc_node(sizeof(u32), GFP_KERNEL, node);
-		*core_idle_state = PNV_CORE_IDLE_THREAD_BITS;
+		*core_idle_state = (1 << threads_per_core) - 1;
 		paca_ptr_array_size = (threads_per_core *
 				       sizeof(struct paca_struct *));