diff mbox series

[3/4] powerpc: Add mm_cpumask warning when context switching

Message ID 20230524060821.148015-4-npiggin@gmail.com (mailing list archive)
State Accepted
Commit 177255afb40548fdf504384b361d18d6cbe35d1e
Headers show
Series powerpc: mm_cpumask cleanups and lazy tlb mm | expand

Commit Message

Nicholas Piggin May 24, 2023, 6:08 a.m. UTC
When context switching away from an mm, add a CONFIG_DEBUG_VM warning
check to ensure this CPU is still set in the mask. This could catch
bugs where the mask is improperly trimmed while the CPU is still using
the mm.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/mm/mmu_context.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Michael Ellerman Aug. 18, 2023, 7:22 a.m. UTC | #1
Nicholas Piggin <npiggin@gmail.com> writes:
> When context switching away from an mm, add a CONFIG_DEBUG_VM warning
> check to ensure this CPU is still set in the mask. This could catch
> bugs where the mask is improperly trimmed while the CPU is still using
> the mm.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/mm/mmu_context.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c
> index 894468975a44..b24c19078eb1 100644
> --- a/arch/powerpc/mm/mmu_context.c
> +++ b/arch/powerpc/mm/mmu_context.c
> @@ -101,6 +102,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
>  	 * sub architectures. Out of line for now
>  	 */
>  	switch_mmu_context(prev, next, tsk);
> +
> +	VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(prev)));

This is popping during CPU hotunplug. I guess some confusion about when
the mask is cleared.

cheers


[  145.150374][    T0] ------------[ cut here ]------------
[  145.150459][    T0] WARNING: CPU: 5 PID: 0 at arch/powerpc/mm/mmu_context.c:106 switch_mm_irqs_off+0x320/0x340
[  145.150519][    T0] Modules linked in: bonding pseries_rng rng_core binfmt_misc aes_gcm_p10_crypto zram vmx_crypto gf128mul crc32c_vpmsum papr_scm ip6_tables ip_tables x_tables fuse autofs4
[  145.150588][    T0] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.5.0-rc3-00084-g01477eb5e323 #47
[  145.150592][    T0] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00 (NH1030_019) hv:phyp pSeries
[  145.150595][    T0] NIP:  c0000000000cbc30 LR: c0000000000cbab0 CTR: c000000000181a40
[  145.150598][    T0] REGS: c00000000985fb10 TRAP: 0700   Not tainted  (6.5.0-rc3-00084-g01477eb5e323)
[  145.150602][    T0] MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24000208  XER: 0000011e
[  145.150625][    T0] CFAR: c0000000000cbae4 IRQMASK: 1 
[  145.150625][    T0] GPR00: c0000000000cbab0 c00000000985fdb0 c0000000027aaf00 c000000b02955f00 
[  145.150625][    T0] GPR04: c0000000043e0b00 c0000000067c7000 0000000000000000 0000000000030e2d 
[  145.150625][    T0] GPR08: c00000000451af00 0000000000000001 c00000000451af00 c00000000451af00 
[  145.150625][    T0] GPR12: c000000000181a40 c00000050fffa300 0000000000000000 000000001eed51a0 
[  145.150625][    T0] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[  145.150625][    T0] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
[  145.150625][    T0] GPR24: 0000000000000005 000000000000dedc c000000004468470 0000000000000001 
[  145.150625][    T0] GPR28: c0000000067c7000 0000000000000000 0000000000000005 c000000b02956780 
[  145.150711][    T0] NIP [c0000000000cbc30] switch_mm_irqs_off+0x320/0x340
[  145.150716][    T0] LR [c0000000000cbab0] switch_mm_irqs_off+0x1a0/0x340
[  145.150721][    T0] Call Trace:
[  145.150724][    T0] [c00000000985fdb0] [c000000000448688] __smp_call_single_queue+0x198/0x1f0 (unreliable)
[  145.150732][    T0] [c00000000985fdf0] [c0000000002af958] idle_task_exit+0xf8/0x230
[  145.150740][    T0] [c00000000985fe40] [c000000000181aac] pseries_cpu_offline_self+0x6c/0x230
[  145.150748][    T0] [c00000000985feb0] [c000000000092bb4] arch_cpu_idle_dead+0x64/0x90
[  145.150755][    T0] [c00000000985fee0] [c0000000002fc09c] do_idle+0x25c/0x740
[  145.150761][    T0] [c00000000985ff60] [c0000000002fcd14] cpu_startup_entry+0x84/0xa0
[  145.150765][    T0] [c00000000985ff90] [c000000000092500] start_secondary+0x4e0/0x510
[  145.150772][    T0] [c00000000985ffe0] [c00000000000e258] start_secondary_prolog+0x10/0x14
[  145.150788][    T0] Code: 0fe00000 3ce201d7 e947a198 394a0001 f947a198 4bfffd70 60000000 60420000 3d4201d7 e92aa210 39290001 f92aa210 <0fe00000> 3d4201d7 e8010050 e92aa218 
[  145.150828][    T0] irq event stamp: 49758
[  145.150831][    T0] hardirqs last  enabled at (49757): [<c0000000004330c8>] tick_nohz_idle_enter+0x118/0x2b0
[  145.150836][    T0] hardirqs last disabled at (49758): [<c0000000002fc038>] do_idle+0x1f8/0x740
[  145.150839][    T0] softirqs last  enabled at (49724): [<c000000001f66398>] __do_softirq+0x5c8/0x7f4
[  145.150847][    T0] softirqs last disabled at (49703): [<c00000000001bc90>] do_softirq_own_stack+0x50/0x80
[  145.150852][    T0] ---[ end trace 0000000000000000 ]---
diff mbox series

Patch

diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c
index 894468975a44..b24c19078eb1 100644
--- a/arch/powerpc/mm/mmu_context.c
+++ b/arch/powerpc/mm/mmu_context.c
@@ -43,12 +43,13 @@  static inline void switch_mm_pgdir(struct task_struct *tsk,
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 			struct task_struct *tsk)
 {
+	int cpu = smp_processor_id();
 	bool new_on_cpu = false;
 
 	/* Mark this context has been used on the new CPU */
-	if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next))) {
+	if (!cpumask_test_cpu(cpu, mm_cpumask(next))) {
 		VM_WARN_ON_ONCE(next == &init_mm);
-		cpumask_set_cpu(smp_processor_id(), mm_cpumask(next));
+		cpumask_set_cpu(cpu, mm_cpumask(next));
 		inc_mm_active_cpus(next);
 
 		/*
@@ -101,6 +102,8 @@  void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 	 * sub architectures. Out of line for now
 	 */
 	switch_mmu_context(prev, next, tsk);
+
+	VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(prev)));
 }
 
 #ifndef CONFIG_PPC_BOOK3S_64