diff mbox series

[linux-next] powerpc: use raw_smp_processor_id in arch_touch_nmi_watchdog

Message ID 20220714013131.12648-1-zhouzhouyi@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series [linux-next] powerpc: use raw_smp_processor_id in arch_touch_nmi_watchdog | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 23 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 7 jobs.

Commit Message

Zhouyi Zhou July 14, 2022, 1:31 a.m. UTC
use raw_smp_processor_id() in arch_touch_nmi_watchdog
because when called from watchdog, the cpu is preemptible.

Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
---
Dear PPC developers

I found this bug when trying to do rcutorture tests in ppc VM of
Open Source Lab of Oregon State University.

qemu-system-ppc64  -nographic -smp cores=4,threads=1 -net none  -M pseries -nodefaults -device spapr-vscsi -serial file:/tmp/console.log -m 2G -kernel /home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.08-22.36.11-torture/results-rcuscale-kvfree/TREE/vmlinux -append "debug_boot_weak_hash panic=-1 console=ttyS0 rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot rcuscale.shutdown=1 rcuscale.verbose=0"

tail /tmp/console.log
[ 1232.433552][   T41] BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/41
[ 1232.439751][   T41] caller is arch_touch_nmi_watchdog+0x34/0xd0
[ 1232.440934][   T41] CPU: 3 PID: 41 Comm: khungtaskd Not tainted 5.19.0-rc5-next-20220708-dirty #106
[ 1232.442684][   T41] Call Trace:
[ 1232.443343][   T41] [c0000000029cbbb0] [c0000000006df360] dump_stack_lvl+0x74/0xa8 (unreliable)
[ 1232.445237][   T41] [c0000000029cbbf0] [c000000000d04f30] check_preemption_disabled+0x150/0x160
[ 1232.446926][   T41] [c0000000029cbc80] [c000000000035584] arch_touch_nmi_watchdog+0x34/0xd0
[ 1232.448532][   T41] [c0000000029cbcb0] [c0000000002068ac] watchdog+0x40c/0x5b0
[ 1232.451449][   T41] [c0000000029cbdc0] [c000000000139df4] kthread+0x144/0x170
[ 1232.452896][   T41] [c0000000029cbe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64

After this fix, "BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/41" does not
appear again.

I also examined other places in watchdog.c where smp_processor_id() are used, but they are well protected by preempt
disable.

Kind Regards
Zhouyi
--
 arch/powerpc/kernel/watchdog.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

John Ogness July 14, 2022, 9:25 a.m. UTC | #1
On 2022-07-14, Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> use raw_smp_processor_id() in arch_touch_nmi_watchdog
> because when called from watchdog, the cpu is preemptible.

I would expect the correct solution is to make it a non-migration
section. Something like the below (untested) patch.

John Ogness

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index bfc27496fe7e..9d34aa809241 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -450,17 +450,23 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 void arch_touch_nmi_watchdog(void)
 {
 	unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
-	int cpu = smp_processor_id();
+	int cpu;
 	u64 tb;
 
-	if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
+	cpu = get_cpu();
+
+	if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) {
+		goto out;
 		return;
+	}
 
 	tb = get_tb();
 	if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
 		per_cpu(wd_timer_tb, cpu) = tb;
 		wd_smp_clear_cpu_pending(cpu);
 	}
+out:
+	put_cpu();
 }
 EXPORT_SYMBOL(arch_touch_nmi_watchdog);
Zhouyi Zhou July 14, 2022, 10:01 a.m. UTC | #2
Thank John for correcting me ;-)

On Thu, Jul 14, 2022 at 5:25 PM John Ogness <john.ogness@linutronix.de> wrote:
>
> On 2022-07-14, Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> > use raw_smp_processor_id() in arch_touch_nmi_watchdog
> > because when called from watchdog, the cpu is preemptible.
>
> I would expect the correct solution is to make it a non-migration
> section. Something like the below (untested) patch.
I applied your patch (I have made a tiny modification by removing the
return statement after "goto out;") and
passed the test in the ppc VM of Open Source Lab of Oregon State University.

Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>

Many Thanks
Kindly Regards
Zhouyi
>
> John Ogness
>
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index bfc27496fe7e..9d34aa809241 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -450,17 +450,23 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  void arch_touch_nmi_watchdog(void)
>  {
>         unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
> -       int cpu = smp_processor_id();
> +       int cpu;
>         u64 tb;
>
> -       if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
> +       cpu = get_cpu();
> +
> +       if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) {
> +               goto out;
>                 return;
I think we should remove the return statement here.
> +       }
>
>         tb = get_tb();
>         if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
>                 per_cpu(wd_timer_tb, cpu) = tb;
>                 wd_smp_clear_cpu_pending(cpu);
>         }
> +out:
> +       put_cpu();
>  }
>  EXPORT_SYMBOL(arch_touch_nmi_watchdog);
John Ogness July 14, 2022, 11:46 a.m. UTC | #3
On 2022-07-14, Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> Thank John for correcting me ;-)

After looking more closely, I do not think disabling migration is the
correct fix either.

The per-cpu variable @wd_timer_tb is written from 2 functions:

- watchdog_timer_interrupt() <-- irq handler
- arch_touch_nmi_watchdog()  <-- called from preemptible

Since watchdog_timer_interrupt() is called from irq context, I expect
that interrupts need to be disabled for the update in
arch_touch_nmi_watchdog(). Perhaps a using a per-cpu local_lock_t with
local_lock_irqsave() to protect write access to @wd_timer_tb?

John Ogness
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 7d28b9553654..ab6b84e00311 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -450,7 +450,7 @@  static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 void arch_touch_nmi_watchdog(void)
 {
 	unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
-	int cpu = smp_processor_id();
+	int cpu = raw_smp_processor_id();
 	u64 tb;
 
 	if (!cpumask_test_cpu(cpu, &watchdog_cpumask))