diff mbox series

[1/1] stop_machine, rcu: Mark functions as notrace

Message ID 20201028104825.107302-2-colin.king@canonical.com
State New
Headers show
Series Fix ftrace oops/hang on RISC-V and ARM64 | expand

Commit Message

Colin Ian King Oct. 28, 2020, 10:48 a.m. UTC
From: Zong Li <zong.li@sifive.com>

BugLink: https://bugs.launchpad.net/bugs/1894613

Some architectures assume that the stopped CPUs don't make function calls
to traceable functions when they are in the stopped state. See also commit
cb9d7fd51d9f ("watchdog: Mark watchdog touch functions as notrace").

Violating this assumption causes kernel crashes when switching tracer on
RISC-V.

Mark rcu_momentary_dyntick_idle() and stop_machine_yield() notrace to
prevent this.

Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield")
Fixes: 366237e7b083 ("stop_machine: Provide RCU quiescent state in multi_cpu_stop()")
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Atish Patra <atish.patra@wdc.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201021073839.43935-1-zong.li@sifive.com
(cherry picked from commit 4230e2deaa484b385aa01d598b2aea8e7f2660a6 from
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git)

---
 kernel/rcu/tree.c     | 2 +-
 kernel/stop_machine.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Stefan Bader Oct. 29, 2020, 8:03 a.m. UTC | #1
On 28.10.20 11:48, Colin King wrote:
> From: Zong Li <zong.li@sifive.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1894613
> 
> Some architectures assume that the stopped CPUs don't make function calls
> to traceable functions when they are in the stopped state. See also commit
> cb9d7fd51d9f ("watchdog: Mark watchdog touch functions as notrace").
> 
> Violating this assumption causes kernel crashes when switching tracer on
> RISC-V.
> 
> Mark rcu_momentary_dyntick_idle() and stop_machine_yield() notrace to
> prevent this.
> 
> Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield")
> Fixes: 366237e7b083 ("stop_machine: Provide RCU quiescent state in multi_cpu_stop()")
> Signed-off-by: Zong Li <zong.li@sifive.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Atish Patra <atish.patra@wdc.com>
> Tested-by: Colin Ian King <colin.king@canonical.com>
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/r/20201021073839.43935-1-zong.li@sifive.com
> (cherry picked from commit 4230e2deaa484b385aa01d598b2aea8e7f2660a6 from
>  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git)
Signed-off-by: Colin King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
> 
> ---

Your sign-off is missing. And, in theory, hirsute is the new primary task, so
would have to reflect whether things are needed there. I added a groovy task but
left the state of the devel task unchanged. I think it probably could be fix
released even though it is a bit odd right now. The actual kernel used is the
copy-forward of groovy, but the unstable would be 5.10 so the question is
whether that pick came from 5.10 or later and maybe needs to go into unstable, too.

-Stefan

>  kernel/rcu/tree.c     | 2 +-
>  kernel/stop_machine.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 06895ef85d69..2a52f42f64b6 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -409,7 +409,7 @@ bool rcu_eqs_special_set(int cpu)
>   *
>   * The caller must have disabled interrupts and must not be idle.
>   */
> -void rcu_momentary_dyntick_idle(void)
> +notrace void rcu_momentary_dyntick_idle(void)
>  {
>  	int special;
>  
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 865bb0228ab6..890b79cf0e7c 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -178,7 +178,7 @@ static void ack_state(struct multi_stop_data *msdata)
>  		set_state(msdata, msdata->state + 1);
>  }
>  
> -void __weak stop_machine_yield(const struct cpumask *cpumask)
> +notrace void __weak stop_machine_yield(const struct cpumask *cpumask)
>  {
>  	cpu_relax();
>  }
>
Kleber Sacilotto de Souza Oct. 29, 2020, 8:49 a.m. UTC | #2
On 28.10.20 11:48, Colin King wrote:
> From: Zong Li <zong.li@sifive.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1894613
> 
> Some architectures assume that the stopped CPUs don't make function calls
> to traceable functions when they are in the stopped state. See also commit
> cb9d7fd51d9f ("watchdog: Mark watchdog touch functions as notrace").
> 
> Violating this assumption causes kernel crashes when switching tracer on
> RISC-V.
> 
> Mark rcu_momentary_dyntick_idle() and stop_machine_yield() notrace to
> prevent this.
> 
> Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield")
> Fixes: 366237e7b083 ("stop_machine: Provide RCU quiescent state in multi_cpu_stop()")
> Signed-off-by: Zong Li <zong.li@sifive.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Atish Patra <atish.patra@wdc.com>
> Tested-by: Colin Ian King <colin.king@canonical.com>
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/r/20201021073839.43935-1-zong.li@sifive.com
> (cherry picked from commit 4230e2deaa484b385aa01d598b2aea8e7f2660a6 from
>   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git)
> 

LGTM.

With the missing s-o-b as Stefan pointed out:

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

> ---
>   kernel/rcu/tree.c     | 2 +-
>   kernel/stop_machine.c | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 06895ef85d69..2a52f42f64b6 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -409,7 +409,7 @@ bool rcu_eqs_special_set(int cpu)
>    *
>    * The caller must have disabled interrupts and must not be idle.
>    */
> -void rcu_momentary_dyntick_idle(void)
> +notrace void rcu_momentary_dyntick_idle(void)
>   {
>   	int special;
>   
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 865bb0228ab6..890b79cf0e7c 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -178,7 +178,7 @@ static void ack_state(struct multi_stop_data *msdata)
>   		set_state(msdata, msdata->state + 1);
>   }
>   
> -void __weak stop_machine_yield(const struct cpumask *cpumask)
> +notrace void __weak stop_machine_yield(const struct cpumask *cpumask)
>   {
>   	cpu_relax();
>   }
>
Ian May Oct. 30, 2020, 9:52 p.m. UTC | #3
Applied to Groovy/master-next

Thanks,
Ian

On 2020-10-28 10:48:25 , Colin King wrote:
> From: Zong Li <zong.li@sifive.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1894613
> 
> Some architectures assume that the stopped CPUs don't make function calls
> to traceable functions when they are in the stopped state. See also commit
> cb9d7fd51d9f ("watchdog: Mark watchdog touch functions as notrace").
> 
> Violating this assumption causes kernel crashes when switching tracer on
> RISC-V.
> 
> Mark rcu_momentary_dyntick_idle() and stop_machine_yield() notrace to
> prevent this.
> 
> Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield")
> Fixes: 366237e7b083 ("stop_machine: Provide RCU quiescent state in multi_cpu_stop()")
> Signed-off-by: Zong Li <zong.li@sifive.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Atish Patra <atish.patra@wdc.com>
> Tested-by: Colin Ian King <colin.king@canonical.com>
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/r/20201021073839.43935-1-zong.li@sifive.com
> (cherry picked from commit 4230e2deaa484b385aa01d598b2aea8e7f2660a6 from
>  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git)
> 
> ---
>  kernel/rcu/tree.c     | 2 +-
>  kernel/stop_machine.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 06895ef85d69..2a52f42f64b6 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -409,7 +409,7 @@ bool rcu_eqs_special_set(int cpu)
>   *
>   * The caller must have disabled interrupts and must not be idle.
>   */
> -void rcu_momentary_dyntick_idle(void)
> +notrace void rcu_momentary_dyntick_idle(void)
>  {
>  	int special;
>  
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 865bb0228ab6..890b79cf0e7c 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -178,7 +178,7 @@ static void ack_state(struct multi_stop_data *msdata)
>  		set_state(msdata, msdata->state + 1);
>  }
>  
> -void __weak stop_machine_yield(const struct cpumask *cpumask)
> +notrace void __weak stop_machine_yield(const struct cpumask *cpumask)
>  {
>  	cpu_relax();
>  }
> -- 
> 2.27.0
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
diff mbox series

Patch

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 06895ef85d69..2a52f42f64b6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -409,7 +409,7 @@  bool rcu_eqs_special_set(int cpu)
  *
  * The caller must have disabled interrupts and must not be idle.
  */
-void rcu_momentary_dyntick_idle(void)
+notrace void rcu_momentary_dyntick_idle(void)
 {
 	int special;
 
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 865bb0228ab6..890b79cf0e7c 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -178,7 +178,7 @@  static void ack_state(struct multi_stop_data *msdata)
 		set_state(msdata, msdata->state + 1);
 }
 
-void __weak stop_machine_yield(const struct cpumask *cpumask)
+notrace void __weak stop_machine_yield(const struct cpumask *cpumask)
 {
 	cpu_relax();
 }