diff mbox series

[v2,2/4] cpus: Make {start,end}_exclusive() recursive

Message ID 20230213125238.331881-3-iii@linux.ibm.com
State New
Headers show
Series Fix deadlock when dying because of a signal | expand

Commit Message

Ilya Leoshkevich Feb. 13, 2023, 12:52 p.m. UTC
Currently dying to one of the core_dump_signal()s deadlocks, because
dump_core_and_abort() calls start_exclusive() two times: first via
stop_all_tasks(), and then via preexit_cleanup() ->
qemu_plugin_user_exit().

There are a number of ways to solve this: resume after dumping core;
check cpu_in_exclusive_context() in qemu_plugin_user_exit(); or make
{start,end}_exclusive() recursive. Pick the last option, since it's
the most straightforward one.

Fixes: da91c1920242 ("linux-user: Clean up when exiting due to a signal")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 cpus-common.c         | 12 ++++++++++--
 include/hw/core/cpu.h |  4 ++--
 2 files changed, 12 insertions(+), 4 deletions(-)

Comments

Richard Henderson Feb. 13, 2023, 8:12 p.m. UTC | #1
On 2/13/23 02:52, Ilya Leoshkevich wrote:
> Currently dying to one of the core_dump_signal()s deadlocks, because
> dump_core_and_abort() calls start_exclusive() two times: first via
> stop_all_tasks(), and then via preexit_cleanup() ->
> qemu_plugin_user_exit().
> 
> There are a number of ways to solve this: resume after dumping core;
> check cpu_in_exclusive_context() in qemu_plugin_user_exit(); or make
> {start,end}_exclusive() recursive. Pick the last option, since it's
> the most straightforward one.
> 
> Fixes: da91c1920242 ("linux-user: Clean up when exiting due to a signal")
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>   cpus-common.c         | 12 ++++++++++--
>   include/hw/core/cpu.h |  4 ++--
>   2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/cpus-common.c b/cpus-common.c
> index 793364dc0ed..a0c52cd187f 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -192,6 +192,11 @@ void start_exclusive(void)
>       CPUState *other_cpu;
>       int running_cpus;
>   
> +    if (current_cpu->exclusive_context_count) {
> +        current_cpu->exclusive_context_count++;
> +        return;
> +    }
> +
>       qemu_mutex_lock(&qemu_cpu_list_lock);
>       exclusive_idle();
>   
> @@ -219,13 +224,16 @@ void start_exclusive(void)
>        */
>       qemu_mutex_unlock(&qemu_cpu_list_lock);
>   
> -    current_cpu->in_exclusive_context = true;
> +    current_cpu->exclusive_context_count++;

I think this line would be clearer as "= 1".

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
Alex Bennée Feb. 14, 2023, 9:26 a.m. UTC | #2
Ilya Leoshkevich <iii@linux.ibm.com> writes:

> Currently dying to one of the core_dump_signal()s deadlocks, because
> dump_core_and_abort() calls start_exclusive() two times: first via
> stop_all_tasks(), and then via preexit_cleanup() ->
> qemu_plugin_user_exit().
>
> There are a number of ways to solve this: resume after dumping core;
> check cpu_in_exclusive_context() in qemu_plugin_user_exit(); or make
> {start,end}_exclusive() recursive. Pick the last option, since it's
> the most straightforward one.
>
> Fixes: da91c1920242 ("linux-user: Clean up when exiting due to a signal")
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
diff mbox series

Patch

diff --git a/cpus-common.c b/cpus-common.c
index 793364dc0ed..a0c52cd187f 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -192,6 +192,11 @@  void start_exclusive(void)
     CPUState *other_cpu;
     int running_cpus;
 
+    if (current_cpu->exclusive_context_count) {
+        current_cpu->exclusive_context_count++;
+        return;
+    }
+
     qemu_mutex_lock(&qemu_cpu_list_lock);
     exclusive_idle();
 
@@ -219,13 +224,16 @@  void start_exclusive(void)
      */
     qemu_mutex_unlock(&qemu_cpu_list_lock);
 
-    current_cpu->in_exclusive_context = true;
+    current_cpu->exclusive_context_count++;
 }
 
 /* Finish an exclusive operation.  */
 void end_exclusive(void)
 {
-    current_cpu->in_exclusive_context = false;
+    current_cpu->exclusive_context_count--;
+    if (current_cpu->exclusive_context_count) {
+        return;
+    }
 
     qemu_mutex_lock(&qemu_cpu_list_lock);
     qatomic_set(&pending_cpus, 0);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 2417597236b..671f041bec6 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -349,7 +349,7 @@  struct CPUState {
     bool unplug;
     bool crash_occurred;
     bool exit_request;
-    bool in_exclusive_context;
+    int exclusive_context_count;
     uint32_t cflags_next_tb;
     /* updates protected by BQL */
     uint32_t interrupt_request;
@@ -758,7 +758,7 @@  void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data
  */
 static inline bool cpu_in_exclusive_context(const CPUState *cpu)
 {
-    return cpu->in_exclusive_context;
+    return cpu->exclusive_context_count;
 }
 
 /**