[RFC,v3,46/56] accel/tcg: convert to cpu_interrupt_request

Message ID	20181019010625.25294-47-cota@braap.org
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: "Emilio G. Cota" <cota@braap.org> To: qemu-devel@nongnu.org Date: Thu, 18 Oct 2018 21:06:15 -0400 Message-Id: <20181019010625.25294-47-cota@braap.org> In-Reply-To: <20181019010625.25294-1-cota@braap.org> References: <20181019010625.25294-1-cota@braap.org> Subject: [Qemu-devel] [RFC v3 46/56] accel/tcg: convert to cpu_interrupt_request Precedence: list Cc: Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <rth@twiddle.net>, Peter Crosthwaite <crosthwaite.peter@gmail.com> Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	per-CPU locks \| expand [RFC,v3,0/56] per-CPU locks [RFC,v3,01/56] cpu: convert queued work to a QSIMPLEQ [RFC,v3,02/56] cpu: rename cpu->work_mutex to cpu->lock [RFC,v3,03/56] cpu: introduce cpu_mutex_lock/unlock [RFC,v3,04/56] cpu: make qemu_work_cond per-cpu [RFC,v3,05/56] cpu: move run_on_cpu to cpus-common [RFC,v3,07/56] target/m68k: rename cpu_halted to cpu_halt [RFC,v3,08/56] cpu: define cpu_halted helpers [RFC,v3,09/56] arm: convert to cpu_halted [RFC,v3,10/56] ppc: convert to cpu_halted [RFC,v3,11/56] sh4: convert to cpu_halted [RFC,v3,12/56] i386: convert to cpu_halted [RFC,v3,13/56] lm32: convert to cpu_halted [RFC,v3,14/56] m68k: convert to cpu_halted [RFC,v3,15/56] mips: convert to cpu_halted [RFC,v3,16/56] riscv: convert to cpu_halted [RFC,v3,17/56] s390x: convert to cpu_halted [RFC,v3,18/56] sparc: convert to cpu_halted [RFC,v3,19/56] xtensa: convert to cpu_halted [RFC,v3,20/56] gdbstub: convert to cpu_halted [RFC,v3,21/56] openrisc: convert to cpu_halted [RFC,v3,22/56] cpu-exec: convert to cpu_halted [RFC,v3,23/56] cpu: define cpu_interrupt_request helpers [RFC,v3,24/56] ppc: use cpu_reset_interrupt [RFC,v3,25/56] exec: use cpu_reset_interrupt [RFC,v3,26/56] i386: use cpu_reset_interrupt [RFC,v3,27/56] s390x: use cpu_reset_interrupt [RFC,v3,28/56] openrisc: use cpu_reset_interrupt [RFC,v3,29/56] arm: convert to cpu_interrupt_request [RFC,v3,30/56] i386: convert to cpu_interrupt_request [RFC,v3,31/56] ppc: convert to cpu_interrupt_request [RFC,v3,32/56] sh4: convert to cpu_interrupt_request [RFC,v3,33/56] cris: convert to cpu_interrupt_request [RFC,v3,34/56] hppa: convert to cpu_interrupt_request [RFC,v3,35/56] lm32: convert to cpu_interrupt_request [RFC,v3,36/56] m68k: convert to cpu_interrupt_request [RFC,v3,37/56] mips: convert to cpu_interrupt_request [RFC,v3,38/56] nios: convert to cpu_interrupt_request [RFC,v3,39/56] s390x: convert to cpu_interrupt_request [RFC,v3,40/56] alpha: convert to cpu_interrupt_request [RFC,v3,41/56] moxie: convert to cpu_interrupt_request [RFC,v3,42/56] sparc: convert to cpu_interrupt_request [RFC,v3,43/56] openrisc: convert to cpu_interrupt_request [RFC,v3,44/56] unicore32: convert to cpu_interrupt_request [RFC,v3,45/56] microblaze: convert to cpu_interrupt_request [RFC,v3,46/56] accel/tcg: convert to cpu_interrupt_request [RFC,v3,47/56] cpu: call .cpu_has_work with the CPU lock held [RFC,v3,48/56] ppc: acquire the BQL in cpu_has_work [RFC,v3,49/56] mips: acquire the BQL in cpu_has_work [RFC,v3,50/56] s390: acquire the BQL in cpu_has_work [RFC,v3,51/56] riscv: acquire the BQL in cpu_has_work [RFC,v3,52/56] sparc: acquire the BQL in cpu_has_work [RFC,v3,53/56] xtensa: acquire the BQL in cpu_has_work [RFC,v3,54/56] cpu: protect most CPU state with cpu->lock [RFC,v3,55/56] cpu: add async_run_on_cpu_no_bql [RFC,v3,56/56] cputlb: queue async flush jobs without the BQL

Message ID

20181019010625.25294-47-cota@braap.org

State

New

Headers

From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Thu, 18 Oct 2018 21:06:15 -0400
Message-Id: <20181019010625.25294-47-cota@braap.org>
In-Reply-To: <20181019010625.25294-1-cota@braap.org>
References: <20181019010625.25294-1-cota@braap.org>
Subject: [Qemu-devel] [RFC v3 46/56] accel/tcg: convert to
	cpu_interrupt_request
Precedence: list
Cc: Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <rth@twiddle.net>,
	Peter Crosthwaite <crosthwaite.peter@gmail.com>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Series

per-CPU locks | expand

Commit Message

Emilio Cota Oct. 19, 2018, 1:06 a.m. UTC

Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 accel/tcg/cpu-exec.c      | 14 +++++++-------
 accel/tcg/tcg-all.c       | 12 +++++++++---
 accel/tcg/translate-all.c |  2 +-
 3 files changed, 17 insertions(+), 11 deletions(-)

Comments

Richard Henderson Oct. 21, 2018, 1:34 p.m. UTC | #1

On 10/19/18 2:06 AM, Emilio G. Cota wrote:
> @@ -540,16 +540,16 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>       */
>      atomic_mb_set(&cpu->icount_decr.u16.high, 0);
>  
> -    if (unlikely(atomic_read(&cpu->interrupt_request))) {
> +    if (unlikely(cpu_interrupt_request(cpu))) {
>          int interrupt_request;
>          qemu_mutex_lock_iothread();
> -        interrupt_request = cpu->interrupt_request;
> +        interrupt_request = cpu_interrupt_request(cpu);
>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
>              /* Mask out external interrupts for this step. */
>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
>          }
>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
> -            cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
> +            cpu_reset_interrupt(cpu, CPU_INTERRUPT_DEBUG);
>              cpu->exception_index = EXCP_DEBUG;
>              qemu_mutex_unlock_iothread();
>              return true;

Multiple calls.


r~

Emilio Cota Oct. 22, 2018, 11:50 p.m. UTC | #2

On Sun, Oct 21, 2018 at 14:34:25 +0100, Richard Henderson wrote:
> On 10/19/18 2:06 AM, Emilio G. Cota wrote:
> > @@ -540,16 +540,16 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >       */
> >      atomic_mb_set(&cpu->icount_decr.u16.high, 0);
> >  
> > -    if (unlikely(atomic_read(&cpu->interrupt_request))) {
> > +    if (unlikely(cpu_interrupt_request(cpu))) {
> >          int interrupt_request;
> >          qemu_mutex_lock_iothread();
> > -        interrupt_request = cpu->interrupt_request;
> > +        interrupt_request = cpu_interrupt_request(cpu);
> >          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
> >              /* Mask out external interrupts for this step. */
> >              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
> >          }
> >          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
> > -            cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
> > +            cpu_reset_interrupt(cpu, CPU_INTERRUPT_DEBUG);
> >              cpu->exception_index = EXCP_DEBUG;
> >              qemu_mutex_unlock_iothread();
> >              return true;
> 
> Multiple calls.

I'd rather keep it as is.

The first read takes the lock, and that has to stay unless
we want to use atomic_set on interrupt_request everywhere.

The second read happens after the BQL has been acquired;
note that to avoid deadlock we never acquire the BQL after
a CPU lock; the second (locked) read thus has to stay.

Subsequent accesses are all via cpu_reset_interrupt.
If we wanted to avoid reacquiring the lock, we'd have
to explicitly acquire the lock before the 2nd read,
and add unlocks everywhere (like the many qemu_mutex_unlock_iothread
calls), which would be ugly. But we'd also have to be careful
not to longjmp with the CPU mutex held, so we'd have to
unlock/lock around cc->cpu_exec_interrupt.

Given that the CPU lock is uncontended (so it's cheap to
acquire) and that the cases where we call cpu_reset_interrupt
are not that frequent (CPU_INTERRUPT_{DEBUG,HALT,EXITTB}),
I'd rather just keep the patch as is.

Thanks,

		Emilio

Richard Henderson Oct. 23, 2018, 2:17 a.m. UTC | #3

On 10/23/18 12:50 AM, Emilio G. Cota wrote:
> On Sun, Oct 21, 2018 at 14:34:25 +0100, Richard Henderson wrote:
>> On 10/19/18 2:06 AM, Emilio G. Cota wrote:
>>> @@ -540,16 +540,16 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>>       */
>>>      atomic_mb_set(&cpu->icount_decr.u16.high, 0);
>>>  
>>> -    if (unlikely(atomic_read(&cpu->interrupt_request))) {
>>> +    if (unlikely(cpu_interrupt_request(cpu))) {
>>>          int interrupt_request;
>>>          qemu_mutex_lock_iothread();
>>> -        interrupt_request = cpu->interrupt_request;
>>> +        interrupt_request = cpu_interrupt_request(cpu);
>>>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
>>>              /* Mask out external interrupts for this step. */
>>>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
>>>          }
>>>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
>>> -            cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
>>> +            cpu_reset_interrupt(cpu, CPU_INTERRUPT_DEBUG);
>>>              cpu->exception_index = EXCP_DEBUG;
>>>              qemu_mutex_unlock_iothread();
>>>              return true;
>>
>> Multiple calls.
> 
> I'd rather keep it as is.
> 
> The first read takes the lock, and that has to stay unless
> we want to use atomic_set on interrupt_request everywhere.

Why not?  That's even cheaper.

> Given that the CPU lock is uncontended (so it's cheap to
> acquire) ...

It still requires at minimum a "lock xchg" (or equivalent on non-x86), which
isn't free -- think 50-ish cycles minimum just for that one insn, plus call
overhead.


r~

Emilio Cota Oct. 23, 2018, 8:21 p.m. UTC | #4

On Tue, Oct 23, 2018 at 03:17:11 +0100, Richard Henderson wrote:
> On 10/23/18 12:50 AM, Emilio G. Cota wrote:
> > On Sun, Oct 21, 2018 at 14:34:25 +0100, Richard Henderson wrote:
> >> On 10/19/18 2:06 AM, Emilio G. Cota wrote:
> >>> @@ -540,16 +540,16 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >>>       */
> >>>      atomic_mb_set(&cpu->icount_decr.u16.high, 0);
> >>>  
> >>> -    if (unlikely(atomic_read(&cpu->interrupt_request))) {
> >>> +    if (unlikely(cpu_interrupt_request(cpu))) {
> >>>          int interrupt_request;
> >>>          qemu_mutex_lock_iothread();
> >>> -        interrupt_request = cpu->interrupt_request;
> >>> +        interrupt_request = cpu_interrupt_request(cpu);
> >>>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
> >>>              /* Mask out external interrupts for this step. */
> >>>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
> >>>          }
> >>>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
> >>> -            cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
> >>> +            cpu_reset_interrupt(cpu, CPU_INTERRUPT_DEBUG);
> >>>              cpu->exception_index = EXCP_DEBUG;
> >>>              qemu_mutex_unlock_iothread();
> >>>              return true;
> >>
> >> Multiple calls.
> > 
> > I'd rather keep it as is.
> > 
> > The first read takes the lock, and that has to stay unless
> > we want to use atomic_set on interrupt_request everywhere.
> 
> Why not?  That's even cheaper.
> 
> > Given that the CPU lock is uncontended (so it's cheap to
> > acquire) ...
> 
> It still requires at minimum a "lock xchg" (or equivalent on non-x86), which
> isn't free -- think 50-ish cycles minimum just for that one insn, plus call
> overhead.

OK, I changed the first read to atomic_read (changing all the other
writers to atomic_set, but thanks to the helpers it's just very
few of them), and then I'm holding both the BQL + cpu->lock throughout.

Thanks,

		Emilio

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index f37c9b1e94..25027a1fc6 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -428,7 +428,7 @@  static inline bool cpu_handle_halt_locked(CPUState *cpu)
 
     if (cpu_halted(cpu)) {
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
-        if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
+        if ((cpu_interrupt_request(cpu) & CPU_INTERRUPT_POLL)
             && replay_interrupt()) {
             X86CPU *x86_cpu = X86_CPU(cpu);
 
@@ -540,16 +540,16 @@  static inline bool cpu_handle_interrupt(CPUState *cpu,
      */
     atomic_mb_set(&cpu->icount_decr.u16.high, 0);
 
-    if (unlikely(atomic_read(&cpu->interrupt_request))) {
+    if (unlikely(cpu_interrupt_request(cpu))) {
         int interrupt_request;
         qemu_mutex_lock_iothread();
-        interrupt_request = cpu->interrupt_request;
+        interrupt_request = cpu_interrupt_request(cpu);
         if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
             /* Mask out external interrupts for this step. */
             interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
         }
         if (interrupt_request & CPU_INTERRUPT_DEBUG) {
-            cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
+            cpu_reset_interrupt(cpu, CPU_INTERRUPT_DEBUG);
             cpu->exception_index = EXCP_DEBUG;
             qemu_mutex_unlock_iothread();
             return true;
@@ -558,7 +558,7 @@  static inline bool cpu_handle_interrupt(CPUState *cpu,
             /* Do nothing */
         } else if (interrupt_request & CPU_INTERRUPT_HALT) {
             replay_interrupt();
-            cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
+            cpu_reset_interrupt(cpu, CPU_INTERRUPT_HALT);
             cpu_halted_set(cpu, 1);
             cpu->exception_index = EXCP_HLT;
             qemu_mutex_unlock_iothread();
@@ -595,10 +595,10 @@  static inline bool cpu_handle_interrupt(CPUState *cpu,
             }
             /* The target hook may have updated the 'cpu->interrupt_request';
              * reload the 'interrupt_request' value */
-            interrupt_request = cpu->interrupt_request;
+            interrupt_request = cpu_interrupt_request(cpu);
         }
         if (interrupt_request & CPU_INTERRUPT_EXITTB) {
-            cpu->interrupt_request &= ~CPU_INTERRUPT_EXITTB;
+            cpu_reset_interrupt(cpu, CPU_INTERRUPT_EXITTB);
             /* ensure that no TB jump will be modified as
                the program flow was changed */
             *last_tb = NULL;
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 3d25bdcc17..4e2fe70350 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -39,10 +39,16 @@  unsigned long tcg_tb_size;
 static void tcg_handle_interrupt(CPUState *cpu, int mask)
 {
     int old_mask;
-    g_assert(qemu_mutex_iothread_locked());
 
-    old_mask = cpu->interrupt_request;
-    cpu->interrupt_request |= mask;
+    if (!cpu_mutex_locked(cpu)) {
+        cpu_mutex_lock(cpu);
+        old_mask = cpu_interrupt_request(cpu);
+        cpu_interrupt_request_or(cpu, mask);
+        cpu_mutex_unlock(cpu);
+    } else {
+        old_mask = cpu_interrupt_request(cpu);
+        cpu_interrupt_request_or(cpu, mask);
+    }
 
     /*
      * If called from iothread context, wake the target cpu in
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 356dcd0948..038d82fdb5 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -2340,7 +2340,7 @@  void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
 void cpu_interrupt(CPUState *cpu, int mask)
 {
     g_assert(qemu_mutex_iothread_locked());
-    cpu->interrupt_request |= mask;
+    cpu_interrupt_request_or(cpu, mask);
     atomic_set(&cpu->icount_decr.u16.high, -1);
 }

[RFC,v3,46/56] accel/tcg: convert to cpu_interrupt_request

Commit Message

Comments

Patch