diff mbox series

[v2,1/3] accel: introduce accelerator blocker API

Message ID 20221110164807.1306076-2-eesposit@redhat.com
State New
Headers show
Series KVM: allow listener to stop all vcpus before | expand

Commit Message

Emanuele Giuseppe Esposito Nov. 10, 2022, 4:48 p.m. UTC
This API allows the accelerators to prevent vcpus from issuing
new ioctls while execting a critical section marked with the
accel-ioctl_inhibit_begin/end functions.

Note that all functions submitting ioctls must mark where the
ioctl is being called with accel_{cpu_}set_in_ioctl().

This API requires the caller to always hold the BQL.
API documentation is in sysemu/accel-blocker.h

Internally, it uses a QemuLockCnt together with a per-CPU QemuLockCnt
(to minimize cache line bouncing) to keep avoid that new ioctls
run when the critical section starts, and a QemuEvent to wait
that all running ioctls finish.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 accel/accel-blocker.c          | 139 +++++++++++++++++++++++++++++++++
 accel/meson.build              |   2 +-
 include/sysemu/accel-blocker.h |  45 +++++++++++
 3 files changed, 185 insertions(+), 1 deletion(-)
 create mode 100644 accel/accel-blocker.c
 create mode 100644 include/sysemu/accel-blocker.h

Comments

Paolo Bonzini Nov. 11, 2022, 10:48 a.m. UTC | #1
On 11/10/22 17:48, Emanuele Giuseppe Esposito wrote:
> +/*
> + * QEMU accel blocker class

"Lock to inhibit accelerator ioctls"

> + *
> + * Copyright (c) 2014 Red Hat Inc.

2022, you can also add an Author line.

> +static int accel_in_ioctls(void)

Return bool (and return early if ret becomes true).

> +void accel_ioctl_inhibit_begin(void)
> +{
> +    CPUState *cpu;
> +
> +    /*
> +     * We allow to inhibit only when holding the BQL, so we can identify
> +     * when an inhibitor wants to issue an ioctl easily.
> +     */
> +    g_assert(qemu_mutex_iothread_locked());
> +
> +    /* Block further invocations of the ioctls outside the BQL.  */
> +    CPU_FOREACH(cpu) {
> +        qemu_lockcnt_lock(&cpu->in_ioctl_lock);
> +    }
> +    qemu_lockcnt_lock(&accel_in_ioctl_lock);
> +
> +    /* Keep waiting until there are running ioctls */
> +    while (accel_in_ioctls()) {
> +        /* Reset event to FREE. */
> +        qemu_event_reset(&accel_in_ioctl_event);
> +
> +        if (accel_in_ioctls()) {
> +
> +            CPU_FOREACH(cpu) {
> +                /* exit the ioctl */
> +                qemu_cpu_kick(cpu);

Only kick if the lockcnt count is > 0? (this is not racy; if it is == 0, 
it cannot ever become > 0 again while the lock is taken)

> diff --git a/include/sysemu/accel-blocker.h b/include/sysemu/accel-blocker.h
> new file mode 100644
> index 0000000000..135ebea566
> --- /dev/null
> +++ b/include/sysemu/accel-blocker.h
> @@ -0,0 +1,45 @@
> +/*
> + * Accelerator blocking API, to prevent new ioctls from starting and wait the
> + * running ones finish.
> + * This mechanism differs from pause/resume_all_vcpus() in that it does not
> + * release the BQL.
> + *
> + *  Copyright (c) 2014 Red Hat Inc.

2022, you can also add an Author line here too.

> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#ifndef ACCEL_BLOCKER_H
> +#define ACCEL_BLOCKER_H
> +
> +#include "qemu/osdep.h"
> +#include "qemu/accel.h"

qemu/accel.h not needed?

> +#include "sysemu/cpus.h"
> +
> +extern void accel_blocker_init(void);
> +
> +/*
> + * accel_set_in_ioctl/accel_cpu_set_in_ioctl:
> + * Mark when ioctl is about to run or just finished.
> + * If @in_ioctl is true, then mark it is beginning. Otherwise marks that it is
> + * ending.
> + *
> + * These functions will block after accel_ioctl_inhibit_begin() is called,
> + * preventing new ioctls to run. They will continue only after
> + * accel_ioctl_inibith_end().
> + */
> +extern void accel_set_in_ioctl(bool in_ioctl);
> +extern void accel_cpu_set_in_ioctl(CPUState *cpu, bool in_ioctl);

Why not just

extern void accel_ioctl_begin(void);
extern void accel_ioctl_end(void);
extern void accel_cpu_ioctl_begin(CPUState *cpu);
extern void accel_cpu_ioctl_end(CPUState *cpu);

?

Otherwise it's very nice.

Paolo
Emanuele Giuseppe Esposito Nov. 11, 2022, 2:52 p.m. UTC | #2
Am 11/11/2022 um 11:48 schrieb Paolo Bonzini:
> On 11/10/22 17:48, Emanuele Giuseppe Esposito wrote:
>> +/*
>> + * QEMU accel blocker class
> 
> "Lock to inhibit accelerator ioctls"
> 
>> + *
>> + * Copyright (c) 2014 Red Hat Inc.
> 
> 2022, you can also add an Author line.
> 
>> +static int accel_in_ioctls(void)
> 
> Return bool (and return early if ret becomes true).
> 
>> +void accel_ioctl_inhibit_begin(void)
>> +{
>> +    CPUState *cpu;
>> +
>> +    /*
>> +     * We allow to inhibit only when holding the BQL, so we can identify
>> +     * when an inhibitor wants to issue an ioctl easily.
>> +     */
>> +    g_assert(qemu_mutex_iothread_locked());
>> +
>> +    /* Block further invocations of the ioctls outside the BQL.  */
>> +    CPU_FOREACH(cpu) {
>> +        qemu_lockcnt_lock(&cpu->in_ioctl_lock);
>> +    }
>> +    qemu_lockcnt_lock(&accel_in_ioctl_lock);
>> +
>> +    /* Keep waiting until there are running ioctls */
>> +    while (accel_in_ioctls()) {
>> +        /* Reset event to FREE. */
>> +        qemu_event_reset(&accel_in_ioctl_event);
>> +
>> +        if (accel_in_ioctls()) {
>> +
>> +            CPU_FOREACH(cpu) {
>> +                /* exit the ioctl */
>> +                qemu_cpu_kick(cpu);
> 
> Only kick if the lockcnt count is > 0? (this is not racy; if it is == 0,
> it cannot ever become > 0 again while the lock is taken)

Better:

accel_has_to_wait(void)
{
    CPUState *cpu;
    bool needs_to_wait = false;

    CPU_FOREACH(cpu) {
        if (qemu_lockcnt_count(&cpu->in_ioctl_lock)) {
            qemu_cpu_kick(cpu);
            needs_to_wait = true;
        }
    }

    return needs_to_wait || qemu_lockcnt_count(&accel_in_ioctl_lock);
}

And then the loop becomes:

while (true) {
        qemu_event_reset(&accel_in_ioctl_event);

        if (accel_has_to_wait()) {
            qemu_event_wait(&accel_in_ioctl_event);
        } else {
            /* No ioctl is running */
            return;
        }
}

> 
>> diff --git a/include/sysemu/accel-blocker.h
>> b/include/sysemu/accel-blocker.h
>> new file mode 100644
>> index 0000000000..135ebea566
>> --- /dev/null
>> +++ b/include/sysemu/accel-blocker.h
>> @@ -0,0 +1,45 @@
>> +/*
>> + * Accelerator blocking API, to prevent new ioctls from starting and
>> wait the
>> + * running ones finish.
>> + * This mechanism differs from pause/resume_all_vcpus() in that it
>> does not
>> + * release the BQL.
>> + *
>> + *  Copyright (c) 2014 Red Hat Inc.
> 
> 2022, you can also add an Author line here too.
> 
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +#ifndef ACCEL_BLOCKER_H
>> +#define ACCEL_BLOCKER_H
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/accel.h"
> 
> qemu/accel.h not needed?
> 
>> +#include "sysemu/cpus.h"
>> +
>> +extern void accel_blocker_init(void);
>> +
>> +/*
>> + * accel_set_in_ioctl/accel_cpu_set_in_ioctl:
>> + * Mark when ioctl is about to run or just finished.
>> + * If @in_ioctl is true, then mark it is beginning. Otherwise marks
>> that it is
>> + * ending.
>> + *
>> + * These functions will block after accel_ioctl_inhibit_begin() is
>> called,
>> + * preventing new ioctls to run. They will continue only after
>> + * accel_ioctl_inibith_end().
>> + */
>> +extern void accel_set_in_ioctl(bool in_ioctl);
>> +extern void accel_cpu_set_in_ioctl(CPUState *cpu, bool in_ioctl);
> 
> Why not just
> 
> extern void accel_ioctl_begin(void);
> extern void accel_ioctl_end(void);
> extern void accel_cpu_ioctl_begin(CPUState *cpu);
> extern void accel_cpu_ioctl_end(CPUState *cpu);
> 
> ?

Ok, makes sense.

Thank you,
Emanuele

> 
> Otherwise it's very nice.
> 
> Paolo
>
diff mbox series

Patch

diff --git a/accel/accel-blocker.c b/accel/accel-blocker.c
new file mode 100644
index 0000000000..2701a05945
--- /dev/null
+++ b/accel/accel-blocker.c
@@ -0,0 +1,139 @@ 
+/*
+ * QEMU accel blocker class
+ *
+ * Copyright (c) 2014 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/thread.h"
+#include "qemu/main-loop.h"
+#include "hw/core/cpu.h"
+#include "sysemu/accel-blocker.h"
+
+static QemuLockCnt accel_in_ioctl_lock;
+static QemuEvent accel_in_ioctl_event;
+
+void accel_blocker_init(void)
+{
+    qemu_lockcnt_init(&accel_in_ioctl_lock);
+    qemu_event_init(&accel_in_ioctl_event, false);
+}
+
+void accel_set_in_ioctl(bool in_ioctl)
+{
+    if (likely(qemu_mutex_iothread_locked())) {
+        return;
+    }
+    if (in_ioctl) {
+        /* block if lock is taken in kvm_ioctl_inhibit_begin() */
+        qemu_lockcnt_inc(&accel_in_ioctl_lock);
+    } else {
+        qemu_lockcnt_dec(&accel_in_ioctl_lock);
+        /* change event to SET. If event was BUSY, wake up all waiters */
+        qemu_event_set(&accel_in_ioctl_event);
+    }
+}
+
+void accel_cpu_set_in_ioctl(CPUState *cpu, bool in_ioctl)
+{
+    if (unlikely(qemu_mutex_iothread_locked())) {
+        return;
+    }
+    if (in_ioctl) {
+        /* block if lock is taken in kvm_ioctl_inhibit_begin() */
+        qemu_lockcnt_inc(&cpu->in_ioctl_lock);
+    } else {
+        qemu_lockcnt_dec(&cpu->in_ioctl_lock);
+        /* change event to SET. If event was BUSY, wake up all waiters */
+        qemu_event_set(&accel_in_ioctl_event);
+    }
+}
+
+static int accel_in_ioctls(void)
+{
+    CPUState *cpu;
+    int ret = qemu_lockcnt_count(&accel_in_ioctl_lock);
+
+    CPU_FOREACH(cpu) {
+        ret += qemu_lockcnt_count(&cpu->in_ioctl_lock);
+    }
+
+    return  ret;
+}
+
+void accel_ioctl_inhibit_begin(void)
+{
+    CPUState *cpu;
+
+    /*
+     * We allow to inhibit only when holding the BQL, so we can identify
+     * when an inhibitor wants to issue an ioctl easily.
+     */
+    g_assert(qemu_mutex_iothread_locked());
+
+    /* Block further invocations of the ioctls outside the BQL.  */
+    CPU_FOREACH(cpu) {
+        qemu_lockcnt_lock(&cpu->in_ioctl_lock);
+    }
+    qemu_lockcnt_lock(&accel_in_ioctl_lock);
+
+    /* Keep waiting until there are running ioctls */
+    while (accel_in_ioctls()) {
+        /* Reset event to FREE. */
+        qemu_event_reset(&accel_in_ioctl_event);
+
+        if (accel_in_ioctls()) {
+
+            CPU_FOREACH(cpu) {
+                /* exit the ioctl */
+                qemu_cpu_kick(cpu);
+            }
+
+            /*
+             * If event is still FREE, and there are ioctls still in progress,
+             * wait.
+             *
+             *  If an ioctl finishes before qemu_event_wait(), it will change
+             * the event state to SET. This will prevent qemu_event_wait() from
+             * blocking, but it's not a problem because if other ioctls are
+             * still running (accel_in_ioctls is true) the loop will iterate
+             * once more and reset the event status to FREE so that it can wait
+             * properly.
+             *
+             * If an ioctls finishes while qemu_event_wait() is blocking, then
+             * it will be waken up, but also here the while loop makes sure
+             * to re-enter the wait if there are other running ioctls.
+             */
+            qemu_event_wait(&accel_in_ioctl_event);
+        }
+    }
+}
+
+void accel_ioctl_inhibit_end(void)
+{
+    CPUState *cpu;
+
+    qemu_lockcnt_unlock(&accel_in_ioctl_lock);
+    CPU_FOREACH(cpu) {
+        qemu_lockcnt_unlock(&cpu->in_ioctl_lock);
+    }
+}
+
diff --git a/accel/meson.build b/accel/meson.build
index b9a963cf80..a0d49c4f31 100644
--- a/accel/meson.build
+++ b/accel/meson.build
@@ -1,4 +1,4 @@ 
-specific_ss.add(files('accel-common.c'))
+specific_ss.add(files('accel-common.c', 'accel-blocker.c'))
 softmmu_ss.add(files('accel-softmmu.c'))
 user_ss.add(files('accel-user.c'))
 
diff --git a/include/sysemu/accel-blocker.h b/include/sysemu/accel-blocker.h
new file mode 100644
index 0000000000..135ebea566
--- /dev/null
+++ b/include/sysemu/accel-blocker.h
@@ -0,0 +1,45 @@ 
+/*
+ * Accelerator blocking API, to prevent new ioctls from starting and wait the
+ * running ones finish.
+ * This mechanism differs from pause/resume_all_vcpus() in that it does not
+ * release the BQL.
+ *
+ *  Copyright (c) 2014 Red Hat Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef ACCEL_BLOCKER_H
+#define ACCEL_BLOCKER_H
+
+#include "qemu/osdep.h"
+#include "qemu/accel.h"
+#include "sysemu/cpus.h"
+
+extern void accel_blocker_init(void);
+
+/*
+ * accel_set_in_ioctl/accel_cpu_set_in_ioctl:
+ * Mark when ioctl is about to run or just finished.
+ * If @in_ioctl is true, then mark it is beginning. Otherwise marks that it is
+ * ending.
+ *
+ * These functions will block after accel_ioctl_inhibit_begin() is called,
+ * preventing new ioctls to run. They will continue only after
+ * accel_ioctl_inibith_end().
+ */
+extern void accel_set_in_ioctl(bool in_ioctl);
+extern void accel_cpu_set_in_ioctl(CPUState *cpu, bool in_ioctl);
+
+/*
+ * accel_ioctl_inhibit_begin/end: start/end critical section
+ * Between these two calls, no ioctl marked with accel_set_in_ioctl() and
+ * accel_cpu_set_in_ioctl() is allowed to run.
+ *
+ * This allows the caller to access shared data or perform operations without
+ * worrying of concurrent vcpus accesses.
+ */
+extern void accel_ioctl_inhibit_begin(void);
+extern void accel_ioctl_inhibit_end(void);
+
+#endif /* ACCEL_BLOCKER_H */