diff mbox series

[4/8] edu: add smp_mb__after_rmw()

Message ID 20230303171939.237819-5-pbonzini@redhat.com
State New
Headers show
Series Fix missing memory barriers on ARM | expand

Commit Message

Paolo Bonzini March 3, 2023, 5:19 p.m. UTC
Ensure ordering between clearing the COMPUTING flag and checking
IRQFACT, and between setting the IRQFACT flag and checking
COMPUTING.  This ensures that no wakeups are lost.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/misc/edu.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Richard Henderson March 5, 2023, 7:14 p.m. UTC | #1
On 3/3/23 09:19, Paolo Bonzini wrote:
> Ensure ordering between clearing the COMPUTING flag and checking
> IRQFACT, and between setting the IRQFACT flag and checking
> COMPUTING.  This ensures that no wakeups are lost.
> 
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
>   hw/misc/edu.c | 5 +++++
>   1 file changed, 5 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~
David Hildenbrand March 6, 2023, 1:31 p.m. UTC | #2
On 03.03.23 18:19, Paolo Bonzini wrote:
> Ensure ordering between clearing the COMPUTING flag and checking
> IRQFACT, and between setting the IRQFACT flag and checking
> COMPUTING.  This ensures that no wakeups are lost.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---

Reviewed-by: David Hildenbrand <david@redhat.com>
Peter Maydell March 6, 2023, 1:38 p.m. UTC | #3
On Fri, 3 Mar 2023 at 17:21, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Ensure ordering between clearing the COMPUTING flag and checking
> IRQFACT, and between setting the IRQFACT flag and checking
> COMPUTING.  This ensures that no wakeups are lost.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Why is this device even messing around with multiple
threads and atomics anyway ??

I'm quite tempted to suggest we should deprecate-and-drop
this; it's not real hardware, it doesn't do anything
useful, and it's not a good model to follow if you're
implementing some other device.

thanks
-- PMM
Paolo Bonzini March 6, 2023, 2:10 p.m. UTC | #4
On 3/6/23 14:38, Peter Maydell wrote:
> On Fri, 3 Mar 2023 at 17:21, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> Ensure ordering between clearing the COMPUTING flag and checking
>> IRQFACT, and between setting the IRQFACT flag and checking
>> COMPUTING.  This ensures that no wakeups are lost.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> Why is this device even messing around with multiple
> threads and atomics anyway ??

Because it is an example of deferring device work to another thread, 
just like on real hardware it may be deferred to an on-device 
microcontroller or CPU.

In particular, in this case I think it is useful to show a pitfall that 
is present in emulated hardware (where all loads and stores ultimately 
go through a store buffer and CPU cache) and not in real hardware (where 
I/O registers are always uncached).

> I'm quite tempted to suggest we should deprecate-and-drop
> this; it's not real hardware, it doesn't do anything
> useful, and it's not a good model to follow if you're
> implementing some other device.

I'm okay with deprecating it but I think it has educational value.

Paolo
Peter Maydell March 6, 2023, 2:24 p.m. UTC | #5
On Mon, 6 Mar 2023 at 14:10, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 3/6/23 14:38, Peter Maydell wrote:
> > On Fri, 3 Mar 2023 at 17:21, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >> Ensure ordering between clearing the COMPUTING flag and checking
> >> IRQFACT, and between setting the IRQFACT flag and checking
> >> COMPUTING.  This ensures that no wakeups are lost.
> >>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >
> > Why is this device even messing around with multiple
> > threads and atomics anyway ??
>
> Because it is an example of deferring device work to another thread,
> just like on real hardware it may be deferred to an on-device
> microcontroller or CPU.

If we want to be able to do that, we should probably have
infrastructure and higher-level primitives for it that
don't require device authors to be super-familiar with
QEMU's memory model and barriers... The fact there are only
half a dozen other uses of qemu_thread_create() under hw/
suggests that in practice we don't really need to do this
very often, though.

thanks
-- PMM
Paolo Bonzini March 6, 2023, 3:06 p.m. UTC | #6
On 3/6/23 15:24, Peter Maydell wrote:
>>> Why is this device even messing around with multiple
>>> threads and atomics anyway ??
>> Because it is an example of deferring device work to another thread,
>> just like on real hardware it may be deferred to an on-device
>> microcontroller or CPU.
> If we want to be able to do that, we should probably have
> infrastructure and higher-level primitives for it that
> don't require device authors to be super-familiar with
> QEMU's memory model and barriers... The fact there are only
> half a dozen other uses of qemu_thread_create() under hw/
> suggests that in practice we don't really need to do this
> very often, though.

Yes, you're totally right about that.  I have never needed this kind of 
higher-level primitive so I haven't thought much about what it would 
look like.

The usage of barriers isn't great, but all this patch does is correctness...

Paolo
Peter Maydell March 6, 2023, 3:36 p.m. UTC | #7
On Mon, 6 Mar 2023 at 15:06, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 3/6/23 15:24, Peter Maydell wrote:
> >>> Why is this device even messing around with multiple
> >>> threads and atomics anyway ??
> >> Because it is an example of deferring device work to another thread,
> >> just like on real hardware it may be deferred to an on-device
> >> microcontroller or CPU.
> > If we want to be able to do that, we should probably have
> > infrastructure and higher-level primitives for it that
> > don't require device authors to be super-familiar with
> > QEMU's memory model and barriers... The fact there are only
> > half a dozen other uses of qemu_thread_create() under hw/
> > suggests that in practice we don't really need to do this
> > very often, though.
>
> Yes, you're totally right about that.  I have never needed this kind of
> higher-level primitive so I haven't thought much about what it would
> look like.
>
> The usage of barriers isn't great, but all this patch does is correctness...

Yeah, I'm not objecting to this patch, to be clear. It's just
that it brought the edu device to my negative attention :-)

-- PMM
diff mbox series

Patch

diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index e935c418d400..a1f8bc77e770 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -267,6 +267,8 @@  static void edu_mmio_write(void *opaque, hwaddr addr, uint64_t val,
     case 0x20:
         if (val & EDU_STATUS_IRQFACT) {
             qatomic_or(&edu->status, EDU_STATUS_IRQFACT);
+            /* Order check of the COMPUTING flag after setting IRQFACT.  */
+            smp_mb__after_rmw();
         } else {
             qatomic_and(&edu->status, ~EDU_STATUS_IRQFACT);
         }
@@ -349,6 +351,9 @@  static void *edu_fact_thread(void *opaque)
         qemu_mutex_unlock(&edu->thr_mutex);
         qatomic_and(&edu->status, ~EDU_STATUS_COMPUTING);
 
+        /* Clear COMPUTING flag before checking IRQFACT.  */
+        smp_mb__after_rmw();
+
         if (qatomic_read(&edu->status) & EDU_STATUS_IRQFACT) {
             qemu_mutex_lock_iothread();
             edu_raise_irq(edu, FACT_IRQ);