diff mbox series

[v2,4/9] memory: Don't do topology update in memory finalize()

Message ID 20210723193444.133412-5-peterx@redhat.com
State New
Headers show
Series memory: Sanity checks memory transaction when releasing BQL | expand

Commit Message

Peter Xu July 23, 2021, 7:34 p.m. UTC
Topology update could be wrongly triggered in memory region finalize() if
there's bug somewhere else.  It'll be a very confusing stack when it
happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
only until it fails!).

Instead of that, we use the push()/pop() helper to avoid memory transaction
commit, at the same time we use assertions to make sure there's no pending
updates or it's a nested transaction, so it could fail even earlier and in a
more explicit way.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/memory.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

Comments

David Hildenbrand July 27, 2021, 1:21 p.m. UTC | #1
On 23.07.21 21:34, Peter Xu wrote:
> Topology update could be wrongly triggered in memory region finalize() if
> there's bug somewhere else.  It'll be a very confusing stack when it
> happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
> only until it fails!).
> 
> Instead of that, we use the push()/pop() helper to avoid memory transaction
> commit, at the same time we use assertions to make sure there's no pending
> updates or it's a nested transaction, so it could fail even earlier and in a
> more explicit way.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   softmmu/memory.c | 23 +++++++++++++++++++++--
>   1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 1a3e9ff8ad..dfce4a2bda 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
>       EventNotifier *e;
>   };
>   
> +/* Returns whether there's any pending memory updates */
> +static bool memory_region_has_pending_update(void)
> +{
> +    return memory_region_update_pending || ioeventfd_update_pending;
> +}
> +
>   static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>                                              MemoryRegionIoeventfd *b)
>   {
> @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
>        * and cause an infinite loop.
>        */
>       mr->enabled = false;
> -    memory_region_transaction_begin();
> +
> +    /*
> +     * Use push()/pop() instead of begin()/commit() to make sure below block
> +     * won't trigger any topology update (which should never happen, but it's
> +     * still a safety belt).
> +     */

Hmm, I wonder if we can just keep the begin/end semantics and just do an 
assertion before doing the commit? Does anything speak against that?
Peter Xu July 27, 2021, 4:02 p.m. UTC | #2
On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
> On 23.07.21 21:34, Peter Xu wrote:
> > Topology update could be wrongly triggered in memory region finalize() if
> > there's bug somewhere else.  It'll be a very confusing stack when it
> > happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
> > only until it fails!).
> > 
> > Instead of that, we use the push()/pop() helper to avoid memory transaction
> > commit, at the same time we use assertions to make sure there's no pending
> > updates or it's a nested transaction, so it could fail even earlier and in a
> > more explicit way.
> > 
> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   softmmu/memory.c | 23 +++++++++++++++++++++--
> >   1 file changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index 1a3e9ff8ad..dfce4a2bda 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
> >       EventNotifier *e;
> >   };
> > +/* Returns whether there's any pending memory updates */
> > +static bool memory_region_has_pending_update(void)
> > +{
> > +    return memory_region_update_pending || ioeventfd_update_pending;
> > +}
> > +
> >   static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
> >                                              MemoryRegionIoeventfd *b)
> >   {
> > @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
> >        * and cause an infinite loop.
> >        */
> >       mr->enabled = false;
> > -    memory_region_transaction_begin();
> > +
> > +    /*
> > +     * Use push()/pop() instead of begin()/commit() to make sure below block
> > +     * won't trigger any topology update (which should never happen, but it's
> > +     * still a safety belt).
> > +     */
> 
> Hmm, I wonder if we can just keep the begin/end semantics and just do an
> assertion before doing the commit? Does anything speak against that?

That sounds working too for the case of run_on_cpu and similar, but I think
this patch should be able to cover more.  For example, it's possible depth==0
when enter memory_region_finalize(), but some removal of subregions could
further cause memory layout changes.  IMHO we should also bail out early for
those cases too.  Thanks,
David Hildenbrand July 28, 2021, 12:13 p.m. UTC | #3
On 27.07.21 18:02, Peter Xu wrote:
> On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
>> On 23.07.21 21:34, Peter Xu wrote:
>>> Topology update could be wrongly triggered in memory region finalize() if
>>> there's bug somewhere else.  It'll be a very confusing stack when it
>>> happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
>>> only until it fails!).
>>>
>>> Instead of that, we use the push()/pop() helper to avoid memory transaction
>>> commit, at the same time we use assertions to make sure there's no pending
>>> updates or it's a nested transaction, so it could fail even earlier and in a
>>> more explicit way.
>>>
>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    softmmu/memory.c | 23 +++++++++++++++++++++--
>>>    1 file changed, 21 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>>> index 1a3e9ff8ad..dfce4a2bda 100644
>>> --- a/softmmu/memory.c
>>> +++ b/softmmu/memory.c
>>> @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
>>>        EventNotifier *e;
>>>    };
>>> +/* Returns whether there's any pending memory updates */
>>> +static bool memory_region_has_pending_update(void)
>>> +{
>>> +    return memory_region_update_pending || ioeventfd_update_pending;
>>> +}
>>> +
>>>    static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>>>                                               MemoryRegionIoeventfd *b)
>>>    {
>>> @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
>>>         * and cause an infinite loop.
>>>         */
>>>        mr->enabled = false;
>>> -    memory_region_transaction_begin();
>>> +
>>> +    /*
>>> +     * Use push()/pop() instead of begin()/commit() to make sure below block
>>> +     * won't trigger any topology update (which should never happen, but it's
>>> +     * still a safety belt).
>>> +     */
>>
>> Hmm, I wonder if we can just keep the begin/end semantics and just do an
>> assertion before doing the commit? Does anything speak against that?
> 
> That sounds working too for the case of run_on_cpu and similar, but I think
> this patch should be able to cover more.  For example, it's possible depth==0
> when enter memory_region_finalize(), but some removal of subregions could
> further cause memory layout changes.  IMHO we should also bail out early for
> those cases too.  Thanks,
> 

Do we really have to switch to push/pop to catch these cases early? I'd 
assume we'd just have to formulate the right assertions :)
Peter Xu July 28, 2021, 1:56 p.m. UTC | #4
On Wed, Jul 28, 2021 at 02:13:17PM +0200, David Hildenbrand wrote:
> On 27.07.21 18:02, Peter Xu wrote:
> > On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
> > > On 23.07.21 21:34, Peter Xu wrote:
> > > > Topology update could be wrongly triggered in memory region finalize() if
> > > > there's bug somewhere else.  It'll be a very confusing stack when it
> > > > happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
> > > > only until it fails!).
> > > > 
> > > > Instead of that, we use the push()/pop() helper to avoid memory transaction
> > > > commit, at the same time we use assertions to make sure there's no pending
> > > > updates or it's a nested transaction, so it could fail even earlier and in a
> > > > more explicit way.
> > > > 
> > > > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >    softmmu/memory.c | 23 +++++++++++++++++++++--
> > > >    1 file changed, 21 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > > index 1a3e9ff8ad..dfce4a2bda 100644
> > > > --- a/softmmu/memory.c
> > > > +++ b/softmmu/memory.c
> > > > @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
> > > >        EventNotifier *e;
> > > >    };
> > > > +/* Returns whether there's any pending memory updates */
> > > > +static bool memory_region_has_pending_update(void)
> > > > +{
> > > > +    return memory_region_update_pending || ioeventfd_update_pending;
> > > > +}
> > > > +
> > > >    static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
> > > >                                               MemoryRegionIoeventfd *b)
> > > >    {
> > > > @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
> > > >         * and cause an infinite loop.
> > > >         */
> > > >        mr->enabled = false;
> > > > -    memory_region_transaction_begin();
> > > > +
> > > > +    /*
> > > > +     * Use push()/pop() instead of begin()/commit() to make sure below block
> > > > +     * won't trigger any topology update (which should never happen, but it's
> > > > +     * still a safety belt).
> > > > +     */
> > > 
> > > Hmm, I wonder if we can just keep the begin/end semantics and just do an
> > > assertion before doing the commit? Does anything speak against that?
> > 
> > That sounds working too for the case of run_on_cpu and similar, but I think
> > this patch should be able to cover more.  For example, it's possible depth==0
> > when enter memory_region_finalize(), but some removal of subregions could
> > further cause memory layout changes.  IMHO we should also bail out early for
> > those cases too.  Thanks,
> > 
> 
> Do we really have to switch to push/pop to catch these cases early? I'd
> assume we'd just have to formulate the right assertions :)

The subject and commit message was trying to tell why. :)

"memory: Don't do topology update in memory finalize()"

The root reason is errors within memory commit can be very hard to digest,
however the assertion failure is easier to understand because any memory layout
change is not expected to happen.

The push/pop (I renamed it after your other comment to depth_inc/dec) avoids
memory commit happening at all within finalize(), and make sure we always fail
at the assertion as long as there's any potential memory update (even if the
memory update won't crash qemu immediately).

Thanks,
David Hildenbrand July 28, 2021, 2:01 p.m. UTC | #5
On 28.07.21 15:56, Peter Xu wrote:
> On Wed, Jul 28, 2021 at 02:13:17PM +0200, David Hildenbrand wrote:
>> On 27.07.21 18:02, Peter Xu wrote:
>>> On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
>>>> On 23.07.21 21:34, Peter Xu wrote:
>>>>> Topology update could be wrongly triggered in memory region finalize() if
>>>>> there's bug somewhere else.  It'll be a very confusing stack when it
>>>>> happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
>>>>> only until it fails!).
>>>>>
>>>>> Instead of that, we use the push()/pop() helper to avoid memory transaction
>>>>> commit, at the same time we use assertions to make sure there's no pending
>>>>> updates or it's a nested transaction, so it could fail even earlier and in a
>>>>> more explicit way.
>>>>>
>>>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>>> ---
>>>>>     softmmu/memory.c | 23 +++++++++++++++++++++--
>>>>>     1 file changed, 21 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>>>>> index 1a3e9ff8ad..dfce4a2bda 100644
>>>>> --- a/softmmu/memory.c
>>>>> +++ b/softmmu/memory.c
>>>>> @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
>>>>>         EventNotifier *e;
>>>>>     };
>>>>> +/* Returns whether there's any pending memory updates */
>>>>> +static bool memory_region_has_pending_update(void)
>>>>> +{
>>>>> +    return memory_region_update_pending || ioeventfd_update_pending;
>>>>> +}
>>>>> +
>>>>>     static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>>>>>                                                MemoryRegionIoeventfd *b)
>>>>>     {
>>>>> @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
>>>>>          * and cause an infinite loop.
>>>>>          */
>>>>>         mr->enabled = false;
>>>>> -    memory_region_transaction_begin();
>>>>> +
>>>>> +    /*
>>>>> +     * Use push()/pop() instead of begin()/commit() to make sure below block
>>>>> +     * won't trigger any topology update (which should never happen, but it's
>>>>> +     * still a safety belt).
>>>>> +     */
>>>>
>>>> Hmm, I wonder if we can just keep the begin/end semantics and just do an
>>>> assertion before doing the commit? Does anything speak against that?
>>>
>>> That sounds working too for the case of run_on_cpu and similar, but I think
>>> this patch should be able to cover more.  For example, it's possible depth==0
>>> when enter memory_region_finalize(), but some removal of subregions could
>>> further cause memory layout changes.  IMHO we should also bail out early for
>>> those cases too.  Thanks,
>>>
>>
>> Do we really have to switch to push/pop to catch these cases early? I'd
>> assume we'd just have to formulate the right assertions :)
> 
> The subject and commit message was trying to tell why. :)
> 
> "memory: Don't do topology update in memory finalize()"
> 
> The root reason is errors within memory commit can be very hard to digest,
> however the assertion failure is easier to understand because any memory layout
> change is not expected to happen.
> 
> The push/pop (I renamed it after your other comment to depth_inc/dec) avoids
> memory commit happening at all within finalize(), and make sure we always fail
> at the assertion as long as there's any potential memory update (even if the
> memory update won't crash qemu immediately).

Okay, makes sense to me, thanks

Acked-by: David Hildenbrand <david@redhat.com>
diff mbox series

Patch

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 1a3e9ff8ad..dfce4a2bda 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -170,6 +170,12 @@  struct MemoryRegionIoeventfd {
     EventNotifier *e;
 };
 
+/* Returns whether there's any pending memory updates */
+static bool memory_region_has_pending_update(void)
+{
+    return memory_region_update_pending || ioeventfd_update_pending;
+}
+
 static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
                                            MemoryRegionIoeventfd *b)
 {
@@ -1756,12 +1762,25 @@  static void memory_region_finalize(Object *obj)
      * and cause an infinite loop.
      */
     mr->enabled = false;
-    memory_region_transaction_begin();
+
+    /*
+     * Use push()/pop() instead of begin()/commit() to make sure below block
+     * won't trigger any topology update (which should never happen, but it's
+     * still a safety belt).
+     */
+    memory_region_transaction_push();
     while (!QTAILQ_EMPTY(&mr->subregions)) {
         MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);
         memory_region_del_subregion(mr, subregion);
     }
-    memory_region_transaction_commit();
+    memory_region_transaction_pop();
+
+    /*
+     * Make sure we're either in a nested transaction or there must have no
+     * pending updates due to memory_region_del_subregion() above.
+     */
+    assert(memory_region_transaction_depth ||
+           !memory_region_has_pending_update());
 
     mr->destructor(mr);
     memory_region_clear_coalescing(mr);