diff mbox series

hw/rdma: Lock before destroy

Message ID 20200324105356.7998-1-yuval.shaia.ml@gmail.com
State New
Headers show
Series hw/rdma: Lock before destroy | expand

Commit Message

Yuval Shaia March 24, 2020, 10:53 a.m. UTC
To protect from the case that users of the protected_qlist are still
using the qlist let's lock before detsroying it.

Reported-by: Coverity (CID 1421951)
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
---
 hw/rdma/rdma_utils.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Peter Maydell March 24, 2020, 11:05 a.m. UTC | #1
On Tue, 24 Mar 2020 at 10:54, Yuval Shaia <yuval.shaia.ml@gmail.com> wrote:
>
> To protect from the case that users of the protected_qlist are still
> using the qlist let's lock before detsroying it.
>
> Reported-by: Coverity (CID 1421951)
> Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
> ---
>  hw/rdma/rdma_utils.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/rdma/rdma_utils.c b/hw/rdma/rdma_utils.c
> index 73f279104c..55779cbef6 100644
> --- a/hw/rdma/rdma_utils.c
> +++ b/hw/rdma/rdma_utils.c
> @@ -63,6 +63,7 @@ void rdma_protected_qlist_init(RdmaProtectedQList *list)
>  void rdma_protected_qlist_destroy(RdmaProtectedQList *list)
>  {
>      if (list->list) {
> +        qemu_mutex_lock(&list->lock);
>          qlist_destroy_obj(QOBJECT(list->list));
>          qemu_mutex_destroy(&list->lock);
>          list->list = NULL;

Looking at the code a bit more closely, I don't think that taking
the lock here helps. If there really could be another thread
that might be calling eg rdma_protected_qlist_append_int64()
at the same time that we're calling rdma_protected_qlist_destroy()
then it's just as likely that the code calling destroy could
finish just before the append starts: in that case the append
will crash because the list has been freed and the mutex destroyed.

So I think we require that the user of a protected-qlist
ensures that there are no more users of it before it is
destroyed (which is fairly normal semantics), and the code
as it stands is correct (ie coverity false-positive).

thanks
-- PMM
Marcel Apfelbaum March 24, 2020, 11:20 a.m. UTC | #2
Hi Peter,Yuval

On 3/24/20 1:05 PM, Peter Maydell wrote:
> On Tue, 24 Mar 2020 at 10:54, Yuval Shaia <yuval.shaia.ml@gmail.com> wrote:
>> To protect from the case that users of the protected_qlist are still
>> using the qlist let's lock before detsroying it.
>>
>> Reported-by: Coverity (CID 1421951)
>> Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
>> ---
>>   hw/rdma/rdma_utils.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/rdma/rdma_utils.c b/hw/rdma/rdma_utils.c
>> index 73f279104c..55779cbef6 100644
>> --- a/hw/rdma/rdma_utils.c
>> +++ b/hw/rdma/rdma_utils.c
>> @@ -63,6 +63,7 @@ void rdma_protected_qlist_init(RdmaProtectedQList *list)
>>   void rdma_protected_qlist_destroy(RdmaProtectedQList *list)
>>   {
>>       if (list->list) {
>> +        qemu_mutex_lock(&list->lock);
>>           qlist_destroy_obj(QOBJECT(list->list));
>>           qemu_mutex_destroy(&list->lock);
>>           list->list = NULL;
> Looking at the code a bit more closely, I don't think that taking
> the lock here helps. If there really could be another thread
> that might be calling eg rdma_protected_qlist_append_int64()
> at the same time that we're calling rdma_protected_qlist_destroy()
> then it's just as likely that the code calling destroy could
> finish just before the append starts: in that case the append
> will crash because the list has been freed and the mutex destroyed.
>
> So I think we require that the user of a protected-qlist
> ensures that there are no more users of it before it is
> destroyed (which is fairly normal semantics), and the code
> as it stands is correct (ie coverity false-positive).

I agree is a false positive for our case.
"rdma_protected_qlist_destroy" is called by "mad_fini" which in turn is 
called by "rdma_backend_fini"
*after* the VM shutdown, at this point there is no active lock user.

Thanks,
Marcel

> thanks
> -- PMM
Peter Maydell March 24, 2020, 11:25 a.m. UTC | #3
On Tue, 24 Mar 2020 at 11:18, Marcel Apfelbaum
<marcel.apfelbaum@gmail.com> wrote:
>
> Hi Peter,Yuval
>
> On 3/24/20 1:05 PM, Peter Maydell wrote:
> > So I think we require that the user of a protected-qlist
> > ensures that there are no more users of it before it is
> > destroyed (which is fairly normal semantics), and the code
> > as it stands is correct (ie coverity false-positive).
>
> I agree is a false positive for our case.
> "rdma_protected_qlist_destroy" is called by "mad_fini" which in turn is
> called by "rdma_backend_fini"
> *after* the VM shutdown, at this point there is no active lock user.

Also, the function coverity queried was rdma_protected_gslist_destroy(),
not rdma_protected_qlist_destroy().

I notice that the gslist_destroy function does not destroy
the mutex -- is this an intentional difference ?

thanks
-- PMM
Yuval Shaia March 24, 2020, 11:25 a.m. UTC | #4
On Tue, 24 Mar 2020 at 13:18, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
wrote:

> Hi Peter,Yuval
>
> On 3/24/20 1:05 PM, Peter Maydell wrote:
> > On Tue, 24 Mar 2020 at 10:54, Yuval Shaia <yuval.shaia.ml@gmail.com>
> wrote:
> >> To protect from the case that users of the protected_qlist are still
> >> using the qlist let's lock before detsroying it.
> >>
> >> Reported-by: Coverity (CID 1421951)
> >> Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
> >> ---
> >>   hw/rdma/rdma_utils.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >>
> >> diff --git a/hw/rdma/rdma_utils.c b/hw/rdma/rdma_utils.c
> >> index 73f279104c..55779cbef6 100644
> >> --- a/hw/rdma/rdma_utils.c
> >> +++ b/hw/rdma/rdma_utils.c
> >> @@ -63,6 +63,7 @@ void rdma_protected_qlist_init(RdmaProtectedQList
> *list)
> >>   void rdma_protected_qlist_destroy(RdmaProtectedQList *list)
> >>   {
> >>       if (list->list) {
> >> +        qemu_mutex_lock(&list->lock);
> >>           qlist_destroy_obj(QOBJECT(list->list));
> >>           qemu_mutex_destroy(&list->lock);
> >>           list->list = NULL;
> > Looking at the code a bit more closely, I don't think that taking
> > the lock here helps. If there really could be another thread
> > that might be calling eg rdma_protected_qlist_append_int64()
> > at the same time that we're calling rdma_protected_qlist_destroy()
> > then it's just as likely that the code calling destroy could
> > finish just before the append starts: in that case the append
> > will crash because the list has been freed and the mutex destroyed.
> >
> > So I think we require that the user of a protected-qlist
> > ensures that there are no more users of it before it is
> > destroyed (which is fairly normal semantics), and the code
> > as it stands is correct (ie coverity false-positive).
>
> I agree is a false positive for our case.
> "rdma_protected_qlist_destroy" is called by "mad_fini" which in turn is
> called by "rdma_backend_fini"
> *after* the VM shutdown, at this point there is no active lock user.
>

As i already said, current code makes sure it will not happen however it
better that API will ensure this and will not trust callers.


>
> Thanks,
> Marcel
>
> > thanks
> > -- PMM
>
>
Peter Maydell March 24, 2020, 11:55 a.m. UTC | #5
On Tue, 24 Mar 2020 at 11:25, Yuval Shaia <yuval.shaia.ml@gmail.com> wrote:
> As i already said, current code makes sure it will not happen
> however it better that API will ensure this and will not trust callers.

I agree with the principle, but I think that here there is no
way to do it -- if you are literally destroying an object
then it is invalid to use it after destruction and there
is no way to have a lock that protects against that kind
of bug, unless the lock is at a higher level (ie the
thing that owns the destroyed-object has a lock and
doesn't allow access to it outside the lock or without
a check for has-been-destroyed). Just throwing an extra
mutex-lock into the destroy function doesn't add any
protection.

thanks
-- PMM
Yuval Shaia March 24, 2020, 12:48 p.m. UTC | #6
On Tue, 24 Mar 2020 at 13:25, Peter Maydell <peter.maydell@linaro.org>
wrote:

> On Tue, 24 Mar 2020 at 11:18, Marcel Apfelbaum
> <marcel.apfelbaum@gmail.com> wrote:
> >
> > Hi Peter,Yuval
> >
> > On 3/24/20 1:05 PM, Peter Maydell wrote:
> > > So I think we require that the user of a protected-qlist
> > > ensures that there are no more users of it before it is
> > > destroyed (which is fairly normal semantics), and the code
> > > as it stands is correct (ie coverity false-positive).
> >
> > I agree is a false positive for our case.
> > "rdma_protected_qlist_destroy" is called by "mad_fini" which in turn is
> > called by "rdma_backend_fini"
> > *after* the VM shutdown, at this point there is no active lock user.
>
> Also, the function coverity queried was rdma_protected_gslist_destroy(),
> not rdma_protected_qlist_destroy().
>

Yeah but pattern is the same, if we agree to threat it as false-positive
then it applies to the two cases.


>
> I notice that the gslist_destroy function does not destroy
> the mutex -- is this an intentional difference ?
>

No - it is a bug!


>
> thanks
> -- PMM
>
Yuval Shaia March 24, 2020, 12:57 p.m. UTC | #7
On Tue, 24 Mar 2020 at 13:55, Peter Maydell <peter.maydell@linaro.org>
wrote:

> On Tue, 24 Mar 2020 at 11:25, Yuval Shaia <yuval.shaia.ml@gmail.com>
> wrote:
> > As i already said, current code makes sure it will not happen
> > however it better that API will ensure this and will not trust callers.
>
> I agree with the principle, but I think that here there is no
> way to do it -- if you are literally destroying an object
> then it is invalid to use it after destruction and there
> is no way to have a lock that protects against that kind
> of bug, unless the lock is at a higher level (ie the
> thing that owns the destroyed-object has a lock and
> doesn't allow access to it outside the lock or without
> a check for has-been-destroyed). Just throwing an extra
> mutex-lock into the destroy function doesn't add any
> protection.
>

Make sense.
So what i can do is to check list->list at every API since destroy
functions sets it to NULL.

Does it make sense?


>
> thanks
> -- PMM
>
Peter Maydell March 24, 2020, 1:20 p.m. UTC | #8
On Tue, 24 Mar 2020 at 12:57, Yuval Shaia <yuval.shaia.ml@gmail.com> wrote:
> So what i can do is to check list->list at every API since destroy
> functions sets it to NULL.

No, that won't help. You can't check list->list for NULL without
taking the lock (because in the append etc functions you really
could be multithreaded), and you can't take the lock because
it might have been destroyed.

thanks
-- PMM
diff mbox series

Patch

diff --git a/hw/rdma/rdma_utils.c b/hw/rdma/rdma_utils.c
index 73f279104c..55779cbef6 100644
--- a/hw/rdma/rdma_utils.c
+++ b/hw/rdma/rdma_utils.c
@@ -63,6 +63,7 @@  void rdma_protected_qlist_init(RdmaProtectedQList *list)
 void rdma_protected_qlist_destroy(RdmaProtectedQList *list)
 {
     if (list->list) {
+        qemu_mutex_lock(&list->lock);
         qlist_destroy_obj(QOBJECT(list->list));
         qemu_mutex_destroy(&list->lock);
         list->list = NULL;