diff mbox

vhost-user: fix watcher need be removed when vhost-user hotplug

Message ID 1500614191-13392-1-git-send-email-wangyunjian@huawei.com
State New
Headers show

Commit Message

wangyunjian July 21, 2017, 5:16 a.m. UTC
From: Yunjian Wang <wangyunjian@huawei.com>

"nc" is freed after hotplug vhost-user, but the watcher don't be removed.
The QEMU crash when the watcher access the "nc" on socket disconnect.

    Program received signal SIGSEGV, Segmentation fault.
    #0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
    #1  0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>) at chardev/char-fe.c:372
    #2  0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
    #3  0x00007f9baf97f99a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
    #4  0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213
    #5  os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
    #6  main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
    #7  0x00007f9bb3e266a7 in main_loop () at vl.c:1917
    #8  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4786

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 net/vhost-user.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Marc-André Lureau July 21, 2017, 11:19 a.m. UTC | #1
Hi

On Fri, Jul 21, 2017 at 7:18 AM w00273186 <wangyunjian@huawei.com> wrote:

> From: Yunjian Wang <wangyunjian@huawei.com>
>
> "nc" is freed after hotplug vhost-user, but the watcher don't be removed.
> The QEMU crash when the watcher access the "nc" on socket disconnect.
>
>
This is actually your 3rd iteration on the patch

Could your describe your changes since:
"[PATCH v2] vhost-user: fix watcher need be removed when vhost-user hotplug"

Thanks


>     Program received signal SIGSEGV, Segmentation fault.
>     #0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
>     #1  0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>)
> at chardev/char-fe.c:372
>     #2  0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>,
> cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
>     #3  0x00007f9baf97f99a in g_main_context_dispatch () from
> /usr/lib64/libglib-2.0.so.0
>     #4  0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213
>     #5  os_host_main_loop_wait (timeout=<optimized out>) at
> util/main-loop.c:261
>     #6  main_loop_wait (nonblocking=nonblocking@entry=0) at
> util/main-loop.c:515
>     #7  0x00007f9bb3e266a7 in main_loop () at vl.c:1917
>     #8  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
> out>) at vl.c:4786
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  net/vhost-user.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 36f32a2..c23927c 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -151,6 +151,10 @@ static void vhost_user_cleanup(NetClientState *nc)
>          s->vhost_net = NULL;
>      }
>      if (nc->queue_index == 0) {
> +        if (s->watch) {
> +            g_source_remove(s->watch);
> +            s->watch = 0;
> +        }
>          qemu_chr_fe_deinit(&s->chr, true);
>      }
>
> --
> 1.8.3.1
>
>
>
> --
Marc-André Lureau
Michael S. Tsirkin July 22, 2017, 12:34 a.m. UTC | #2
On Fri, Jul 21, 2017 at 11:19:04AM +0000, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Jul 21, 2017 at 7:18 AM w00273186 <wangyunjian@huawei.com> wrote:
> 
>     From: Yunjian Wang <wangyunjian@huawei.com>
> 
>     "nc" is freed after hotplug vhost-user, but the watcher don't be removed.
>     The QEMU crash when the watcher access the "nc" on socket disconnect.
> 
> 
> 
> This is actually your 3rd iteration on the patch
> 
> Could your describe your changes since:
> "[PATCH v2] vhost-user: fix watcher need be removed when vhost-user hotplug"
> 
> Thanks

Yes but it's a 3-liner. That's way below the limit where you need
detailed change history. Does the patch make sense to you?

> 
>         Program received signal SIGSEGV, Segmentation fault.
>         #0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
>         #1  0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>)
>     at chardev/char-fe.c:372
>         #2  0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>,
>     cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
>         #3  0x00007f9baf97f99a in g_main_context_dispatch () from /usr/lib64/
>     libglib-2.0.so.0
>         #4  0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213
>         #5  os_host_main_loop_wait (timeout=<optimized out>) at util/
>     main-loop.c:261
>         #6  main_loop_wait (nonblocking=nonblocking@entry=0) at util/
>     main-loop.c:515
>         #7  0x00007f9bb3e266a7 in main_loop () at vl.c:1917
>         #8  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
>     out>) at vl.c:4786
> 
>     Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>     ---
>      net/vhost-user.c | 4 ++++
>      1 file changed, 4 insertions(+)
> 
>     diff --git a/net/vhost-user.c b/net/vhost-user.c
>     index 36f32a2..c23927c 100644
>     --- a/net/vhost-user.c
>     +++ b/net/vhost-user.c
>     @@ -151,6 +151,10 @@ static void vhost_user_cleanup(NetClientState *nc)
>              s->vhost_net = NULL;
>          }
>          if (nc->queue_index == 0) {
>     +        if (s->watch) {
>     +            g_source_remove(s->watch);
>     +            s->watch = 0;
>     +        }
>              qemu_chr_fe_deinit(&s->chr, true);
>          }
> 
>     --
>     1.8.3.1
> 
> 
> 
> 
> --
> Marc-André Lureau
Marc-André Lureau July 22, 2017, 9:24 a.m. UTC | #3
On Sat, Jul 22, 2017 at 2:35 AM Michael S. Tsirkin <mst@redhat.com> wrote:

> On Fri, Jul 21, 2017 at 11:19:04AM +0000, Marc-André Lureau wrote:
> > Hi
> >
> > On Fri, Jul 21, 2017 at 7:18 AM w00273186 <wangyunjian@huawei.com>
> wrote:
> >
> >     From: Yunjian Wang <wangyunjian@huawei.com>
> >
> >     "nc" is freed after hotplug vhost-user, but the watcher don't be
> removed.
> >     The QEMU crash when the watcher access the "nc" on socket disconnect.
> >
> >
> >
> > This is actually your 3rd iteration on the patch
> >
> > Could your describe your changes since:
> > "[PATCH v2] vhost-user: fix watcher need be removed when vhost-user
> hotplug"
> >
> > Thanks
>
> Yes but it's a 3-liner. That's way below the limit where you need
> detailed change history. Does the patch make sense to you?
>
>
That's not all, the fact that he didn't come up with the same solution in
the first place, and I didn't notice a problem either with the previous
approach is enough to ask from some clarification on which approach is
best, and I bet there is something to say.

Furthermore, we would really benefit from having repeatable cases for this
kind of fixes.



> >
> >         Program received signal SIGSEGV, Segmentation fault.
> >         #0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
> >         #1  0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized
> out>)
> >     at chardev/char-fe.c:372
> >         #2  0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized
> out>,
> >     cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
> >         #3  0x00007f9baf97f99a in g_main_context_dispatch () from
> /usr/lib64/
> >     libglib-2.0.so.0
> >         #4  0x00007f9bb41d7ebc in glib_pollfds_poll () at
> util/main-loop.c:213
> >         #5  os_host_main_loop_wait (timeout=<optimized out>) at util/
> >     main-loop.c:261
> >         #6  main_loop_wait (nonblocking=nonblocking@entry=0) at util/
> >     main-loop.c:515
> >         #7  0x00007f9bb3e266a7 in main_loop () at vl.c:1917
> >         #8  main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized
> >     out>) at vl.c:4786
> >
> >     Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> >     ---
> >      net/vhost-user.c | 4 ++++
> >      1 file changed, 4 insertions(+)
> >
> >     diff --git a/net/vhost-user.c b/net/vhost-user.c
> >     index 36f32a2..c23927c 100644
> >     --- a/net/vhost-user.c
> >     +++ b/net/vhost-user.c
> >     @@ -151,6 +151,10 @@ static void vhost_user_cleanup(NetClientState
> *nc)
> >              s->vhost_net = NULL;
> >          }
> >          if (nc->queue_index == 0) {
> >     +        if (s->watch) {
> >     +            g_source_remove(s->watch);
> >     +            s->watch = 0;
> >     +        }
> >              qemu_chr_fe_deinit(&s->chr, true);
> >          }
> >
> >     --
> >     1.8.3.1
> >
> >
> >
> >
> > --
> > Marc-André Lureau
>
Michael S. Tsirkin July 23, 2017, 2:12 a.m. UTC | #4
On Sat, Jul 22, 2017 at 09:24:27AM +0000, Marc-André Lureau wrote:
> 
> 
> On Sat, Jul 22, 2017 at 2:35 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> 
>     On Fri, Jul 21, 2017 at 11:19:04AM +0000, Marc-André Lureau wrote:
>     > Hi
>     >
>     > On Fri, Jul 21, 2017 at 7:18 AM w00273186 <wangyunjian@huawei.com> wrote:
>     >
>     >     From: Yunjian Wang <wangyunjian@huawei.com>
>     >
>     >     "nc" is freed after hotplug vhost-user, but the watcher don't be
>     removed.
>     >     The QEMU crash when the watcher access the "nc" on socket disconnect.
>     >
>     >
>     >
>     > This is actually your 3rd iteration on the patch
>     >
>     > Could your describe your changes since:
>     > "[PATCH v2] vhost-user: fix watcher need be removed when vhost-user
>     hotplug"
>     >
>     > Thanks
> 
>     Yes but it's a 3-liner. That's way below the limit where you need
>     detailed change history. Does the patch make sense to you?
> 
> 
> 
> That's not all, the fact that he didn't come up with the same solution in the
> first place, and I didn't notice a problem either with the previous approach is
> enough to ask from some clarification on which approach is best, and I bet
> there is something to say.

I'm rather confused.  Looks like you were the one who asked for the change.
Really we want to attract new contributors and a small bugfix like this
seems like a very good way to start contributing. Changelog is already
3 times the size of the patch here. So I think we should just get the patch
reviewed and applied if correct. Do you plan to review it?

> Furthermore, we would really benefit from having repeatable cases for this kind
> of fixes.

I agree disconnect path is but tested adequately but I don't think we
are at a point where we should be asking for testcases for every use
after free bug that gets fixed.

>  
> 
>     >
>     >         Program received signal SIGSEGV, Segmentation fault.
>     >         #0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
>     >         #1  0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized
>     out>)
>     >     at chardev/char-fe.c:372
>     >         #2  0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized
>     out>,
>     >     cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
>     >         #3  0x00007f9baf97f99a in g_main_context_dispatch () from /usr/
>     lib64/
>     >     libglib-2.0.so.0
>     >         #4  0x00007f9bb41d7ebc in glib_pollfds_poll () at util/
>     main-loop.c:213
>     >         #5  os_host_main_loop_wait (timeout=<optimized out>) at util/
>     >     main-loop.c:261
>     >         #6  main_loop_wait (nonblocking=nonblocking@entry=0) at util/
>     >     main-loop.c:515
>     >         #7  0x00007f9bb3e266a7 in main_loop () at vl.c:1917
>     >         #8  main (argc=<optimized out>, argv=<optimized out>, envp=
>     <optimized
>     >     out>) at vl.c:4786
>     >
>     >     Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>     >     ---
>     >      net/vhost-user.c | 4 ++++
>     >      1 file changed, 4 insertions(+)
>     >
>     >     diff --git a/net/vhost-user.c b/net/vhost-user.c
>     >     index 36f32a2..c23927c 100644
>     >     --- a/net/vhost-user.c
>     >     +++ b/net/vhost-user.c
>     >     @@ -151,6 +151,10 @@ static void vhost_user_cleanup(NetClientState
>     *nc)
>     >              s->vhost_net = NULL;
>     >          }
>     >          if (nc->queue_index == 0) {
>     >     +        if (s->watch) {
>     >     +            g_source_remove(s->watch);
>     >     +            s->watch = 0;
>     >     +        }
>     >              qemu_chr_fe_deinit(&s->chr, true);
>     >          }
>     >
>     >     --
>     >     1.8.3.1
>     >
>     >
>     >
>     >
>     > --
>     > Marc-André Lureau
> 
> --
> Marc-André Lureau


Why do you even bother including the patch if you use a client that
corrupts both the patch and the commit log formatting? It's not a good
example to give to new contributors and it doesn't align well
with nit-picking about same commit log, in my eyes.
Marc-André Lureau July 23, 2017, 10:06 a.m. UTC | #5
Hi

On Sun, Jul 23, 2017 at 4:12 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Sat, Jul 22, 2017 at 09:24:27AM +0000, Marc-André Lureau wrote:
>>
>>
>> On Sat, Jul 22, 2017 at 2:35 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>>     On Fri, Jul 21, 2017 at 11:19:04AM +0000, Marc-André Lureau wrote:
>>     > Hi
>>     >
>>     > On Fri, Jul 21, 2017 at 7:18 AM w00273186 <wangyunjian@huawei.com> wrote:
>>     >
>>     >     From: Yunjian Wang <wangyunjian@huawei.com>
>>     >
>>     >     "nc" is freed after hotplug vhost-user, but the watcher don't be
>>     removed.
>>     >     The QEMU crash when the watcher access the "nc" on socket disconnect.
>>     >
>>     >
>>     >
>>     > This is actually your 3rd iteration on the patch
>>     >
>>     > Could your describe your changes since:
>>     > "[PATCH v2] vhost-user: fix watcher need be removed when vhost-user
>>     hotplug"
>>     >
>>     > Thanks
>>
>>     Yes but it's a 3-liner. That's way below the limit where you need
>>     detailed change history. Does the patch make sense to you?
>>
>>
>>
>> That's not all, the fact that he didn't come up with the same solution in the
>> first place, and I didn't notice a problem either with the previous approach is
>> enough to ask from some clarification on which approach is best, and I bet
>> there is something to say.
>
> I'm rather confused.  Looks like you were the one who asked for the change.
> Really we want to attract new contributors and a small bugfix like this
> seems like a very good way to start contributing. Changelog is already
> 3 times the size of the patch here. So I think we should just get the patch
> reviewed and applied if correct. Do you plan to review it?

Indeed, but I totally forgot.

This situation wouldn't happen if:
- the patch was version v3
- the patch/mail would have been annotated after  --- to quickly
describe the change
- I had better memory...


>
>> Furthermore, we would really benefit from having repeatable cases for this kind
>> of fixes.
>
> I agree disconnect path is but tested adequately but I don't think we
> are at a point where we should be asking for testcases for every use
> after free bug that gets fixed.

Not to write a test case, but at least to document what triggered this
path. Since Yunjian gave it in the previous reply, and I forgot that
too, it would be best to have it in the commit message, agree?
diff mbox

Patch

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 36f32a2..c23927c 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -151,6 +151,10 @@  static void vhost_user_cleanup(NetClientState *nc)
         s->vhost_net = NULL;
     }
     if (nc->queue_index == 0) {
+        if (s->watch) {
+            g_source_remove(s->watch);
+            s->watch = 0;
+        }
         qemu_chr_fe_deinit(&s->chr, true);
     }