Message ID | 20180316140508.863778-1-eblake@redhat.com |
---|---|
State | New |
Headers | show |
On 16 March 2018 at 14:04, Eric Blake <eblake@redhat.com> wrote: > The following changes since commit 3788c7b6e56fa34ee2a73e41706eb2a2447ba75a: > > Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2018-03-16 11:05:03 +0000) > > are available in the Git repository at: > > git://repo.or.cz/qemu/ericb.git tags/pull-qapi-2018-03-12-v2 > > for you to fetch changes up to 75eb57e3ed3682f011a6694863044e8b143a9821: > > qapi: Pass '-u' when doing non-silent diff (2018-03-16 09:00:07 -0500) > > v2: rebase on Paolo's queue (should fix tests that failed on v1), > fix rebase conflicts, add two more related patches > Sending only the changed patches from v1 > > ---------------------------------------------------------------- > qapi patches for 2018-03-12, 2.12 softfreeze > > - Marc-André Lureau: 0/4 qapi: generate a literal qobject for introspection > - Max Reitz: 0/7 block: Handle null backing link > - Daniel P. Berrange: chardev: tcp: postpone TLS work until machine done > - Peter Xu: 00/23 QMP: out-of-band (OOB) execution support > - Vladimir Sementsov-Ogievskiy: 0/2 block latency histogram > - Eric Blake: qapi: Pass '-u' when doing non-silent diff > > ---------------------------------------------------------------- Hi. I get a bunch of test assertion failures with this: ppc64 host: QTEST_QEMU_BINARY=nios2-softmmu/qemu-system-nios2 QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1) )} gtester -k --verbose -m=quick tests/qmp-test tests/device-introspect-test tests/qom-test tests/test-hmp TEST: tests/qmp-test... (pid=49431) /nios2/qmp/protocol: OK /nios2/qmp/oob: OK /nios2/qmp/query-status: OK /nios2/qmp/query-block: OK /nios2/qmp/query-blockstats: OK /nios2/qmp/query-block-jobs: OK /nios2/qmp/query-named-block-nodes: qemu-system-nios2: /home/pm215/qemu/chardev/char-io.c:91: io_watc h_poll_finalize: Assertion `iwp->src == ((void *)0)' failed. Broken pipe FAIL FreeBSD host: TEST: tests/qmp-test... (pid=68428) /aarch64/qmp/protocol: OK /aarch64/qmp/oob: OK [...] /aarch64/qmp/query-iothreads: Assertion failed: (iwp->src == NULL), function io_watch_poll_finalize, file /root/qemu/chardev/char-io.c, line 91. Broken pipe FAIL GTester: last random seed: R02S60296bacb6aea7a3d748811fc486c71e (pid=68462) OpenBSD host: /cris/qmp/qom-list-types: assertion "iwp->src == NULL" failed: file "/home/qemu/chardev/cha r-io.c", line 91, function "io_watch_poll_finalize" Broken pipe FAIL NetBSD host: TEST: tests/tpm-crb-test... (pid=21337) /i386/tpm-crb/test: OK Unexpected error in qio_channel_socket_readv() at /root/qemu/io/channel-socket.c:494: FAIL: tests/tpm-crb-test TEST: tests/tpm-tis-test... (pid=14763) /i386/tpm-tis/test_check_localities: OK /i386/tpm-tis/test_check_access_reg: OK /i386/tpm-tis/test_check_access_reg_seize: OK /i386/tpm-tis/test_check_access_reg_release: OK /i386/tpm-tis/test_check_transmit: OK Unexpected error in qio_channel_socket_readv() at /root/qemu/io/channel-socket.c:494: FAIL: tests/tpm-tis-test SPARC host: /cris/qmp/query-memory-size-summary: qemu-system-cris: /srv/pm215/qemu/chardev/char-io.c:91: io_watch_ poll_finalize: Assertion `iwp->src == NULL' failed. Broken pipe FAIL x86/Linux host: /hppa/qmp/query-memdev: qemu-system-hppa: /home/petmay01/linaro/qemu-for-merges/chardev/c har-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == NULL' failed. Broken pipe FAIL aarch64 host: qemu-system-alpha: /home/pm215/qemu/chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed. One or two of the build hosts did pass, so that plus the varying tests which failed suggests that the iwp->src assert is an intermittent or timing based one. The tpm error on NetBSD is probably a separate issue. thanks -- PMM
On Sat, Mar 17, 2018 at 12:10:35PM +0000, Peter Maydell wrote: > On 16 March 2018 at 14:04, Eric Blake <eblake@redhat.com> wrote: > > The following changes since commit 3788c7b6e56fa34ee2a73e41706eb2a2447ba75a: > > > > Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2018-03-16 11:05:03 +0000) > > > > are available in the Git repository at: > > > > git://repo.or.cz/qemu/ericb.git tags/pull-qapi-2018-03-12-v2 > > > > for you to fetch changes up to 75eb57e3ed3682f011a6694863044e8b143a9821: > > > > qapi: Pass '-u' when doing non-silent diff (2018-03-16 09:00:07 -0500) > > > > v2: rebase on Paolo's queue (should fix tests that failed on v1), > > fix rebase conflicts, add two more related patches > > Sending only the changed patches from v1 > > > > ---------------------------------------------------------------- > > qapi patches for 2018-03-12, 2.12 softfreeze > > > > - Marc-André Lureau: 0/4 qapi: generate a literal qobject for introspection > > - Max Reitz: 0/7 block: Handle null backing link > > - Daniel P. Berrange: chardev: tcp: postpone TLS work until machine done > > - Peter Xu: 00/23 QMP: out-of-band (OOB) execution support > > - Vladimir Sementsov-Ogievskiy: 0/2 block latency histogram > > - Eric Blake: qapi: Pass '-u' when doing non-silent diff > > > > ---------------------------------------------------------------- > > Hi. I get a bunch of test assertion failures with this: > > ppc64 host: > > QTEST_QEMU_BINARY=nios2-softmmu/qemu-system-nios2 > QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( > ${RANDOM:-0} % 255 + 1) > )} gtester -k --verbose -m=quick tests/qmp-test > tests/device-introspect-test tests/qom-test tests/test-hmp > TEST: tests/qmp-test... (pid=49431) > /nios2/qmp/protocol: OK > /nios2/qmp/oob: OK > /nios2/qmp/query-status: OK > /nios2/qmp/query-block: OK > /nios2/qmp/query-blockstats: OK > /nios2/qmp/query-block-jobs: OK > /nios2/qmp/query-named-block-nodes: > qemu-system-nios2: /home/pm215/qemu/chardev/char-io.c:91: io_watc > h_poll_finalize: Assertion `iwp->src == ((void *)0)' failed. > Broken pipe > FAIL > > FreeBSD host: > > TEST: tests/qmp-test... (pid=68428) > /aarch64/qmp/protocol: OK > /aarch64/qmp/oob: OK > [...] > /aarch64/qmp/query-iothreads: > Assertion failed: (iwp->src == NULL), function io_watch_poll_finalize, > file /root/qemu/chardev/char-io.c, line 91. > Broken pipe > FAIL > GTester: last random seed: R02S60296bacb6aea7a3d748811fc486c71e > (pid=68462) > > OpenBSD host: > /cris/qmp/qom-list-types: > assertion "iwp->src == NULL" failed: file "/home/qemu/chardev/cha > r-io.c", line 91, function "io_watch_poll_finalize" > Broken pipe > FAIL > > > NetBSD host: > > TEST: tests/tpm-crb-test... (pid=21337) > /i386/tpm-crb/test: OK > Unexpected error in qio_channel_socket_readv() at > /root/qemu/io/channel-socket.c:494: > FAIL: tests/tpm-crb-test > TEST: tests/tpm-tis-test... (pid=14763) > /i386/tpm-tis/test_check_localities: OK > /i386/tpm-tis/test_check_access_reg: OK > /i386/tpm-tis/test_check_access_reg_seize: OK > /i386/tpm-tis/test_check_access_reg_release: OK > /i386/tpm-tis/test_check_transmit: OK > Unexpected error in qio_channel_socket_readv() at > /root/qemu/io/channel-socket.c:494: > FAIL: tests/tpm-tis-test > > SPARC host: > > /cris/qmp/query-memory-size-summary: > qemu-system-cris: /srv/pm215/qemu/chardev/char-io.c:91: io_watch_ > poll_finalize: Assertion `iwp->src == NULL' failed. > Broken pipe > FAIL > > > x86/Linux host: > /hppa/qmp/query-memdev: > qemu-system-hppa: /home/petmay01/linaro/qemu-for-merges/chardev/c > har-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == NULL' failed. > Broken pipe > FAIL > > aarch64 host: > qemu-system-alpha: /home/pm215/qemu/chardev/char-io.c:91: > io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed. > > One or two of the build hosts did pass, so that plus the varying > tests which failed suggests that the iwp->src assert is an > intermittent or timing based one. The tpm error on NetBSD > is probably a separate issue. I think I still need this to be squashed into "monitor: allow using IO thread for parsing", which I dropped during respin from v7 to v8: diff --git a/monitor.c b/monitor.c index f9ef3e5266..121194111f 100644 --- a/monitor.c +++ b/monitor.c @@ -4556,6 +4556,11 @@ void monitor_init(Chardev *chr, int flags) qemu_chr_fe_set_echo(&mon->chr, true); json_message_parser_init(&mon->qmp.parser, handle_qmp_command); if (mon->use_io_thr) { + /* + * Make sure the old iowatch be gone. It's possible when + * e.g. the chardev is in client mode, with wait=on. + */ + remove_fd_in_watch(chr); /* * We can't call qemu_chr_fe_set_handlers() directly here * since during the procedure the chardev will be active I thought there should be no pending task on main thread after the QIO and CHARDEV fixes, but I missed the most general io watch and we still possibly need the line. We should fix the assertion problem with above, but not sure about whether it can fix the QIO issue since I haven't seen that before (and I can't reproduce that too in my environment). I hope the fix can work for us. But in all cases, please feel free to drop the series if needed. Sorry for the trouble.
On 03/19/2018 04:26 AM, Peter Xu wrote: >>> for you to fetch changes up to 75eb57e3ed3682f011a6694863044e8b143a9821: >>> >>> qapi: Pass '-u' when doing non-silent diff (2018-03-16 09:00:07 -0500) >>> >> Hi. I get a bunch of test assertion failures with this: >> >> ppc64 host: >> >> QTEST_QEMU_BINARY=nios2-softmmu/qemu-system-nios2 >> QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( >> ${RANDOM:-0} % 255 + 1) >> )} gtester -k --verbose -m=quick tests/qmp-test >> tests/device-introspect-test tests/qom-test tests/test-hmp >> TEST: tests/qmp-test... (pid=49431) >> /nios2/qmp/protocol: OK >> /nios2/qmp/oob: OK >> /nios2/qmp/query-status: OK >> /nios2/qmp/query-block: OK >> /nios2/qmp/query-blockstats: OK >> /nios2/qmp/query-block-jobs: OK >> /nios2/qmp/query-named-block-nodes: >> qemu-system-nios2: /home/pm215/qemu/chardev/char-io.c:91: io_watc >> h_poll_finalize: Assertion `iwp->src == ((void *)0)' failed. >> Broken pipe >> FAIL I haven't been able to reproduce the testsuite failures on my Linux box, but if it's a race, then that doesn't make me all the more confident on what it takes to reproduce and/or fix the race. >> One or two of the build hosts did pass, so that plus the varying >> tests which failed suggests that the iwp->src assert is an >> intermittent or timing based one. The tpm error on NetBSD >> is probably a separate issue. > > I think I still need this to be squashed into "monitor: allow using IO > thread for parsing", which I dropped during respin from v7 to v8: > > diff --git a/monitor.c b/monitor.c > index f9ef3e5266..121194111f 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -4556,6 +4556,11 @@ void monitor_init(Chardev *chr, int flags) > qemu_chr_fe_set_echo(&mon->chr, true); > json_message_parser_init(&mon->qmp.parser, handle_qmp_command); > if (mon->use_io_thr) { > + /* > + * Make sure the old iowatch be gone. It's possible when > + * e.g. the chardev is in client mode, with wait=on. > + */ > + remove_fd_in_watch(chr); > /* > * We can't call qemu_chr_fe_set_handlers() directly here > * since during the procedure the chardev will be active > > I thought there should be no pending task on main thread after the QIO > and CHARDEV fixes, but I missed the most general io watch and we still > possibly need the line. So, should I squash in the fix and keep OOB as part of my v3 attempt, or are we getting close enough to rc0 that my qapi v3 pull request should just drop OOB, and save that as a feature for 2.13 instead? > > We should fix the assertion problem with above, but not sure about > whether it can fix the QIO issue since I haven't seen that before (and > I can't reproduce that too in my environment). > > I hope the fix can work for us. But in all cases, please feel free to > drop the series if needed. Sorry for the trouble. >
On 03/19/2018 09:57 AM, Eric Blake wrote: > On 03/19/2018 04:26 AM, Peter Xu wrote: > >>>> for you to fetch changes up to >>>> 75eb57e3ed3682f011a6694863044e8b143a9821: >>>> >>>> qapi: Pass '-u' when doing non-silent diff (2018-03-16 09:00:07 >>>> -0500) >>>> > >>> Hi. I get a bunch of test assertion failures with this: >>> > > I haven't been able to reproduce the testsuite failures on my Linux box, > but if it's a race, then that doesn't make me all the more confident on > what it takes to reproduce and/or fix the race. Okay, my simple builds on just x86_64-softmmu weren't hitting it, but my 'build all binaries' tree seems to be hitting the same thing: GTESTER check-qtest-ppcemb qemu-system-ppcemb: chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == NULL' failed. Broken pipe GTester: last random seed: R02S74d45e64b38428eddd131a5c1b4c878c make: *** [/home/eblake/qemu-tmp/tests/Makefile.include:878: check-qtest-ppcemb] Error 1 so I'm now testing if your squash makes a difference, now that it looks like I'm reproducing the problem.
On Mon, Mar 19, 2018 at 10:27:41AM -0500, Eric Blake wrote: > On 03/19/2018 09:57 AM, Eric Blake wrote: > > On 03/19/2018 04:26 AM, Peter Xu wrote: > > > > > > > for you to fetch changes up to > > > > > 75eb57e3ed3682f011a6694863044e8b143a9821: > > > > > > > > > > qapi: Pass '-u' when doing non-silent diff (2018-03-16 > > > > > 09:00:07 -0500) > > > > > > > > > > > Hi. I get a bunch of test assertion failures with this: > > > > > > > > > I haven't been able to reproduce the testsuite failures on my Linux box, > > but if it's a race, then that doesn't make me all the more confident on > > what it takes to reproduce and/or fix the race. > > Okay, my simple builds on just x86_64-softmmu weren't hitting it, but my > 'build all binaries' tree seems to be hitting the same thing: > > GTESTER check-qtest-ppcemb > qemu-system-ppcemb: chardev/char-io.c:91: io_watch_poll_finalize: Assertion > `iwp->src == NULL' failed. > Broken pipe > GTester: last random seed: R02S74d45e64b38428eddd131a5c1b4c878c > make: *** [/home/eblake/qemu-tmp/tests/Makefile.include:878: > check-qtest-ppcemb] Error 1 > > so I'm now testing if your squash makes a difference, now that it looks like > I'm reproducing the problem. Exactly what I encountered. My old tests on v8 are not strong enough (less binaries, less concurrency). I reproduced that easily when I didn't specify --target-list (so all binaries), then run tests with more concurrency (-j8). However still I never reproduced the QIO problem even with that. I suspect that's a more hard-to-trigger race, and even it might not related to OOB (but I'm not sure).