Message ID | 20170227162501.29280-1-ehabkost@redhat.com |
---|---|
State | New |
Headers | show |
On 27 February 2017 at 16:24, Eduardo Habkost <ehabkost@redhat.com> wrote: > The following changes since commit 3b1d8169844fafee184366b0e0d7080534758b4d: > > tests-aio-multithread: use atomic_read properly (2017-02-27 12:54:08 +0000) > > are available in the git repository at: > > git://github.com/ehabkost/qemu.git tags/x86-pull-request > > for you to fetch changes up to b8097deb359bbbd92592b9670adfe9e245b2d0bd: > > i386: Improve query-cpu-model-expansion full mode (2017-02-27 13:23:35 -0300) > > ---------------------------------------------------------------- > x86 queue, 2017-02-27 > > "-cpu max" and query-cpu-model-expansion support for x86. This > should be the last x86 pull request before 2.9 soft freeze. > > ---------------------------------------------------------------- > > Eduardo Habkost (11): > i386: Unset cannot_destroy_with_object_finalize_yet on "host" model > i386: Add ordering field to CPUClass > i386: Rename X86CPU::host_features to X86CPU::max_features > i386: Reorganize and document CPUID initialization steps > qapi-schema: Comment about full expansion of non-migration-safe models > i386: Create "max" CPU model > i386: Make "max" model not use any host CPUID info on TCG > i386: Don't set CPUClass::cpu_def on "max" model > i386: Define static "base" CPU model > i386: Implement query-cpu-model-expansion QMP command > i386: Improve query-cpu-model-expansion full mode This seemed to hang in 'make check' on x86. I didn't have time to investigate, so I'll have another go with it later. thanks -- PMM
On Mon, Feb 27, 2017 at 07:16:03PM +0000, Peter Maydell wrote: [...] > This seemed to hang in 'make check' on x86. I didn't have time to > investigate, so I'll have another go with it later. I have been seeing lots of travis-ci failures recently because of 'make check' taking too long. The jobs seem to timeout while running tests/test-aio-multithread. I see similar failures on recent qemu.git master jobs, too: https://travis-ci.org/qemu/qemu/builds Are they similar to the hang you have seen?
On Mon, 27 Feb 2017 16:41:34 -0300 Eduardo Habkost <ehabkost@redhat.com> wrote: > On Mon, Feb 27, 2017 at 07:16:03PM +0000, Peter Maydell wrote: > [...] > > This seemed to hang in 'make check' on x86. I didn't have time to > > investigate, so I'll have another go with it later. > > I have been seeing lots of travis-ci failures recently because of > 'make check' taking too long. The jobs seem to timeout while > running tests/test-aio-multithread. > > I see similar failures on recent qemu.git master jobs, too: > https://travis-ci.org/qemu/qemu/builds > > Are they similar to the hang you have seen? > The problem with tests/test-aio-multithread got fixed today with the following commit: http://git.qemu-project.org/?p=qemu.git;a=commit;h=3b1d8169844fafee184366b0e0d7080534758b4d
On Mon, Feb 27, 2017 at 08:57:53PM +0100, Greg Kurz wrote: > On Mon, 27 Feb 2017 16:41:34 -0300 > Eduardo Habkost <ehabkost@redhat.com> wrote: > > > On Mon, Feb 27, 2017 at 07:16:03PM +0000, Peter Maydell wrote: > > [...] > > > This seemed to hang in 'make check' on x86. I didn't have time to > > > investigate, so I'll have another go with it later. > > > > I have been seeing lots of travis-ci failures recently because of > > 'make check' taking too long. The jobs seem to timeout while > > running tests/test-aio-multithread. > > > > I see similar failures on recent qemu.git master jobs, too: > > https://travis-ci.org/qemu/qemu/builds > > > > Are they similar to the hang you have seen? > > > > The problem with tests/test-aio-multithread got fixed today with the > following commit: > > http://git.qemu-project.org/?p=qemu.git;a=commit;h=3b1d8169844fafee184366b0e0d7080534758b4d I see. The failures really stopped after I rebased to that commit. I saw a failure on x86-pull-request that seemed to be because of vhost-user-test[1]. However, after restarting the job, it passed[2]. (I can't confirm the actual cause of the first failure, because the log files seem to be temporarily unavailable on Travis.) Now, the problem with vhost-user-test is that it is not guaranteed to work without KVM[3], so we need to add a mechanism to skip vhost-user-test (or at least make errors non-fatal) if KVM is unavailable. [1] https://travis-ci.org/ehabkost/qemu/builds/205849528 [2] https://travis-ci.org/ehabkost/qemu/builds/205850173 [3] See commit cdafe929615ec5eca71bcd5a3d12bab5678e5886
On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote: > I saw a failure on x86-pull-request that seemed to be because of > vhost-user-test[1]. However, after restarting the job, it > passed[2]. I'm currently processing a patch which (hopefully) fixes vhost-user-test's intermittent failures: http://patchwork.ozlabs.org/patch/732747/ thanks -- PMM
On 27 February 2017 at 19:16, Peter Maydell <peter.maydell@linaro.org> wrote: > On 27 February 2017 at 16:24, Eduardo Habkost <ehabkost@redhat.com> wrote: >> The following changes since commit 3b1d8169844fafee184366b0e0d7080534758b4d: >> >> tests-aio-multithread: use atomic_read properly (2017-02-27 12:54:08 +0000) >> >> are available in the git repository at: >> >> git://github.com/ehabkost/qemu.git tags/x86-pull-request >> >> for you to fetch changes up to b8097deb359bbbd92592b9670adfe9e245b2d0bd: >> >> i386: Improve query-cpu-model-expansion full mode (2017-02-27 13:23:35 -0300) >> >> ---------------------------------------------------------------- >> x86 queue, 2017-02-27 >> >> "-cpu max" and query-cpu-model-expansion support for x86. This >> should be the last x86 pull request before 2.9 soft freeze. >> >> ---------------------------------------------------------------- >> >> Eduardo Habkost (11): >> i386: Unset cannot_destroy_with_object_finalize_yet on "host" model >> i386: Add ordering field to CPUClass >> i386: Rename X86CPU::host_features to X86CPU::max_features >> i386: Reorganize and document CPUID initialization steps >> qapi-schema: Comment about full expansion of non-migration-safe models >> i386: Create "max" CPU model >> i386: Make "max" model not use any host CPUID info on TCG >> i386: Don't set CPUClass::cpu_def on "max" model >> i386: Define static "base" CPU model >> i386: Implement query-cpu-model-expansion QMP command >> i386: Improve query-cpu-model-expansion full mode > > This seemed to hang in 'make check' on x86. I didn't have time to > investigate, so I'll have another go with it later. Retrying worked ok, so I've applied it, on the assumption the hang was not something in this patchset. thanks -- PMM
On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote: > On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote: > > I saw a failure on x86-pull-request that seemed to be because of > > vhost-user-test[1]. However, after restarting the job, it > > passed[2]. > > I'm currently processing a patch which (hopefully) fixes > vhost-user-test's intermittent failures: > http://patchwork.ozlabs.org/patch/732747/ I'm not sure it will solve the issues on hosts without KVM. As far as I can see, if vhost-user-test is working without KVM, it is working by accident. See the thread at: http://www.mail-archive.com/qemu-devel@nongnu.org/msg394258.html
On 02/03/2017 16:39, Eduardo Habkost wrote: > On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote: >> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote: >>> I saw a failure on x86-pull-request that seemed to be because of >>> vhost-user-test[1]. However, after restarting the job, it >>> passed[2]. >> >> I'm currently processing a patch which (hopefully) fixes >> vhost-user-test's intermittent failures: >> http://patchwork.ozlabs.org/patch/732747/ > > I'm not sure it will solve the issues on hosts without KVM. As > far as I can see, if vhost-user-test is working without KVM, it > is working by accident. Well, it has worked for a while before the patch. As long as you don't overwrite code with vhost-user data and then try to run that data, things will be fine. Just not something you can use in practice, but it works in tests. Paolo
On Thu, Mar 02, 2017 at 04:54:26PM +0100, Paolo Bonzini wrote: > On 02/03/2017 16:39, Eduardo Habkost wrote: > > On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote: > >> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote: > >>> I saw a failure on x86-pull-request that seemed to be because of > >>> vhost-user-test[1]. However, after restarting the job, it > >>> passed[2]. > >> > >> I'm currently processing a patch which (hopefully) fixes > >> vhost-user-test's intermittent failures: > >> http://patchwork.ozlabs.org/patch/732747/ > > > > I'm not sure it will solve the issues on hosts without KVM. As > > far as I can see, if vhost-user-test is working without KVM, it > > is working by accident. > > Well, it has worked for a while before the patch. Before which patch? > As long as you don't > overwrite code with vhost-user data and then try to run that data, > things will be fine. Just not something you can use in practice, but it > works in tests. Earlier this week I saw the wait_for_fds assertion (mentioned at the thread above) on a travis-ci job again, and I was suspecting it was the same vhost_set_mem_table() + TCG error seen at the thread above. Unfortunately travis-ci overwrote the previous logs when I restarted the job, and now I can't confirm if it was really the same vhost_set_mem_table() error. I guess we'll have to simply wait and see if it fails again.
On 2 March 2017 at 16:07, Eduardo Habkost <ehabkost@redhat.com> wrote: > Earlier this week I saw the wait_for_fds assertion (mentioned at > the thread above) on a travis-ci job again, and I was suspecting > it was the same vhost_set_mem_table() + TCG error seen at the > thread above. > > Unfortunately travis-ci overwrote the previous logs when I > restarted the job, and now I can't confirm if it was really the > same vhost_set_mem_table() error. I guess we'll have to simply > wait and see if it fails again. Here's a fresh new travis job failing like that: https://travis-ci.org/qemu/qemu/builds/207005044 (specifically https://travis-ci.org/qemu/qemu/jobs/207005050) thanks -- PMM
On 02/03/2017 17:07, Eduardo Habkost wrote: > On Thu, Mar 02, 2017 at 04:54:26PM +0100, Paolo Bonzini wrote: >> On 02/03/2017 16:39, Eduardo Habkost wrote: >>> On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote: >>>> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote: >>>>> I saw a failure on x86-pull-request that seemed to be because of >>>>> vhost-user-test[1]. However, after restarting the job, it >>>>> passed[2]. >>>> >>>> I'm currently processing a patch which (hopefully) fixes >>>> vhost-user-test's intermittent failures: >>>> http://patchwork.ozlabs.org/patch/732747/ >>> >>> I'm not sure it will solve the issues on hosts without KVM. As >>> far as I can see, if vhost-user-test is working without KVM, it >>> is working by accident. >> >> Well, it has worked for a while before the patch. > > Before which patch? The one mentioned in the commit message by Marc-André: b0a335e351103bf92f3f9d0bd5759311be8156ac. Paolo > >> As long as you don't >> overwrite code with vhost-user data and then try to run that data, >> things will be fine. Just not something you can use in practice, but it >> works in tests. > > Earlier this week I saw the wait_for_fds assertion (mentioned at > the thread above) on a travis-ci job again, and I was suspecting > it was the same vhost_set_mem_table() + TCG error seen at the > thread above. > > Unfortunately travis-ci overwrote the previous logs when I > restarted the job, and now I can't confirm if it was really the > same vhost_set_mem_table() error. I guess we'll have to simply > wait and see if it fails again. >
On Thu, Mar 02, 2017 at 04:16:53PM +0000, Peter Maydell wrote: > On 2 March 2017 at 16:07, Eduardo Habkost <ehabkost@redhat.com> wrote: > > Earlier this week I saw the wait_for_fds assertion (mentioned at > > the thread above) on a travis-ci job again, and I was suspecting > > it was the same vhost_set_mem_table() + TCG error seen at the > > thread above. > > > > Unfortunately travis-ci overwrote the previous logs when I > > restarted the job, and now I can't confirm if it was really the > > same vhost_set_mem_table() error. I guess we'll have to simply > > wait and see if it fails again. > > Here's a fresh new travis job failing like that: > https://travis-ci.org/qemu/qemu/builds/207005044 > (specifically https://travis-ci.org/qemu/qemu/jobs/207005050) Indeed, it is not exactly the same TCG-related error we've seen in August. It looks like vhost-user-test + TCG is not as broken as I thought.