mbox

[PULL,00/11] x86 queue, 2017-02-27

Message ID 20170227162501.29280-1-ehabkost@redhat.com
State New
Headers show

Pull-request

git://github.com/ehabkost/qemu.git tags/x86-pull-request

Message

Eduardo Habkost Feb. 27, 2017, 4:24 p.m. UTC
The following changes since commit 3b1d8169844fafee184366b0e0d7080534758b4d:

  tests-aio-multithread: use atomic_read properly (2017-02-27 12:54:08 +0000)

are available in the git repository at:

  git://github.com/ehabkost/qemu.git tags/x86-pull-request

for you to fetch changes up to b8097deb359bbbd92592b9670adfe9e245b2d0bd:

  i386: Improve query-cpu-model-expansion full mode (2017-02-27 13:23:35 -0300)

----------------------------------------------------------------
x86 queue, 2017-02-27

"-cpu max" and query-cpu-model-expansion support for x86. This
should be the last x86 pull request before 2.9 soft freeze.

----------------------------------------------------------------

Eduardo Habkost (11):
  i386: Unset cannot_destroy_with_object_finalize_yet on "host" model
  i386: Add ordering field to CPUClass
  i386: Rename X86CPU::host_features to X86CPU::max_features
  i386: Reorganize and document CPUID initialization steps
  qapi-schema: Comment about full expansion of non-migration-safe models
  i386: Create "max" CPU model
  i386: Make "max" model not use any host CPUID info on TCG
  i386: Don't set CPUClass::cpu_def on "max" model
  i386: Define static "base" CPU model
  i386: Implement query-cpu-model-expansion QMP command
  i386: Improve query-cpu-model-expansion full mode

 qapi-schema.json      |   9 +
 target/i386/cpu-qom.h |   8 +-
 target/i386/cpu.h     |   2 +-
 monitor.c             |   4 +-
 target/i386/cpu.c     | 455 +++++++++++++++++++++++++++++++++++++++++---------
 5 files changed, 397 insertions(+), 81 deletions(-)

Comments

Peter Maydell Feb. 27, 2017, 7:16 p.m. UTC | #1
On 27 February 2017 at 16:24, Eduardo Habkost <ehabkost@redhat.com> wrote:
> The following changes since commit 3b1d8169844fafee184366b0e0d7080534758b4d:
>
>   tests-aio-multithread: use atomic_read properly (2017-02-27 12:54:08 +0000)
>
> are available in the git repository at:
>
>   git://github.com/ehabkost/qemu.git tags/x86-pull-request
>
> for you to fetch changes up to b8097deb359bbbd92592b9670adfe9e245b2d0bd:
>
>   i386: Improve query-cpu-model-expansion full mode (2017-02-27 13:23:35 -0300)
>
> ----------------------------------------------------------------
> x86 queue, 2017-02-27
>
> "-cpu max" and query-cpu-model-expansion support for x86. This
> should be the last x86 pull request before 2.9 soft freeze.
>
> ----------------------------------------------------------------
>
> Eduardo Habkost (11):
>   i386: Unset cannot_destroy_with_object_finalize_yet on "host" model
>   i386: Add ordering field to CPUClass
>   i386: Rename X86CPU::host_features to X86CPU::max_features
>   i386: Reorganize and document CPUID initialization steps
>   qapi-schema: Comment about full expansion of non-migration-safe models
>   i386: Create "max" CPU model
>   i386: Make "max" model not use any host CPUID info on TCG
>   i386: Don't set CPUClass::cpu_def on "max" model
>   i386: Define static "base" CPU model
>   i386: Implement query-cpu-model-expansion QMP command
>   i386: Improve query-cpu-model-expansion full mode

This seemed to hang in 'make check' on x86. I didn't have time to
investigate, so I'll have another go with it later.

thanks
-- PMM
Eduardo Habkost Feb. 27, 2017, 7:41 p.m. UTC | #2
On Mon, Feb 27, 2017 at 07:16:03PM +0000, Peter Maydell wrote:
[...]
> This seemed to hang in 'make check' on x86. I didn't have time to
> investigate, so I'll have another go with it later.

I have been seeing lots of travis-ci failures recently because of
'make check' taking too long. The jobs seem to timeout while
running tests/test-aio-multithread.

I see similar failures on recent qemu.git master jobs, too:
https://travis-ci.org/qemu/qemu/builds

Are they similar to the hang you have seen?
Greg Kurz Feb. 27, 2017, 7:57 p.m. UTC | #3
On Mon, 27 Feb 2017 16:41:34 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Mon, Feb 27, 2017 at 07:16:03PM +0000, Peter Maydell wrote:
> [...]
> > This seemed to hang in 'make check' on x86. I didn't have time to
> > investigate, so I'll have another go with it later.  
> 
> I have been seeing lots of travis-ci failures recently because of
> 'make check' taking too long. The jobs seem to timeout while
> running tests/test-aio-multithread.
> 
> I see similar failures on recent qemu.git master jobs, too:
> https://travis-ci.org/qemu/qemu/builds
> 
> Are they similar to the hang you have seen?
> 

The problem with tests/test-aio-multithread got fixed today with the
following commit:

http://git.qemu-project.org/?p=qemu.git;a=commit;h=3b1d8169844fafee184366b0e0d7080534758b4d
Eduardo Habkost Feb. 28, 2017, 7:12 p.m. UTC | #4
On Mon, Feb 27, 2017 at 08:57:53PM +0100, Greg Kurz wrote:
> On Mon, 27 Feb 2017 16:41:34 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Mon, Feb 27, 2017 at 07:16:03PM +0000, Peter Maydell wrote:
> > [...]
> > > This seemed to hang in 'make check' on x86. I didn't have time to
> > > investigate, so I'll have another go with it later.  
> > 
> > I have been seeing lots of travis-ci failures recently because of
> > 'make check' taking too long. The jobs seem to timeout while
> > running tests/test-aio-multithread.
> > 
> > I see similar failures on recent qemu.git master jobs, too:
> > https://travis-ci.org/qemu/qemu/builds
> > 
> > Are they similar to the hang you have seen?
> > 
> 
> The problem with tests/test-aio-multithread got fixed today with the
> following commit:
> 
> http://git.qemu-project.org/?p=qemu.git;a=commit;h=3b1d8169844fafee184366b0e0d7080534758b4d

I see. The failures really stopped after I rebased to that
commit.

I saw a failure on x86-pull-request that seemed to be because of
vhost-user-test[1]. However, after restarting the job, it
passed[2]. (I can't confirm the actual cause of the first
failure, because the log files seem to be temporarily unavailable
on Travis.)

Now, the problem with vhost-user-test is that it is not
guaranteed to work without KVM[3], so we need to add a mechanism
to skip vhost-user-test (or at least make errors non-fatal) if
KVM is unavailable.

[1] https://travis-ci.org/ehabkost/qemu/builds/205849528
[2] https://travis-ci.org/ehabkost/qemu/builds/205850173
[3] See commit cdafe929615ec5eca71bcd5a3d12bab5678e5886
Peter Maydell Feb. 28, 2017, 7:17 p.m. UTC | #5
On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote:
> I saw a failure on x86-pull-request that seemed to be because of
> vhost-user-test[1]. However, after restarting the job, it
> passed[2].

I'm currently processing a patch which (hopefully) fixes
vhost-user-test's intermittent failures:
http://patchwork.ozlabs.org/patch/732747/

thanks
-- PMM
Peter Maydell March 2, 2017, 12:29 p.m. UTC | #6
On 27 February 2017 at 19:16, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 27 February 2017 at 16:24, Eduardo Habkost <ehabkost@redhat.com> wrote:
>> The following changes since commit 3b1d8169844fafee184366b0e0d7080534758b4d:
>>
>>   tests-aio-multithread: use atomic_read properly (2017-02-27 12:54:08 +0000)
>>
>> are available in the git repository at:
>>
>>   git://github.com/ehabkost/qemu.git tags/x86-pull-request
>>
>> for you to fetch changes up to b8097deb359bbbd92592b9670adfe9e245b2d0bd:
>>
>>   i386: Improve query-cpu-model-expansion full mode (2017-02-27 13:23:35 -0300)
>>
>> ----------------------------------------------------------------
>> x86 queue, 2017-02-27
>>
>> "-cpu max" and query-cpu-model-expansion support for x86. This
>> should be the last x86 pull request before 2.9 soft freeze.
>>
>> ----------------------------------------------------------------
>>
>> Eduardo Habkost (11):
>>   i386: Unset cannot_destroy_with_object_finalize_yet on "host" model
>>   i386: Add ordering field to CPUClass
>>   i386: Rename X86CPU::host_features to X86CPU::max_features
>>   i386: Reorganize and document CPUID initialization steps
>>   qapi-schema: Comment about full expansion of non-migration-safe models
>>   i386: Create "max" CPU model
>>   i386: Make "max" model not use any host CPUID info on TCG
>>   i386: Don't set CPUClass::cpu_def on "max" model
>>   i386: Define static "base" CPU model
>>   i386: Implement query-cpu-model-expansion QMP command
>>   i386: Improve query-cpu-model-expansion full mode
>
> This seemed to hang in 'make check' on x86. I didn't have time to
> investigate, so I'll have another go with it later.

Retrying worked ok, so I've applied it, on the assumption the
hang was not something in this patchset.

thanks
-- PMM
Eduardo Habkost March 2, 2017, 3:39 p.m. UTC | #7
On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote:
> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote:
> > I saw a failure on x86-pull-request that seemed to be because of
> > vhost-user-test[1]. However, after restarting the job, it
> > passed[2].
> 
> I'm currently processing a patch which (hopefully) fixes
> vhost-user-test's intermittent failures:
> http://patchwork.ozlabs.org/patch/732747/

I'm not sure it will solve the issues on hosts without KVM. As
far as I can see, if vhost-user-test is working without KVM, it
is working by accident.

See the thread at:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg394258.html
Paolo Bonzini March 2, 2017, 3:54 p.m. UTC | #8
On 02/03/2017 16:39, Eduardo Habkost wrote:
> On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote:
>> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote:
>>> I saw a failure on x86-pull-request that seemed to be because of
>>> vhost-user-test[1]. However, after restarting the job, it
>>> passed[2].
>>
>> I'm currently processing a patch which (hopefully) fixes
>> vhost-user-test's intermittent failures:
>> http://patchwork.ozlabs.org/patch/732747/
> 
> I'm not sure it will solve the issues on hosts without KVM. As
> far as I can see, if vhost-user-test is working without KVM, it
> is working by accident.

Well, it has worked for a while before the patch.  As long as you don't
overwrite code with vhost-user data and then try to run that data,
things will be fine.  Just not something you can use in practice, but it
works in tests.

Paolo
Eduardo Habkost March 2, 2017, 4:07 p.m. UTC | #9
On Thu, Mar 02, 2017 at 04:54:26PM +0100, Paolo Bonzini wrote:
> On 02/03/2017 16:39, Eduardo Habkost wrote:
> > On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote:
> >> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote:
> >>> I saw a failure on x86-pull-request that seemed to be because of
> >>> vhost-user-test[1]. However, after restarting the job, it
> >>> passed[2].
> >>
> >> I'm currently processing a patch which (hopefully) fixes
> >> vhost-user-test's intermittent failures:
> >> http://patchwork.ozlabs.org/patch/732747/
> > 
> > I'm not sure it will solve the issues on hosts without KVM. As
> > far as I can see, if vhost-user-test is working without KVM, it
> > is working by accident.
> 
> Well, it has worked for a while before the patch.

Before which patch?

>                                                    As long as you don't
> overwrite code with vhost-user data and then try to run that data,
> things will be fine.  Just not something you can use in practice, but it
> works in tests.

Earlier this week I saw the wait_for_fds assertion (mentioned at
the thread above) on a travis-ci job again, and I was suspecting
it was the same vhost_set_mem_table() + TCG error seen at the
thread above.

Unfortunately travis-ci overwrote the previous logs when I
restarted the job, and now I can't confirm if it was really the
same vhost_set_mem_table() error. I guess we'll have to simply
wait and see if it fails again.
Peter Maydell March 2, 2017, 4:16 p.m. UTC | #10
On 2 March 2017 at 16:07, Eduardo Habkost <ehabkost@redhat.com> wrote:
> Earlier this week I saw the wait_for_fds assertion (mentioned at
> the thread above) on a travis-ci job again, and I was suspecting
> it was the same vhost_set_mem_table() + TCG error seen at the
> thread above.
>
> Unfortunately travis-ci overwrote the previous logs when I
> restarted the job, and now I can't confirm if it was really the
> same vhost_set_mem_table() error. I guess we'll have to simply
> wait and see if it fails again.

Here's a fresh new travis job failing like that:
https://travis-ci.org/qemu/qemu/builds/207005044
(specifically https://travis-ci.org/qemu/qemu/jobs/207005050)

thanks
-- PMM
Paolo Bonzini March 2, 2017, 4:22 p.m. UTC | #11
On 02/03/2017 17:07, Eduardo Habkost wrote:
> On Thu, Mar 02, 2017 at 04:54:26PM +0100, Paolo Bonzini wrote:
>> On 02/03/2017 16:39, Eduardo Habkost wrote:
>>> On Tue, Feb 28, 2017 at 07:17:39PM +0000, Peter Maydell wrote:
>>>> On 28 February 2017 at 19:12, Eduardo Habkost <ehabkost@redhat.com> wrote:
>>>>> I saw a failure on x86-pull-request that seemed to be because of
>>>>> vhost-user-test[1]. However, after restarting the job, it
>>>>> passed[2].
>>>>
>>>> I'm currently processing a patch which (hopefully) fixes
>>>> vhost-user-test's intermittent failures:
>>>> http://patchwork.ozlabs.org/patch/732747/
>>>
>>> I'm not sure it will solve the issues on hosts without KVM. As
>>> far as I can see, if vhost-user-test is working without KVM, it
>>> is working by accident.
>>
>> Well, it has worked for a while before the patch.
> 
> Before which patch?

The one mentioned in the commit message by Marc-André:
b0a335e351103bf92f3f9d0bd5759311be8156ac.

Paolo

> 
>>                                                    As long as you don't
>> overwrite code with vhost-user data and then try to run that data,
>> things will be fine.  Just not something you can use in practice, but it
>> works in tests.
> 
> Earlier this week I saw the wait_for_fds assertion (mentioned at
> the thread above) on a travis-ci job again, and I was suspecting
> it was the same vhost_set_mem_table() + TCG error seen at the
> thread above.
> 
> Unfortunately travis-ci overwrote the previous logs when I
> restarted the job, and now I can't confirm if it was really the
> same vhost_set_mem_table() error. I guess we'll have to simply
> wait and see if it fails again.
>
Eduardo Habkost March 2, 2017, 6 p.m. UTC | #12
On Thu, Mar 02, 2017 at 04:16:53PM +0000, Peter Maydell wrote:
> On 2 March 2017 at 16:07, Eduardo Habkost <ehabkost@redhat.com> wrote:
> > Earlier this week I saw the wait_for_fds assertion (mentioned at
> > the thread above) on a travis-ci job again, and I was suspecting
> > it was the same vhost_set_mem_table() + TCG error seen at the
> > thread above.
> >
> > Unfortunately travis-ci overwrote the previous logs when I
> > restarted the job, and now I can't confirm if it was really the
> > same vhost_set_mem_table() error. I guess we'll have to simply
> > wait and see if it fails again.
> 
> Here's a fresh new travis job failing like that:
> https://travis-ci.org/qemu/qemu/builds/207005044
> (specifically https://travis-ci.org/qemu/qemu/jobs/207005050)

Indeed, it is not exactly the same TCG-related error we've seen
in August. It looks like vhost-user-test + TCG is not as broken
as I thought.