[for-2.10] boot-serial-test: prefer tcg accelerator

Message ID	20170816082650.21880-1-cohuck@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3AA524B70D From: Cornelia Huck <cohuck@redhat.com> To: qemu-devel@nongnu.org Date: Wed, 16 Aug 2017 10:26:50 +0200 Message-Id: <20170816082650.21880-1-cohuck@redhat.com> Subject: [Qemu-devel] [PATCH for-2.10] boot-serial-test: prefer tcg accelerator Precedence: list Cc: lvivier@redhat.com, thuth@redhat.com, mst@redhat.com, Cornelia Huck <cohuck@redhat.com>, richard.henderson@linaro.org, pbonzini@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Cornelia Huck Aug. 16, 2017, 8:26 a.m. UTC

Prefer to use the tcg accelarator if it is available: This is our only
real smoke test for tcg, and fast enough to use it for that.

Fixes: 480bc11e6 ("boot-serial-test: fallback to kvm accelerator")
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 tests/boot-serial-test.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Laurent Vivier Aug. 16, 2017, 8:28 a.m. UTC | #1

On 16/08/2017 10:26, Cornelia Huck wrote:
> Prefer to use the tcg accelarator if it is available: This is our only
> real smoke test for tcg, and fast enough to use it for that.
> 
> Fixes: 480bc11e6 ("boot-serial-test: fallback to kvm accelerator")
> Reported-by: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>  tests/boot-serial-test.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
> index a8ca877168..b95c5e74ea 100644
> --- a/tests/boot-serial-test.c
> +++ b/tests/boot-serial-test.c
> @@ -78,7 +78,11 @@ static void test_machine(const void *data)
>      fd = mkstemp(tmpname);
>      g_assert(fd != -1);
>  
> -    args = g_strdup_printf("-M %s,accel=kvm:tcg "
> +    /*
> +     * Make sure that this test uses tcg if available: It is used as a
> +     * fast-enough smoketest for that.
> +     */
> +    args = g_strdup_printf("-M %s,accel=tcg:kvm "
>                             "-chardev file,id=serial0,path=%s "
>                             "-no-shutdown -serial chardev:serial0 %s",
>                             test->machine, tmpname, test->extra);
> 

Reviewed-by: Laurent Vivier <lvivier@redhat.com>

Thomas Huth Aug. 16, 2017, 10:25 a.m. UTC | #2

On 16.08.2017 10:26, Cornelia Huck wrote:
> Prefer to use the tcg accelarator if it is available: This is our only
> real smoke test for tcg, and fast enough to use it for that.
> 
> Fixes: 480bc11e6 ("boot-serial-test: fallback to kvm accelerator")
> Reported-by: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>  tests/boot-serial-test.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
> index a8ca877168..b95c5e74ea 100644
> --- a/tests/boot-serial-test.c
> +++ b/tests/boot-serial-test.c
> @@ -78,7 +78,11 @@ static void test_machine(const void *data)
>      fd = mkstemp(tmpname);
>      g_assert(fd != -1);
>  
> -    args = g_strdup_printf("-M %s,accel=kvm:tcg "
> +    /*
> +     * Make sure that this test uses tcg if available: It is used as a
> +     * fast-enough smoketest for that.
> +     */
> +    args = g_strdup_printf("-M %s,accel=tcg:kvm "
>                             "-chardev file,id=serial0,path=%s "
>                             "-no-shutdown -serial chardev:serial0 %s",
>                             test->machine, tmpname, test->extra);
> 

Reviewed-by: Thomas Huth <thuth@redhat.com>

Paolo Bonzini Aug. 16, 2017, 10:51 a.m. UTC | #3

On 16/08/2017 10:26, Cornelia Huck wrote:
> Prefer to use the tcg accelarator if it is available: This is our only
> real smoke test for tcg, and fast enough to use it for that.

I'm not sure this is required for 2.10.  Yes, it means the coverage from
"make check" is worse, but that's it.

Paolo

> Fixes: 480bc11e6 ("boot-serial-test: fallback to kvm accelerator")
> Reported-by: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>  tests/boot-serial-test.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
> index a8ca877168..b95c5e74ea 100644
> --- a/tests/boot-serial-test.c
> +++ b/tests/boot-serial-test.c
> @@ -78,7 +78,11 @@ static void test_machine(const void *data)
>      fd = mkstemp(tmpname);
>      g_assert(fd != -1);
>  
> -    args = g_strdup_printf("-M %s,accel=kvm:tcg "
> +    /*
> +     * Make sure that this test uses tcg if available: It is used as a
> +     * fast-enough smoketest for that.
> +     */
> +    args = g_strdup_printf("-M %s,accel=tcg:kvm "
>                             "-chardev file,id=serial0,path=%s "
>                             "-no-shutdown -serial chardev:serial0 %s",
>                             test->machine, tmpname, test->extra);
>

Peter Maydell Aug. 16, 2017, 11:18 a.m. UTC | #4

On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 16/08/2017 10:26, Cornelia Huck wrote:
>> Prefer to use the tcg accelarator if it is available: This is our only
>> real smoke test for tcg, and fast enough to use it for that.
>
> I'm not sure this is required for 2.10.  Yes, it means the coverage from
> "make check" is worse, but that's it.

Yes, I'd put it under "if we need to roll an rc4 anyway for
some more significant bug we might as well put this in too,
but it doesn't merit cutting rc4 by itself."

thanks
-- PMM

Cornelia Huck Aug. 16, 2017, 11:46 a.m. UTC | #5

On Wed, 16 Aug 2017 12:18:07 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > On 16/08/2017 10:26, Cornelia Huck wrote:  
> >> Prefer to use the tcg accelarator if it is available: This is our only
> >> real smoke test for tcg, and fast enough to use it for that.  
> >
> > I'm not sure this is required for 2.10.  Yes, it means the coverage from
> > "make check" is worse, but that's it.  
> 
> Yes, I'd put it under "if we need to roll an rc4 anyway for
> some more significant bug we might as well put this in too,
> but it doesn't merit cutting rc4 by itself."

Yup, that sounds good to me.

Richard Henderson Aug. 16, 2017, 2:03 p.m. UTC | #6

On 08/16/2017 01:26 AM, Cornelia Huck wrote:
> Prefer to use the tcg accelarator if it is available: This is our only
> real smoke test for tcg, and fast enough to use it for that.
> 
> Fixes: 480bc11e6 ("boot-serial-test: fallback to kvm accelerator")
> Reported-by: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>  tests/boot-serial-test.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Thanks, guys.  I don't mind if this gets deferred for 2.11.


r~

David Gibson Aug. 22, 2017, 1:09 a.m. UTC | #7

On Wed, Aug 16, 2017 at 12:18:07PM +0100, Peter Maydell wrote:
> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > On 16/08/2017 10:26, Cornelia Huck wrote:
> >> Prefer to use the tcg accelarator if it is available: This is our only
> >> real smoke test for tcg, and fast enough to use it for that.
> >
> > I'm not sure this is required for 2.10.  Yes, it means the coverage from
> > "make check" is worse, but that's it.
> 
> Yes, I'd put it under "if we need to roll an rc4 anyway for
> some more significant bug we might as well put this in too,
> but it doesn't merit cutting rc4 by itself."

It does entirely break "make check" on a ppc host.  And that in turn
has held up my testing cycle for a couple of ppc regressions from 2.9
that I was hoping to squeeze in.  Does that change your calculations?

Thomas Huth Aug. 22, 2017, 5:40 a.m. UTC | #8

On 22.08.2017 03:09, David Gibson wrote:
> On Wed, Aug 16, 2017 at 12:18:07PM +0100, Peter Maydell wrote:
>> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 16/08/2017 10:26, Cornelia Huck wrote:
>>>> Prefer to use the tcg accelarator if it is available: This is our only
>>>> real smoke test for tcg, and fast enough to use it for that.
>>>
>>> I'm not sure this is required for 2.10.  Yes, it means the coverage from
>>> "make check" is worse, but that's it.
>>
>> Yes, I'd put it under "if we need to roll an rc4 anyway for
>> some more significant bug we might as well put this in too,
>> but it doesn't merit cutting rc4 by itself."
> 
> It does entirely break "make check" on a ppc host.  And that in turn
> has held up my testing cycle for a couple of ppc regressions from 2.9
> that I was hoping to squeeze in.  Does that change your calculations?

Uuh, right. The problem is that we can only run the "pseries" machine
with kvm-hv, but not the other ppc machines (and the boot-serial test
tries to start a couple of these). So the "accel=tcg:kvm" fix should
currently be the best way to ease the situation - since ppc does not
support "--disable-tcg" yet, it will always be tested with tcg.

But when we support "--disable-tcg" one day on ppc, too, we might have
to come up with a more clever solution here instead...

 Thomas

Peter Maydell Aug. 22, 2017, 8:47 a.m. UTC | #9

On 22 August 2017 at 02:09, David Gibson <david@gibson.dropbear.id.au> wrote:
> On Wed, Aug 16, 2017 at 12:18:07PM +0100, Peter Maydell wrote:
>> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> > On 16/08/2017 10:26, Cornelia Huck wrote:
>> >> Prefer to use the tcg accelarator if it is available: This is our only
>> >> real smoke test for tcg, and fast enough to use it for that.
>> >
>> > I'm not sure this is required for 2.10.  Yes, it means the coverage from
>> > "make check" is worse, but that's it.
>>
>> Yes, I'd put it under "if we need to roll an rc4 anyway for
>> some more significant bug we might as well put this in too,
>> but it doesn't merit cutting rc4 by itself."
>
> It does entirely break "make check" on a ppc host.  And that in turn
> has held up my testing cycle for a couple of ppc regressions from 2.9
> that I was hoping to squeeze in.  Does that change your calculations?

I have a PPC64 box in my standard set of build tests, and it
runs 'make check' without problems...

thanks
-- PMM

Laurent Vivier Aug. 22, 2017, 8:49 a.m. UTC | #10

On 22/08/2017 10:47, Peter Maydell wrote:
> On 22 August 2017 at 02:09, David Gibson <david@gibson.dropbear.id.au> wrote:
>> On Wed, Aug 16, 2017 at 12:18:07PM +0100, Peter Maydell wrote:
>>> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> On 16/08/2017 10:26, Cornelia Huck wrote:
>>>>> Prefer to use the tcg accelarator if it is available: This is our only
>>>>> real smoke test for tcg, and fast enough to use it for that.
>>>>
>>>> I'm not sure this is required for 2.10.  Yes, it means the coverage from
>>>> "make check" is worse, but that's it.
>>>
>>> Yes, I'd put it under "if we need to roll an rc4 anyway for
>>> some more significant bug we might as well put this in too,
>>> but it doesn't merit cutting rc4 by itself."
>>
>> It does entirely break "make check" on a ppc host.  And that in turn
>> has held up my testing cycle for a couple of ppc regressions from 2.9
>> that I was hoping to squeeze in.  Does that change your calculations?
> 
> I have a PPC64 box in my standard set of build tests, and it
> runs 'make check' without problems...

You need to use KVM HV to have the problem, not KVM PR.
Is that the case?

Laurent

Peter Maydell Aug. 22, 2017, 9:35 a.m. UTC | #11

On 22 August 2017 at 09:49, Laurent Vivier <lvivier@redhat.com> wrote:
> On 22/08/2017 10:47, Peter Maydell wrote:
>> On 22 August 2017 at 02:09, David Gibson <david@gibson.dropbear.id.au> wrote:
>>> On Wed, Aug 16, 2017 at 12:18:07PM +0100, Peter Maydell wrote:
>>>> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>> On 16/08/2017 10:26, Cornelia Huck wrote:
>>>>>> Prefer to use the tcg accelarator if it is available: This is our only
>>>>>> real smoke test for tcg, and fast enough to use it for that.
>>>>>
>>>>> I'm not sure this is required for 2.10.  Yes, it means the coverage from
>>>>> "make check" is worse, but that's it.
>>>>
>>>> Yes, I'd put it under "if we need to roll an rc4 anyway for
>>>> some more significant bug we might as well put this in too,
>>>> but it doesn't merit cutting rc4 by itself."
>>>
>>> It does entirely break "make check" on a ppc host.  And that in turn
>>> has held up my testing cycle for a couple of ppc regressions from 2.9
>>> that I was hoping to squeeze in.  Does that change your calculations?
>>
>> I have a PPC64 box in my standard set of build tests, and it
>> runs 'make check' without problems...
>
> You need to use KVM HV to have the problem, not KVM PR.
> Is that the case?

I don't have access to KVM at all on that box, so it will be
using TCG only.

thanks
-- PMM

David Gibson Aug. 22, 2017, 11:20 a.m. UTC | #12

On Tue, Aug 22, 2017 at 10:35:17AM +0100, Peter Maydell wrote:
> On 22 August 2017 at 09:49, Laurent Vivier <lvivier@redhat.com> wrote:
> > On 22/08/2017 10:47, Peter Maydell wrote:
> >> On 22 August 2017 at 02:09, David Gibson <david@gibson.dropbear.id.au> wrote:
> >>> On Wed, Aug 16, 2017 at 12:18:07PM +0100, Peter Maydell wrote:
> >>>> On 16 August 2017 at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>>> On 16/08/2017 10:26, Cornelia Huck wrote:
> >>>>>> Prefer to use the tcg accelarator if it is available: This is our only
> >>>>>> real smoke test for tcg, and fast enough to use it for that.
> >>>>>
> >>>>> I'm not sure this is required for 2.10.  Yes, it means the coverage from
> >>>>> "make check" is worse, but that's it.
> >>>>
> >>>> Yes, I'd put it under "if we need to roll an rc4 anyway for
> >>>> some more significant bug we might as well put this in too,
> >>>> but it doesn't merit cutting rc4 by itself."
> >>>
> >>> It does entirely break "make check" on a ppc host.  And that in turn
> >>> has held up my testing cycle for a couple of ppc regressions from 2.9
> >>> that I was hoping to squeeze in.  Does that change your calculations?
> >>
> >> I have a PPC64 box in my standard set of build tests, and it
> >> runs 'make check' without problems...
> >
> > You need to use KVM HV to have the problem, not KVM PR.
> > Is that the case?
> 
> I don't have access to KVM at all on that box, so it will be
> using TCG only.

That would explain it, it's the attempt to use KVM that triggers the
problem.

Obviously it's not a thing to fix right now, but I've really been
thinking that none of the tests should use this "TCG or KVM" stuff.
They should instead be run with *both* options - or at least the ones
that are available on the host.

That would have caught the bug in the pull request I sent you, and
would at least have given you a chance at seeing the problem with
boot-serial-test.

Cornelia Huck Aug. 22, 2017, 11:48 a.m. UTC | #13

On Tue, 22 Aug 2017 21:20:46 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Obviously it's not a thing to fix right now, but I've really been
> thinking that none of the tests should use this "TCG or KVM" stuff.
> They should instead be run with *both* options - or at least the ones
> that are available on the host.

Having one test as a 'smoke test' that is run for everything available
sounds like a good idea, and the boot-serial test may be a good
candidate for that.

I would not want to run every test with every accelerator, however, as
this makes 'make check' even slower than it is now. (Although it may be
useful to be able to trigger 'run everything' tests on some dedicated
test machines.)

> 
> That would have caught the bug in the pull request I sent you, and
> would at least have given you a chance at seeing the problem with
> boot-serial-test.

I don't think that would have been caught unless you have the right
host available... and given that most developers will only have a
x86_64 machine and whatever other architecture they are working on (if
any) available, they will run most of 'make check' with tcg only.

David Gibson Aug. 23, 2017, 12:29 a.m. UTC | #14

On Tue, Aug 22, 2017 at 01:48:15PM +0200, Cornelia Huck wrote:
> On Tue, 22 Aug 2017 21:20:46 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Obviously it's not a thing to fix right now, but I've really been
> > thinking that none of the tests should use this "TCG or KVM" stuff.
> > They should instead be run with *both* options - or at least the ones
> > that are available on the host.
> 
> Having one test as a 'smoke test' that is run for everything available
> sounds like a good idea, and the boot-serial test may be a good
> candidate for that.
> 
> I would not want to run every test with every accelerator, however, as
> this makes 'make check' even slower than it is now. (Although it may be
> useful to be able to trigger 'run everything' tests on some dedicated
> test machines.)

I'd be fine with only running the full matrix on a "make check-harder"
or whatever, target.  But I'd like the option to be there.  Sometimes
(like when preparing a pull request) a slower check is an acceptable
cost for better coverage.

> > That would have caught the bug in the pull request I sent you, and
> > would at least have given you a chance at seeing the problem with
> > boot-serial-test.
> 
> I don't think that would have been caught unless you have the right
> host available... and given that most developers will only have a
> x86_64 machine and whatever other architecture they are working on (if
> any) available, they will run most of 'make check' with tcg only.

Sure, but running make check (amongst other things) on a ppc host is
part of my standard pre-pull-request tests, so it certainly would have
caught it in this case.  Peter Maydell has said he has a ppc host
amongst his test systems.  At the moment it doesn't do KVM so wouldn't
have caught the other problem, but it'd be one step closer to doing
so.

Cornelia Huck Aug. 23, 2017, 7:16 a.m. UTC | #15

On Wed, 23 Aug 2017 10:29:07 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Aug 22, 2017 at 01:48:15PM +0200, Cornelia Huck wrote:
> > On Tue, 22 Aug 2017 21:20:46 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > Obviously it's not a thing to fix right now, but I've really been
> > > thinking that none of the tests should use this "TCG or KVM" stuff.
> > > They should instead be run with *both* options - or at least the ones
> > > that are available on the host.  
> > 
> > Having one test as a 'smoke test' that is run for everything available
> > sounds like a good idea, and the boot-serial test may be a good
> > candidate for that.
> > 
> > I would not want to run every test with every accelerator, however, as
> > this makes 'make check' even slower than it is now. (Although it may be
> > useful to be able to trigger 'run everything' tests on some dedicated
> > test machines.)  
> 
> I'd be fine with only running the full matrix on a "make check-harder"
> or whatever, target.  But I'd like the option to be there.  Sometimes
> (like when preparing a pull request) a slower check is an acceptable
> cost for better coverage.

make check with one smoke test + make check-harder (like the name :)
sounds like a good combination.

Thomas Huth Aug. 23, 2017, 7:49 a.m. UTC | #16

On 23.08.2017 09:16, Cornelia Huck wrote:
> On Wed, 23 Aug 2017 10:29:07 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
>> On Tue, Aug 22, 2017 at 01:48:15PM +0200, Cornelia Huck wrote:
>>> On Tue, 22 Aug 2017 21:20:46 +1000
>>> David Gibson <david@gibson.dropbear.id.au> wrote:
>>>   
>>>> Obviously it's not a thing to fix right now, but I've really been
>>>> thinking that none of the tests should use this "TCG or KVM" stuff.
>>>> They should instead be run with *both* options - or at least the ones
>>>> that are available on the host.  
>>>
>>> Having one test as a 'smoke test' that is run for everything available
>>> sounds like a good idea, and the boot-serial test may be a good
>>> candidate for that.
>>>
>>> I would not want to run every test with every accelerator, however, as
>>> this makes 'make check' even slower than it is now. (Although it may be
>>> useful to be able to trigger 'run everything' tests on some dedicated
>>> test machines.)  
>>
>> I'd be fine with only running the full matrix on a "make check-harder"
>> or whatever, target.  But I'd like the option to be there.  Sometimes
>> (like when preparing a pull request) a slower check is an acceptable
>> cost for better coverage.
> 
> make check with one smoke test + make check-harder (like the name :)
> sounds like a good combination.

While we're at it: I'd like to have a "make check-fast", too. Sometimes
the normal "make check" is already too slow, e.g. while developing new
patches, I sometimes just want to do a very quick sanity test to see
whether I broke some basic things or not, and only do the "make check"
before I submit my patches.
So we would get three stages:

- make check-fast => For very, very quick sanity tests only

- make check => E.g. has to be run before submitting patches

- make check-harder => might run a very long time, so best suited for
                       nightly regression tests etc.?

Does that sound reasonable? And the crucial question: Who is going to
implement the basic framework for this?

 Thomas

Paolo Bonzini Aug. 23, 2017, 8:01 a.m. UTC | #17

On 23/08/2017 09:49, Thomas Huth wrote:
> While we're at it: I'd like to have a "make check-fast", too. Sometimes
> the normal "make check" is already too slow, e.g. while developing new
> patches, I sometimes just want to do a very quick sanity test to see
> whether I broke some basic things or not, and only do the "make check"
> before I submit my patches.
> So we would get three stages:
> 
> - make check-fast => For very, very quick sanity tests only
> 
> - make check => E.g. has to be run before submitting patches
> 
> - make check-harder => might run a very long time, so best suited for
>                        nightly regression tests etc.?
> 
> Does that sound reasonable? And the crucial question: Who is going to
> implement the basic framework for this?

There's already make check-unit or make check-qtest-x86_64 depending on
what you're working on.

If you have a many-core machine, of course, there's no simpler solution
than throwing more CPUs at it. :)

Paolo

Thomas Huth Aug. 23, 2017, 8:35 a.m. UTC | #18

On 23.08.2017 10:01, Paolo Bonzini wrote:
> On 23/08/2017 09:49, Thomas Huth wrote:
>> While we're at it: I'd like to have a "make check-fast", too. Sometimes
>> the normal "make check" is already too slow, e.g. while developing new
>> patches, I sometimes just want to do a very quick sanity test to see
>> whether I broke some basic things or not, and only do the "make check"
>> before I submit my patches.
>> So we would get three stages:
>>
>> - make check-fast => For very, very quick sanity tests only
>>
>> - make check => E.g. has to be run before submitting patches
>>
>> - make check-harder => might run a very long time, so best suited for
>>                        nightly regression tests etc.?
>>
>> Does that sound reasonable? And the crucial question: Who is going to
>> implement the basic framework for this?
> 
> There's already make check-unit or make check-qtest-x86_64 depending on
> what you're working on.

True. And I just learned that you can also already set the SPEED
variable to either "quick" or "slow" and that we're already using
g_test_quick() and g_test_slow() in a couple of places to check this. So
the framework for running quick vs. thorough tests is already there ...
we just might want to add this to some more tests, I guess...

Question for the maintainers and the test automation folks: Is anybody
already running "make check SPEED=slow" or is this just rather an
unheard-of way of running the tests?

> If you have a many-core machine, of course, there's no simpler solution
> than throwing more CPUs at it. :)

Is it safe nowadays to run "make check -j4" for example? Last time I
tried (maybe 1 or 2 years ago), there were still issues since some tests
were using hard-coded temporary file names, so the parallel tests were
disturbing each other, for example...

 Thomas

Fam Zheng Aug. 23, 2017, 8:52 a.m. UTC | #19

On Wed, 08/23 10:35, Thomas Huth wrote:
> Is it safe nowadays to run "make check -j4" for example? Last time I
> tried (maybe 1 or 2 years ago), there were still issues since some tests
> were using hard-coded temporary file names, so the parallel tests were
> disturbing each other, for example...

We really should fix them.

Last year I wanted to get some speed up in patchew's "make check" jobs so I
tried this:

https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07713.html

On my laptop it can save 50% time as shown in the commit message, unfortunately
the patch didn't get enough traction, and I've since then rarely used -j in
"make check" command lines (what?!).

Fam

Fam Zheng Aug. 23, 2017, 9:09 a.m. UTC | #20

On Wed, 08/23 16:52, Fam Zheng wrote:
> On Wed, 08/23 10:35, Thomas Huth wrote:
> > Is it safe nowadays to run "make check -j4" for example? Last time I
> > tried (maybe 1 or 2 years ago), there were still issues since some tests
> > were using hard-coded temporary file names, so the parallel tests were
> > disturbing each other, for example...
> 
> We really should fix them.
> 
> Last year I wanted to get some speed up in patchew's "make check" jobs so I
> tried this:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07713.html
> 
> On my laptop it can save 50% time as shown in the commit message, unfortunately
> the patch didn't get enough traction, and I've since then rarely used -j in
> "make check" command lines (what?!).

Spoke too soon, I forgot my git-publish hook has been doing "make check -j8" on
all patches I post, and it seems to work well.

Fam

Paolo Bonzini Aug. 23, 2017, 9:33 a.m. UTC | #21

On 23/08/2017 10:35, Thomas Huth wrote:
>> If you have a many-core machine, of course, there's no simpler solution
>> than throwing more CPUs at it. :)
> Is it safe nowadays to run "make check -j4" for example? Last time I
> tried (maybe 1 or 2 years ago), there were still issues since some tests
> were using hard-coded temporary file names, so the parallel tests were
> disturbing each other, for example...

I run -j18 all the time. :)

Paolo

Lukáš Doktor Aug. 23, 2017, 11:51 a.m. UTC | #22

Dne 23.8.2017 v 10:35 Thomas Huth napsal(a):
> On 23.08.2017 10:01, Paolo Bonzini wrote:
>> On 23/08/2017 09:49, Thomas Huth wrote:
>>> While we're at it: I'd like to have a "make check-fast", too. Sometimes
>>> the normal "make check" is already too slow, e.g. while developing new
>>> patches, I sometimes just want to do a very quick sanity test to see
>>> whether I broke some basic things or not, and only do the "make check"
>>> before I submit my patches.
>>> So we would get three stages:
>>>
>>> - make check-fast => For very, very quick sanity tests only
>>>
>>> - make check => E.g. has to be run before submitting patches
>>>
>>> - make check-harder => might run a very long time, so best suited for
>>>                        nightly regression tests etc.?
>>>
>>> Does that sound reasonable? And the crucial question: Who is going to
>>> implement the basic framework for this?
>>
>> There's already make check-unit or make check-qtest-x86_64 depending on
>> what you're working on.
> 
> True. And I just learned that you can also already set the SPEED
> variable to either "quick" or "slow" and that we're already using
> g_test_quick() and g_test_slow() in a couple of places to check this. So
> the framework for running quick vs. thorough tests is already there ...
> we just might want to add this to some more tests, I guess...
> 
> Question for the maintainers and the test automation folks: Is anybody
> already running "make check SPEED=slow" or is this just rather an
> unheard-of way of running the tests?
> 
>> If you have a many-core machine, of course, there's no simpler solution
>> than throwing more CPUs at it. :)
> 
> Is it safe nowadays to run "make check -j4" for example? Last time I
> tried (maybe 1 or 2 years ago), there were still issues since some tests
> were using hard-coded temporary file names, so the parallel tests were
> disturbing each other, for example...
> 

Actually the `.travis.yml` defines `MAKEFLAGS="-j3"`, which results in `make check` being executed with 3 threads...

I was actually looking at the increasing number of failed travis builds and it seems to be related to the fluctuating performance. Running `nice -n 20 make check -j 12` with `nice -n 5 stress -c 20` in background results in the same kind of failures:

    File '/tmp/qemu/include/qapi/qmp/qobject.h'
    Lines executed:0.00% of 9
    **
    ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x7c7f1b78 == 0xcafec0de)
    GTester: last random seed: R02S117a2b838f1fd17c65a134629577b996
    make: *** [check-qtest-ppc] Chyba 1
    /tmp/qemu/tests/Makefile.include:836: návod pro cíl „check-qtest-ppc“ selhal
    make: *** [check-qtest-sparc] Chyba 1
    /tmp/qemu/tests/Makefile.include:836: návod pro cíl „check-qtest-sparc“ selhal
    **
    ERROR:tests/rtc-test.c:173:check_time: assertion failed (ABS(t - s) <= wiggle): (3 <= 2)
    GTester: last random seed: R02S2ca82eb2e7f63ec62c8433b715d9fe12
    Vhost user backend fails to broadcast fake RARP
    **
    ERROR:tests/vhost-user-test.c:779:test_reconnect: child process (/i386/vhost-user/reconnect/subprocess [5264]) failed unexpectedly
    GTester: last random seed: R02S6e728ccb5384a4d856cacc4166be8052
    Gcov report for hw/misc/tmp105.c:
    File 'hw/misc/tmp105.c'
    Lines executed:30.40% of 125
    Gcov report for arm-softmmu/hw/block/virtio-blk.c:
    File '/tmp/qemu/hw/block/virtio-blk.c'

and all ERROR lines:

    ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x00000000 == 0xcafec0de)
    ERROR:tests/rtc-test.c:173:check_time: assertion failed (ABS(t - s) <= wiggle): (3 <= 2)
    ERROR:tests/vhost-user-test.c:779:test_reconnect: child process (/i386/vhost-user/reconnect/subprocess [5264]) failed unexpectedly
    ERROR:tests/vhost-user-test.c:807:test_connect_fail: child process (/x86_64/vhost-user/connect-fail/subprocess [7751]) failed unexpectedly
    ERROR:tests/vhost-user-test.c:835:test_flags_mismatch: child process (/x86_64/vhost-user/flags-mismatch/subprocess [7892]) failed unexpectedly
    ERROR:tests/boot-sector.c:127:boot_sector_test: assertion failed (signature == SIGNATURE): (0x00000000 == 0x0000dead)

on my machine. Using `make -j 1` or at least `make -j 2` could improve (but I haven't checked that)

Lukáš

>  Thomas
>

Thomas Huth Aug. 23, 2017, 12:01 p.m. UTC | #23

On 23.08.2017 13:51, Lukáš Doktor wrote:
> Dne 23.8.2017 v 10:35 Thomas Huth napsal(a):
>> On 23.08.2017 10:01, Paolo Bonzini wrote:
>>> On 23/08/2017 09:49, Thomas Huth wrote:
>>>> While we're at it: I'd like to have a "make check-fast", too. Sometimes
>>>> the normal "make check" is already too slow, e.g. while developing new
>>>> patches, I sometimes just want to do a very quick sanity test to see
>>>> whether I broke some basic things or not, and only do the "make check"
>>>> before I submit my patches.
>>>> So we would get three stages:
>>>>
>>>> - make check-fast => For very, very quick sanity tests only
>>>>
>>>> - make check => E.g. has to be run before submitting patches
>>>>
>>>> - make check-harder => might run a very long time, so best suited for
>>>>                        nightly regression tests etc.?
>>>>
>>>> Does that sound reasonable? And the crucial question: Who is going to
>>>> implement the basic framework for this?
>>>
>>> There's already make check-unit or make check-qtest-x86_64 depending on
>>> what you're working on.
>>
>> True. And I just learned that you can also already set the SPEED
>> variable to either "quick" or "slow" and that we're already using
>> g_test_quick() and g_test_slow() in a couple of places to check this. So
>> the framework for running quick vs. thorough tests is already there ...
>> we just might want to add this to some more tests, I guess...
>>
>> Question for the maintainers and the test automation folks: Is anybody
>> already running "make check SPEED=slow" or is this just rather an
>> unheard-of way of running the tests?
>>
>>> If you have a many-core machine, of course, there's no simpler solution
>>> than throwing more CPUs at it. :)
>>
>> Is it safe nowadays to run "make check -j4" for example? Last time I
>> tried (maybe 1 or 2 years ago), there were still issues since some tests
>> were using hard-coded temporary file names, so the parallel tests were
>> disturbing each other, for example...
>>
> 
> Actually the `.travis.yml` defines `MAKEFLAGS="-j3"`, which results in `make check` being executed with 3 threads...
> 
> I was actually looking at the increasing number of failed travis builds and it seems to be related to the fluctuating performance. Running `nice -n 20 make check -j 12` with `nice -n 5 stress -c 20` in background results in the same kind of failures:
> 
>     File '/tmp/qemu/include/qapi/qmp/qobject.h'
>     Lines executed:0.00% of 9
>     **
>     ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x7c7f1b78 == 0xcafec0de)

I think you're simply running into timeout issues here since you're
likely overloading the host, I guess. Or does your host have that many CPUs?
If the timeouts are really an issue here, we simply might have to
increase the timeout values a bit again...

 Thomas

Lukáš Doktor Aug. 23, 2017, 12:13 p.m. UTC | #24

Dne 23.8.2017 v 14:01 Thomas Huth napsal(a):
> On 23.08.2017 13:51, Lukáš Doktor wrote:
>> Dne 23.8.2017 v 10:35 Thomas Huth napsal(a):
>>> On 23.08.2017 10:01, Paolo Bonzini wrote:
>>>> On 23/08/2017 09:49, Thomas Huth wrote:
>>>>> While we're at it: I'd like to have a "make check-fast", too. Sometimes
>>>>> the normal "make check" is already too slow, e.g. while developing new
>>>>> patches, I sometimes just want to do a very quick sanity test to see
>>>>> whether I broke some basic things or not, and only do the "make check"
>>>>> before I submit my patches.
>>>>> So we would get three stages:
>>>>>
>>>>> - make check-fast => For very, very quick sanity tests only
>>>>>
>>>>> - make check => E.g. has to be run before submitting patches
>>>>>
>>>>> - make check-harder => might run a very long time, so best suited for
>>>>>                        nightly regression tests etc.?
>>>>>
>>>>> Does that sound reasonable? And the crucial question: Who is going to
>>>>> implement the basic framework for this?
>>>>
>>>> There's already make check-unit or make check-qtest-x86_64 depending on
>>>> what you're working on.
>>>
>>> True. And I just learned that you can also already set the SPEED
>>> variable to either "quick" or "slow" and that we're already using
>>> g_test_quick() and g_test_slow() in a couple of places to check this. So
>>> the framework for running quick vs. thorough tests is already there ...
>>> we just might want to add this to some more tests, I guess...
>>>
>>> Question for the maintainers and the test automation folks: Is anybody
>>> already running "make check SPEED=slow" or is this just rather an
>>> unheard-of way of running the tests?
>>>
>>>> If you have a many-core machine, of course, there's no simpler solution
>>>> than throwing more CPUs at it. :)
>>>
>>> Is it safe nowadays to run "make check -j4" for example? Last time I
>>> tried (maybe 1 or 2 years ago), there were still issues since some tests
>>> were using hard-coded temporary file names, so the parallel tests were
>>> disturbing each other, for example...
>>>
>>
>> Actually the `.travis.yml` defines `MAKEFLAGS="-j3"`, which results in `make check` being executed with 3 threads...
>>
>> I was actually looking at the increasing number of failed travis builds and it seems to be related to the fluctuating performance. Running `nice -n 20 make check -j 12` with `nice -n 5 stress -c 20` in background results in the same kind of failures:
>>
>>     File '/tmp/qemu/include/qapi/qmp/qobject.h'
>>     Lines executed:0.00% of 9
>>     **
>>     ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x7c7f1b78 == 0xcafec0de)
> 
> I think you're simply running into timeout issues here since you're
> likely overloading the host, I guess. Or does your host have that many CPUs?
> If the timeouts are really an issue here, we simply might have to
> increase the timeout values a bit again...
> 

The reason I did this is that the `make check -j 12` runs for ~ 4 minutes on my machine (8 cores) while in travis it sometimes runs even more than 40 minutes. I wanted to get closer to see why is it failing and this might be the reason and yes, it's most certainly timeout issue. The problem with Travis is it gives quite decent power, but sometimes it's slowed for couple of seconds, or even minutes. This affects many of our (Avocado) selftests so we usually have timeouts between 10-60s for operations that usually take less than a second.

Lukáš

PS: Usually the `make check -j 12` works well on my machine...

>  Thomas
>

Cornelia Huck Aug. 23, 2017, 12:20 p.m. UTC | #25

On Wed, 23 Aug 2017 10:35:43 +0200
Thomas Huth <thuth@redhat.com> wrote:

> True. And I just learned that you can also already set the SPEED
> variable to either "quick" or "slow" and that we're already using
> g_test_quick() and g_test_slow() in a couple of places to check this. So
> the framework for running quick vs. thorough tests is already there ...
> we just might want to add this to some more tests, I guess...
> 
> Question for the maintainers and the test automation folks: Is anybody
> already running "make check SPEED=slow" or is this just rather an
> unheard-of way of running the tests?

So I tried this on master just for fun, and 'make V=1 SPEED=slow
check-qtest-x86_64' promptly failed for some ivshmem test.

On x86_86:
TEST: tests/ivshmem-test... (pid=3672)
  /x86_64/ivshmem/single:                                              OK
  /x86_64/ivshmem/hotplug:                                             OK
  /x86_64/ivshmem/memdev:                                              OK
  /x86_64/ivshmem/pair:                                                OK
  /x86_64/ivshmem/server-msi:                                          **
ERROR:/home/cohuck/git/qemu/tests/ivshmem-test.c:367:test_ivshmem_server: assertion failed (ret == 0): (1 == 0)
FAIL
GTester: last random seed: R02Scde8fd6835fdf17450c73e2f74f25007
(pid=3697)
 /x86_64/ivshmem/server-irq:                                          OK
FAIL: tests/ivshmem-test

On s390x:
TEST: tests/ivshmem-test... (pid=63617)
  /x86_64/ivshmem/single:                                              OK
  /x86_64/ivshmem/hotplug:                                             OK
  /x86_64/ivshmem/memdev:                                              OK
  /x86_64/ivshmem/pair:                                                OK
  /x86_64/ivshmem/server-msi:                                          qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
Broken pipe
FAIL
GTester: last random seed: R02Sda000f7be5ce27b3dfbb03d12f297b69
(pid=63640)
  /x86_64/ivshmem/server-irq:                                          qemu-system-x86_64: -device ivshmem,size=1M,msi=off,chardev=chr0,vectors=2: server sent invalid ID message
Broken pipe
FAIL
GTester: last random seed: R02S5a236dbcac35545cc34c0131fbc06162
(pid=63648)
FAIL: tests/ivshmem-test

Both machines are on Fedora 26.

(On the bright side, the rest of the test seemed fine.)

Thomas Huth Aug. 23, 2017, 12:35 p.m. UTC | #26

On 23.08.2017 13:51, Lukáš Doktor wrote:
[...]
> and all ERROR lines:
> 
>     ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x00000000 == 0xcafec0de)

This test uses a timeout of 120 s, see check_guest_memory() in
tests/prom-env-test.c ... that's a lot already, but if you hit this on a
real system, feel free to send a patch to increase the timeout.

>     ERROR:tests/rtc-test.c:173:check_time: assertion failed (ABS(t - s) <= wiggle): (3 <= 2)

You might need to increase the "wiggle" variable here.

>     ERROR:tests/vhost-user-test.c:779:test_reconnect: child process (/i386/vhost-user/reconnect/subprocess [5264]) failed unexpectedly
>     ERROR:tests/vhost-user-test.c:807:test_connect_fail: child process (/x86_64/vhost-user/connect-fail/subprocess [7751]) failed unexpectedly
>     ERROR:tests/vhost-user-test.c:835:test_flags_mismatch: child process (/x86_64/vhost-user/flags-mismatch/subprocess [7892]) failed unexpectedly

I've got no clue about what could be wrong here ... maybe Marc-André can
advise here (who wrote these tests if I got "git blame" right).

>     ERROR:tests/boot-sector.c:127:boot_sector_test: assertion failed (signature == SIGNATURE): (0x00000000 == 0x0000dead)

Uses a timeout of 90 s ... feel free to send a patch to increase it if
it's necessary (the timeout can be found in boot_sector_test() in
tests/boot-sector.c).

 Thomas

Lukáš Doktor Aug. 23, 2017, 12:52 p.m. UTC | #27

Dne 23.8.2017 v 14:35 Thomas Huth napsal(a):
> On 23.08.2017 13:51, Lukáš Doktor wrote:
> [...]
>> and all ERROR lines:
>>
>>     ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x00000000 == 0xcafec0de)
> 
> This test uses a timeout of 120 s, see check_guest_memory() in
> tests/prom-env-test.c ... that's a lot already, but if you hit this on a
> real system, feel free to send a patch to increase the timeout.
> 
>>     ERROR:tests/rtc-test.c:173:check_time: assertion failed (ABS(t - s) <= wiggle): (3 <= 2)
> 
> You might need to increase the "wiggle" variable here.
> 
>>     ERROR:tests/vhost-user-test.c:779:test_reconnect: child process (/i386/vhost-user/reconnect/subprocess [5264]) failed unexpectedly
>>     ERROR:tests/vhost-user-test.c:807:test_connect_fail: child process (/x86_64/vhost-user/connect-fail/subprocess [7751]) failed unexpectedly
>>     ERROR:tests/vhost-user-test.c:835:test_flags_mismatch: child process (/x86_64/vhost-user/flags-mismatch/subprocess [7892]) failed unexpectedly
> 
> I've got no clue about what could be wrong here ... maybe Marc-André can
> advise here (who wrote these tests if I got "git blame" right).
> 
>>     ERROR:tests/boot-sector.c:127:boot_sector_test: assertion failed (signature == SIGNATURE): (0x00000000 == 0x0000dead)
> 
> Uses a timeout of 90 s ... feel free to send a patch to increase it if
> it's necessary (the timeout can be found in boot_sector_test() in
> tests/boot-sector.c).
> 
>  Thomas
> 

Well this one was produced by the `stress` command, so it does not reflect the reality. So this time I took the time and looked at the actual failures in David's travis results (his builds were already migrated to trusty and are failing massively) https://travis-ci.org/dgibson/qemu/builds I see only the `vhost-user-test` failing:

ERROR:tests/vhost-user-test.c:779:test_reconnect: child process (/x86_64/vhost-user/reconnect/subprocess [58907]) failed unexpectedly
ERROR:tests/vhost-user-test.c:807:test_connect_fail: child process (/x86_64/vhost-user/connect-fail/subprocess [58924]) failed unexpectedly
ERROR:tests/vhost-user-test.c:835:test_flags_mismatch: child process (/i386/vhost-user/flags-mismatch/subprocess [56261]) failed unexpectedly

I actually looked there but I have no clue of how to fix this issue. But most probably it'd make Travis happier...

Lukáš

David Gibson Aug. 24, 2017, 3:01 a.m. UTC | #28

On Wed, Aug 23, 2017 at 11:33:55AM +0200, Paolo Bonzini wrote:
> On 23/08/2017 10:35, Thomas Huth wrote:
> >> If you have a many-core machine, of course, there's no simpler solution
> >> than throwing more CPUs at it. :)
> > Is it safe nowadays to run "make check -j4" for example? Last time I
> > tried (maybe 1 or 2 years ago), there were still issues since some tests
> > were using hard-coded temporary file names, so the parallel tests were
> > disturbing each other, for example...
> 
> I run -j18 all the time. :)

Nice for those that have 18-way machines at their regular disposal...

Thomas Huth Aug. 24, 2017, 5:27 a.m. UTC | #29

On 23.08.2017 14:20, Cornelia Huck wrote:
> On Wed, 23 Aug 2017 10:35:43 +0200
> Thomas Huth <thuth@redhat.com> wrote:
> 
>> True. And I just learned that you can also already set the SPEED
>> variable to either "quick" or "slow" and that we're already using
>> g_test_quick() and g_test_slow() in a couple of places to check this. So
>> the framework for running quick vs. thorough tests is already there ...
>> we just might want to add this to some more tests, I guess...
>>
>> Question for the maintainers and the test automation folks: Is anybody
>> already running "make check SPEED=slow" or is this just rather an
>> unheard-of way of running the tests?
> 
> So I tried this on master just for fun, and 'make V=1 SPEED=slow
> check-qtest-x86_64' promptly failed for some ivshmem test.
> 
> On x86_86:
> TEST: tests/ivshmem-test... (pid=3672)
>   /x86_64/ivshmem/single:                                              OK
>   /x86_64/ivshmem/hotplug:                                             OK
>   /x86_64/ivshmem/memdev:                                              OK
>   /x86_64/ivshmem/pair:                                                OK
>   /x86_64/ivshmem/server-msi:                                          **
> ERROR:/home/cohuck/git/qemu/tests/ivshmem-test.c:367:test_ivshmem_server: assertion failed (ret == 0): (1 == 0)
> FAIL
> GTester: last random seed: R02Scde8fd6835fdf17450c73e2f74f25007
> (pid=3697)
>  /x86_64/ivshmem/server-irq:                                          OK
> FAIL: tests/ivshmem-test
> 
> On s390x:
> TEST: tests/ivshmem-test... (pid=63617)
>   /x86_64/ivshmem/single:                                              OK
>   /x86_64/ivshmem/hotplug:                                             OK
>   /x86_64/ivshmem/memdev:                                              OK
>   /x86_64/ivshmem/pair:                                                OK
>   /x86_64/ivshmem/server-msi:                                          qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
> Broken pipe
> FAIL
> GTester: last random seed: R02Sda000f7be5ce27b3dfbb03d12f297b69
> (pid=63640)
>   /x86_64/ivshmem/server-irq:                                          qemu-system-x86_64: -device ivshmem,size=1M,msi=off,chardev=chr0,vectors=2: server sent invalid ID message
> Broken pipe
> FAIL
> GTester: last random seed: R02S5a236dbcac35545cc34c0131fbc06162
> (pid=63648)
> FAIL: tests/ivshmem-test
> 
> Both machines are on Fedora 26.

The ivshmem test fails for me with SPEED=slow, too (on a x86 RHEL7
machine). Looks like it is definitely broken. Could anybody with some
ivshmem knowledge please have a look at this?

 Thanks,
  Thomas

Cleber Rosa Aug. 24, 2017, 1:52 p.m. UTC | #30

On 08/23/2017 08:13 AM, Lukáš Doktor wrote:
> Dne 23.8.2017 v 14:01 Thomas Huth napsal(a):
>> On 23.08.2017 13:51, Lukáš Doktor wrote:
>>> Dne 23.8.2017 v 10:35 Thomas Huth napsal(a):
>>>> On 23.08.2017 10:01, Paolo Bonzini wrote:
>>>>> On 23/08/2017 09:49, Thomas Huth wrote:
>>>>>> While we're at it: I'd like to have a "make check-fast", too. Sometimes
>>>>>> the normal "make check" is already too slow, e.g. while developing new
>>>>>> patches, I sometimes just want to do a very quick sanity test to see
>>>>>> whether I broke some basic things or not, and only do the "make check"
>>>>>> before I submit my patches.
>>>>>> So we would get three stages:
>>>>>>
>>>>>> - make check-fast => For very, very quick sanity tests only
>>>>>>
>>>>>> - make check => E.g. has to be run before submitting patches
>>>>>>
>>>>>> - make check-harder => might run a very long time, so best suited for
>>>>>>                        nightly regression tests etc.?
>>>>>>
>>>>>> Does that sound reasonable? And the crucial question: Who is going to
>>>>>> implement the basic framework for this?
>>>>>
>>>>> There's already make check-unit or make check-qtest-x86_64 depending on
>>>>> what you're working on.
>>>>
>>>> True. And I just learned that you can also already set the SPEED
>>>> variable to either "quick" or "slow" and that we're already using
>>>> g_test_quick() and g_test_slow() in a couple of places to check this. So
>>>> the framework for running quick vs. thorough tests is already there ...
>>>> we just might want to add this to some more tests, I guess...
>>>>
>>>> Question for the maintainers and the test automation folks: Is anybody
>>>> already running "make check SPEED=slow" or is this just rather an
>>>> unheard-of way of running the tests?
>>>>
>>>>> If you have a many-core machine, of course, there's no simpler solution
>>>>> than throwing more CPUs at it. :)
>>>>
>>>> Is it safe nowadays to run "make check -j4" for example? Last time I
>>>> tried (maybe 1 or 2 years ago), there were still issues since some tests
>>>> were using hard-coded temporary file names, so the parallel tests were
>>>> disturbing each other, for example...
>>>>
>>>
>>> Actually the `.travis.yml` defines `MAKEFLAGS="-j3"`, which results in `make check` being executed with 3 threads...
>>>
>>> I was actually looking at the increasing number of failed travis builds and it seems to be related to the fluctuating performance. Running `nice -n 20 make check -j 12` with `nice -n 5 stress -c 20` in background results in the same kind of failures:
>>>
>>>     File '/tmp/qemu/include/qapi/qmp/qobject.h'
>>>     Lines executed:0.00% of 9
>>>     **
>>>     ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature == MAGIC): (0x7c7f1b78 == 0xcafec0de)
>>
>> I think you're simply running into timeout issues here since you're
>> likely overloading the host, I guess. Or does your host have that many CPUs?
>> If the timeouts are really an issue here, we simply might have to
>> increase the timeout values a bit again...
>>
> 
> The reason I did this is that the `make check -j 12` runs for ~ 4 minutes on my machine (8 cores) while in travis it sometimes runs even more than 40 minutes. I wanted to get closer to see why is it failing and this might be the reason and yes, it's most certainly timeout issue. The problem with Travis is it gives quite decent power, but sometimes it's slowed for couple of seconds, or even minutes. This affects many of our (Avocado) selftests so we usually have timeouts between 10-60s for operations that usually take less than a second.

And we learned that even huge timeouts are not a valid solution, in fact
they can become a counter solution.  On many environments it's not wise
to even run tests are time sensitive.  Travis is one, Fedora package
build hosts are another example.

IMO, this signals that test metadata (and categorization on top of it)
can be of great help in many situations.  Instead of writing a multitude
of "make check-(you-characteristics-here)" targets, something like "make
EXCLUDE=time_senstitive check" is far more flexible.

Using Avocado as an example, it'd mean running something like:

 $ avocado run -t '-time_sensitive,-superuser' tests/*

- Cleber.

> 
> Lukáš
> 
> PS: Usually the `make check -j 12` works well on my machine...
> 
>>  Thomas
>>
> 
>

Thomas Huth Aug. 29, 2017, 4:34 p.m. UTC | #31

On 23.08.2017 14:20, Cornelia Huck wrote:
> On Wed, 23 Aug 2017 10:35:43 +0200
> Thomas Huth <thuth@redhat.com> wrote:
> 
>> True. And I just learned that you can also already set the SPEED
>> variable to either "quick" or "slow" and that we're already using
>> g_test_quick() and g_test_slow() in a couple of places to check this. So
>> the framework for running quick vs. thorough tests is already there ...
>> we just might want to add this to some more tests, I guess...
>>
>> Question for the maintainers and the test automation folks: Is anybody
>> already running "make check SPEED=slow" or is this just rather an
>> unheard-of way of running the tests?
> 
> So I tried this on master just for fun, and 'make V=1 SPEED=slow
> check-qtest-x86_64' promptly failed for some ivshmem test.
> 
> On x86_86:
> TEST: tests/ivshmem-test... (pid=3672)
>   /x86_64/ivshmem/single:                                              OK
>   /x86_64/ivshmem/hotplug:                                             OK
>   /x86_64/ivshmem/memdev:                                              OK
>   /x86_64/ivshmem/pair:                                                OK
>   /x86_64/ivshmem/server-msi:                                          **
> ERROR:/home/cohuck/git/qemu/tests/ivshmem-test.c:367:test_ivshmem_server: assertion failed (ret == 0): (1 == 0)
> FAIL
> GTester: last random seed: R02Scde8fd6835fdf17450c73e2f74f25007
> (pid=3697)
>  /x86_64/ivshmem/server-irq:                                          OK
> FAIL: tests/ivshmem-test

Bisecting this problem automatically ("git bisect run" rules!) revealed
that this test broke with this commit:

	commit b4ba67d9a702507793c2724e56f98e9b0f7be02b
	Author: David Gibson <david@gibson.dropbear.id.au>
	Title: libqos: Change PCI accessors to take opaque BAR handle

David, any ideas what's going wrong here?

 Thomas

Thomas Huth Aug. 29, 2017, 5:13 p.m. UTC | #32

On 29.08.2017 18:34, Thomas Huth wrote:
> On 23.08.2017 14:20, Cornelia Huck wrote:
>> On Wed, 23 Aug 2017 10:35:43 +0200
>> Thomas Huth <thuth@redhat.com> wrote:
>>
>>> True. And I just learned that you can also already set the SPEED
>>> variable to either "quick" or "slow" and that we're already using
>>> g_test_quick() and g_test_slow() in a couple of places to check this. So
>>> the framework for running quick vs. thorough tests is already there ...
>>> we just might want to add this to some more tests, I guess...
>>>
>>> Question for the maintainers and the test automation folks: Is anybody
>>> already running "make check SPEED=slow" or is this just rather an
>>> unheard-of way of running the tests?
>>
>> So I tried this on master just for fun, and 'make V=1 SPEED=slow
>> check-qtest-x86_64' promptly failed for some ivshmem test.
>>
>> On x86_86:
>> TEST: tests/ivshmem-test... (pid=3672)
>>   /x86_64/ivshmem/single:                                              OK
>>   /x86_64/ivshmem/hotplug:                                             OK
>>   /x86_64/ivshmem/memdev:                                              OK
>>   /x86_64/ivshmem/pair:                                                OK
>>   /x86_64/ivshmem/server-msi:                                          **
>> ERROR:/home/cohuck/git/qemu/tests/ivshmem-test.c:367:test_ivshmem_server: assertion failed (ret == 0): (1 == 0)
>> FAIL
>> GTester: last random seed: R02Scde8fd6835fdf17450c73e2f74f25007
>> (pid=3697)
>>  /x86_64/ivshmem/server-irq:                                          OK
>> FAIL: tests/ivshmem-test
> 
> Bisecting this problem automatically ("git bisect run" rules!) revealed
> that this test broke with this commit:
> 
> 	commit b4ba67d9a702507793c2724e56f98e9b0f7be02b
> 	Author: David Gibson <david@gibson.dropbear.id.au>
> 	Title: libqos: Change PCI accessors to take opaque BAR handle
> 
> David, any ideas what's going wrong here?

Never mind, I've found the problem: dev->msix_pba_bar is not properly
initialized anymore if bir_pba == bir_table. I'm working on a patch...

 Thomas

Cornelia Huck Aug. 30, 2017, 8:59 a.m. UTC | #33

On Thu, 24 Aug 2017 07:27:18 +0200
Thomas Huth <thuth@redhat.com> wrote:

> On 23.08.2017 14:20, Cornelia Huck wrote:

> > So I tried this on master just for fun, and 'make V=1 SPEED=slow
> > check-qtest-x86_64' promptly failed for some ivshmem test.
> > 
> > On x86_86:
> > TEST: tests/ivshmem-test... (pid=3672)
> >   /x86_64/ivshmem/single:                                              OK
> >   /x86_64/ivshmem/hotplug:                                             OK
> >   /x86_64/ivshmem/memdev:                                              OK
> >   /x86_64/ivshmem/pair:                                                OK
> >   /x86_64/ivshmem/server-msi:                                          **
> > ERROR:/home/cohuck/git/qemu/tests/ivshmem-test.c:367:test_ivshmem_server: assertion failed (ret == 0): (1 == 0)
> > FAIL
> > GTester: last random seed: R02Scde8fd6835fdf17450c73e2f74f25007
> > (pid=3697)
> >  /x86_64/ivshmem/server-irq:                                          OK
> > FAIL: tests/ivshmem-test

I think this one is understood now...

> > 
> > On s390x:
> > TEST: tests/ivshmem-test... (pid=63617)
> >   /x86_64/ivshmem/single:                                              OK
> >   /x86_64/ivshmem/hotplug:                                             OK
> >   /x86_64/ivshmem/memdev:                                              OK
> >   /x86_64/ivshmem/pair:                                                OK
> >   /x86_64/ivshmem/server-msi:                                          qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
> > Broken pipe
> > FAIL
> > GTester: last random seed: R02Sda000f7be5ce27b3dfbb03d12f297b69
> > (pid=63640)
> >   /x86_64/ivshmem/server-irq:                                          qemu-system-x86_64: -device ivshmem,size=1M,msi=off,chardev=chr0,vectors=2: server sent invalid ID message
> > Broken pipe
> > FAIL
> > GTester: last random seed: R02S5a236dbcac35545cc34c0131fbc06162
> > (pid=63648)
> > FAIL: tests/ivshmem-test

...but this is a different problem (i.e., it can't be tracked down to
commit b4ba67d9a7). I'm not sure whether that one ever worked. Might be
an endianness problem (a quick test on another BE platform could
confirm.)

> > 
> > Both machines are on Fedora 26.  
> 
> The ivshmem test fails for me with SPEED=slow, too (on a x86 RHEL7
> machine). Looks like it is definitely broken. Could anybody with some
> ivshmem knowledge please have a look at this?
> 
>  Thanks,
>   Thomas

Thomas Huth Aug. 30, 2017, 9:13 a.m. UTC | #34

On 30.08.2017 10:59, Cornelia Huck wrote:
[...]
>>> On s390x:
>>> TEST: tests/ivshmem-test... (pid=63617)
>>>   /x86_64/ivshmem/single:                                              OK
>>>   /x86_64/ivshmem/hotplug:                                             OK
>>>   /x86_64/ivshmem/memdev:                                              OK
>>>   /x86_64/ivshmem/pair:                                                OK
>>>   /x86_64/ivshmem/server-msi:                                          qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
>>> Broken pipe
>>> FAIL
>>> GTester: last random seed: R02Sda000f7be5ce27b3dfbb03d12f297b69
>>> (pid=63640)
>>>   /x86_64/ivshmem/server-irq:                                          qemu-system-x86_64: -device ivshmem,size=1M,msi=off,chardev=chr0,vectors=2: server sent invalid ID message
>>> Broken pipe
>>> FAIL
>>> GTester: last random seed: R02S5a236dbcac35545cc34c0131fbc06162
>>> (pid=63648)
>>> FAIL: tests/ivshmem-test
> 
> ...but this is a different problem (i.e., it can't be tracked down to
> commit b4ba67d9a7). I'm not sure whether that one ever worked. Might be
> an endianness problem (a quick test on another BE platform could
> confirm.)

This also fails on a big endian ppc64 host machine:

$ uname -m
ppc64
$ V=1 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/ivshmem-test -m slow
/x86_64/ivshmem/single: OK
/x86_64/ivshmem/hotplug: OK
/x86_64/ivshmem/memdev: OK
/x86_64/ivshmem/pair: OK
/x86_64/ivshmem/server-msi: qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
Broken pipe

 Thomas

Markus Armbruster Aug. 30, 2017, 11:52 a.m. UTC | #35

Thomas Huth <thuth@redhat.com> writes:

> On 30.08.2017 10:59, Cornelia Huck wrote:
> [...]
>>>> On s390x:
>>>> TEST: tests/ivshmem-test... (pid=63617)
>>>>   /x86_64/ivshmem/single:                                              OK
>>>>   /x86_64/ivshmem/hotplug:                                             OK
>>>>   /x86_64/ivshmem/memdev:                                              OK
>>>>   /x86_64/ivshmem/pair:                                                OK
>>>>   /x86_64/ivshmem/server-msi:                                          qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
>>>> Broken pipe
>>>> FAIL
>>>> GTester: last random seed: R02Sda000f7be5ce27b3dfbb03d12f297b69
>>>> (pid=63640)
>>>>   /x86_64/ivshmem/server-irq:                                          qemu-system-x86_64: -device ivshmem,size=1M,msi=off,chardev=chr0,vectors=2: server sent invalid ID message
>>>> Broken pipe
>>>> FAIL
>>>> GTester: last random seed: R02S5a236dbcac35545cc34c0131fbc06162
>>>> (pid=63648)
>>>> FAIL: tests/ivshmem-test
>> 
>> ...but this is a different problem (i.e., it can't be tracked down to
>> commit b4ba67d9a7). I'm not sure whether that one ever worked. Might be
>> an endianness problem (a quick test on another BE platform could
>> confirm.)
>
> This also fails on a big endian ppc64 host machine:
>
> $ uname -m
> ppc64
> $ V=1 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/ivshmem-test -m slow
> /x86_64/ivshmem/single: OK
> /x86_64/ivshmem/hotplug: OK
> /x86_64/ivshmem/memdev: OK
> /x86_64/ivshmem/pair: OK
> /x86_64/ivshmem/server-msi: qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
> Broken pipe

Botched endian conversion?

Thomas Huth Aug. 30, 2017, 1:56 p.m. UTC | #36

On 30.08.2017 13:52, Markus Armbruster wrote:
> Thomas Huth <thuth@redhat.com> writes:
[...]
>> This also fails on a big endian ppc64 host machine:
>>
>> $ uname -m
>> ppc64
>> $ V=1 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/ivshmem-test -m slow
>> /x86_64/ivshmem/single: OK
>> /x86_64/ivshmem/hotplug: OK
>> /x86_64/ivshmem/memdev: OK
>> /x86_64/ivshmem/pair: OK
>> /x86_64/ivshmem/server-msi: qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
>> Broken pipe
> 
> Botched endian conversion?

Yes, it's an endianess problem. I just sent a patch, title
"hw/misc/ivshmem: Fix ivshmem_recv_msg() to also work on big endian
systems" (forgot to put you on CC:, sorry!).

 Thomas

Markus Armbruster Aug. 30, 2017, 2:19 p.m. UTC | #37

Thomas Huth <thuth@redhat.com> writes:

> On 30.08.2017 13:52, Markus Armbruster wrote:
>> Thomas Huth <thuth@redhat.com> writes:
> [...]
>>> This also fails on a big endian ppc64 host machine:
>>>
>>> $ uname -m
>>> ppc64
>>> $ V=1 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/ivshmem-test -m slow
>>> /x86_64/ivshmem/single: OK
>>> /x86_64/ivshmem/hotplug: OK
>>> /x86_64/ivshmem/memdev: OK
>>> /x86_64/ivshmem/pair: OK
>>> /x86_64/ivshmem/server-msi: qemu-system-x86_64: -device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
>>> Broken pipe
>> 
>> Botched endian conversion?
>
> Yes, it's an endianess problem. I just sent a patch, title
> "hw/misc/ivshmem: Fix ivshmem_recv_msg() to also work on big endian
> systems" (forgot to put you on CC:, sorry!).

No need to be sorry!  Because I always feel sorry when I have to review
another ivshmem patch ;-}

[for-2.10] boot-serial-test: prefer tcg accelerator

Commit Message

Comments

Patch