mbox

[PULL,00/16] migration queue

Message ID 20201026161952.149188-1-dgilbert@redhat.com
State New
Headers show

Pull-request

git://github.com/dagrh/qemu.git tags/pull-migration-20201026a

Message

Dr. David Alan Gilbert Oct. 26, 2020, 4:19 p.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The following changes since commit a46e72710566eea0f90f9c673a0f02da0064acce:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20201026' into staging (2020-10-26 14:50:03 +0000)

are available in the Git repository at:

  git://github.com/dagrh/qemu.git tags/pull-migration-20201026a

for you to fetch changes up to a47295014de56e108f359ec859d5499b851f62b8:

  migration-test: Only hide error if !QTEST_LOG (2020-10-26 16:15:04 +0000)

----------------------------------------------------------------
migration pull: 2020-10-26

Another go at Peter's postcopy fixes

Cleanups from Bihong Yu and Peter Maydell.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

----------------------------------------------------------------
Bihong Yu (9):
      migration: Do not use C99 // comments
      migration: Don't use '#' flag of printf format
      migration: Add spaces around operator
      migration: Open brace '{' following struct go on the same line
      migration: Add braces {} for if statement
      migration: Do not initialise statics and globals to 0 or NULL
      migration: Open brace '{' following function declarations go on the next line
      migration: Delete redundant spaces
      migration: using trace_ to replace DPRINTF

Peter Maydell (1):
      migration: Drop unused VMSTATE_FLOAT64 support

Peter Xu (6):
      migration: Pass incoming state into qemu_ufd_copy_ioctl()
      migration: Introduce migrate_send_rp_message_req_pages()
      migration: Maintain postcopy faulted addresses
      migration: Sync requested pages after postcopy recovery
      migration/postcopy: Release fd before going into 'postcopy-pause'
      migration-test: Only hide error if !QTEST_LOG

 include/migration/vmstate.h  | 13 ----------
 migration/block.c            | 40 ++++++++++++++---------------
 migration/migration.c        | 59 +++++++++++++++++++++++++++++++++++++-----
 migration/migration.h        | 24 ++++++++++++++---
 migration/page_cache.c       | 13 +++-------
 migration/postcopy-ram.c     | 27 +++++++++++++++-----
 migration/ram.c              | 14 +++++-----
 migration/rdma.c             |  7 ++---
 migration/savevm.c           | 61 ++++++++++++++++++++++++++++++++++++++++++--
 migration/trace-events       | 16 ++++++++++++
 migration/vmstate-types.c    | 26 -------------------
 migration/vmstate.c          | 10 ++++----
 tests/qtest/migration-test.c |  6 ++++-
 13 files changed, 213 insertions(+), 103 deletions(-)

Comments

no-reply@patchew.org Oct. 26, 2020, 4:39 p.m. UTC | #1
Patchew URL: https://patchew.org/QEMU/20201026161952.149188-1-dgilbert@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20201026161952.149188-1-dgilbert@redhat.com
Subject: [PULL 00/16] migration queue

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/1603704987-20977-1-git-send-email-kwankhede@nvidia.com -> patchew/1603704987-20977-1-git-send-email-kwankhede@nvidia.com
 * [new tag]         patchew/20201026161952.149188-1-dgilbert@redhat.com -> patchew/20201026161952.149188-1-dgilbert@redhat.com
Switched to a new branch 'test'
eec7173 migration-test: Only hide error if !QTEST_LOG
3cf5619 migration/postcopy: Release fd before going into 'postcopy-pause'
05826d6 migration: Sync requested pages after postcopy recovery
e3ab9bd migration: Maintain postcopy faulted addresses
94c47fd migration: Introduce migrate_send_rp_message_req_pages()
37f5d13 migration: Pass incoming state into qemu_ufd_copy_ioctl()
d212778 migration: using trace_ to replace DPRINTF
57fda59 migration: Delete redundant spaces
5b093f1 migration: Open brace '{' following function declarations go on the next line
0b60dca migration: Do not initialise statics and globals to 0 or NULL
2f219b6 migration: Add braces {} for if statement
3fdcabc migration: Open brace '{' following struct go on the same line
63bb26e migration: Add spaces around operator
fadaa39 migration: Don't use '#' flag of printf format
e568828 migration: Do not use C99 // comments
e477d01 migration: Drop unused VMSTATE_FLOAT64 support

=== OUTPUT BEGIN ===
1/16 Checking commit e477d010bf93 (migration: Drop unused VMSTATE_FLOAT64 support)
2/16 Checking commit e568828c0e63 (migration: Do not use C99 // comments)
3/16 Checking commit fadaa39cfb42 (migration: Don't use '#' flag of printf format)
4/16 Checking commit 63bb26e930c4 (migration: Add spaces around operator)
ERROR: spaces required around that '*' (ctx:WxV)
#65: FILE: migration/savevm.c:523:
+    .subsections = (const VMStateDescription *[]) {
                                              ^

total: 1 errors, 0 warnings, 59 lines checked

Patch 4/16 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

5/16 Checking commit 3fdcabce2015 (migration: Open brace '{' following struct go on the same line)
6/16 Checking commit 2f219b67cf64 (migration: Add braces {} for if statement)
7/16 Checking commit 0b60dca56f19 (migration: Do not initialise statics and globals to 0 or NULL)
8/16 Checking commit 5b093f15f482 (migration: Open brace '{' following function declarations go on the next line)
9/16 Checking commit 57fda592d44b (migration: Delete redundant spaces)
10/16 Checking commit d21277810b4a (migration: using trace_ to replace DPRINTF)
11/16 Checking commit 37f5d13fa1bb (migration: Pass incoming state into qemu_ufd_copy_ioctl())
12/16 Checking commit 94c47fdc32f4 (migration: Introduce migrate_send_rp_message_req_pages())
13/16 Checking commit e3ab9bded382 (migration: Maintain postcopy faulted addresses)
14/16 Checking commit 05826d6ec62a (migration: Sync requested pages after postcopy recovery)
15/16 Checking commit 3cf5619e375b (migration/postcopy: Release fd before going into 'postcopy-pause')
16/16 Checking commit eec7173d989d (migration-test: Only hide error if !QTEST_LOG)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20201026161952.149188-1-dgilbert@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Dr. David Alan Gilbert Oct. 26, 2020, 4:52 p.m. UTC | #2
* no-reply@patchew.org (no-reply@patchew.org) wrote:
> Patchew URL: https://patchew.org/QEMU/20201026161952.149188-1-dgilbert@redhat.com/
> 
> 
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Type: series
> Message-id: 20201026161952.149188-1-dgilbert@redhat.com
> Subject: [PULL 00/16] migration queue
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> git rev-parse base > /dev/null || exit 0
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> ./scripts/checkpatch.pl --mailback base..
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> From https://github.com/patchew-project/qemu
>  - [tag update]      patchew/1603704987-20977-1-git-send-email-kwankhede@nvidia.com -> patchew/1603704987-20977-1-git-send-email-kwankhede@nvidia.com
>  * [new tag]         patchew/20201026161952.149188-1-dgilbert@redhat.com -> patchew/20201026161952.149188-1-dgilbert@redhat.com
> Switched to a new branch 'test'
> eec7173 migration-test: Only hide error if !QTEST_LOG
> 3cf5619 migration/postcopy: Release fd before going into 'postcopy-pause'
> 05826d6 migration: Sync requested pages after postcopy recovery
> e3ab9bd migration: Maintain postcopy faulted addresses
> 94c47fd migration: Introduce migrate_send_rp_message_req_pages()
> 37f5d13 migration: Pass incoming state into qemu_ufd_copy_ioctl()
> d212778 migration: using trace_ to replace DPRINTF
> 57fda59 migration: Delete redundant spaces
> 5b093f1 migration: Open brace '{' following function declarations go on the next line
> 0b60dca migration: Do not initialise statics and globals to 0 or NULL
> 2f219b6 migration: Add braces {} for if statement
> 3fdcabc migration: Open brace '{' following struct go on the same line
> 63bb26e migration: Add spaces around operator
> fadaa39 migration: Don't use '#' flag of printf format
> e568828 migration: Do not use C99 // comments
> e477d01 migration: Drop unused VMSTATE_FLOAT64 support
> 
> === OUTPUT BEGIN ===
> 1/16 Checking commit e477d010bf93 (migration: Drop unused VMSTATE_FLOAT64 support)
> 2/16 Checking commit e568828c0e63 (migration: Do not use C99 // comments)
> 3/16 Checking commit fadaa39cfb42 (migration: Don't use '#' flag of printf format)
> 4/16 Checking commit 63bb26e930c4 (migration: Add spaces around operator)
> ERROR: spaces required around that '*' (ctx:WxV)
> #65: FILE: migration/savevm.c:523:
> +    .subsections = (const VMStateDescription *[]) {
>                                               ^
> 
> total: 1 errors, 0 warnings, 59 lines checked

We decided that was preferable

> 
> Patch 4/16 has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> 5/16 Checking commit 3fdcabce2015 (migration: Open brace '{' following struct go on the same line)
> 6/16 Checking commit 2f219b67cf64 (migration: Add braces {} for if statement)
> 7/16 Checking commit 0b60dca56f19 (migration: Do not initialise statics and globals to 0 or NULL)
> 8/16 Checking commit 5b093f15f482 (migration: Open brace '{' following function declarations go on the next line)
> 9/16 Checking commit 57fda592d44b (migration: Delete redundant spaces)
> 10/16 Checking commit d21277810b4a (migration: using trace_ to replace DPRINTF)
> 11/16 Checking commit 37f5d13fa1bb (migration: Pass incoming state into qemu_ufd_copy_ioctl())
> 12/16 Checking commit 94c47fdc32f4 (migration: Introduce migrate_send_rp_message_req_pages())
> 13/16 Checking commit e3ab9bded382 (migration: Maintain postcopy faulted addresses)
> 14/16 Checking commit 05826d6ec62a (migration: Sync requested pages after postcopy recovery)
> 15/16 Checking commit 3cf5619e375b (migration/postcopy: Release fd before going into 'postcopy-pause')
> 16/16 Checking commit eec7173d989d (migration-test: Only hide error if !QTEST_LOG)
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> The full log is available at
> http://patchew.org/logs/20201026161952.149188-1-dgilbert@redhat.com/testing.checkpatch/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com
Peter Maydell Oct. 27, 2020, 11:28 a.m. UTC | #3
On Mon, 26 Oct 2020 at 16:20, Dr. David Alan Gilbert (git)
<dgilbert@redhat.com> wrote:
>
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> The following changes since commit a46e72710566eea0f90f9c673a0f02da0064acce:
>
>   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20201026' into staging (2020-10-26 14:50:03 +0000)
>
> are available in the Git repository at:
>
>   git://github.com/dagrh/qemu.git tags/pull-migration-20201026a
>
> for you to fetch changes up to a47295014de56e108f359ec859d5499b851f62b8:
>
>   migration-test: Only hide error if !QTEST_LOG (2020-10-26 16:15:04 +0000)
>
> ----------------------------------------------------------------
> migration pull: 2020-10-26
>
> Another go at Peter's postcopy fixes
>
> Cleanups from Bihong Yu and Peter Maydell.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.

-- PMM
Christian Schoenebeck Oct. 31, 2020, 4:12 p.m. UTC | #4
On Montag, 26. Oktober 2020 17:19:36 CET Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> The following changes since commit a46e72710566eea0f90f9c673a0f02da0064acce:
> 
>   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20201026' into
> staging (2020-10-26 14:50:03 +0000)
> 
> are available in the Git repository at:
> 
>   git://github.com/dagrh/qemu.git tags/pull-migration-20201026a
> 
> for you to fetch changes up to a47295014de56e108f359ec859d5499b851f62b8:
> 
>   migration-test: Only hide error if !QTEST_LOG (2020-10-26 16:15:04 +0000)
> 
> ----------------------------------------------------------------
> migration pull: 2020-10-26
> 
> Another go at Peter's postcopy fixes
> 
> Cleanups from Bihong Yu and Peter Maydell.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 

May it be possible that this PR introduced a lockup of the qtests that I am 
encountering in this week's upstream revisions?

PASS 25 qtest-x86_64/bios-tables-test /x86_64/acpi/microvm/pcie

Looking for expected file 'tests/data/acpi/microvm/FACP.rtc'
Looking for expected file 'tests/data/acpi/microvm/FACP'
Using expected file 'tests/data/acpi/microvm/FACP'
Looking for expected file 'tests/data/acpi/microvm/APIC.rtc'
Looking for expected file 'tests/data/acpi/microvm/APIC'
Using expected file 'tests/data/acpi/microvm/APIC'
Looking for expected file 'tests/data/acpi/microvm/DSDT.rtc'
Using expected file 'tests/data/acpi/microvm/DSDT.rtc'
PASS 24 qtest-i386/bios-tables-test /i386/acpi/microvm/rtc
PASS 15 qtest-i386/migration-test /i386/migration/multifd/tcp/cancel
PASS 16 qtest-i386/migration-test /i386/migration/multifd/tcp/zlib
^Cmake: *** [Makefile.mtest:1169: run-test-144] Interrupt

I tried to bisect the causing commit and came up to this PR merge. But I'm not 
absolutely sure it really is, because it sometimes takes 2 hours or more to 
trigger the lockup, so the relevant commit may as well have slipped during 
bisection.

Last week's revisions run here for >24h without issues, right now I get 
lockups of test runs after ~3min .. ~2h when running the qtests in an endless 
loop:

#/bin/sh
i=0
while true; do
  i=$[$i+1]
  echo '**************************************************'
  echo "* RUN $i                                         *"
  echo '**************************************************'
  make check-qtest -j32 V=1
  if [[ "$?" -ne 0 ]]; then
    echo "Aborted after $i runs due to failure"
    break
  fi
done

> ----------------------------------------------------------------
> Bihong Yu (9):
>       migration: Do not use C99 // comments
>       migration: Don't use '#' flag of printf format
>       migration: Add spaces around operator
>       migration: Open brace '{' following struct go on the same line
>       migration: Add braces {} for if statement
>       migration: Do not initialise statics and globals to 0 or NULL
>       migration: Open brace '{' following function declarations go on the
> next line migration: Delete redundant spaces
>       migration: using trace_ to replace DPRINTF
> 
> Peter Maydell (1):
>       migration: Drop unused VMSTATE_FLOAT64 support
> 
> Peter Xu (6):
>       migration: Pass incoming state into qemu_ufd_copy_ioctl()
>       migration: Introduce migrate_send_rp_message_req_pages()
>       migration: Maintain postcopy faulted addresses
>       migration: Sync requested pages after postcopy recovery
>       migration/postcopy: Release fd before going into 'postcopy-pause'
>       migration-test: Only hide error if !QTEST_LOG
> 
>  include/migration/vmstate.h  | 13 ----------
>  migration/block.c            | 40 ++++++++++++++---------------
>  migration/migration.c        | 59
> +++++++++++++++++++++++++++++++++++++----- migration/migration.h        |
> 24 ++++++++++++++---
>  migration/page_cache.c       | 13 +++-------
>  migration/postcopy-ram.c     | 27 +++++++++++++++-----
>  migration/ram.c              | 14 +++++-----
>  migration/rdma.c             |  7 ++---
>  migration/savevm.c           | 61
> ++++++++++++++++++++++++++++++++++++++++++-- migration/trace-events       |
> 16 ++++++++++++
>  migration/vmstate-types.c    | 26 -------------------
>  migration/vmstate.c          | 10 ++++----
>  tests/qtest/migration-test.c |  6 ++++-
>  13 files changed, 213 insertions(+), 103 deletions(-)
Peter Maydell Oct. 31, 2020, 5:26 p.m. UTC | #5
On Sat, 31 Oct 2020 at 16:12, Christian Schoenebeck
<qemu_oss@crudebyte.com> wrote:
>
> On Montag, 26. Oktober 2020 17:19:36 CET Dr. David Alan Gilbert (git) wrote:
> > ----------------------------------------------------------------
> > migration pull: 2020-10-26
> >
> > Another go at Peter's postcopy fixes
> >
> > Cleanups from Bihong Yu and Peter Maydell.
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >
>
> May it be possible that this PR introduced a lockup of the qtests that I am
> encountering in this week's upstream revisions?

If you try the patches Peter Xu attached to this thread
does the lockup go away ?

https://lore.kernel.org/qemu-devel/20201030135350.GA588069@xz-x1/

(I'm also seeing intermittent hangs, for some reason almost always
on s390x host.)

thanks
-- PMM
Peter Xu Oct. 31, 2020, 5:46 p.m. UTC | #6
On Sat, Oct 31, 2020 at 05:26:28PM +0000, Peter Maydell wrote:
> On Sat, 31 Oct 2020 at 16:12, Christian Schoenebeck
> <qemu_oss@crudebyte.com> wrote:
> >
> > On Montag, 26. Oktober 2020 17:19:36 CET Dr. David Alan Gilbert (git) wrote:
> > > ----------------------------------------------------------------
> > > migration pull: 2020-10-26
> > >
> > > Another go at Peter's postcopy fixes
> > >
> > > Cleanups from Bihong Yu and Peter Maydell.
> > >
> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > >
> >
> > May it be possible that this PR introduced a lockup of the qtests that I am
> > encountering in this week's upstream revisions?
> 
> If you try the patches Peter Xu attached to this thread
> does the lockup go away ?
> 
> https://lore.kernel.org/qemu-devel/20201030135350.GA588069@xz-x1/
> 
> (I'm also seeing intermittent hangs, for some reason almost always
> on s390x host.)

It would be good to know exactly which test hanged.  If it's migration-test
then it's very possible.

The race above patch(es) tried to fix should logically be reproducable on all
archs, not s390x only.

Thanks,
Christian Schoenebeck Oct. 31, 2020, 7:10 p.m. UTC | #7
On Samstag, 31. Oktober 2020 18:46:11 CET Peter Xu wrote:
> On Sat, Oct 31, 2020 at 05:26:28PM +0000, Peter Maydell wrote:
> > On Sat, 31 Oct 2020 at 16:12, Christian Schoenebeck
> > 
> > <qemu_oss@crudebyte.com> wrote:
> > > On Montag, 26. Oktober 2020 17:19:36 CET Dr. David Alan Gilbert (git) 
wrote:
> > > > ----------------------------------------------------------------
> > > > migration pull: 2020-10-26
> > > > 
> > > > Another go at Peter's postcopy fixes
> > > > 
> > > > Cleanups from Bihong Yu and Peter Maydell.
> > > > 
> > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > 
> > > May it be possible that this PR introduced a lockup of the qtests that I
> > > am
> > > encountering in this week's upstream revisions?
> > 
> > If you try the patches Peter Xu attached to this thread
> > does the lockup go away ?
> > 
> > https://lore.kernel.org/qemu-devel/20201030135350.GA588069@xz-x1/
> > 
> > (I'm also seeing intermittent hangs, for some reason almost always
> > on s390x host.)
> 
> It would be good to know exactly which test hanged.  If it's migration-test
> then it's very possible.

It's run-test-144 that does not return; according to Makefile.mtest that's 
migration-test, so chances are high that it's indeed introduced by this PR.

> The race above patch(es) tried to fix should logically be reproducable on
> all archs, not s390x only.
> 
> Thanks,

Yes, it's i386 here that locks up.

I'm running the loop with your patches now, so far so good, let's see if it's 
still alive tomorrow.

Best regards,
Christian Schoenebeck
Christian Schoenebeck Nov. 1, 2020, 10:17 a.m. UTC | #8
On Samstag, 31. Oktober 2020 20:10:49 CET Christian Schoenebeck wrote:
> On Samstag, 31. Oktober 2020 18:46:11 CET Peter Xu wrote:
> > On Sat, Oct 31, 2020 at 05:26:28PM +0000, Peter Maydell wrote:
> > > On Sat, 31 Oct 2020 at 16:12, Christian Schoenebeck
> > > 
> > > <qemu_oss@crudebyte.com> wrote:
> > > > On Montag, 26. Oktober 2020 17:19:36 CET Dr. David Alan Gilbert (git)
> 
> wrote:
> > > > > ----------------------------------------------------------------
> > > > > migration pull: 2020-10-26
> > > > > 
> > > > > Another go at Peter's postcopy fixes
> > > > > 
> > > > > Cleanups from Bihong Yu and Peter Maydell.
> > > > > 
> > > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > 
> > > > May it be possible that this PR introduced a lockup of the qtests that
> > > > I
> > > > am
> > > > encountering in this week's upstream revisions?
> > > 
> > > If you try the patches Peter Xu attached to this thread
> > > does the lockup go away ?
> > > 
> > > https://lore.kernel.org/qemu-devel/20201030135350.GA588069@xz-x1/
> > > 
> > > (I'm also seeing intermittent hangs, for some reason almost always
> > > on s390x host.)
> > 
> > It would be good to know exactly which test hanged.  If it's
> > migration-test
> > then it's very possible.
> 
> It's run-test-144 that does not return; according to Makefile.mtest that's
> migration-test, so chances are high that it's indeed introduced by this PR.
> 
> > The race above patch(es) tried to fix should logically be reproducable on
> > all archs, not s390x only.
> > 
> > Thanks,
> 
> Yes, it's i386 here that locks up.
> 
> I'm running the loop with your patches now, so far so good, let's see if
> it's still alive tomorrow.
> 
> Best regards,
> Christian Schoenebeck

Looks good! 16h later and the loop is still running here; it also made the 
lockup to disappear on Travis-CI. So Peter Xu's two patches fix the lockup 
problem for me.

Best regards,
Christian Schoenebeck