mbox

[PULL,00/40] ivshmem: Fixes, cleanups, device model split

Message ID 1458320487-19603-1-git-send-email-armbru@redhat.com
State New
Headers show

Pull-request

git://repo.or.cz/qemu/armbru.git tags/pull-ivshmem-2016-03-18

Message

Markus Armbruster March 18, 2016, 5 p.m. UTC
Major issues addressed by this series:

* The specification document is incomplete and vague.  Rewritten.

* When a peer goes away, and its ID gets reused for another one,
  interrupts don't work.

* When configured for interrupts, we receive shared memory from the
  server some time after realize().  This creates a (usually
  short-lived) "no shared memory, yet" state.  If the guest wins the
  race, it is exposed to this state (known issue, if you count burying
  in docs/specs/ as "known").  If migration wins the race, it fails or
  corrupts memory.

* Interrupts are unreliable in a (usually small) time window after the
  destination peer connects.  I believe fixing this will require
  changing the client/server protocol, so just document it for now.

* The device isn't capable to tell guest software whether it is
  configured for interrupts.  Fix that in a new, backwards-compatible
  revision of the guest ABI, and bump the PCI revision.  Deprecate the
  old revision.

* The device properties are a confusing mess and badly checked.
  Clean that up.

* Migration with interrupts relies on server behavior not guaranteed
  by the specification.  Tighten the specification.

The following changes since commit 6741d38ad0f2405a6e999ebc9550801b01aca479:

  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2016-03-17 15:59:42 +0000)

are available in the git repository at:

  git://repo.or.cz/qemu/armbru.git tags/pull-ivshmem-2016-03-18

for you to fetch changes up to 9c4b53495c86f7c518e6daae6f98a349a9852009:

  contrib/ivshmem-server: Print "not for production" warning (2016-03-18 17:35:26 +0100)

----------------------------------------------------------------
ivshmem: Fixes, cleanups, device model split

----------------------------------------------------------------
Markus Armbruster (40):
      target-ppc: Document TOCTTOU in hugepage support
      ivshmem-server: Fix and clean up command line help
      ivshmem-server: Don't overload POSIX shmem and file name
      qemu-doc: Fix ivshmem huge page example
      event_notifier: Make event_notifier_init_fd() #ifdef CONFIG_EVENTFD
      tests/libqos/pci-pc: Fix qpci_pc_iomap() to map BARs aligned
      ivshmem-test: Improve test case /ivshmem/single
      ivshmem-test: Clean up wait for devices to become operational
      ivshmem-test: Improve test cases /ivshmem/server-*
      ivshmem: Rewrite specification document
      ivshmem: Add missing newlines to debug printfs
      ivshmem: Compile debug prints unconditionally to prevent bit-rot
      ivshmem: Clean up after commit 9940c32
      ivshmem: Drop ivshmem_event() stub
      ivshmem: Don't destroy the chardev on version mismatch
      ivshmem: Fix harmless misuse of Error
      ivshmem: Failed realize() can leave migration blocker behind
      ivshmem: Clean up register callbacks
      ivshmem: Clean up MSI-X conditions
      ivshmem: Leave INTx alone when using MSI-X
      ivshmem: Assert interrupts are set up once
      ivshmem: Simplify rejection of invalid peer ID from server
      ivshmem: Disentangle ivshmem_read()
      ivshmem: Plug leaks on unplug, fix peer disconnect
      ivshmem: Receive shared memory synchronously in realize()
      ivshmem: Propagate errors through ivshmem_recv_setup()
      ivshmem: Rely on server sending the ID right after the version
      ivshmem: Drop the hackish test for UNIX domain chardev
      ivshmem: Simplify how we cope with short reads from server
      ivshmem: Tighten check of property "size"
      ivshmem: Implement shm=... with a memory backend
      ivshmem: Simplify memory regions for BAR 2 (shared memory)
      ivshmem: Inline check_shm_size() into its only caller
      qdev: New DEFINE_PROP_ON_OFF_AUTO
      ivshmem: Replace int role_val by OnOffAuto master
      ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem
      ivshmem: Clean up after the previous commit
      ivshmem: Drop ivshmem property x-memdev
      ivshmem: Require master to have ID zero
      contrib/ivshmem-server: Print "not for production" warning

 contrib/ivshmem-server/ivshmem-server.c |   56 +-
 contrib/ivshmem-server/ivshmem-server.h |    4 +-
 contrib/ivshmem-server/main.c           |   98 +--
 default-configs/pci.mak                 |    2 +-
 docs/specs/ivshmem-spec.txt             |  254 +++++++
 docs/specs/ivshmem_device_spec.txt      |  161 -----
 hw/core/qdev-properties.c               |   10 +
 hw/misc/ivshmem.c                       | 1091 +++++++++++++++++--------------
 include/hw/qdev-properties.h            |    3 +
 qemu-doc.texi                           |   47 +-
 target-ppc/kvm.c                        |    6 +
 tests/ivshmem-test.c                    |   99 +--
 tests/libqos/pci-pc.c                   |    8 +-
 util/event_notifier-posix.c             |    6 +
 14 files changed, 1017 insertions(+), 828 deletions(-)
 create mode 100644 docs/specs/ivshmem-spec.txt
 delete mode 100644 docs/specs/ivshmem_device_spec.txt

Comments

Peter Maydell March 21, 2016, 9:45 a.m. UTC | #1
On 18 March 2016 at 17:00, Markus Armbruster <armbru@redhat.com> wrote:
> Major issues addressed by this series:
>
> * The specification document is incomplete and vague.  Rewritten.
>
> * When a peer goes away, and its ID gets reused for another one,
>   interrupts don't work.
>
> * When configured for interrupts, we receive shared memory from the
>   server some time after realize().  This creates a (usually
>   short-lived) "no shared memory, yet" state.  If the guest wins the
>   race, it is exposed to this state (known issue, if you count burying
>   in docs/specs/ as "known").  If migration wins the race, it fails or
>   corrupts memory.
>
> * Interrupts are unreliable in a (usually small) time window after the
>   destination peer connects.  I believe fixing this will require
>   changing the client/server protocol, so just document it for now.
>
> * The device isn't capable to tell guest software whether it is
>   configured for interrupts.  Fix that in a new, backwards-compatible
>   revision of the guest ABI, and bump the PCI revision.  Deprecate the
>   old revision.
>
> * The device properties are a confusing mess and badly checked.
>   Clean that up.
>
> * Migration with interrupts relies on server behavior not guaranteed
>   by the specification.  Tighten the specification.
>
> The following changes since commit 6741d38ad0f2405a6e999ebc9550801b01aca479:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2016-03-17 15:59:42 +0000)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/armbru.git tags/pull-ivshmem-2016-03-18
>
> for you to fetch changes up to 9c4b53495c86f7c518e6daae6f98a349a9852009:
>
>   contrib/ivshmem-server: Print "not for production" warning (2016-03-18 17:35:26 +0100)
>
> ----------------------------------------------------------------
> ivshmem: Fixes, cleanups, device model split
>

Hi; I'm afraid this fails 'make check' on OSX:

GTESTER check-qtest-i386
qemu-system-i386: invalid object type: memory-backend-file

Also some new clang ubsan warnings on x86 Linux:
GTESTER check-qtest-i386
[deleted existing warnings about slirp code]
/home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:166:23: runtime
error: shift exponent -1 is negative
/home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:171:24: runtime
error: shift exponent -1 is negative
/home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:172:24: runtime
error: shift exponent -1 is negative

thanks
-- PMM
Markus Armbruster March 21, 2016, 10:05 a.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On 18 March 2016 at 17:00, Markus Armbruster <armbru@redhat.com> wrote:
>> Major issues addressed by this series:
>>
>> * The specification document is incomplete and vague.  Rewritten.
>>
>> * When a peer goes away, and its ID gets reused for another one,
>>   interrupts don't work.
>>
>> * When configured for interrupts, we receive shared memory from the
>>   server some time after realize().  This creates a (usually
>>   short-lived) "no shared memory, yet" state.  If the guest wins the
>>   race, it is exposed to this state (known issue, if you count burying
>>   in docs/specs/ as "known").  If migration wins the race, it fails or
>>   corrupts memory.
>>
>> * Interrupts are unreliable in a (usually small) time window after the
>>   destination peer connects.  I believe fixing this will require
>>   changing the client/server protocol, so just document it for now.
>>
>> * The device isn't capable to tell guest software whether it is
>>   configured for interrupts.  Fix that in a new, backwards-compatible
>>   revision of the guest ABI, and bump the PCI revision.  Deprecate the
>>   old revision.
>>
>> * The device properties are a confusing mess and badly checked.
>>   Clean that up.
>>
>> * Migration with interrupts relies on server behavior not guaranteed
>>   by the specification.  Tighten the specification.
>>
>> The following changes since commit 6741d38ad0f2405a6e999ebc9550801b01aca479:
>>
>>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2016-03-17 15:59:42 +0000)
>>
>> are available in the git repository at:
>>
>>   git://repo.or.cz/qemu/armbru.git tags/pull-ivshmem-2016-03-18
>>
>> for you to fetch changes up to 9c4b53495c86f7c518e6daae6f98a349a9852009:
>>
>>   contrib/ivshmem-server: Print "not for production" warning (2016-03-18 17:35:26 +0100)
>>
>> ----------------------------------------------------------------
>> ivshmem: Fixes, cleanups, device model split
>>
>
> Hi; I'm afraid this fails 'make check' on OSX:
>
> GTESTER check-qtest-i386
> qemu-system-i386: invalid object type: memory-backend-file

I forgot to update tests for "[PATCH] event_notifier: Make
event_notifier_init_fd() #ifdef CONFIG_EVENTFD".  Will fix.

> Also some new clang ubsan warnings on x86 Linux:
> GTESTER check-qtest-i386
> [deleted existing warnings about slirp code]
> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:166:23: runtime
> error: shift exponent -1 is negative
> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:171:24: runtime
> error: shift exponent -1 is negative
> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:172:24: runtime
> error: shift exponent -1 is negative

Stack backtrace?  If it's not too much trouble...
Peter Maydell March 21, 2016, 10:18 a.m. UTC | #3
On 21 March 2016 at 10:05, Markus Armbruster <armbru@redhat.com> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
>> Also some new clang ubsan warnings on x86 Linux:
>> GTESTER check-qtest-i386
>> [deleted existing warnings about slirp code]
>> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:166:23: runtime
>> error: shift exponent -1 is negative
>> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:171:24: runtime
>> error: shift exponent -1 is negative
>> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:172:24: runtime
>> error: shift exponent -1 is negative
>
> Stack backtrace?  If it's not too much trouble...

Sorry, too painful -- this version of clang doesn't support
the UBSAN_OPTIONS environment variable to request a backtrace
at runtime and I can't remember the rune to connect gdb to
the qemu under a qtest test, which is what I'd need to do if
I rebuilt everything with the trap-on-error flag.

The issues are all provoked by the i386/ivshmem/single test.

thanks
-- PMM
Markus Armbruster March 21, 2016, 11:52 a.m. UTC | #4
Peter Maydell <peter.maydell@linaro.org> writes:

> On 21 March 2016 at 10:05, Markus Armbruster <armbru@redhat.com> wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>> Also some new clang ubsan warnings on x86 Linux:
>>> GTESTER check-qtest-i386
>>> [deleted existing warnings about slirp code]
>>> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:166:23: runtime
>>> error: shift exponent -1 is negative
>>> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:171:24: runtime
>>> error: shift exponent -1 is negative
>>> /home/petmay01/linaro/qemu-for-merges/hw/pci/pci.c:172:24: runtime
>>> error: shift exponent -1 is negative
>>
>> Stack backtrace?  If it's not too much trouble...
>
> Sorry, too painful -- this version of clang doesn't support
> the UBSAN_OPTIONS environment variable to request a backtrace
> at runtime and I can't remember the rune to connect gdb to
> the qemu under a qtest test, which is what I'd need to do if
> I rebuilt everything with the trap-on-error flag.
>
> The issues are all provoked by the i386/ivshmem/single test.

No worries, I reproduced it locally.