Patchwork [v4,1/3] Device specification for shared memory PCI device

login
register
mail settings
Submitter Cam Macdonell
Date April 7, 2010, 10:51 p.m.
Message ID <1270680720-8457-2-git-send-email-cam@cs.ualberta.ca>
Download mbox | patch
Permalink /patch/49674/
State New
Headers show

Comments

Cam Macdonell - April 7, 2010, 10:51 p.m.
---
 docs/specs/ivshmem_device_spec.txt |   85 ++++++++++++++++++++++++++++++++++++
 1 files changed, 85 insertions(+), 0 deletions(-)
 create mode 100644 docs/specs/ivshmem_device_spec.txt
Avi Kivity - April 12, 2010, 8:34 p.m.
On 04/08/2010 01:51 AM, Cam Macdonell wrote:

(sorry about the late review)

> +
> +Regular Interrupts
> +------------------
> +
> +If regular interrupts are used (due to either a guest not supporting MSI or the
> +user specifying not to use them on startup) then the value written to the lower
> +16-bits of the Doorbell register results is arbitrary and will trigger an
> +interrupt in the destination guest.
>    

Does the value written show up in the status register?  If yes, it can 
get overwritten by other interrupts.  If not, the lower 16 bits should 
be reserved to the value 1 for future expansion.  Basically it means 
that the pci interrupt is equivalent to to vector 1.

> +
> +An interrupt is also generated when a new guest accesses the shared memory
> +region.  A status of (2^32 - 1) indicates that a new guest has joined.
>    

Suggest making this a bitfield, define bit 0 as 'at least some other 
machine has signalled you' and bit 1 as 'at least one other machine has 
joined'.

> +
> +Message Signalled Interrupts
> +----------------------------
> +
> +A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
> +written to the Doorbell register must be between 1 and the maximum number of
> +vectors the guest supports.  The lower 16 bits written to the doorbell is the
> +MSI vector that will be raised in the destination guest.  The number of MSI
> +vectors can vary but it is set when the VM is started, however vector 0 is
> +used to notify that a new guest has joined.  Guests should not use vector 0 for
> +any other purpose.
>    

Come to think about it, the guest has joined is actually pointless.  
Since it hasn't initialized yet you can't talk to it.  So it's best to 
leave it completely to the application, which can initialize shared 
memory and start sending interrupts.  An application defined protocol 
can handle joining.

How is initialization performed?  I guess we can define memory to start 
zeroed and let participants compete to acquire a lock.

Need to document the mask register.

Do we want an interrupt on a guest leaving?  Let's not complicate things.
Cam Macdonell - April 12, 2010, 9:11 p.m.
On Mon, Apr 12, 2010 at 3:34 PM, Avi Kivity <avi@redhat.com> wrote:
> On 04/08/2010 01:51 AM, Cam Macdonell wrote:
>
> (sorry about the late review)
>
>> +
>> +Regular Interrupts
>> +------------------
>> +
>> +If regular interrupts are used (due to either a guest not supporting MSI
>> or the
>> +user specifying not to use them on startup) then the value written to the
>> lower
>> +16-bits of the Doorbell register results is arbitrary and will trigger an
>> +interrupt in the destination guest.
>>
>
> Does the value written show up in the status register?  If yes, it can get
> overwritten by other interrupts.  If not, the lower 16 bits should be
> reserved to the value 1 for future expansion.  Basically it means that the
> pci interrupt is equivalent to to vector 1.

The status register is only 1 or 0.  I've made it so 1 is the only
value written to trigger an interrupt.

>
>> +
>> +An interrupt is also generated when a new guest accesses the shared
>> memory
>> +region.  A status of (2^32 - 1) indicates that a new guest has joined.
>>
>
> Suggest making this a bitfield, define bit 0 as 'at least some other machine
> has signalled you' and bit 1 as 'at least one other machine has joined'.
>
>> +
>> +Message Signalled Interrupts
>> +----------------------------
>> +
>> +A ivshmem device may support multiple MSI vectors.  If so, the lower
>> 16-bits
>> +written to the Doorbell register must be between 1 and the maximum number
>> of
>> +vectors the guest supports.  The lower 16 bits written to the doorbell is
>> the
>> +MSI vector that will be raised in the destination guest.  The number of
>> MSI
>> +vectors can vary but it is set when the VM is started, however vector 0
>> is
>> +used to notify that a new guest has joined.  Guests should not use vector
>> 0 for
>> +any other purpose.
>>
>
> Come to think about it, the guest has joined is actually pointless.  Since
> it hasn't initialized yet you can't talk to it.  So it's best to leave it
> completely to the application, which can initialize shared memory and start
> sending interrupts.  An application defined protocol can handle joining.

Good point.

> How is initialization performed?  I guess we can define memory to start
> zeroed and let participants compete to acquire a lock.

No initialization of the memory occurs presently.

With interrupts the shared memory server could zero the memory.
Without the server (non-interrupt case) the guests can try and open
the shared memory with O_EXCL first and zero the memory if it
succeeds.  If O_EXCL fails, then guest would open without O_EXCL and
not initialize.

>
> Need to document the mask register.

Currently only applies with regular interrupts.  Since the status
register is only 0 or 1, then only the first bit has any affect.  I'll
add this to the spec.

>
> Do we want an interrupt on a guest leaving?  Let's not complicate things.

Probably not if we don't have one on join.

Cam

Patch

diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
new file mode 100644
index 0000000..9895782
--- /dev/null
+++ b/docs/specs/ivshmem_device_spec.txt
@@ -0,0 +1,85 @@ 
+
+Device Specification for Inter-VM shared memory device
+------------------------------------------------------
+
+The Inter-VM shared memory device is designed to share a region of memory to
+userspace in multiple virtual guests.  The memory region does not belong to any
+guest, but is a POSIX memory object on the host.  Optionally, the device may
+support sending interrupts to other guests sharing the same memory region.
+
+The Inter-VM PCI device
+-----------------------
+
+BARs
+
+The device supports three BARs.  BAR0 is a 1 Kbyte MMIO region to support
+registers.  BAR1 is used for MSI-X when it is enabled in the device.  BAR2 is
+used to map the shared memory object from the host.  The size of BAR2 is
+specified when the guest is started and must be a power of 2 in size.
+
+Registers
+
+The device currently supports 4 registers of 32-bits each.  Registers
+are used for synchronization between guests sharing the same memory object when
+interrupts are supported (this requires using the shared memory server).
+
+The server assigns each VM an ID number and sends this ID number to the Qemu
+process when the guest starts.
+
+enum ivshmem_registers {
+    IntrMask = 0,
+    IntrStatus = 4,
+    IVPosition = 8,
+    Doorbell = 12
+};
+
+The first two registers are the interrupt mask and status registers.  Mask and
+status are only used with pin-based interrupts.  They are unused with MSI
+interrupts.  The IVPosition register is read-only and reports the guest's ID
+number.  To interrupt another guest, a guest must write to the Doorbell
+register.  The doorbell register is 32-bits, logically divided into two 16-bit
+fields.  The high 16-bits are the guest ID to interrupt and the low 16-bits are
+the interrupt vector to trigger.
+
+The semantics of the value written to the doorbell depends on whether the
+device is using MSI or a regular pin-based interrupt.  In short, MSI uses
+vectors and regular interrupts set the status register.
+
+Regular Interrupts
+------------------
+
+If regular interrupts are used (due to either a guest not supporting MSI or the
+user specifying not to use them on startup) then the value written to the lower
+16-bits of the Doorbell register results is arbitrary and will trigger an
+interrupt in the destination guest.
+
+An interrupt is also generated when a new guest accesses the shared memory
+region.  A status of (2^32 - 1) indicates that a new guest has joined.
+
+Message Signalled Interrupts
+----------------------------
+
+A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
+written to the Doorbell register must be between 1 and the maximum number of
+vectors the guest supports.  The lower 16 bits written to the doorbell is the
+MSI vector that will be raised in the destination guest.  The number of MSI
+vectors can vary but it is set when the VM is started, however vector 0 is
+used to notify that a new guest has joined.  Guests should not use vector 0 for
+any other purpose.
+
+The important thing to remember with MSI is that it is only a signal, no status
+is set (since MSI interrupts are not shared).  All information other than the
+interrupt itself should be communicated via the shared memory region.  Devices
+supporting multiple MSI vectors can use different vectors to indicate different
+events have occurred.  The semantics of interrupt vectors are left to the
+user's discretion.
+
+Usage in the Guest
+------------------
+
+The shared memory device is intended to be used with the provided UIO driver.
+Very little configuration is needed.  The guest should map BAR0 to access the
+registers (an array of 32-bit ints allows simple writing) and map BAR2 to
+access the shared memory region itself.  The size of the shared memory region
+is specified when the guest (or shared memory server) is started.  A guest may
+map the whole shared memory region or only part of it.