diff mbox

[v2,2/5] fw_cfg DMA interface documentation

Message ID 1441012217-8213-3-git-send-email-markmb@redhat.com
State New
Headers show

Commit Message

Marc Marí Aug. 31, 2015, 9:10 a.m. UTC
Add fw_cfg DMA interface specification in the documentation.

Based on Gerd Hoffman's initial implementation.

Signed-off-by: Marc Marí <markmb@redhat.com>
---
 docs/specs/fw_cfg.txt | 68 ++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 64 insertions(+), 4 deletions(-)

Comments

Kevin O'Connor Aug. 31, 2015, 3:36 p.m. UTC | #1
On Mon, Aug 31, 2015 at 11:10:14AM +0200, Marc Marí wrote:
> Add fw_cfg DMA interface specification in the documentation.
> 
> Based on Gerd Hoffman's initial implementation.

Thanks for working on this.

[...]
> += Guest-side DMA Interface =
> +
> +If bit 1 of the feature bitmap is set, the DMA interface is present. This does
> +not replace the existing fw_cfg interface, it is an add-on. This interface
> +can be used through the 64-bit wide address register.
> +
> +The address register, as the selector register, is in little-endian format
> +when using IOports, and in big-endian format when using MMIO. The value for
> +the register is 0 at startup and after an operation. A write to the lower
> +half triggers an operation. This means, that operations with 32-bit addresses
> +can be triggered with just one write, whereas operations with 64-bit addresses
> +can be triggered with one 64-bit write or two 32-bit writes, starting with the
> +higher part.
> +
> +In this register, a physical RAM address to a FWCfgDmaAccess structure should
> +be written. This is the format of the FWCfgDmaAccess structure:
> +
> +typedef struct FWCfgDmaAccess {
> +    uint32_t control;
> +    uint32_t length;
> +    uint64_t address;
> +} FWCfgDmaAccess;
> +
> +The fields of the structure are in big endian mode, and the field at the lowest
> +address is the "control" field.
> +
> +The "control" field has the following bits:
> + - Bit 0: Error
> + - Bit 1: Read
> + - Bit 2: Skip
> +
> +When an operation is triggered, if the "control" field has bit 1 set, a read
> +operation will be performed. "length" bytes for the current selector and
> +offset will be copied into the address specified by the "address" field.
> +
> +If the control field has only bit 2 set, a skip operation will be perfomed.
> +The offset for the current selector will be advanced "length" bytes.
> +
> +To check result, read the "control" field:
> +   error bit set        ->  something went wrong.
> +   all bits cleared     ->  transfer finished successfully.
> +   otherwise            ->  transfer still in progress (doesn't happen
> +                            today due to implementation not being async,
> +                            but may in the future).
> +
> +Target address goes up and transfer length goes down as the transfer happens,
> +so after a successful transfer the length field is zero and the address field
> +points right after the memory block written.
> +
> +If a partial transfer happened before an error occured the address and
> +length registers indicate how much data has been transfered successfully.

If the interface will support asynchronous transfers, then it is
necessary to document which field QEMU will update last.  That way,
the guest will know which particular field to wait on.  It should
probably also be documented that that particular field must have
native alignment (eg, alignment of 4 for a 32bit field).

The reason this is important, is because of a subtle race condition.
For example, if QEMU updates "length" and then "control", but the
guest spins on "length" then it is possible for the guest to exit its
spin loop and use the memory containing "control" for something else
before QEMU finishes updating "control".  This would result in
corruption of that memory when QEMU later overwrites it.

My suggestion would be for QEMU to only ever update "control" and not
modify any other parts of the FWCfgDmaAccess struct.  This makes the
interface and documentation simpler and it sidesteps this issue.

-Kevin
Peter Maydell Sept. 1, 2015, 5:47 p.m. UTC | #2
On 31 August 2015 at 10:10, Marc Marí <markmb@redhat.com> wrote:
> Add fw_cfg DMA interface specification in the documentation.
>
> Based on Gerd Hoffman's initial implementation.
>
> Signed-off-by: Marc Marí <markmb@redhat.com>
> ---
>  docs/specs/fw_cfg.txt | 68 ++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 64 insertions(+), 4 deletions(-)
>
> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> index 5bc7b96..06302f6 100644
> --- a/docs/specs/fw_cfg.txt
> +++ b/docs/specs/fw_cfg.txt
> @@ -76,6 +76,13 @@ increasing address order, similar to memcpy().
>
>  Selector Register IOport: 0x510
>  Data Register IOport:     0x511
> +DMA Address IOport:       0x514
> +
> +=== ARM Register Locations ===
> +
> +Selector Register address: 0x09020000
> +Data Register address:     0x09020008
> +DMA Address address:       0x0902000c

These addresses shouldn't be documented -- the correct API is that the guest
needs to find the base address of the fw_cfg device via device tree
or ACPI table. You can document the layout of the registers within the
device, obviously (ie +0, +4, +8).

>  == Firmware Configuration Items ==
>
> @@ -86,11 +93,12 @@ by selecting the "signature" item using key 0x0000 (FW_CFG_SIGNATURE),
>  and reading four bytes from the data register. If the fw_cfg device is
>  present, the four bytes read will contain the characters "QEMU".
>
> -=== Revision (Key 0x0001, FW_CFG_ID) ===
> +=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
>
> -A 32-bit little-endian unsigned int, this item is used as an interface
> -revision number, and is currently set to 1 by QEMU when fw_cfg is
> -initialized.
> +A 32-bit little-endian unsigned int, this item is used to check for enabled
> +features.
> + - Bit 0: traditional interface. Always set.
> + - Bit 1: DMA interface.
>
>  === File Directory (Key 0x0019, FW_CFG_FILE_DIR) ===
>
> @@ -132,6 +140,58 @@ Selector Reg.    Range Usage
>  In practice, the number of allowed firmware configuration items is given
>  by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h).
>
> += Guest-side DMA Interface =
> +
> +If bit 1 of the feature bitmap is set, the DMA interface is present. This does
> +not replace the existing fw_cfg interface, it is an add-on. This interface
> +can be used through the 64-bit wide address register.
> +
> +The address register, as the selector register, is in little-endian format
> +when using IOports, and in big-endian format when using MMIO. The value for
> +the register is 0 at startup and after an operation. A write to the lower
> +half triggers an operation. This means, that operations with 32-bit addresses

Delete this comma.

> +can be triggered with just one write, whereas operations with 64-bit addresses
> +can be triggered with one 64-bit write or two 32-bit writes, starting with the
> +higher part.
> +
> +In this register, a physical RAM address to a FWCfgDmaAccess structure should

"the physical address of a FWCfgDmaAccess structure in RAM"

> +be written. This is the format of the FWCfgDmaAccess structure:
> +
> +typedef struct FWCfgDmaAccess {
> +    uint32_t control;
> +    uint32_t length;
> +    uint64_t address;
> +} FWCfgDmaAccess;
> +
> +The fields of the structure are in big endian mode, and the field at the lowest
> +address is the "control" field.
> +
> +The "control" field has the following bits:
> + - Bit 0: Error
> + - Bit 1: Read
> + - Bit 2: Skip
> +
> +When an operation is triggered, if the "control" field has bit 1 set, a read
> +operation will be performed. "length" bytes for the current selector and
> +offset will be copied into the address specified by the "address" field.
> +
> +If the control field has only bit 2 set, a skip operation will be perfomed.
> +The offset for the current selector will be advanced "length" bytes.

The implication here is that the operation completes before the
guest write to the address register returns,

> +To check result, read the "control" field:
> +   error bit set        ->  something went wrong.
> +   all bits cleared     ->  transfer finished successfully.
> +   otherwise            ->  transfer still in progress (doesn't happen
> +                            today due to implementation not being async,
> +                            but may in the future).

Is there much point in having an async transfer interface which
requires the guest to busy-wait polling the control field?

> +
> +Target address goes up and transfer length goes down as the transfer happens,
> +so after a successful transfer the length field is zero and the address field
> +points right after the memory block written.
> +
> +If a partial transfer happened before an error occured the address and

"occurred".

> +length registers indicate how much data has been transfered successfully.

"transferred".

> +
>  = Host-side API =
>
>  The following functions are available to the QEMU programmer for adding

thanks
-- PMM
Peter Maydell Sept. 1, 2015, 5:56 p.m. UTC | #3
On 1 September 2015 at 18:47, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 31 August 2015 at 10:10, Marc Marí <markmb@redhat.com> wrote:
>> +=== ARM Register Locations ===
>> +
>> +Selector Register address: 0x09020000
>> +Data Register address:     0x09020008
>> +DMA Address address:       0x0902000c
>
> These addresses shouldn't be documented -- the correct API is that the guest
> needs to find the base address of the fw_cfg device via device tree
> or ACPI table. You can document the layout of the registers within the
> device, obviously (ie +0, +4, +8).

...and this is not what your patchset implements anyway. This should be:

 offset 0: data register (8 bytes wide)
 offset 8: selector register (4 bytes wide)
 offset 12: reserved
 offset 16: DMA address register (8 bytes wide)

thanks
-- PMM
diff mbox

Patch

diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
index 5bc7b96..06302f6 100644
--- a/docs/specs/fw_cfg.txt
+++ b/docs/specs/fw_cfg.txt
@@ -76,6 +76,13 @@  increasing address order, similar to memcpy().
 
 Selector Register IOport: 0x510
 Data Register IOport:     0x511
+DMA Address IOport:       0x514
+
+=== ARM Register Locations ===
+
+Selector Register address: 0x09020000
+Data Register address:     0x09020008
+DMA Address address:       0x0902000c
 
 == Firmware Configuration Items ==
 
@@ -86,11 +93,12 @@  by selecting the "signature" item using key 0x0000 (FW_CFG_SIGNATURE),
 and reading four bytes from the data register. If the fw_cfg device is
 present, the four bytes read will contain the characters "QEMU".
 
-=== Revision (Key 0x0001, FW_CFG_ID) ===
+=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
 
-A 32-bit little-endian unsigned int, this item is used as an interface
-revision number, and is currently set to 1 by QEMU when fw_cfg is
-initialized.
+A 32-bit little-endian unsigned int, this item is used to check for enabled
+features.
+ - Bit 0: traditional interface. Always set.
+ - Bit 1: DMA interface.
 
 === File Directory (Key 0x0019, FW_CFG_FILE_DIR) ===
 
@@ -132,6 +140,58 @@  Selector Reg.    Range Usage
 In practice, the number of allowed firmware configuration items is given
 by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h).
 
+= Guest-side DMA Interface =
+
+If bit 1 of the feature bitmap is set, the DMA interface is present. This does
+not replace the existing fw_cfg interface, it is an add-on. This interface
+can be used through the 64-bit wide address register.
+
+The address register, as the selector register, is in little-endian format
+when using IOports, and in big-endian format when using MMIO. The value for
+the register is 0 at startup and after an operation. A write to the lower
+half triggers an operation. This means, that operations with 32-bit addresses
+can be triggered with just one write, whereas operations with 64-bit addresses
+can be triggered with one 64-bit write or two 32-bit writes, starting with the
+higher part.
+
+In this register, a physical RAM address to a FWCfgDmaAccess structure should
+be written. This is the format of the FWCfgDmaAccess structure:
+
+typedef struct FWCfgDmaAccess {
+    uint32_t control;
+    uint32_t length;
+    uint64_t address;
+} FWCfgDmaAccess;
+
+The fields of the structure are in big endian mode, and the field at the lowest
+address is the "control" field.
+
+The "control" field has the following bits:
+ - Bit 0: Error
+ - Bit 1: Read
+ - Bit 2: Skip
+
+When an operation is triggered, if the "control" field has bit 1 set, a read
+operation will be performed. "length" bytes for the current selector and
+offset will be copied into the address specified by the "address" field.
+
+If the control field has only bit 2 set, a skip operation will be perfomed.
+The offset for the current selector will be advanced "length" bytes.
+
+To check result, read the "control" field:
+   error bit set        ->  something went wrong.
+   all bits cleared     ->  transfer finished successfully.
+   otherwise            ->  transfer still in progress (doesn't happen
+                            today due to implementation not being async,
+                            but may in the future).
+
+Target address goes up and transfer length goes down as the transfer happens,
+so after a successful transfer the length field is zero and the address field
+points right after the memory block written.
+
+If a partial transfer happened before an error occured the address and
+length registers indicate how much data has been transfered successfully.
+
 = Host-side API =
 
 The following functions are available to the QEMU programmer for adding