diff mbox series

[v6,1/5] hw/dma: xlnx_csu_dma: Implement a Xilinx CSU DMA model

Message ID 20210301132011.8494-2-bmeng.cn@gmail.com
State Superseded
Headers show
Series hw/arm: zynqmp: Implement a CSU DMA model and connect it with GQSPI | expand

Commit Message

Bin Meng March 1, 2021, 1:20 p.m. UTC
From: Xuzhou Cheng <xuzhou.cheng@windriver.com>

ZynqMP QSPI supports SPI transfer using DMA mode, but currently this
is unimplemented. When QSPI is programmed to use DMA mode, QEMU will
crash. This is observed when testing VxWorks 7.

This adds a Xilinx CSU DMA model and the implementation is based on
https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c.
The DST part of the model is verified along with ZynqMP GQSPI model.

Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>

---

Changes in v6:
- int_enable and int_disable do not have its own state
- return 0 in int_enable_pre_write() and int_disable_pre_write()
- remove XLNX_CSU_DMA_INT_R_MASK in int_status_pre_write()

Changes in v5:
- int_enable and int_disable do not affect each other
- do not modify int_status int_disable_pre_write
- set MEM_DONE in xlnx_csu_dma_done if it's SRC

Changes in v4:
- Add complete CSU DMA model based on Edgar's branch
- Differences with Edgar's branch:
  1. Match the registers' FIELD to UG1807.
  2. Remove "byte-align" property. Per UG1807, SIZE and ADDR registers
     must be word aligned.
  3. Make the values of int_enable and int_disable mutually exclusive
     otherwise IRQ cannot be delivered.
  4. Clear int_status after int_disable is set.
  5. Coding convention issues clean-up

Changes in v3:
- Implement DMA as a separate CSU DMA model

Changes in v2:
- Remove unconnected TYPE_STREAM_SINK link property
- Add a TYPE_MEMORY_REGION link property, to allow board codes to tell
  the device what its view of the world that it is doing DMA to is
- Replace cpu_physical_memory_write() with address_space_write()

 include/hw/dma/xlnx_csu_dma.h |  52 +++
 hw/dma/xlnx_csu_dma.c         | 743 ++++++++++++++++++++++++++++++++++
 hw/dma/Kconfig                |   4 +
 hw/dma/meson.build            |   1 +
 4 files changed, 800 insertions(+)
 create mode 100644 include/hw/dma/xlnx_csu_dma.h
 create mode 100644 hw/dma/xlnx_csu_dma.c

Comments

Edgar E. Iglesias March 2, 2021, 3:03 p.m. UTC | #1
On Mon, Mar 01, 2021 at 09:20:07PM +0800, Bin Meng wrote:
> From: Xuzhou Cheng <xuzhou.cheng@windriver.com>
> 
> ZynqMP QSPI supports SPI transfer using DMA mode, but currently this
> is unimplemented. When QSPI is programmed to use DMA mode, QEMU will
> crash. This is observed when testing VxWorks 7.
> 
> This adds a Xilinx CSU DMA model and the implementation is based on
> https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c.
> The DST part of the model is verified along with ZynqMP GQSPI model.


This looked good to me so I moved it into our tree and ran our
testsuite which discovered a couple of more issues.

I'll comment inline and show the fixes I made to make the tests pass.


> 
> Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> 
> ---
> 
> Changes in v6:
> - int_enable and int_disable do not have its own state
> - return 0 in int_enable_pre_write() and int_disable_pre_write()
> - remove XLNX_CSU_DMA_INT_R_MASK in int_status_pre_write()
> 
> Changes in v5:
> - int_enable and int_disable do not affect each other
> - do not modify int_status int_disable_pre_write
> - set MEM_DONE in xlnx_csu_dma_done if it's SRC
> 
> Changes in v4:
> - Add complete CSU DMA model based on Edgar's branch
> - Differences with Edgar's branch:
>   1. Match the registers' FIELD to UG1807.
>   2. Remove "byte-align" property. Per UG1807, SIZE and ADDR registers
>      must be word aligned.
>   3. Make the values of int_enable and int_disable mutually exclusive
>      otherwise IRQ cannot be delivered.
>   4. Clear int_status after int_disable is set.
>   5. Coding convention issues clean-up
> 
> Changes in v3:
> - Implement DMA as a separate CSU DMA model
> 
> Changes in v2:
> - Remove unconnected TYPE_STREAM_SINK link property
> - Add a TYPE_MEMORY_REGION link property, to allow board codes to tell
>   the device what its view of the world that it is doing DMA to is
> - Replace cpu_physical_memory_write() with address_space_write()
> 
>  include/hw/dma/xlnx_csu_dma.h |  52 +++
>  hw/dma/xlnx_csu_dma.c         | 743 ++++++++++++++++++++++++++++++++++
>  hw/dma/Kconfig                |   4 +
>  hw/dma/meson.build            |   1 +
>  4 files changed, 800 insertions(+)
>  create mode 100644 include/hw/dma/xlnx_csu_dma.h
>  create mode 100644 hw/dma/xlnx_csu_dma.c
> 
> diff --git a/include/hw/dma/xlnx_csu_dma.h b/include/hw/dma/xlnx_csu_dma.h
> new file mode 100644
> index 0000000000..204d94c673
> --- /dev/null
> +++ b/include/hw/dma/xlnx_csu_dma.h
> @@ -0,0 +1,52 @@
> +/*
> + * Xilinx Platform CSU Stream DMA emulation
> + *
> + * This implementation is based on
> + * https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 or
> + * (at your option) version 3 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef XLNX_CSU_DMA_H
> +#define XLNX_CSU_DMA_H
> +
> +#define TYPE_XLNX_CSU_DMA "xlnx.csu_dma"
> +
> +#define XLNX_CSU_DMA_R_MAX (0x2c / 4)
> +
> +typedef struct XlnxCSUDMA {
> +    SysBusDevice busdev;
> +    MemoryRegion iomem;
> +    MemTxAttrs attr;
> +    MemoryRegion *dma_mr;
> +    AddressSpace *dma_as;
> +    qemu_irq irq;
> +    StreamSink *tx_dev; /* Used as generic StreamSink */
> +    ptimer_state *src_timer;
> +
> +    uint16_t width;
> +    bool is_dst;
> +    bool r_size_last_word;
> +
> +    StreamCanPushNotifyFn notify;
> +    void *notify_opaque;
> +
> +    uint32_t regs[XLNX_CSU_DMA_R_MAX];
> +    RegisterInfo regs_info[XLNX_CSU_DMA_R_MAX];
> +} XlnxCSUDMA;
> +
> +#define XLNX_CSU_DMA(obj) \
> +    OBJECT_CHECK(XlnxCSUDMA, (obj), TYPE_XLNX_CSU_DMA)
> +
> +#endif
> diff --git a/hw/dma/xlnx_csu_dma.c b/hw/dma/xlnx_csu_dma.c
> new file mode 100644
> index 0000000000..5efb60dd9e
> --- /dev/null
> +++ b/hw/dma/xlnx_csu_dma.c
> @@ -0,0 +1,743 @@
> +/*
> + * Xilinx Platform CSU Stream DMA emulation
> + *
> + * This implementation is based on
> + * https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 or
> + * (at your option) version 3 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "qapi/error.h"
> +#include "hw/hw.h"
> +#include "hw/irq.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/sysbus.h"
> +#include "migration/vmstate.h"
> +#include "sysemu/dma.h"
> +#include "hw/ptimer.h"
> +#include "hw/stream.h"
> +#include "hw/register.h"
> +#include "hw/dma/xlnx_csu_dma.h"
> +
> +/*
> + * Ref: UG1087 (v1.7) February 8, 2019
> + * https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html
> + * CSUDMA Module section
> + */
> +REG32(ADDR, 0x0)
> +    FIELD(ADDR, ADDR, 2, 30) /* wo */
> +REG32(SIZE, 0x4)
> +    FIELD(SIZE, SIZE, 2, 27) /* wo */
> +    FIELD(SIZE, LAST_WORD, 0, 1) /* rw, only exists in SRC */
> +REG32(STATUS, 0x8)
> +    FIELD(STATUS, DONE_CNT, 13, 3) /* wtc */
> +    FIELD(STATUS, FIFO_LEVEL, 5, 8) /* ro */
> +    FIELD(STATUS, OUTSTANDING, 1, 4) /* ro */
> +    FIELD(STATUS, BUSY, 0, 1) /* ro */
> +REG32(CTRL, 0xc)
> +    FIELD(CTRL, FIFOTHRESH, 25, 7) /* rw, only exists in DST, reset 0x40 */
> +    FIELD(CTRL, APB_ERR_RESP, 24, 1) /* rw */
> +    FIELD(CTRL, ENDIANNESS, 23, 1) /* rw */
> +    FIELD(CTRL, AXI_BRST_TYPE, 22, 1) /* rw */
> +    FIELD(CTRL, TIMEOUT_VAL, 10, 12) /* rw, reset: 0xFFE */
> +    FIELD(CTRL, FIFO_THRESH, 2, 8) /* rw, reset: 0x80 */
> +    FIELD(CTRL, PAUSE_STRM, 1, 1) /* rw */
> +    FIELD(CTRL, PAUSE_MEM, 0, 1) /* rw */
> +REG32(CRC, 0x10)
> +REG32(INT_STATUS, 0x14)
> +    FIELD(INT_STATUS, FIFO_OVERFLOW, 7, 1) /* wtc */
> +    FIELD(INT_STATUS, INVALID_APB, 6, 1) /* wtc */
> +    FIELD(INT_STATUS, THRESH_HIT, 5, 1) /* wtc */
> +    FIELD(INT_STATUS, TIMEOUT_MEM, 4, 1) /* wtc */
> +    FIELD(INT_STATUS, TIMEOUT_STRM, 3, 1) /* wtc */
> +    FIELD(INT_STATUS, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
> +    FIELD(INT_STATUS, DONE, 1, 1) /* wtc */
> +    FIELD(INT_STATUS, MEM_DONE, 0, 1) /* wtc */
> +REG32(INT_ENABLE, 0x18)
> +    FIELD(INT_ENABLE, FIFO_OVERFLOW, 7, 1) /* wtc */
> +    FIELD(INT_ENABLE, INVALID_APB, 6, 1) /* wtc */
> +    FIELD(INT_ENABLE, THRESH_HIT, 5, 1) /* wtc */
> +    FIELD(INT_ENABLE, TIMEOUT_MEM, 4, 1) /* wtc */
> +    FIELD(INT_ENABLE, TIMEOUT_STRM, 3, 1) /* wtc */
> +    FIELD(INT_ENABLE, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
> +    FIELD(INT_ENABLE, DONE, 1, 1) /* wtc */
> +    FIELD(INT_ENABLE, MEM_DONE, 0, 1) /* wtc */
> +REG32(INT_DISABLE, 0x1c)
> +    FIELD(INT_DISABLE, FIFO_OVERFLOW, 7, 1) /* wtc */
> +    FIELD(INT_DISABLE, INVALID_APB, 6, 1) /* wtc */
> +    FIELD(INT_DISABLE, THRESH_HIT, 5, 1) /* wtc */
> +    FIELD(INT_DISABLE, TIMEOUT_MEM, 4, 1) /* wtc */
> +    FIELD(INT_DISABLE, TIMEOUT_STRM, 3, 1) /* wtc */
> +    FIELD(INT_DISABLE, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
> +    FIELD(INT_DISABLE, DONE, 1, 1) /* wtc */
> +    FIELD(INT_DISABLE, MEM_DONE, 0, 1) /* wtc */
> +REG32(INT_MASK, 0x20)
> +    FIELD(INT_MASK, FIFO_OVERFLOW, 7, 1) /* ro, reset: 0x1 */
> +    FIELD(INT_MASK, INVALID_APB, 6, 1) /* ro, reset: 0x1 */
> +    FIELD(INT_MASK, THRESH_HIT, 5, 1) /* ro, reset: 0x1 */
> +    FIELD(INT_MASK, TIMEOUT_MEM, 4, 1) /* ro, reset: 0x1 */
> +    FIELD(INT_MASK, TIMEOUT_STRM, 3, 1) /* ro, reset: 0x1 */
> +    FIELD(INT_MASK, AXI_BRESP_ERR, 2, 1) /* ro, reset: 0x1, SRC: AXI_RDERR */
> +    FIELD(INT_MASK, DONE, 1, 1) /* ro, reset: 0x1 */
> +    FIELD(INT_MASK, MEM_DONE, 0, 1) /* ro, reset: 0x1 */
> +REG32(CTRL2, 0x24)
> +    FIELD(CTRL2, ARCACHE, 24, 3) /* rw */
> +    FIELD(CTRL2, ROUTE_BIT, 23, 1) /* rw */
> +    FIELD(CTRL2, TIMEOUT_EN, 22, 1) /* rw */
> +    FIELD(CTRL2, TIMEOUT_PRE, 4, 12) /* rw, reset: 0xFFF */
> +    FIELD(CTRL2, MAX_OUTS_CMDS, 0, 4) /* rw, reset: 0x8 */
> +REG32(ADDR_MSB, 0x28)
> +    FIELD(ADDR_MSB, ADDR_MSB, 0, 17) /* wo */
> +
> +#define R_CTRL_TIMEOUT_VAL_RESET    (0xFFE)
> +#define R_CTRL_FIFO_THRESH_RESET    (0x80)
> +#define R_CTRL_FIFOTHRESH_RESET     (0x40)
> +
> +#define R_CTRL2_TIMEOUT_PRE_RESET   (0xFFF)
> +#define R_CTRL2_MAX_OUTS_CMDS_RESET (0x8)
> +
> +#define XLNX_CSU_DMA_ERR_DEBUG      (0)
> +#define XLNX_CSU_DMA_INT_R_MASK     (0xff)
> +
> +/* UG1807: Set the prescaler value for the timeout in clk (~2.5ns) cycles */
> +#define XLNX_CSU_DMA_TIMER_FREQ     (400 * 1000 * 1000)
> +
> +static bool xlnx_csu_dma_is_paused(XlnxCSUDMA *s)
> +{
> +    bool paused;
> +
> +    paused = !!(s->regs[R_CTRL] & R_CTRL_PAUSE_STRM_MASK);
> +    paused |= !!(s->regs[R_CTRL] & R_CTRL_PAUSE_MEM_MASK);
> +
> +    return paused;
> +}
> +
> +static bool xlnx_csu_dma_get_eop(XlnxCSUDMA *s)
> +{
> +    return s->r_size_last_word;
> +}
> +
> +static bool xlnx_csu_dma_burst_is_fixed(XlnxCSUDMA *s)
> +{
> +    return !!(s->regs[R_CTRL] & R_CTRL_AXI_BRST_TYPE_MASK);
> +}
> +
> +static bool xlnx_csu_dma_timeout_enabled(XlnxCSUDMA *s)
> +{
> +    return !!(s->regs[R_CTRL2] & R_CTRL2_TIMEOUT_EN_MASK);
> +}
> +
> +static void xlnx_csu_dma_update_done_cnt(XlnxCSUDMA *s, int a)
> +{
> +    int cnt;
> +
> +    /* Increase DONE_CNT */
> +    cnt = ARRAY_FIELD_EX32(s->regs, STATUS, DONE_CNT) + a;
> +    ARRAY_FIELD_DP32(s->regs, STATUS, DONE_CNT, cnt);
> +}
> +
> +static void xlnx_csu_dma_data_process(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
> +{
> +    uint32_t bswap;
> +    uint32_t i;
> +
> +    bswap = s->regs[R_CTRL] & R_CTRL_ENDIANNESS_MASK;
> +    if (!s->is_dst && !bswap) {

This test is the wrong way, it should say:

if (s->is_dst && !bswap) {



> +        /* Fast when ENDIANNESS cleared or it's SRC */
> +        return;
> +    }
> +
> +    for (i = 0; i < len; i += 4) {
> +        uint8_t *b = &buf[i];
> +        union {
> +            uint8_t u8[4];
> +            uint32_t u32;
> +        } v = {
> +            .u8 = { b[0], b[1], b[2], b[3] }
> +        };
> +
> +        if (!s->is_dst) {
> +            s->regs[R_CRC] += v.u32;
> +        }
> +        if (bswap) {
> +            /*
> +             * No point using bswap, we need to writeback
> +             * into a potentially unaligned pointer.
> +             */
> +            b[0] = v.u8[3];
> +            b[1] = v.u8[2];
> +            b[2] = v.u8[1];
> +            b[3] = v.u8[0];
> +        }
> +    }
> +}
> +
> +/* len is in bytes */
> +static uint32_t xlnx_csu_dma_write(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
> +{
> +    hwaddr addr = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
> +    MemTxResult result = MEMTX_OK;
> +
> +    xlnx_csu_dma_data_process(s, buf, len);
> +    if (xlnx_csu_dma_burst_is_fixed(s)) {
> +        uint32_t i;
> +
> +        for (i = 0; i < len && (result == MEMTX_OK); i += s->width) {
> +            uint32_t mlen = MIN(len - i, s->width);
> +
> +            result = address_space_rw(s->dma_as, addr, s->attr,
> +                                      buf, mlen, true);
> +            buf += mlen;
> +        }
> +    } else {
> +        result = address_space_rw(s->dma_as, addr, s->attr, buf, len, true);
> +    }
> +
> +    if (result == MEMTX_OK) {
> +        return len;
> +    } else {
> +        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem write",
> +                      __func__, addr);
> +        return 0;
> +    }

This causes a QEMU hang when accessing areas that are not reachable.

We should always return len here even when the access fails and
bonus to set the AXI_BRESP_ERR flag in INT_STATUS (which we had
missed in our model aswell).

e.g:
    if (result != MEMTX_OK) {
        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem write",
                      __func__, addr);
        s->regs[R_INT_STATUS] |= R_INT_STATUS_AXI_BRESP_ERR_MASK;
        xlnx_csu_dma_update_irq(s);
    }
    return len;


> +}
> +
> +/* len is in bytes */
> +static uint32_t xlnx_csu_dma_read(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
> +{
> +    hwaddr addr = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
> +    MemTxResult result = MEMTX_OK;
> +
> +    if (xlnx_csu_dma_burst_is_fixed(s)) {
> +        uint32_t i;
> +
> +        for (i = 0; i < len && (result == MEMTX_OK); i += s->width) {
> +            uint32_t mlen = MIN(len - i, s->width);
> +
> +            result = address_space_rw(s->dma_as, addr, s->attr,
> +                                      buf + i, mlen, false);
> +        }
> +    } else {
> +        result = address_space_rw(s->dma_as, addr, s->attr, buf, len, false);
> +    }
> +
> +    if (result == MEMTX_OK) {
> +        xlnx_csu_dma_data_process(s, buf, len);
> +        return len;
> +    } else {
> +        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem read",
> +                      __func__, addr);
> +        return 0;
> +    }

Similar as for writes, we need to return len and set the RESP ERR irq.


> +}
> +
> +static void xlnx_csu_dma_update_irq(XlnxCSUDMA *s)
> +{
> +    qemu_set_irq(s->irq, !!(s->regs[R_INT_STATUS] & ~s->regs[R_INT_MASK]));
> +}
> +
> +static void xlnx_csu_dma_done(XlnxCSUDMA *s)
> +{
> +    s->regs[R_STATUS] &= ~R_STATUS_BUSY_MASK;
> +    s->regs[R_INT_STATUS] |= R_INT_STATUS_DONE_MASK;
> +
> +    if (!s->is_dst) {
> +        s->regs[R_INT_STATUS] |= R_INT_STATUS_MEM_DONE_MASK;
> +    }
> +
> +    xlnx_csu_dma_update_done_cnt(s, 1);
> +}
> +
> +static uint32_t xlnx_csu_dma_advance(XlnxCSUDMA *s, uint32_t len)
> +{
> +    uint32_t size = s->regs[R_SIZE];
> +    hwaddr dst = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
> +
> +    assert(len <= size);
> +
> +    size -= len;
> +    s->regs[R_SIZE] = size;
> +
> +    if (!xlnx_csu_dma_burst_is_fixed(s)) {
> +        dst += len;
> +        s->regs[R_ADDR] = (uint32_t) dst;
> +        s->regs[R_ADDR_MSB] = dst >> 32;
> +    }
> +
> +    if (size == 0) {
> +        xlnx_csu_dma_done(s);
> +    }
> +
> +    return size;
> +}
> +
> +static void xlnx_csu_dma_src_notify(void *opaque)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(opaque);
> +    unsigned char buf[4 * 1024];
> +    size_t rlen = 0;
> +
> +    ptimer_transaction_begin(s->src_timer);
> +    /* Stop the backpreassure timer */
> +    ptimer_stop(s->src_timer);
> +
> +    while (s->regs[R_SIZE] && !xlnx_csu_dma_is_paused(s) &&
> +           stream_can_push(s->tx_dev, xlnx_csu_dma_src_notify, s)) {
> +        uint32_t plen = MIN(s->regs[R_SIZE], sizeof buf);
> +        bool eop = false;
> +
> +        /* Did we fit it all? */
> +        if (s->regs[R_SIZE] == plen && xlnx_csu_dma_get_eop(s)) {
> +            eop = true;
> +        }
> +
> +        /* DMA transfer */
> +        xlnx_csu_dma_read(s, buf, plen);
> +        rlen = stream_push(s->tx_dev, buf, plen, eop);
> +        xlnx_csu_dma_advance(s, rlen);
> +    }
> +
> +    if (xlnx_csu_dma_timeout_enabled(s) && s->regs[R_SIZE] &&
> +        !stream_can_push(s->tx_dev, xlnx_csu_dma_src_notify, s)) {
> +        uint32_t timeout = ARRAY_FIELD_EX32(s->regs, CTRL, TIMEOUT_VAL);
> +        uint32_t div = ARRAY_FIELD_EX32(s->regs, CTRL2, TIMEOUT_PRE) + 1;
> +        uint32_t freq = XLNX_CSU_DMA_TIMER_FREQ;
> +
> +        freq /= div;
> +        ptimer_set_freq(s->src_timer, freq);
> +        ptimer_set_count(s->src_timer, timeout);
> +        ptimer_run(s->src_timer, 1);
> +    }
> +
> +    ptimer_transaction_commit(s->src_timer);
> +    xlnx_csu_dma_update_irq(s);
> +}
> +
> +static uint64_t addr_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    /* Address is word aligned */
> +    return val & R_ADDR_ADDR_MASK;
> +}
> +
> +static uint64_t size_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    if (s->regs[R_SIZE] != 0) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "%s: Starting DMA while already running.\n", __func__);
> +    }
> +
> +    if (!s->is_dst) {
> +        s->r_size_last_word = !!(val & R_SIZE_LAST_WORD_MASK);
> +    }
> +
> +    /* Size is word aligned */
> +    return val & R_SIZE_SIZE_MASK;
> +}
> +
> +static uint64_t size_post_read(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    return val | s->r_size_last_word;
> +}
> +
> +static void size_post_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    s->regs[R_STATUS] |= R_STATUS_BUSY_MASK;
> +
> +    /*
> +     * Note that if SIZE is programmed to 0, and the DMA is started,
> +     * the interrupts DONE and MEM_DONE will be asserted.
> +     */
> +    if (s->regs[R_SIZE] == 0) {
> +        xlnx_csu_dma_done(s);
> +        xlnx_csu_dma_update_irq(s);
> +        return;
> +    }
> +
> +    /* Set SIZE is considered the last step in transfer configuration */
> +    if (!s->is_dst) {
> +        xlnx_csu_dma_src_notify(s);
> +    } else {
> +        if (s->notify) {
> +            s->notify(s->notify_opaque);
> +        }
> +    }
> +}
> +
> +static uint64_t status_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    return val & (R_STATUS_DONE_CNT_MASK | R_STATUS_BUSY_MASK);
> +}
> +
> +static uint64_t addr_msb_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    return val & R_ADDR_MSB_ADDR_MSB_MASK;
> +}
> +
> +static void ctrl_post_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    if (!s->is_dst) {
> +        if (!xlnx_csu_dma_is_paused(s)) {
> +            xlnx_csu_dma_src_notify(s);
> +        }
> +    } else {
> +        if (!xlnx_csu_dma_is_paused(s) && s->notify) {
> +            s->notify(s->notify_opaque);
> +        }
> +    }
> +}
> +
> +static uint64_t int_status_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    /* DMA counter decrements when flag 'DONE' is cleared */
> +    if ((val & s->regs[R_INT_STATUS] & R_INT_STATUS_DONE_MASK)) {
> +        xlnx_csu_dma_update_done_cnt(s, -1);
> +    }
> +
> +    return s->regs[R_INT_STATUS] & ~val;
> +}
> +
> +static void int_status_post_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    xlnx_csu_dma_update_irq(s);
> +}
> +
> +static uint64_t int_enable_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +    uint32_t v32 = val;
> +
> +    /*
> +     * R_INT_ENABLE doesn't have its own state.
> +     * It is used to indirectly modify R_INT_MASK.
> +     *
> +     * 1: Enable this interrupt field (the mask bit will be cleared to 0)
> +     * 0: No effect
> +     */
> +    s->regs[R_INT_MASK] &= ~v32;
> +    return 0;
> +}
> +
> +static void int_enable_post_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    xlnx_csu_dma_update_irq(s);
> +}
> +
> +static uint64_t int_disable_pre_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +    uint32_t v32 = val;
> +
> +    /*
> +     * R_INT_DISABLE doesn't have its own state.
> +     * It is used to indirectly modify R_INT_MASK.
> +     *
> +     * 1: Disable this interrupt field (the mask bit will be set to 1)
> +     * 0: No effect
> +     */
> +    s->regs[R_INT_MASK] |= v32;
> +    return 0;
> +}
> +
> +static void int_disable_post_write(RegisterInfo *reg, uint64_t val)
> +{
> +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> +
> +    xlnx_csu_dma_update_irq(s);
> +}
> +
> +static const RegisterAccessInfo *xlnx_csu_dma_regs_info[] = {
> +#define DMACH_REGINFO(NAME, snd)                                              \
> +    (const RegisterAccessInfo []) {                                           \
> +        {                                                                     \
> +            .name = #NAME "_ADDR",                                            \
> +            .addr = A_ADDR,                                                   \
> +            .pre_write = addr_pre_write                                       \
> +        }, {                                                                  \
> +            .name = #NAME "_SIZE",                                            \
> +            .addr = A_SIZE,                                                   \
> +            .pre_write = size_pre_write,                                      \
> +            .post_write = size_post_write,                                    \
> +            .post_read = size_post_read                                       \
> +        }, {                                                                  \
> +            .name = #NAME "_STATUS",                                          \
> +            .addr = A_STATUS,                                                 \
> +            .pre_write = status_pre_write,                                    \

The 3 DONE_CNT bits in R_STATUS need to be w1c so that SW can clear the counter.

            .w1c = R_STATUS_DONE_CNT_MASK,                                    \


For reference, I'm attaching the complete diff I used to pass the testsuite:

diff --git a/hw/dma/xlnx_csu_dma.c b/hw/dma/xlnx_csu_dma.c
index ec06fefa88..cc9eeec74b 100644
--- a/hw/dma/xlnx_csu_dma.c
+++ b/hw/dma/xlnx_csu_dma.c
@@ -155,7 +155,7 @@ static void xlnx_csu_dma_data_process(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
     uint32_t i;
 
     bswap = s->regs[R_CTRL] & R_CTRL_ENDIANNESS_MASK;
-    if (!s->is_dst && !bswap) {
+    if (s->is_dst && !bswap) {
         /* Fast when ENDIANNESS cleared or it's SRC */
         return;
     }
@@ -185,6 +185,11 @@ static void xlnx_csu_dma_data_process(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
     }
 }
 
+static void xlnx_csu_dma_update_irq(XlnxCSUDMA *s)
+{
+    qemu_set_irq(s->irq, !!(s->regs[R_INT_STATUS] & ~s->regs[R_INT_MASK]));
+}
+
 /* len is in bytes */
 static uint32_t xlnx_csu_dma_write(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
 {
@@ -206,13 +211,13 @@ static uint32_t xlnx_csu_dma_write(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
         result = address_space_rw(s->dma_as, addr, s->attr, buf, len, true);
     }
 
-    if (result == MEMTX_OK) {
-        return len;
-    } else {
+    if (result != MEMTX_OK) {
         qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem write",
                       __func__, addr);
-        return 0;
+        s->regs[R_INT_STATUS] |= R_INT_STATUS_AXI_BRESP_ERR_MASK;
+        xlnx_csu_dma_update_irq(s);
     }
+    return len;
 }
 
 /* len is in bytes */
@@ -236,17 +241,13 @@ static uint32_t xlnx_csu_dma_read(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
 
     if (result == MEMTX_OK) {
         xlnx_csu_dma_data_process(s, buf, len);
-        return len;
     } else {
         qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem read",
                       __func__, addr);
-        return 0;
+        s->regs[R_INT_STATUS] |= R_INT_STATUS_AXI_BRESP_ERR_MASK;
+        xlnx_csu_dma_update_irq(s);
     }
-}
-
-static void xlnx_csu_dma_update_irq(XlnxCSUDMA *s)
-{
-    qemu_set_irq(s->irq, !!(s->regs[R_INT_STATUS] & ~s->regs[R_INT_MASK]));
+    return len;
 }
 
 static void xlnx_csu_dma_done(XlnxCSUDMA *s)
@@ -489,6 +490,7 @@ static const RegisterAccessInfo *xlnx_csu_dma_regs_info[] = {
             .name = #NAME "_STATUS",                                          \
             .addr = A_STATUS,                                                 \
             .pre_write = status_pre_write,                                    \
+            .w1c = R_STATUS_DONE_CNT_MASK,                                    \
             .ro = (R_STATUS_BUSY_MASK                                         \
                    | R_STATUS_FIFO_LEVEL_MASK                                 \
                    | R_STATUS_OUTSTANDING_MASK)                               \
Bin Meng March 2, 2021, 3:16 p.m. UTC | #2
Hi Edgar,

On Tue, Mar 2, 2021 at 11:03 PM Edgar E. Iglesias
<edgar.iglesias@gmail.com> wrote:
>
> On Mon, Mar 01, 2021 at 09:20:07PM +0800, Bin Meng wrote:
> > From: Xuzhou Cheng <xuzhou.cheng@windriver.com>
> >
> > ZynqMP QSPI supports SPI transfer using DMA mode, but currently this
> > is unimplemented. When QSPI is programmed to use DMA mode, QEMU will
> > crash. This is observed when testing VxWorks 7.
> >
> > This adds a Xilinx CSU DMA model and the implementation is based on
> > https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c.
> > The DST part of the model is verified along with ZynqMP GQSPI model.
>
>
> This looked good to me so I moved it into our tree and ran our
> testsuite which discovered a couple of more issues.
>
> I'll comment inline and show the fixes I made to make the tests pass.
>
>
> >
> > Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >
> > ---
> >
> > Changes in v6:
> > - int_enable and int_disable do not have its own state
> > - return 0 in int_enable_pre_write() and int_disable_pre_write()
> > - remove XLNX_CSU_DMA_INT_R_MASK in int_status_pre_write()
> >
> > Changes in v5:
> > - int_enable and int_disable do not affect each other
> > - do not modify int_status int_disable_pre_write
> > - set MEM_DONE in xlnx_csu_dma_done if it's SRC
> >
> > Changes in v4:
> > - Add complete CSU DMA model based on Edgar's branch
> > - Differences with Edgar's branch:
> >   1. Match the registers' FIELD to UG1807.
> >   2. Remove "byte-align" property. Per UG1807, SIZE and ADDR registers
> >      must be word aligned.
> >   3. Make the values of int_enable and int_disable mutually exclusive
> >      otherwise IRQ cannot be delivered.
> >   4. Clear int_status after int_disable is set.
> >   5. Coding convention issues clean-up
> >
> > Changes in v3:
> > - Implement DMA as a separate CSU DMA model
> >
> > Changes in v2:
> > - Remove unconnected TYPE_STREAM_SINK link property
> > - Add a TYPE_MEMORY_REGION link property, to allow board codes to tell
> >   the device what its view of the world that it is doing DMA to is
> > - Replace cpu_physical_memory_write() with address_space_write()
> >
> >  include/hw/dma/xlnx_csu_dma.h |  52 +++
> >  hw/dma/xlnx_csu_dma.c         | 743 ++++++++++++++++++++++++++++++++++
> >  hw/dma/Kconfig                |   4 +
> >  hw/dma/meson.build            |   1 +
> >  4 files changed, 800 insertions(+)
> >  create mode 100644 include/hw/dma/xlnx_csu_dma.h
> >  create mode 100644 hw/dma/xlnx_csu_dma.c
> >
> > diff --git a/include/hw/dma/xlnx_csu_dma.h b/include/hw/dma/xlnx_csu_dma.h
> > new file mode 100644
> > index 0000000000..204d94c673
> > --- /dev/null
> > +++ b/include/hw/dma/xlnx_csu_dma.h
> > @@ -0,0 +1,52 @@
> > +/*
> > + * Xilinx Platform CSU Stream DMA emulation
> > + *
> > + * This implementation is based on
> > + * https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License as
> > + * published by the Free Software Foundation; either version 2 or
> > + * (at your option) version 3 of the License.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#ifndef XLNX_CSU_DMA_H
> > +#define XLNX_CSU_DMA_H
> > +
> > +#define TYPE_XLNX_CSU_DMA "xlnx.csu_dma"
> > +
> > +#define XLNX_CSU_DMA_R_MAX (0x2c / 4)
> > +
> > +typedef struct XlnxCSUDMA {
> > +    SysBusDevice busdev;
> > +    MemoryRegion iomem;
> > +    MemTxAttrs attr;
> > +    MemoryRegion *dma_mr;
> > +    AddressSpace *dma_as;
> > +    qemu_irq irq;
> > +    StreamSink *tx_dev; /* Used as generic StreamSink */
> > +    ptimer_state *src_timer;
> > +
> > +    uint16_t width;
> > +    bool is_dst;
> > +    bool r_size_last_word;
> > +
> > +    StreamCanPushNotifyFn notify;
> > +    void *notify_opaque;
> > +
> > +    uint32_t regs[XLNX_CSU_DMA_R_MAX];
> > +    RegisterInfo regs_info[XLNX_CSU_DMA_R_MAX];
> > +} XlnxCSUDMA;
> > +
> > +#define XLNX_CSU_DMA(obj) \
> > +    OBJECT_CHECK(XlnxCSUDMA, (obj), TYPE_XLNX_CSU_DMA)
> > +
> > +#endif
> > diff --git a/hw/dma/xlnx_csu_dma.c b/hw/dma/xlnx_csu_dma.c
> > new file mode 100644
> > index 0000000000..5efb60dd9e
> > --- /dev/null
> > +++ b/hw/dma/xlnx_csu_dma.c
> > @@ -0,0 +1,743 @@
> > +/*
> > + * Xilinx Platform CSU Stream DMA emulation
> > + *
> > + * This implementation is based on
> > + * https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License as
> > + * published by the Free Software Foundation; either version 2 or
> > + * (at your option) version 3 of the License.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "qapi/error.h"
> > +#include "hw/hw.h"
> > +#include "hw/irq.h"
> > +#include "hw/qdev-properties.h"
> > +#include "hw/sysbus.h"
> > +#include "migration/vmstate.h"
> > +#include "sysemu/dma.h"
> > +#include "hw/ptimer.h"
> > +#include "hw/stream.h"
> > +#include "hw/register.h"
> > +#include "hw/dma/xlnx_csu_dma.h"
> > +
> > +/*
> > + * Ref: UG1087 (v1.7) February 8, 2019
> > + * https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html
> > + * CSUDMA Module section
> > + */
> > +REG32(ADDR, 0x0)
> > +    FIELD(ADDR, ADDR, 2, 30) /* wo */
> > +REG32(SIZE, 0x4)
> > +    FIELD(SIZE, SIZE, 2, 27) /* wo */
> > +    FIELD(SIZE, LAST_WORD, 0, 1) /* rw, only exists in SRC */
> > +REG32(STATUS, 0x8)
> > +    FIELD(STATUS, DONE_CNT, 13, 3) /* wtc */
> > +    FIELD(STATUS, FIFO_LEVEL, 5, 8) /* ro */
> > +    FIELD(STATUS, OUTSTANDING, 1, 4) /* ro */
> > +    FIELD(STATUS, BUSY, 0, 1) /* ro */
> > +REG32(CTRL, 0xc)
> > +    FIELD(CTRL, FIFOTHRESH, 25, 7) /* rw, only exists in DST, reset 0x40 */
> > +    FIELD(CTRL, APB_ERR_RESP, 24, 1) /* rw */
> > +    FIELD(CTRL, ENDIANNESS, 23, 1) /* rw */
> > +    FIELD(CTRL, AXI_BRST_TYPE, 22, 1) /* rw */
> > +    FIELD(CTRL, TIMEOUT_VAL, 10, 12) /* rw, reset: 0xFFE */
> > +    FIELD(CTRL, FIFO_THRESH, 2, 8) /* rw, reset: 0x80 */
> > +    FIELD(CTRL, PAUSE_STRM, 1, 1) /* rw */
> > +    FIELD(CTRL, PAUSE_MEM, 0, 1) /* rw */
> > +REG32(CRC, 0x10)
> > +REG32(INT_STATUS, 0x14)
> > +    FIELD(INT_STATUS, FIFO_OVERFLOW, 7, 1) /* wtc */
> > +    FIELD(INT_STATUS, INVALID_APB, 6, 1) /* wtc */
> > +    FIELD(INT_STATUS, THRESH_HIT, 5, 1) /* wtc */
> > +    FIELD(INT_STATUS, TIMEOUT_MEM, 4, 1) /* wtc */
> > +    FIELD(INT_STATUS, TIMEOUT_STRM, 3, 1) /* wtc */
> > +    FIELD(INT_STATUS, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
> > +    FIELD(INT_STATUS, DONE, 1, 1) /* wtc */
> > +    FIELD(INT_STATUS, MEM_DONE, 0, 1) /* wtc */
> > +REG32(INT_ENABLE, 0x18)
> > +    FIELD(INT_ENABLE, FIFO_OVERFLOW, 7, 1) /* wtc */
> > +    FIELD(INT_ENABLE, INVALID_APB, 6, 1) /* wtc */
> > +    FIELD(INT_ENABLE, THRESH_HIT, 5, 1) /* wtc */
> > +    FIELD(INT_ENABLE, TIMEOUT_MEM, 4, 1) /* wtc */
> > +    FIELD(INT_ENABLE, TIMEOUT_STRM, 3, 1) /* wtc */
> > +    FIELD(INT_ENABLE, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
> > +    FIELD(INT_ENABLE, DONE, 1, 1) /* wtc */
> > +    FIELD(INT_ENABLE, MEM_DONE, 0, 1) /* wtc */
> > +REG32(INT_DISABLE, 0x1c)
> > +    FIELD(INT_DISABLE, FIFO_OVERFLOW, 7, 1) /* wtc */
> > +    FIELD(INT_DISABLE, INVALID_APB, 6, 1) /* wtc */
> > +    FIELD(INT_DISABLE, THRESH_HIT, 5, 1) /* wtc */
> > +    FIELD(INT_DISABLE, TIMEOUT_MEM, 4, 1) /* wtc */
> > +    FIELD(INT_DISABLE, TIMEOUT_STRM, 3, 1) /* wtc */
> > +    FIELD(INT_DISABLE, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
> > +    FIELD(INT_DISABLE, DONE, 1, 1) /* wtc */
> > +    FIELD(INT_DISABLE, MEM_DONE, 0, 1) /* wtc */
> > +REG32(INT_MASK, 0x20)
> > +    FIELD(INT_MASK, FIFO_OVERFLOW, 7, 1) /* ro, reset: 0x1 */
> > +    FIELD(INT_MASK, INVALID_APB, 6, 1) /* ro, reset: 0x1 */
> > +    FIELD(INT_MASK, THRESH_HIT, 5, 1) /* ro, reset: 0x1 */
> > +    FIELD(INT_MASK, TIMEOUT_MEM, 4, 1) /* ro, reset: 0x1 */
> > +    FIELD(INT_MASK, TIMEOUT_STRM, 3, 1) /* ro, reset: 0x1 */
> > +    FIELD(INT_MASK, AXI_BRESP_ERR, 2, 1) /* ro, reset: 0x1, SRC: AXI_RDERR */
> > +    FIELD(INT_MASK, DONE, 1, 1) /* ro, reset: 0x1 */
> > +    FIELD(INT_MASK, MEM_DONE, 0, 1) /* ro, reset: 0x1 */
> > +REG32(CTRL2, 0x24)
> > +    FIELD(CTRL2, ARCACHE, 24, 3) /* rw */
> > +    FIELD(CTRL2, ROUTE_BIT, 23, 1) /* rw */
> > +    FIELD(CTRL2, TIMEOUT_EN, 22, 1) /* rw */
> > +    FIELD(CTRL2, TIMEOUT_PRE, 4, 12) /* rw, reset: 0xFFF */
> > +    FIELD(CTRL2, MAX_OUTS_CMDS, 0, 4) /* rw, reset: 0x8 */
> > +REG32(ADDR_MSB, 0x28)
> > +    FIELD(ADDR_MSB, ADDR_MSB, 0, 17) /* wo */
> > +
> > +#define R_CTRL_TIMEOUT_VAL_RESET    (0xFFE)
> > +#define R_CTRL_FIFO_THRESH_RESET    (0x80)
> > +#define R_CTRL_FIFOTHRESH_RESET     (0x40)
> > +
> > +#define R_CTRL2_TIMEOUT_PRE_RESET   (0xFFF)
> > +#define R_CTRL2_MAX_OUTS_CMDS_RESET (0x8)
> > +
> > +#define XLNX_CSU_DMA_ERR_DEBUG      (0)
> > +#define XLNX_CSU_DMA_INT_R_MASK     (0xff)
> > +
> > +/* UG1807: Set the prescaler value for the timeout in clk (~2.5ns) cycles */
> > +#define XLNX_CSU_DMA_TIMER_FREQ     (400 * 1000 * 1000)
> > +
> > +static bool xlnx_csu_dma_is_paused(XlnxCSUDMA *s)
> > +{
> > +    bool paused;
> > +
> > +    paused = !!(s->regs[R_CTRL] & R_CTRL_PAUSE_STRM_MASK);
> > +    paused |= !!(s->regs[R_CTRL] & R_CTRL_PAUSE_MEM_MASK);
> > +
> > +    return paused;
> > +}
> > +
> > +static bool xlnx_csu_dma_get_eop(XlnxCSUDMA *s)
> > +{
> > +    return s->r_size_last_word;
> > +}
> > +
> > +static bool xlnx_csu_dma_burst_is_fixed(XlnxCSUDMA *s)
> > +{
> > +    return !!(s->regs[R_CTRL] & R_CTRL_AXI_BRST_TYPE_MASK);
> > +}
> > +
> > +static bool xlnx_csu_dma_timeout_enabled(XlnxCSUDMA *s)
> > +{
> > +    return !!(s->regs[R_CTRL2] & R_CTRL2_TIMEOUT_EN_MASK);
> > +}
> > +
> > +static void xlnx_csu_dma_update_done_cnt(XlnxCSUDMA *s, int a)
> > +{
> > +    int cnt;
> > +
> > +    /* Increase DONE_CNT */
> > +    cnt = ARRAY_FIELD_EX32(s->regs, STATUS, DONE_CNT) + a;
> > +    ARRAY_FIELD_DP32(s->regs, STATUS, DONE_CNT, cnt);
> > +}
> > +
> > +static void xlnx_csu_dma_data_process(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
> > +{
> > +    uint32_t bswap;
> > +    uint32_t i;
> > +
> > +    bswap = s->regs[R_CTRL] & R_CTRL_ENDIANNESS_MASK;
> > +    if (!s->is_dst && !bswap) {
>
> This test is the wrong way, it should say:
>
> if (s->is_dst && !bswap) {
>
>
>
> > +        /* Fast when ENDIANNESS cleared or it's SRC */
> > +        return;
> > +    }
> > +
> > +    for (i = 0; i < len; i += 4) {
> > +        uint8_t *b = &buf[i];
> > +        union {
> > +            uint8_t u8[4];
> > +            uint32_t u32;
> > +        } v = {
> > +            .u8 = { b[0], b[1], b[2], b[3] }
> > +        };
> > +
> > +        if (!s->is_dst) {
> > +            s->regs[R_CRC] += v.u32;
> > +        }
> > +        if (bswap) {
> > +            /*
> > +             * No point using bswap, we need to writeback
> > +             * into a potentially unaligned pointer.
> > +             */
> > +            b[0] = v.u8[3];
> > +            b[1] = v.u8[2];
> > +            b[2] = v.u8[1];
> > +            b[3] = v.u8[0];
> > +        }
> > +    }
> > +}
> > +
> > +/* len is in bytes */
> > +static uint32_t xlnx_csu_dma_write(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
> > +{
> > +    hwaddr addr = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
> > +    MemTxResult result = MEMTX_OK;
> > +
> > +    xlnx_csu_dma_data_process(s, buf, len);
> > +    if (xlnx_csu_dma_burst_is_fixed(s)) {
> > +        uint32_t i;
> > +
> > +        for (i = 0; i < len && (result == MEMTX_OK); i += s->width) {
> > +            uint32_t mlen = MIN(len - i, s->width);
> > +
> > +            result = address_space_rw(s->dma_as, addr, s->attr,
> > +                                      buf, mlen, true);
> > +            buf += mlen;
> > +        }
> > +    } else {
> > +        result = address_space_rw(s->dma_as, addr, s->attr, buf, len, true);
> > +    }
> > +
> > +    if (result == MEMTX_OK) {
> > +        return len;
> > +    } else {
> > +        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem write",
> > +                      __func__, addr);
> > +        return 0;
> > +    }
>
> This causes a QEMU hang when accessing areas that are not reachable.
>
> We should always return len here even when the access fails and
> bonus to set the AXI_BRESP_ERR flag in INT_STATUS (which we had
> missed in our model aswell).
>
> e.g:
>     if (result != MEMTX_OK) {
>         qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem write",
>                       __func__, addr);
>         s->regs[R_INT_STATUS] |= R_INT_STATUS_AXI_BRESP_ERR_MASK;
>         xlnx_csu_dma_update_irq(s);
>     }
>     return len;
>
>
> > +}
> > +
> > +/* len is in bytes */
> > +static uint32_t xlnx_csu_dma_read(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
> > +{
> > +    hwaddr addr = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
> > +    MemTxResult result = MEMTX_OK;
> > +
> > +    if (xlnx_csu_dma_burst_is_fixed(s)) {
> > +        uint32_t i;
> > +
> > +        for (i = 0; i < len && (result == MEMTX_OK); i += s->width) {
> > +            uint32_t mlen = MIN(len - i, s->width);
> > +
> > +            result = address_space_rw(s->dma_as, addr, s->attr,
> > +                                      buf + i, mlen, false);
> > +        }
> > +    } else {
> > +        result = address_space_rw(s->dma_as, addr, s->attr, buf, len, false);
> > +    }
> > +
> > +    if (result == MEMTX_OK) {
> > +        xlnx_csu_dma_data_process(s, buf, len);
> > +        return len;
> > +    } else {
> > +        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem read",
> > +                      __func__, addr);
> > +        return 0;
> > +    }
>
> Similar as for writes, we need to return len and set the RESP ERR irq.
>
>
> > +}
> > +
> > +static void xlnx_csu_dma_update_irq(XlnxCSUDMA *s)
> > +{
> > +    qemu_set_irq(s->irq, !!(s->regs[R_INT_STATUS] & ~s->regs[R_INT_MASK]));
> > +}
> > +
> > +static void xlnx_csu_dma_done(XlnxCSUDMA *s)
> > +{
> > +    s->regs[R_STATUS] &= ~R_STATUS_BUSY_MASK;
> > +    s->regs[R_INT_STATUS] |= R_INT_STATUS_DONE_MASK;
> > +
> > +    if (!s->is_dst) {
> > +        s->regs[R_INT_STATUS] |= R_INT_STATUS_MEM_DONE_MASK;
> > +    }
> > +
> > +    xlnx_csu_dma_update_done_cnt(s, 1);
> > +}
> > +
> > +static uint32_t xlnx_csu_dma_advance(XlnxCSUDMA *s, uint32_t len)
> > +{
> > +    uint32_t size = s->regs[R_SIZE];
> > +    hwaddr dst = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
> > +
> > +    assert(len <= size);
> > +
> > +    size -= len;
> > +    s->regs[R_SIZE] = size;
> > +
> > +    if (!xlnx_csu_dma_burst_is_fixed(s)) {
> > +        dst += len;
> > +        s->regs[R_ADDR] = (uint32_t) dst;
> > +        s->regs[R_ADDR_MSB] = dst >> 32;
> > +    }
> > +
> > +    if (size == 0) {
> > +        xlnx_csu_dma_done(s);
> > +    }
> > +
> > +    return size;
> > +}
> > +
> > +static void xlnx_csu_dma_src_notify(void *opaque)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(opaque);
> > +    unsigned char buf[4 * 1024];
> > +    size_t rlen = 0;
> > +
> > +    ptimer_transaction_begin(s->src_timer);
> > +    /* Stop the backpreassure timer */
> > +    ptimer_stop(s->src_timer);
> > +
> > +    while (s->regs[R_SIZE] && !xlnx_csu_dma_is_paused(s) &&
> > +           stream_can_push(s->tx_dev, xlnx_csu_dma_src_notify, s)) {
> > +        uint32_t plen = MIN(s->regs[R_SIZE], sizeof buf);
> > +        bool eop = false;
> > +
> > +        /* Did we fit it all? */
> > +        if (s->regs[R_SIZE] == plen && xlnx_csu_dma_get_eop(s)) {
> > +            eop = true;
> > +        }
> > +
> > +        /* DMA transfer */
> > +        xlnx_csu_dma_read(s, buf, plen);
> > +        rlen = stream_push(s->tx_dev, buf, plen, eop);
> > +        xlnx_csu_dma_advance(s, rlen);
> > +    }
> > +
> > +    if (xlnx_csu_dma_timeout_enabled(s) && s->regs[R_SIZE] &&
> > +        !stream_can_push(s->tx_dev, xlnx_csu_dma_src_notify, s)) {
> > +        uint32_t timeout = ARRAY_FIELD_EX32(s->regs, CTRL, TIMEOUT_VAL);
> > +        uint32_t div = ARRAY_FIELD_EX32(s->regs, CTRL2, TIMEOUT_PRE) + 1;
> > +        uint32_t freq = XLNX_CSU_DMA_TIMER_FREQ;
> > +
> > +        freq /= div;
> > +        ptimer_set_freq(s->src_timer, freq);
> > +        ptimer_set_count(s->src_timer, timeout);
> > +        ptimer_run(s->src_timer, 1);
> > +    }
> > +
> > +    ptimer_transaction_commit(s->src_timer);
> > +    xlnx_csu_dma_update_irq(s);
> > +}
> > +
> > +static uint64_t addr_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    /* Address is word aligned */
> > +    return val & R_ADDR_ADDR_MASK;
> > +}
> > +
> > +static uint64_t size_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    if (s->regs[R_SIZE] != 0) {
> > +        qemu_log_mask(LOG_GUEST_ERROR,
> > +                      "%s: Starting DMA while already running.\n", __func__);
> > +    }
> > +
> > +    if (!s->is_dst) {
> > +        s->r_size_last_word = !!(val & R_SIZE_LAST_WORD_MASK);
> > +    }
> > +
> > +    /* Size is word aligned */
> > +    return val & R_SIZE_SIZE_MASK;
> > +}
> > +
> > +static uint64_t size_post_read(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    return val | s->r_size_last_word;
> > +}
> > +
> > +static void size_post_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    s->regs[R_STATUS] |= R_STATUS_BUSY_MASK;
> > +
> > +    /*
> > +     * Note that if SIZE is programmed to 0, and the DMA is started,
> > +     * the interrupts DONE and MEM_DONE will be asserted.
> > +     */
> > +    if (s->regs[R_SIZE] == 0) {
> > +        xlnx_csu_dma_done(s);
> > +        xlnx_csu_dma_update_irq(s);
> > +        return;
> > +    }
> > +
> > +    /* Set SIZE is considered the last step in transfer configuration */
> > +    if (!s->is_dst) {
> > +        xlnx_csu_dma_src_notify(s);
> > +    } else {
> > +        if (s->notify) {
> > +            s->notify(s->notify_opaque);
> > +        }
> > +    }
> > +}
> > +
> > +static uint64_t status_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    return val & (R_STATUS_DONE_CNT_MASK | R_STATUS_BUSY_MASK);
> > +}
> > +
> > +static uint64_t addr_msb_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    return val & R_ADDR_MSB_ADDR_MSB_MASK;
> > +}
> > +
> > +static void ctrl_post_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    if (!s->is_dst) {
> > +        if (!xlnx_csu_dma_is_paused(s)) {
> > +            xlnx_csu_dma_src_notify(s);
> > +        }
> > +    } else {
> > +        if (!xlnx_csu_dma_is_paused(s) && s->notify) {
> > +            s->notify(s->notify_opaque);
> > +        }
> > +    }
> > +}
> > +
> > +static uint64_t int_status_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    /* DMA counter decrements when flag 'DONE' is cleared */
> > +    if ((val & s->regs[R_INT_STATUS] & R_INT_STATUS_DONE_MASK)) {
> > +        xlnx_csu_dma_update_done_cnt(s, -1);
> > +    }
> > +
> > +    return s->regs[R_INT_STATUS] & ~val;
> > +}
> > +
> > +static void int_status_post_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    xlnx_csu_dma_update_irq(s);
> > +}
> > +
> > +static uint64_t int_enable_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +    uint32_t v32 = val;
> > +
> > +    /*
> > +     * R_INT_ENABLE doesn't have its own state.
> > +     * It is used to indirectly modify R_INT_MASK.
> > +     *
> > +     * 1: Enable this interrupt field (the mask bit will be cleared to 0)
> > +     * 0: No effect
> > +     */
> > +    s->regs[R_INT_MASK] &= ~v32;
> > +    return 0;
> > +}
> > +
> > +static void int_enable_post_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    xlnx_csu_dma_update_irq(s);
> > +}
> > +
> > +static uint64_t int_disable_pre_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +    uint32_t v32 = val;
> > +
> > +    /*
> > +     * R_INT_DISABLE doesn't have its own state.
> > +     * It is used to indirectly modify R_INT_MASK.
> > +     *
> > +     * 1: Disable this interrupt field (the mask bit will be set to 1)
> > +     * 0: No effect
> > +     */
> > +    s->regs[R_INT_MASK] |= v32;
> > +    return 0;
> > +}
> > +
> > +static void int_disable_post_write(RegisterInfo *reg, uint64_t val)
> > +{
> > +    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
> > +
> > +    xlnx_csu_dma_update_irq(s);
> > +}
> > +
> > +static const RegisterAccessInfo *xlnx_csu_dma_regs_info[] = {
> > +#define DMACH_REGINFO(NAME, snd)                                              \
> > +    (const RegisterAccessInfo []) {                                           \
> > +        {                                                                     \
> > +            .name = #NAME "_ADDR",                                            \
> > +            .addr = A_ADDR,                                                   \
> > +            .pre_write = addr_pre_write                                       \
> > +        }, {                                                                  \
> > +            .name = #NAME "_SIZE",                                            \
> > +            .addr = A_SIZE,                                                   \
> > +            .pre_write = size_pre_write,                                      \
> > +            .post_write = size_post_write,                                    \
> > +            .post_read = size_post_read                                       \
> > +        }, {                                                                  \
> > +            .name = #NAME "_STATUS",                                          \
> > +            .addr = A_STATUS,                                                 \
> > +            .pre_write = status_pre_write,                                    \
>
> The 3 DONE_CNT bits in R_STATUS need to be w1c so that SW can clear the counter.
>
>             .w1c = R_STATUS_DONE_CNT_MASK,                                    \
>
>
> For reference, I'm attaching the complete diff I used to pass the testsuite:
>

Thanks for the review and testing. We will incorporate your diff, test
and send v7.

Should we include your SoB tag for the diff?

Regards,
Bin
Edgar E. Iglesias March 2, 2021, 3:19 p.m. UTC | #3
On Tue, Mar 02, 2021 at 11:16:10PM +0800, Bin Meng wrote:
> Hi Edgar,

Hi Bin,


> 
> On Tue, Mar 2, 2021 at 11:03 PM Edgar E. Iglesias
> <edgar.iglesias@gmail.com> wrote:
> >
> > On Mon, Mar 01, 2021 at 09:20:07PM +0800, Bin Meng wrote:
> > > From: Xuzhou Cheng <xuzhou.cheng@windriver.com>
> > >
> > > ZynqMP QSPI supports SPI transfer using DMA mode, but currently this

....

> > The 3 DONE_CNT bits in R_STATUS need to be w1c so that SW can clear the counter.
> >
> >             .w1c = R_STATUS_DONE_CNT_MASK,                                    \
> >
> >
> > For reference, I'm attaching the complete diff I used to pass the testsuite:
> >
> 
> Thanks for the review and testing. We will incorporate your diff, test
> and send v7.
> 
> Should we include your SoB tag for the diff?

If you take the changes as from my example, you can include:

Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>

Thanks & Best regards,
Edgar
Bin Meng March 2, 2021, 3:21 p.m. UTC | #4
Hi Edgar,

On Tue, Mar 2, 2021 at 11:19 PM Edgar E. Iglesias
<edgar.iglesias@gmail.com> wrote:
>
> On Tue, Mar 02, 2021 at 11:16:10PM +0800, Bin Meng wrote:
> > Hi Edgar,
>
> Hi Bin,
>
>
> >
> > On Tue, Mar 2, 2021 at 11:03 PM Edgar E. Iglesias
> > <edgar.iglesias@gmail.com> wrote:
> > >
> > > On Mon, Mar 01, 2021 at 09:20:07PM +0800, Bin Meng wrote:
> > > > From: Xuzhou Cheng <xuzhou.cheng@windriver.com>
> > > >
> > > > ZynqMP QSPI supports SPI transfer using DMA mode, but currently this
>
> ....
>
> > > The 3 DONE_CNT bits in R_STATUS need to be w1c so that SW can clear the counter.
> > >
> > >             .w1c = R_STATUS_DONE_CNT_MASK,                                    \
> > >
> > >
> > > For reference, I'm attaching the complete diff I used to pass the testsuite:
> > >
> >
> > Thanks for the review and testing. We will incorporate your diff, test
> > and send v7.
> >
> > Should we include your SoB tag for the diff?
>
> If you take the changes as from my example, you can include:
>
> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>

Great, thank you!

Regards,
Bin
diff mbox series

Patch

diff --git a/include/hw/dma/xlnx_csu_dma.h b/include/hw/dma/xlnx_csu_dma.h
new file mode 100644
index 0000000000..204d94c673
--- /dev/null
+++ b/include/hw/dma/xlnx_csu_dma.h
@@ -0,0 +1,52 @@ 
+/*
+ * Xilinx Platform CSU Stream DMA emulation
+ *
+ * This implementation is based on
+ * https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) version 3 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XLNX_CSU_DMA_H
+#define XLNX_CSU_DMA_H
+
+#define TYPE_XLNX_CSU_DMA "xlnx.csu_dma"
+
+#define XLNX_CSU_DMA_R_MAX (0x2c / 4)
+
+typedef struct XlnxCSUDMA {
+    SysBusDevice busdev;
+    MemoryRegion iomem;
+    MemTxAttrs attr;
+    MemoryRegion *dma_mr;
+    AddressSpace *dma_as;
+    qemu_irq irq;
+    StreamSink *tx_dev; /* Used as generic StreamSink */
+    ptimer_state *src_timer;
+
+    uint16_t width;
+    bool is_dst;
+    bool r_size_last_word;
+
+    StreamCanPushNotifyFn notify;
+    void *notify_opaque;
+
+    uint32_t regs[XLNX_CSU_DMA_R_MAX];
+    RegisterInfo regs_info[XLNX_CSU_DMA_R_MAX];
+} XlnxCSUDMA;
+
+#define XLNX_CSU_DMA(obj) \
+    OBJECT_CHECK(XlnxCSUDMA, (obj), TYPE_XLNX_CSU_DMA)
+
+#endif
diff --git a/hw/dma/xlnx_csu_dma.c b/hw/dma/xlnx_csu_dma.c
new file mode 100644
index 0000000000..5efb60dd9e
--- /dev/null
+++ b/hw/dma/xlnx_csu_dma.c
@@ -0,0 +1,743 @@ 
+/*
+ * Xilinx Platform CSU Stream DMA emulation
+ *
+ * This implementation is based on
+ * https://github.com/Xilinx/qemu/blob/master/hw/dma/csu_stream_dma.c
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) version 3 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "hw/hw.h"
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "migration/vmstate.h"
+#include "sysemu/dma.h"
+#include "hw/ptimer.h"
+#include "hw/stream.h"
+#include "hw/register.h"
+#include "hw/dma/xlnx_csu_dma.h"
+
+/*
+ * Ref: UG1087 (v1.7) February 8, 2019
+ * https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html
+ * CSUDMA Module section
+ */
+REG32(ADDR, 0x0)
+    FIELD(ADDR, ADDR, 2, 30) /* wo */
+REG32(SIZE, 0x4)
+    FIELD(SIZE, SIZE, 2, 27) /* wo */
+    FIELD(SIZE, LAST_WORD, 0, 1) /* rw, only exists in SRC */
+REG32(STATUS, 0x8)
+    FIELD(STATUS, DONE_CNT, 13, 3) /* wtc */
+    FIELD(STATUS, FIFO_LEVEL, 5, 8) /* ro */
+    FIELD(STATUS, OUTSTANDING, 1, 4) /* ro */
+    FIELD(STATUS, BUSY, 0, 1) /* ro */
+REG32(CTRL, 0xc)
+    FIELD(CTRL, FIFOTHRESH, 25, 7) /* rw, only exists in DST, reset 0x40 */
+    FIELD(CTRL, APB_ERR_RESP, 24, 1) /* rw */
+    FIELD(CTRL, ENDIANNESS, 23, 1) /* rw */
+    FIELD(CTRL, AXI_BRST_TYPE, 22, 1) /* rw */
+    FIELD(CTRL, TIMEOUT_VAL, 10, 12) /* rw, reset: 0xFFE */
+    FIELD(CTRL, FIFO_THRESH, 2, 8) /* rw, reset: 0x80 */
+    FIELD(CTRL, PAUSE_STRM, 1, 1) /* rw */
+    FIELD(CTRL, PAUSE_MEM, 0, 1) /* rw */
+REG32(CRC, 0x10)
+REG32(INT_STATUS, 0x14)
+    FIELD(INT_STATUS, FIFO_OVERFLOW, 7, 1) /* wtc */
+    FIELD(INT_STATUS, INVALID_APB, 6, 1) /* wtc */
+    FIELD(INT_STATUS, THRESH_HIT, 5, 1) /* wtc */
+    FIELD(INT_STATUS, TIMEOUT_MEM, 4, 1) /* wtc */
+    FIELD(INT_STATUS, TIMEOUT_STRM, 3, 1) /* wtc */
+    FIELD(INT_STATUS, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
+    FIELD(INT_STATUS, DONE, 1, 1) /* wtc */
+    FIELD(INT_STATUS, MEM_DONE, 0, 1) /* wtc */
+REG32(INT_ENABLE, 0x18)
+    FIELD(INT_ENABLE, FIFO_OVERFLOW, 7, 1) /* wtc */
+    FIELD(INT_ENABLE, INVALID_APB, 6, 1) /* wtc */
+    FIELD(INT_ENABLE, THRESH_HIT, 5, 1) /* wtc */
+    FIELD(INT_ENABLE, TIMEOUT_MEM, 4, 1) /* wtc */
+    FIELD(INT_ENABLE, TIMEOUT_STRM, 3, 1) /* wtc */
+    FIELD(INT_ENABLE, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
+    FIELD(INT_ENABLE, DONE, 1, 1) /* wtc */
+    FIELD(INT_ENABLE, MEM_DONE, 0, 1) /* wtc */
+REG32(INT_DISABLE, 0x1c)
+    FIELD(INT_DISABLE, FIFO_OVERFLOW, 7, 1) /* wtc */
+    FIELD(INT_DISABLE, INVALID_APB, 6, 1) /* wtc */
+    FIELD(INT_DISABLE, THRESH_HIT, 5, 1) /* wtc */
+    FIELD(INT_DISABLE, TIMEOUT_MEM, 4, 1) /* wtc */
+    FIELD(INT_DISABLE, TIMEOUT_STRM, 3, 1) /* wtc */
+    FIELD(INT_DISABLE, AXI_BRESP_ERR, 2, 1) /* wtc, SRC: AXI_RDERR */
+    FIELD(INT_DISABLE, DONE, 1, 1) /* wtc */
+    FIELD(INT_DISABLE, MEM_DONE, 0, 1) /* wtc */
+REG32(INT_MASK, 0x20)
+    FIELD(INT_MASK, FIFO_OVERFLOW, 7, 1) /* ro, reset: 0x1 */
+    FIELD(INT_MASK, INVALID_APB, 6, 1) /* ro, reset: 0x1 */
+    FIELD(INT_MASK, THRESH_HIT, 5, 1) /* ro, reset: 0x1 */
+    FIELD(INT_MASK, TIMEOUT_MEM, 4, 1) /* ro, reset: 0x1 */
+    FIELD(INT_MASK, TIMEOUT_STRM, 3, 1) /* ro, reset: 0x1 */
+    FIELD(INT_MASK, AXI_BRESP_ERR, 2, 1) /* ro, reset: 0x1, SRC: AXI_RDERR */
+    FIELD(INT_MASK, DONE, 1, 1) /* ro, reset: 0x1 */
+    FIELD(INT_MASK, MEM_DONE, 0, 1) /* ro, reset: 0x1 */
+REG32(CTRL2, 0x24)
+    FIELD(CTRL2, ARCACHE, 24, 3) /* rw */
+    FIELD(CTRL2, ROUTE_BIT, 23, 1) /* rw */
+    FIELD(CTRL2, TIMEOUT_EN, 22, 1) /* rw */
+    FIELD(CTRL2, TIMEOUT_PRE, 4, 12) /* rw, reset: 0xFFF */
+    FIELD(CTRL2, MAX_OUTS_CMDS, 0, 4) /* rw, reset: 0x8 */
+REG32(ADDR_MSB, 0x28)
+    FIELD(ADDR_MSB, ADDR_MSB, 0, 17) /* wo */
+
+#define R_CTRL_TIMEOUT_VAL_RESET    (0xFFE)
+#define R_CTRL_FIFO_THRESH_RESET    (0x80)
+#define R_CTRL_FIFOTHRESH_RESET     (0x40)
+
+#define R_CTRL2_TIMEOUT_PRE_RESET   (0xFFF)
+#define R_CTRL2_MAX_OUTS_CMDS_RESET (0x8)
+
+#define XLNX_CSU_DMA_ERR_DEBUG      (0)
+#define XLNX_CSU_DMA_INT_R_MASK     (0xff)
+
+/* UG1807: Set the prescaler value for the timeout in clk (~2.5ns) cycles */
+#define XLNX_CSU_DMA_TIMER_FREQ     (400 * 1000 * 1000)
+
+static bool xlnx_csu_dma_is_paused(XlnxCSUDMA *s)
+{
+    bool paused;
+
+    paused = !!(s->regs[R_CTRL] & R_CTRL_PAUSE_STRM_MASK);
+    paused |= !!(s->regs[R_CTRL] & R_CTRL_PAUSE_MEM_MASK);
+
+    return paused;
+}
+
+static bool xlnx_csu_dma_get_eop(XlnxCSUDMA *s)
+{
+    return s->r_size_last_word;
+}
+
+static bool xlnx_csu_dma_burst_is_fixed(XlnxCSUDMA *s)
+{
+    return !!(s->regs[R_CTRL] & R_CTRL_AXI_BRST_TYPE_MASK);
+}
+
+static bool xlnx_csu_dma_timeout_enabled(XlnxCSUDMA *s)
+{
+    return !!(s->regs[R_CTRL2] & R_CTRL2_TIMEOUT_EN_MASK);
+}
+
+static void xlnx_csu_dma_update_done_cnt(XlnxCSUDMA *s, int a)
+{
+    int cnt;
+
+    /* Increase DONE_CNT */
+    cnt = ARRAY_FIELD_EX32(s->regs, STATUS, DONE_CNT) + a;
+    ARRAY_FIELD_DP32(s->regs, STATUS, DONE_CNT, cnt);
+}
+
+static void xlnx_csu_dma_data_process(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
+{
+    uint32_t bswap;
+    uint32_t i;
+
+    bswap = s->regs[R_CTRL] & R_CTRL_ENDIANNESS_MASK;
+    if (!s->is_dst && !bswap) {
+        /* Fast when ENDIANNESS cleared or it's SRC */
+        return;
+    }
+
+    for (i = 0; i < len; i += 4) {
+        uint8_t *b = &buf[i];
+        union {
+            uint8_t u8[4];
+            uint32_t u32;
+        } v = {
+            .u8 = { b[0], b[1], b[2], b[3] }
+        };
+
+        if (!s->is_dst) {
+            s->regs[R_CRC] += v.u32;
+        }
+        if (bswap) {
+            /*
+             * No point using bswap, we need to writeback
+             * into a potentially unaligned pointer.
+             */
+            b[0] = v.u8[3];
+            b[1] = v.u8[2];
+            b[2] = v.u8[1];
+            b[3] = v.u8[0];
+        }
+    }
+}
+
+/* len is in bytes */
+static uint32_t xlnx_csu_dma_write(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
+{
+    hwaddr addr = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
+    MemTxResult result = MEMTX_OK;
+
+    xlnx_csu_dma_data_process(s, buf, len);
+    if (xlnx_csu_dma_burst_is_fixed(s)) {
+        uint32_t i;
+
+        for (i = 0; i < len && (result == MEMTX_OK); i += s->width) {
+            uint32_t mlen = MIN(len - i, s->width);
+
+            result = address_space_rw(s->dma_as, addr, s->attr,
+                                      buf, mlen, true);
+            buf += mlen;
+        }
+    } else {
+        result = address_space_rw(s->dma_as, addr, s->attr, buf, len, true);
+    }
+
+    if (result == MEMTX_OK) {
+        return len;
+    } else {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem write",
+                      __func__, addr);
+        return 0;
+    }
+}
+
+/* len is in bytes */
+static uint32_t xlnx_csu_dma_read(XlnxCSUDMA *s, uint8_t *buf, uint32_t len)
+{
+    hwaddr addr = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
+    MemTxResult result = MEMTX_OK;
+
+    if (xlnx_csu_dma_burst_is_fixed(s)) {
+        uint32_t i;
+
+        for (i = 0; i < len && (result == MEMTX_OK); i += s->width) {
+            uint32_t mlen = MIN(len - i, s->width);
+
+            result = address_space_rw(s->dma_as, addr, s->attr,
+                                      buf + i, mlen, false);
+        }
+    } else {
+        result = address_space_rw(s->dma_as, addr, s->attr, buf, len, false);
+    }
+
+    if (result == MEMTX_OK) {
+        xlnx_csu_dma_data_process(s, buf, len);
+        return len;
+    } else {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address 0x%lx for mem read",
+                      __func__, addr);
+        return 0;
+    }
+}
+
+static void xlnx_csu_dma_update_irq(XlnxCSUDMA *s)
+{
+    qemu_set_irq(s->irq, !!(s->regs[R_INT_STATUS] & ~s->regs[R_INT_MASK]));
+}
+
+static void xlnx_csu_dma_done(XlnxCSUDMA *s)
+{
+    s->regs[R_STATUS] &= ~R_STATUS_BUSY_MASK;
+    s->regs[R_INT_STATUS] |= R_INT_STATUS_DONE_MASK;
+
+    if (!s->is_dst) {
+        s->regs[R_INT_STATUS] |= R_INT_STATUS_MEM_DONE_MASK;
+    }
+
+    xlnx_csu_dma_update_done_cnt(s, 1);
+}
+
+static uint32_t xlnx_csu_dma_advance(XlnxCSUDMA *s, uint32_t len)
+{
+    uint32_t size = s->regs[R_SIZE];
+    hwaddr dst = (hwaddr)s->regs[R_ADDR_MSB] << 32 | s->regs[R_ADDR];
+
+    assert(len <= size);
+
+    size -= len;
+    s->regs[R_SIZE] = size;
+
+    if (!xlnx_csu_dma_burst_is_fixed(s)) {
+        dst += len;
+        s->regs[R_ADDR] = (uint32_t) dst;
+        s->regs[R_ADDR_MSB] = dst >> 32;
+    }
+
+    if (size == 0) {
+        xlnx_csu_dma_done(s);
+    }
+
+    return size;
+}
+
+static void xlnx_csu_dma_src_notify(void *opaque)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(opaque);
+    unsigned char buf[4 * 1024];
+    size_t rlen = 0;
+
+    ptimer_transaction_begin(s->src_timer);
+    /* Stop the backpreassure timer */
+    ptimer_stop(s->src_timer);
+
+    while (s->regs[R_SIZE] && !xlnx_csu_dma_is_paused(s) &&
+           stream_can_push(s->tx_dev, xlnx_csu_dma_src_notify, s)) {
+        uint32_t plen = MIN(s->regs[R_SIZE], sizeof buf);
+        bool eop = false;
+
+        /* Did we fit it all? */
+        if (s->regs[R_SIZE] == plen && xlnx_csu_dma_get_eop(s)) {
+            eop = true;
+        }
+
+        /* DMA transfer */
+        xlnx_csu_dma_read(s, buf, plen);
+        rlen = stream_push(s->tx_dev, buf, plen, eop);
+        xlnx_csu_dma_advance(s, rlen);
+    }
+
+    if (xlnx_csu_dma_timeout_enabled(s) && s->regs[R_SIZE] &&
+        !stream_can_push(s->tx_dev, xlnx_csu_dma_src_notify, s)) {
+        uint32_t timeout = ARRAY_FIELD_EX32(s->regs, CTRL, TIMEOUT_VAL);
+        uint32_t div = ARRAY_FIELD_EX32(s->regs, CTRL2, TIMEOUT_PRE) + 1;
+        uint32_t freq = XLNX_CSU_DMA_TIMER_FREQ;
+
+        freq /= div;
+        ptimer_set_freq(s->src_timer, freq);
+        ptimer_set_count(s->src_timer, timeout);
+        ptimer_run(s->src_timer, 1);
+    }
+
+    ptimer_transaction_commit(s->src_timer);
+    xlnx_csu_dma_update_irq(s);
+}
+
+static uint64_t addr_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    /* Address is word aligned */
+    return val & R_ADDR_ADDR_MASK;
+}
+
+static uint64_t size_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    if (s->regs[R_SIZE] != 0) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: Starting DMA while already running.\n", __func__);
+    }
+
+    if (!s->is_dst) {
+        s->r_size_last_word = !!(val & R_SIZE_LAST_WORD_MASK);
+    }
+
+    /* Size is word aligned */
+    return val & R_SIZE_SIZE_MASK;
+}
+
+static uint64_t size_post_read(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    return val | s->r_size_last_word;
+}
+
+static void size_post_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    s->regs[R_STATUS] |= R_STATUS_BUSY_MASK;
+
+    /*
+     * Note that if SIZE is programmed to 0, and the DMA is started,
+     * the interrupts DONE and MEM_DONE will be asserted.
+     */
+    if (s->regs[R_SIZE] == 0) {
+        xlnx_csu_dma_done(s);
+        xlnx_csu_dma_update_irq(s);
+        return;
+    }
+
+    /* Set SIZE is considered the last step in transfer configuration */
+    if (!s->is_dst) {
+        xlnx_csu_dma_src_notify(s);
+    } else {
+        if (s->notify) {
+            s->notify(s->notify_opaque);
+        }
+    }
+}
+
+static uint64_t status_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    return val & (R_STATUS_DONE_CNT_MASK | R_STATUS_BUSY_MASK);
+}
+
+static uint64_t addr_msb_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    return val & R_ADDR_MSB_ADDR_MSB_MASK;
+}
+
+static void ctrl_post_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    if (!s->is_dst) {
+        if (!xlnx_csu_dma_is_paused(s)) {
+            xlnx_csu_dma_src_notify(s);
+        }
+    } else {
+        if (!xlnx_csu_dma_is_paused(s) && s->notify) {
+            s->notify(s->notify_opaque);
+        }
+    }
+}
+
+static uint64_t int_status_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    /* DMA counter decrements when flag 'DONE' is cleared */
+    if ((val & s->regs[R_INT_STATUS] & R_INT_STATUS_DONE_MASK)) {
+        xlnx_csu_dma_update_done_cnt(s, -1);
+    }
+
+    return s->regs[R_INT_STATUS] & ~val;
+}
+
+static void int_status_post_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    xlnx_csu_dma_update_irq(s);
+}
+
+static uint64_t int_enable_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+    uint32_t v32 = val;
+
+    /*
+     * R_INT_ENABLE doesn't have its own state.
+     * It is used to indirectly modify R_INT_MASK.
+     *
+     * 1: Enable this interrupt field (the mask bit will be cleared to 0)
+     * 0: No effect
+     */
+    s->regs[R_INT_MASK] &= ~v32;
+    return 0;
+}
+
+static void int_enable_post_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    xlnx_csu_dma_update_irq(s);
+}
+
+static uint64_t int_disable_pre_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+    uint32_t v32 = val;
+
+    /*
+     * R_INT_DISABLE doesn't have its own state.
+     * It is used to indirectly modify R_INT_MASK.
+     *
+     * 1: Disable this interrupt field (the mask bit will be set to 1)
+     * 0: No effect
+     */
+    s->regs[R_INT_MASK] |= v32;
+    return 0;
+}
+
+static void int_disable_post_write(RegisterInfo *reg, uint64_t val)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(reg->opaque);
+
+    xlnx_csu_dma_update_irq(s);
+}
+
+static const RegisterAccessInfo *xlnx_csu_dma_regs_info[] = {
+#define DMACH_REGINFO(NAME, snd)                                              \
+    (const RegisterAccessInfo []) {                                           \
+        {                                                                     \
+            .name = #NAME "_ADDR",                                            \
+            .addr = A_ADDR,                                                   \
+            .pre_write = addr_pre_write                                       \
+        }, {                                                                  \
+            .name = #NAME "_SIZE",                                            \
+            .addr = A_SIZE,                                                   \
+            .pre_write = size_pre_write,                                      \
+            .post_write = size_post_write,                                    \
+            .post_read = size_post_read                                       \
+        }, {                                                                  \
+            .name = #NAME "_STATUS",                                          \
+            .addr = A_STATUS,                                                 \
+            .pre_write = status_pre_write,                                    \
+            .ro = (R_STATUS_BUSY_MASK                                         \
+                   | R_STATUS_FIFO_LEVEL_MASK                                 \
+                   | R_STATUS_OUTSTANDING_MASK)                               \
+        }, {                                                                  \
+            .name = #NAME "_CTRL",                                            \
+            .addr = A_CTRL,                                                   \
+            .post_write = ctrl_post_write,                                    \
+            .reset = ((R_CTRL_TIMEOUT_VAL_RESET << R_CTRL_TIMEOUT_VAL_SHIFT)  \
+                      | (R_CTRL_FIFO_THRESH_RESET << R_CTRL_FIFO_THRESH_SHIFT)\
+                      | (snd ? 0 : R_CTRL_FIFOTHRESH_RESET                    \
+                         << R_CTRL_FIFOTHRESH_SHIFT))                         \
+        }, {                                                                  \
+            .name = #NAME "_CRC",                                             \
+            .addr = A_CRC,                                                    \
+        }, {                                                                  \
+            .name =  #NAME "_INT_STATUS",                                     \
+            .addr = A_INT_STATUS,                                             \
+            .pre_write = int_status_pre_write,                                \
+            .post_write = int_status_post_write                               \
+        }, {                                                                  \
+            .name = #NAME "_INT_ENABLE",                                      \
+            .addr = A_INT_ENABLE,                                             \
+            .pre_write = int_enable_pre_write,                                \
+            .post_write = int_enable_post_write                               \
+        }, {                                                                  \
+            .name = #NAME "_INT_DISABLE",                                     \
+            .addr = A_INT_DISABLE,                                            \
+            .pre_write = int_disable_pre_write,                               \
+            .post_write = int_disable_post_write                              \
+        }, {                                                                  \
+            .name = #NAME "_INT_MASK",                                        \
+            .addr = A_INT_MASK,                                               \
+            .ro = ~0,                                                         \
+            .reset = XLNX_CSU_DMA_INT_R_MASK                                  \
+        }, {                                                                  \
+            .name = #NAME "_CTRL2",                                           \
+            .addr = A_CTRL2,                                                  \
+            .reset = ((R_CTRL2_TIMEOUT_PRE_RESET                              \
+                       << R_CTRL2_TIMEOUT_PRE_SHIFT)                          \
+                      | (R_CTRL2_MAX_OUTS_CMDS_RESET                          \
+                         << R_CTRL2_MAX_OUTS_CMDS_SHIFT))                     \
+        }, {                                                                  \
+            .name = #NAME "_ADDR_MSB",                                        \
+            .addr = A_ADDR_MSB,                                               \
+            .pre_write = addr_msb_pre_write                                   \
+        }                                                                     \
+    }
+
+    DMACH_REGINFO(DMA_SRC, true),
+    DMACH_REGINFO(DMA_DST, false)
+};
+
+static size_t xlnx_csu_dma_stream_push(StreamSink *obj, uint8_t *buf,
+                                       size_t len, bool eop)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(obj);
+    uint32_t size = s->regs[R_SIZE];
+    uint32_t mlen = MIN(size, len) & (~3); /* Size is word aligned */
+
+    /* Be called when it's DST */
+    assert(s->is_dst);
+
+    if (size == 0 || len <= 0) {
+        return 0;
+    }
+
+    if (len && (xlnx_csu_dma_is_paused(s) || mlen == 0)) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "csu-dma: DST channel dropping %zd b of data.\n", len);
+        s->regs[R_INT_STATUS] |= R_INT_STATUS_FIFO_OVERFLOW_MASK;
+        return len;
+    }
+
+    if (xlnx_csu_dma_write(s, buf, mlen) != mlen) {
+        return 0;
+    }
+
+    xlnx_csu_dma_advance(s, mlen);
+    xlnx_csu_dma_update_irq(s);
+
+    return mlen;
+}
+
+static bool xlnx_csu_dma_stream_can_push(StreamSink *obj,
+                                         StreamCanPushNotifyFn notify,
+                                         void *notify_opaque)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(obj);
+
+    if (s->regs[R_SIZE] != 0) {
+        return true;
+    } else {
+        s->notify = notify;
+        s->notify_opaque = notify_opaque;
+        return false;
+    }
+}
+
+static const MemoryRegionOps xlnx_csu_dma_ops = {
+    .read = register_read_memory,
+    .write = register_write_memory,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    }
+};
+
+static void xlnx_csu_dma_reset(DeviceState *dev)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(dev);
+    unsigned int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->regs_info); ++i) {
+        register_reset(&s->regs_info[i]);
+    }
+}
+
+static void xlnx_csu_dma_src_timeout_hit(void *opaque)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(opaque);
+
+    /* Ignore if the timeout is masked */
+    if (!xlnx_csu_dma_timeout_enabled(s)) {
+        return;
+    }
+
+    s->regs[R_INT_STATUS] |= R_INT_STATUS_TIMEOUT_STRM_MASK;
+    xlnx_csu_dma_update_irq(s);
+}
+
+static void xlnx_csu_dma_realize(DeviceState *dev, Error **errp)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(dev);
+    RegisterInfoArray *reg_array;
+
+    reg_array =
+        register_init_block32(dev, xlnx_csu_dma_regs_info[!!s->is_dst],
+                              XLNX_CSU_DMA_R_MAX,
+                              s->regs_info, s->regs,
+                              &xlnx_csu_dma_ops,
+                              XLNX_CSU_DMA_ERR_DEBUG,
+                              XLNX_CSU_DMA_R_MAX * 4);
+    memory_region_add_subregion(&s->iomem,
+                                0x0,
+                                &reg_array->mem);
+
+    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->iomem);
+    sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->irq);
+
+    if (!s->is_dst && !s->tx_dev) {
+        error_setg(errp, "zynqmp.csu-dma: Stream not connected");
+        return;
+    }
+
+    s->src_timer = ptimer_init(xlnx_csu_dma_src_timeout_hit,
+                               s, PTIMER_POLICY_DEFAULT);
+
+    if (s->dma_mr) {
+        s->dma_as = g_malloc0(sizeof(AddressSpace));
+        address_space_init(s->dma_as, s->dma_mr, NULL);
+    } else {
+        s->dma_as = &address_space_memory;
+    }
+
+    s->attr = MEMTXATTRS_UNSPECIFIED;
+
+    s->r_size_last_word = 0;
+}
+
+static void xlnx_csu_dma_init(Object *obj)
+{
+    XlnxCSUDMA *s = XLNX_CSU_DMA(obj);
+
+    memory_region_init(&s->iomem, obj, TYPE_XLNX_CSU_DMA,
+                       XLNX_CSU_DMA_R_MAX * 4);
+
+    object_property_add_link(obj, "stream-connected-dma", TYPE_STREAM_SINK,
+                             (Object **)&s->tx_dev,
+                             qdev_prop_allow_set_link_before_realize,
+                             OBJ_PROP_LINK_STRONG);
+    object_property_add_link(obj, "dma", TYPE_MEMORY_REGION,
+                             (Object **)&s->dma_mr,
+                             qdev_prop_allow_set_link_before_realize,
+                             OBJ_PROP_LINK_STRONG);
+}
+
+static const VMStateDescription vmstate_xlnx_csu_dma = {
+    .name = TYPE_XLNX_CSU_DMA,
+    .version_id = 0,
+    .minimum_version_id = 0,
+    .minimum_version_id_old = 0,
+    .fields = (VMStateField[]) {
+        VMSTATE_PTIMER(src_timer, XlnxCSUDMA),
+        VMSTATE_UINT16(width, XlnxCSUDMA),
+        VMSTATE_BOOL(is_dst, XlnxCSUDMA),
+        VMSTATE_BOOL(r_size_last_word, XlnxCSUDMA),
+        VMSTATE_UINT32_ARRAY(regs, XlnxCSUDMA, XLNX_CSU_DMA_R_MAX),
+        VMSTATE_END_OF_LIST(),
+    }
+};
+
+static Property xlnx_csu_dma_properties[] = {
+    /*
+     * The CSU DMA is a two-channel, simple DMA, allowing separate control of
+     * the SRC (read) channel and DST (write) channel. "is-dst" is used to mark
+     * which channel the device is connected to.
+     */
+    DEFINE_PROP_BOOL("is-dst", XlnxCSUDMA, is_dst, true),
+    /*
+     * Ref PG021, Stream Data Width:
+     * Data width in bits of the AXI S2MM AXI4-Stream Data bus.
+     * This value must be equal or less than the Memory Map Data Width.
+     * Valid values are 8, 16, 32, 64, 128, 512 and 1024.
+     * "dma-width" is the byte value of the "Stream Data Width".
+     */
+    DEFINE_PROP_UINT16("dma-width", XlnxCSUDMA, width, 4),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void xlnx_csu_dma_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    StreamSinkClass *ssc = STREAM_SINK_CLASS(klass);
+
+    dc->reset = xlnx_csu_dma_reset;
+    dc->realize = xlnx_csu_dma_realize;
+    dc->vmsd = &vmstate_xlnx_csu_dma;
+    device_class_set_props(dc, xlnx_csu_dma_properties);
+
+    ssc->push = xlnx_csu_dma_stream_push;
+    ssc->can_push = xlnx_csu_dma_stream_can_push;
+}
+
+static const TypeInfo xlnx_csu_dma_info = {
+    .name          = TYPE_XLNX_CSU_DMA,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(XlnxCSUDMA),
+    .class_init    = xlnx_csu_dma_class_init,
+    .instance_init = xlnx_csu_dma_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_STREAM_SINK },
+        { }
+    }
+};
+
+static void xlnx_csu_dma_register_types(void)
+{
+    type_register_static(&xlnx_csu_dma_info);
+}
+
+type_init(xlnx_csu_dma_register_types)
diff --git a/hw/dma/Kconfig b/hw/dma/Kconfig
index 5d6be1a7a7..98fbb1bb04 100644
--- a/hw/dma/Kconfig
+++ b/hw/dma/Kconfig
@@ -26,3 +26,7 @@  config STP2000
 
 config SIFIVE_PDMA
     bool
+
+config XLNX_CSU_DMA
+    bool
+    select REGISTER
diff --git a/hw/dma/meson.build b/hw/dma/meson.build
index 47b4a7cb47..5c78a4e05f 100644
--- a/hw/dma/meson.build
+++ b/hw/dma/meson.build
@@ -14,3 +14,4 @@  softmmu_ss.add(when: 'CONFIG_OMAP', if_true: files('omap_dma.c', 'soc_dma.c'))
 softmmu_ss.add(when: 'CONFIG_PXA2XX', if_true: files('pxa2xx_dma.c'))
 softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2835_dma.c'))
 softmmu_ss.add(when: 'CONFIG_SIFIVE_PDMA', if_true: files('sifive_pdma.c'))
+softmmu_ss.add(when: 'CONFIG_XLNX_CSU_DMA', if_true: files('xlnx_csu_dma.c'))