mbox series

[v4,00/15] dmaengine/soc: Add Texas Instruments UDMA support

Message ID 20191101084135.14811-1-peter.ujfalusi@ti.com
Headers show
Series dmaengine/soc: Add Texas Instruments UDMA support | expand

Message

Peter Ujfalusi Nov. 1, 2019, 8:41 a.m. UTC
Hi,

Changes since v3
(https://patchwork.kernel.org/project/linux-dmaengine/list/?series=180679&state=*):
- Based on 5.4-rc5
- Fixed typos pointed out by Tero
- Added reviewed-by tags from Tero

- ring accelerator driver
 - TODO_GS is removed from the header
 - pm_runtime removed as NAVSS and it's components are always on
 - Check validity of Message mode setup (element size > 8 bytes must use proxy)

- cppi5 header
 - add commit message

- UDMAP DT bindings
 - Drop the psil-config node use on the remote PSI-L side and use only one cell
   which is the remote threadID:

     dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
     dma-names = "tx", "rx";

 - The PSI-L thread configuration description is moved to kernel as a new module:
   k3-psil/k3-psil-am654/k3-psil-j721e
 - ti,psil-base has been removed and moved to kernel
 - removed the no longer needed dt-bindings/dma/k3-udma.h
 - Convert the document to schema (yaml)

- NEW PSI-L endpoint configuration database
 - a simple database holding the remote end's configuration needed for UDMAP
   configuration. All previous parameters from DT has been moved here and merged
   with the linux only tr mode channel flag.
 - Client drivers can update the remote endpoint configuration as it can be
   different based on system configuration and the endpoint itself is under the
   control of the peripheral driver.
 - database for am654 and j721e

- UDMAP DMAengine driver
 - pm_runtime removed as NAVSS and it's components are always on
 - rchan_oes_offset added to MSI dommain allocation
 - Use the new PSI-L endpoint database for UDMAP configuration
 - Support for waiting for PDMA teardown completion on j721e instead of
   returning right away. depends on:
   https://lkml.org/lkml/2019/10/25/189
   Not included in this series, but it is in the branch I have prepared.
 - psil-base is moved from DT to be part of udma_match_data
 - tr_thread maps is removed and using the PSI-L endpoint configuration for it

- UDMAP glue layer
 - pm_runtime removed as NAVSS and it's components are always on
 - Use the new PSI-L endpoint database for UDMAP configuration

Changes since v2
(https://patchwork.kernel.org/project/linux-dmaengine/list/?series=152609&state=*)
- Based on 5.4-rc1
- Support for Flow only data transfer for the glue layer

- cppi5 header
 - comments converted to kernel-doc style
 - Remove the excessive WARN_ONs and rely on the user for sanity
 - new macro for checking TearDown Completion Message

- ring accelerator driver
 - fixed up th commit message (SoB, TI-SCI)
 - fixed ring reset
 - CONFIG_TI_K3_RINGACC_DEBUG is removed along with the dbg_write/read functions
   and use dev_dbg()
 - k3_ringacc_ring_dump() is moved to static
 - step numbering removed from k3_ringacc_ring_reset_dma()
 - Add clarification comment for shared ring usage in k3_ringacc_ring_cfg()
 - Magic shift values in k3_ringacc_ring_cfg_proxy() got defined
 - K3_RINGACC_RING_MODE_QM is removed as it is not supported

- UDMAP DT bindings
 - Fix property prefixing: s/pdma,/ti,pdma-
 - Add ti,notdpkt property to suppress teardown completion message on tchan
 - example updated accordingly

- UDMAP DMAengine driver
 - Change __raw_readl/writel to readl/writel
 - Split up the udma_tisci_channel_config() into m2m, tx and rx tisci
   configuration functions for clarity
 - DT bindings change: s/pdma,/ti,pdma-
 - Cleanup of udma_tx_status():
  - residue calculation fix for m2m
  - no need to read packet counter as it is not used
  - peer byte counter only available in PDMAs
  - Proper locking to avoid race with interrupt handler (polled m2m fix)
 - Support for ti,notdpkt
 - RFLOW management rework to support data movement without channel:
  - the channel is not controlled by Linux but other core and we only have
    rflows and rings to do the DMA transfers.
    This mode is only supported by the Glue layer for now.

- UDMAP glue layer
 - Debug print improvements
 - Support for rflow/ring only data movement

Changes since v1
(https://patchwork.kernel.org/project/linux-dmaengine/list/?series=114105&state=*)
- Added support for j721e
- Based on 5.3-rc2
- dropped ti_sci API patch for RM management as it is already upstream
- dropped dmadev_get_slave_channel() patch, using __dma_request_channel()
- Added Rob's Reviewed-by to ringacc DT binding document patch
- DT bindings changes:
 - linux,udma-mode is gone, I have a simple lookup table in the driver to flag
   TR channels.
 - Support for j721e
- Fix bug in of_node_put() handling in xlate function

Changes since RFC (https://patchwork.kernel.org/cover/10612465/):
- Based on linux-next (20190506) which now have the ti_sci interrupt support
- The series can be applied and the UDMA via DMAengine API will be functional
- Included in the series: ti_sci Resource management API, cppi5 header and
  driver for the ring accelerator.
- The DMAengine core patches have been updated as per the review comments for
  earlier submittion.
- The DMAengine driver patch is artificially split up to 6 smaller patches

The k3-udma driver implements the Data Movement Architecture described in
AM65x TRM (http://www.ti.com/lit/pdf/spruid7) and
j721e TRM (http://www.ti.com/lit/pdf/spruil1)

This DMA architecture is a big departure from 'traditional' architecture where
we had either EDMA or sDMA as system DMA.

Packet DMAs were used as dedicated DMAs to service only networking (Kesytone2)
or USB (am335x) while other peripherals were serviced by EDMA.

In AM65x/j721e the UDMA (Unified DMA) is used for all data movment within the
SoC, tasked to service all peripherals (UART, McSPI, McASP, networking, etc). 

The NAVSS/UDMA is built around CPPI5 (Communications Port Programming Interface)
and it supports Packet mode (similar to CPPI4.1 in Keystone2 for networking) and
TR mode (similar to EDMA descriptor).
The data movement is done within a PSI-L fabric, peripherals (including the
UDMA-P) are not addressed by their I/O register as with traditional DMAs but
with their PSI-L thread ID.

In AM65x/j721e we have two main type of peripherals:
Legacy: McASP, McSPI, UART, etc.
 to provide connectivity they are serviced by PDMA (Peripheral DMA)
 PDMA threads are locked to service a given peripheral, for example PSI-L thread
 0x4400/0xc400 is to service McASP0 rx/tx.
 The PDMa configuration can be done via the UDMA Real Time Peer registers.
Native: Networking, security accelerator
 these peripherals have native support for PSI-L.

To be able to use the DMA the following generic steps need to be taken:
- configure a DMA channel (tchan for TX, rchan for RX)
 - channel mode: Packet or TR mode
 - for memcpy a tchan and rchan pair is used.
 - for packet mode RX we also need to configure a receive flow to configure the
   packet receiption
- the source and destination threads must be paired
- at minimum one pair of rings need to be configured:
 - tx: transfer ring and transfer completion ring
 - rx: free descriptor ring and receive ring
- two interrupts: UDMA-P channel interrupt and ring interrupt for tc_ring/r_ring
 - If the channel is in packet mode or configured to memcpy then we only need
   one interrupt from the ring, events from UDMAP is not used.

When the channel setup is completed we only interract with the rings:
- TX: push a descriptor to t_ring and wait for it to be pushed to the tc_ring by
  the UDMA-P
- RX: push a descriptor to the fd_ring and waith for UDMA-P to push it back to
  the r_ring.

Since we have FIFOs in the DMA fabric (UDMA-P, PSI-L and PDMA) which was not the
case in previous DMAs we need to report the amount of data held in these FIFOs
to clients (delay calculation for ALSA, UART FIFO flush support).

Metadata support:
DMAengine user driver was posted upstream based/tested on the v1 of the UDMA
series: https://lkml.org/lkml/2019/6/28/20
SA2UL is using the metadata DMAengine API.

Note on the last patch:
In Keystone2 the networking had dedicated DMA (packet DMA) which is not the case
anymore and the DMAengine API currently missing support for the features we
would need to support networking, things like
- support for receive descriptor 'classification'
 - we need to support several receive queues for a channel.
 - the queues are used for packet priority handling for example, but they can be
   used to have pools of descriptors for different sizes.
- out of order completion of descriptors on a channel
 - when we have several queues to handle different priority packets the
   descriptors will be completed 'out-of-order'
- NAPI type of operation (polling instead of interrupt driven transfer)
 - without this we can not sustain gigabit speeds and we need to support NAPI
 - not to limit this to networking, but other high performance operations

It is my intention to work on these to be able to remove the 'glue' layer and
switch to DMAengine API - or have an API aside of DMAengine to have generic way
to support networking, but given how controversial and not trivial these changes
are we need something to support networking.

The series (+DT patches to enabled DMA on AM65x and j721e) on top of 5.4-rc5 is
available:
https://github.com/omap-audio/linux-audio.git peter/udma/series_v4-5.4-rc5

Regards,
Peter
---
Grygorii Strashko (3):
  bindings: soc: ti: add documentation for k3 ringacc
  soc: ti: k3: add navss ringacc driver
  dmaengine: ti: k3-udma: Add glue layer for non DMAengine users

Peter Ujfalusi (12):
  dmaengine: doc: Add sections for per descriptor metadata support
  dmaengine: Add metadata_ops for dma_async_tx_descriptor
  dmaengine: Add support for reporting DMA cached data amount
  dmaengine: ti: Add cppi5 header for K3 NAVSS/UDMA
  dmaengine: ti: k3 PSI-L remote endpoint configuration
  dt-bindings: dma: ti: Add document for K3 UDMA
  dmaengine: ti: New driver for K3 UDMA - split#1: defines, structs, io
    func
  dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate
    and filter_fn
  dmaengine: ti: New driver for K3 UDMA - split#3: alloc/free
    chan_resources
  dmaengine: ti: New driver for K3 UDMA - split#4: dma_device callbacks
    1
  dmaengine: ti: New driver for K3 UDMA - split#5: dma_device callbacks
    2
  dmaengine: ti: New driver for K3 UDMA - split#6: Kconfig and Makefile

 .../devicetree/bindings/dma/ti/k3-udma.yaml   |  190 +
 .../devicetree/bindings/soc/ti/k3-ringacc.txt |   59 +
 Documentation/driver-api/dmaengine/client.rst |   75 +
 .../driver-api/dmaengine/provider.rst         |   46 +
 drivers/dma/dmaengine.c                       |   73 +
 drivers/dma/dmaengine.h                       |    8 +
 drivers/dma/ti/Kconfig                        |   26 +
 drivers/dma/ti/Makefile                       |    3 +
 drivers/dma/ti/k3-psil-am654.c                |  172 +
 drivers/dma/ti/k3-psil-j721e.c                |  219 ++
 drivers/dma/ti/k3-psil-priv.h                 |   39 +
 drivers/dma/ti/k3-psil.c                      |   97 +
 drivers/dma/ti/k3-udma-glue.c                 | 1202 ++++++
 drivers/dma/ti/k3-udma-private.c              |  133 +
 drivers/dma/ti/k3-udma.c                      | 3425 +++++++++++++++++
 drivers/dma/ti/k3-udma.h                      |  151 +
 drivers/soc/ti/Kconfig                        |   12 +
 drivers/soc/ti/Makefile                       |    1 +
 drivers/soc/ti/k3-ringacc.c                   | 1158 ++++++
 include/linux/dma/k3-psil.h                   |   47 +
 include/linux/dma/k3-udma-glue.h              |  134 +
 include/linux/dma/ti-cppi5.h                  | 1049 +++++
 include/linux/dmaengine.h                     |  110 +
 include/linux/soc/ti/k3-ringacc.h             |  244 ++
 24 files changed, 8673 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/ti/k3-udma.yaml
 create mode 100644 Documentation/devicetree/bindings/soc/ti/k3-ringacc.txt
 create mode 100644 drivers/dma/ti/k3-psil-am654.c
 create mode 100644 drivers/dma/ti/k3-psil-j721e.c
 create mode 100644 drivers/dma/ti/k3-psil-priv.h
 create mode 100644 drivers/dma/ti/k3-psil.c
 create mode 100644 drivers/dma/ti/k3-udma-glue.c
 create mode 100644 drivers/dma/ti/k3-udma-private.c
 create mode 100644 drivers/dma/ti/k3-udma.c
 create mode 100644 drivers/dma/ti/k3-udma.h
 create mode 100644 drivers/soc/ti/k3-ringacc.c
 create mode 100644 include/linux/dma/k3-psil.h
 create mode 100644 include/linux/dma/k3-udma-glue.h
 create mode 100644 include/linux/dma/ti-cppi5.h
 create mode 100644 include/linux/soc/ti/k3-ringacc.h

Comments

Tero Kristo Nov. 5, 2019, 7:40 a.m. UTC | #1
On 01/11/2019 10:41, Peter Ujfalusi wrote:
> The K3 DMA architecture uses CPPI5 (Communications Port Programming
> Interface) specified descriptors over PSI-L bus within NAVSS.
> 
> The header provides helpers, macros to work with these descriptors in a
> consistent way.
> 
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>

Looks fine to me:

Reviewed-by: Tero Kristo <t-kristo@ti.com>

> ---
>   include/linux/dma/ti-cppi5.h | 1049 ++++++++++++++++++++++++++++++++++
>   1 file changed, 1049 insertions(+)
>   create mode 100644 include/linux/dma/ti-cppi5.h
> 
> diff --git a/include/linux/dma/ti-cppi5.h b/include/linux/dma/ti-cppi5.h
> new file mode 100644
> index 000000000000..f795f8cb7cc5
> --- /dev/null
> +++ b/include/linux/dma/ti-cppi5.h
> @@ -0,0 +1,1049 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * CPPI5 descriptors interface
> + *
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef __TI_CPPI5_H__
> +#define __TI_CPPI5_H__
> +
> +#include <linux/bitops.h>
> +#include <linux/printk.h>
> +#include <linux/bug.h>
> +
> +/**
> + * struct cppi5_desc_hdr_t - Descriptor header, present in all types of
> + *			     descriptors
> + * @pkt_info0:		Packet info word 0 (n/a in Buffer desc)
> + * @pkt_info0:		Packet info word 1 (n/a in Buffer desc)
> + * @pkt_info0:		Packet info word 2 (n/a in Buffer desc)
> + * @src_dst_tag:	Packet info word 3 (n/a in Buffer desc)
> + */
> +struct cppi5_desc_hdr_t {
> +	u32 pkt_info0;
> +	u32 pkt_info1;
> +	u32 pkt_info2;
> +	u32 src_dst_tag;
> +} __packed;
> +
> +/**
> + * struct cppi5_host_desc_t - Host-mode packet and buffer descriptor definition
> + * @hdr:		Descriptor header
> + * @next_desc:		word 4/5: Linking word
> + * @buf_ptr:		word 6/7: Buffer pointer
> + * @buf_info1:		word 8: Buffer valid data length
> + * @org_buf_len:	word 9: Original buffer length
> + * @org_buf_ptr:	word 10/11: Original buffer pointer
> + * @epib[0]:		Extended Packet Info Data (optional, 4 words), and/or
> + *			Protocol Specific Data (optional, 0-128 bytes in
> + *			multiples of 4), and/or
> + *			Other Software Data (0-N bytes, optional)
> + */
> +struct cppi5_host_desc_t {
> +	struct cppi5_desc_hdr_t hdr;
> +	u64 next_desc;
> +	u64 buf_ptr;
> +	u32 buf_info1;
> +	u32 org_buf_len;
> +	u64 org_buf_ptr;
> +	u32 epib[0];
> +} __packed;
> +
> +#define CPPI5_DESC_MIN_ALIGN			(16U)
> +
> +#define CPPI5_INFO0_HDESC_EPIB_SIZE		(16U)
> +#define CPPI5_INFO0_HDESC_PSDATA_MAX_SIZE	(128U)
> +
> +#define CPPI5_INFO0_HDESC_TYPE_SHIFT		(30U)
> +#define CPPI5_INFO0_HDESC_TYPE_MASK		GENMASK(31, 30)
> +#define   CPPI5_INFO0_DESC_TYPE_VAL_HOST	(1U)
> +#define   CPPI5_INFO0_DESC_TYPE_VAL_MONO	(2U)
> +#define   CPPI5_INFO0_DESC_TYPE_VAL_TR		(3U)
> +#define CPPI5_INFO0_HDESC_EPIB_PRESENT		BIT(29)
> +/*
> + * Protocol Specific Words location:
> + * 0 - located in the descriptor,
> + * 1 = located in the SOP Buffer immediately prior to the data.
> + */
> +#define CPPI5_INFO0_HDESC_PSINFO_LOCATION	BIT(28)
> +#define CPPI5_INFO0_HDESC_PSINFO_SIZE_SHIFT	(22U)
> +#define CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK	GENMASK(27, 22)
> +#define CPPI5_INFO0_HDESC_PKTLEN_SHIFT		(0)
> +#define CPPI5_INFO0_HDESC_PKTLEN_MASK		GENMASK(21, 0)
> +
> +#define CPPI5_INFO1_DESC_PKTERROR_SHIFT		(28U)
> +#define CPPI5_INFO1_DESC_PKTERROR_MASK		GENMASK(31, 28)
> +#define CPPI5_INFO1_HDESC_PSFLGS_SHIFT		(24U)
> +#define CPPI5_INFO1_HDESC_PSFLGS_MASK		GENMASK(27, 24)
> +#define CPPI5_INFO1_DESC_PKTID_SHIFT		(14U)
> +#define CPPI5_INFO1_DESC_PKTID_MASK		GENMASK(23, 14)
> +#define CPPI5_INFO1_DESC_FLOWID_SHIFT		(0)
> +#define CPPI5_INFO1_DESC_FLOWID_MASK		GENMASK(13, 0)
> +
> +#define CPPI5_INFO2_HDESC_PKTTYPE_SHIFT		(27U)
> +#define CPPI5_INFO2_HDESC_PKTTYPE_MASK		GENMASK(31, 27)
> +/* Return Policy: 0 - Entire packet 1 - Each buffer */
> +#define CPPI5_INFO2_HDESC_RETPOLICY		BIT(18)
> +/*
> + * Early Return:
> + * 0 = desc pointers should be returned after all reads have been completed
> + * 1 = desc pointers should be returned immediately upon fetching
> + * the descriptor and beginning to transfer data.
> + */
> +#define CPPI5_INFO2_HDESC_EARLYRET		BIT(17)
> +/*
> + * Return Push Policy:
> + * 0 = Descriptor must be returned to tail of queue
> + * 1 = Descriptor must be returned to head of queue
> + */
> +#define CPPI5_INFO2_DESC_RETPUSHPOLICY		BIT(16)
> +#define CPPI5_INFO2_DESC_RETQ_SHIFT		(0)
> +#define CPPI5_INFO2_DESC_RETQ_MASK		GENMASK(15, 0)
> +
> +#define CPPI5_INFO3_DESC_SRCTAG_SHIFT		(16U)
> +#define CPPI5_INFO3_DESC_SRCTAG_MASK		GENMASK(31, 16)
> +#define CPPI5_INFO3_DESC_DSTTAG_SHIFT		(0)
> +#define CPPI5_INFO3_DESC_DSTTAG_MASK		GENMASK(15, 0)
> +
> +#define CPPI5_BUFINFO1_HDESC_DATA_LEN_SHIFT	(0)
> +#define CPPI5_BUFINFO1_HDESC_DATA_LEN_MASK	GENMASK(27, 0)
> +
> +#define CPPI5_OBUFINFO0_HDESC_BUF_LEN_SHIFT	(0)
> +#define CPPI5_OBUFINFO0_HDESC_BUF_LEN_MASK	GENMASK(27, 0)
> +
> +/**
> + * struct cppi5_desc_epib_t - Host Packet Descriptor Extended Packet Info Block
> + * @timestamp:		word 0: application specific timestamp
> + * @sw_info0:		word 1: Software Info 0
> + * @sw_info1:		word 1: Software Info 1
> + * @sw_info2:		word 1: Software Info 2
> + */
> +struct cppi5_desc_epib_t {
> +	u32 timestamp;	/* w0: application specific timestamp */
> +	u32 sw_info0;	/* w1: Software Info 0 */
> +	u32 sw_info1;	/* w2: Software Info 1 */
> +	u32 sw_info2;	/* w3: Software Info 2 */
> +};
> +
> +/**
> + * struct cppi5_monolithic_desc_t - Monolithic-mode packet descriptor
> + * @hdr:		Descriptor header
> + * @epib[0]:		Extended Packet Info Data (optional, 4 words), and/or
> + *			Protocol Specific Data (optional, 0-128 bytes in
> + *			multiples of 4), and/or
> + *			Other Software Data (0-N bytes, optional)
> + */
> +struct cppi5_monolithic_desc_t {
> +	struct cppi5_desc_hdr_t hdr;
> +	u32 epib[0];
> +};
> +
> +#define CPPI5_INFO2_MDESC_DATA_OFFSET_SHIFT	(18U)
> +#define CPPI5_INFO2_MDESC_DATA_OFFSET_MASK	GENMASK(26, 18)
> +
> +/*
> + * Reload Count:
> + * 0 = Finish the packet and place the descriptor back on the return queue
> + * 1-0x1ff = Vector to the Reload Index and resume processing
> + * 0x1ff indicates perpetual loop, infinite reload until the channel is stopped
> + */
> +#define CPPI5_INFO0_TRDESC_RLDCNT_SHIFT		(20U)
> +#define CPPI5_INFO0_TRDESC_RLDCNT_MASK		GENMASK(28, 20)
> +#define CPPI5_INFO0_TRDESC_RLDCNT_MAX		(0x1ff)
> +#define CPPI5_INFO0_TRDESC_RLDCNT_INFINITE	CPPI5_INFO0_TRDESC_RLDCNT_MAX
> +#define CPPI5_INFO0_TRDESC_RLDIDX_SHIFT		(14U)
> +#define CPPI5_INFO0_TRDESC_RLDIDX_MASK		GENMASK(19, 14)
> +#define CPPI5_INFO0_TRDESC_RLDIDX_MAX		(0x3f)
> +#define CPPI5_INFO0_TRDESC_LASTIDX_SHIFT	(0)
> +#define CPPI5_INFO0_TRDESC_LASTIDX_MASK		GENMASK(13, 0)
> +
> +#define CPPI5_INFO1_TRDESC_RECSIZE_SHIFT	(24U)
> +#define CPPI5_INFO1_TRDESC_RECSIZE_MASK		GENMASK(26, 24)
> +#define   CPPI5_INFO1_TRDESC_RECSIZE_VAL_16B	(0)
> +#define   CPPI5_INFO1_TRDESC_RECSIZE_VAL_32B	(1U)
> +#define   CPPI5_INFO1_TRDESC_RECSIZE_VAL_64B	(2U)
> +#define   CPPI5_INFO1_TRDESC_RECSIZE_VAL_128B	(3U)
> +
> +static inline void cppi5_desc_dump(void *desc, u32 size)
> +{
> +	print_hex_dump(KERN_ERR, "dump udmap_desc: ", DUMP_PREFIX_NONE,
> +		       32, 4, desc, size, false);
> +}
> +
> +#define CPPI5_TDCM_MARKER			(0x1)
> +/**
> + * cppi5_desc_is_tdcm - check if the paddr indicates Teardown Complete Message
> + * @paddr: Physical address of the packet popped from the ring
> + *
> + * Returns true if the address indicates TDCM
> + */
> +static inline bool cppi5_desc_is_tdcm(dma_addr_t paddr)
> +{
> +	return (paddr & CPPI5_TDCM_MARKER) ? true : false;
> +}
> +
> +/**
> + * cppi5_desc_get_type - get descriptor type
> + * @desc_hdr: packet descriptor/TR header
> + *
> + * Returns descriptor type:
> + * CPPI5_INFO0_DESC_TYPE_VAL_HOST
> + * CPPI5_INFO0_DESC_TYPE_VAL_MONO
> + * CPPI5_INFO0_DESC_TYPE_VAL_TR
> + */
> +static inline u32 cppi5_desc_get_type(struct cppi5_desc_hdr_t *desc_hdr)
> +{
> +	return (desc_hdr->pkt_info0 & CPPI5_INFO0_HDESC_TYPE_MASK) >>
> +		CPPI5_INFO0_HDESC_TYPE_SHIFT;
> +}
> +
> +/**
> + * cppi5_desc_get_errflags - get Error Flags from Desc
> + * @desc_hdr: packet/TR descriptor header
> + *
> + * Returns Error Flags from Packet/TR Descriptor
> + */
> +static inline u32 cppi5_desc_get_errflags(struct cppi5_desc_hdr_t *desc_hdr)
> +{
> +	return (desc_hdr->pkt_info1 & CPPI5_INFO1_DESC_PKTERROR_MASK) >>
> +		CPPI5_INFO1_DESC_PKTERROR_SHIFT;
> +}
> +
> +/**
> + * cppi5_desc_get_pktids - get Packet and Flow ids from Desc
> + * @desc_hdr: packet/TR descriptor header
> + * @pkt_id: Packet ID
> + * @flow_id: Flow ID
> + *
> + * Returns Packet and Flow ids from packet/TR descriptor
> + */
> +static inline void cppi5_desc_get_pktids(struct cppi5_desc_hdr_t *desc_hdr,
> +					 u32 *pkt_id, u32 *flow_id)
> +{
> +	*pkt_id = (desc_hdr->pkt_info1 & CPPI5_INFO1_DESC_PKTID_MASK) >>
> +		   CPPI5_INFO1_DESC_PKTID_SHIFT;
> +	*flow_id = (desc_hdr->pkt_info1 & CPPI5_INFO1_DESC_FLOWID_MASK) >>
> +		    CPPI5_INFO1_DESC_FLOWID_SHIFT;
> +}
> +
> +/**
> + * cppi5_desc_set_pktids - set Packet and Flow ids in Desc
> + * @desc_hdr: packet/TR descriptor header
> + * @pkt_id: Packet ID
> + * @flow_id: Flow ID
> + */
> +static inline void cppi5_desc_set_pktids(struct cppi5_desc_hdr_t *desc_hdr,
> +					 u32 pkt_id, u32 flow_id)
> +{
> +	desc_hdr->pkt_info1 |= (pkt_id << CPPI5_INFO1_DESC_PKTID_SHIFT) &
> +				CPPI5_INFO1_DESC_PKTID_MASK;
> +	desc_hdr->pkt_info1 |= (flow_id << CPPI5_INFO1_DESC_FLOWID_SHIFT) &
> +				CPPI5_INFO1_DESC_FLOWID_MASK;
> +}
> +
> +/**
> + * cppi5_desc_set_retpolicy - set Packet Return Policy in Desc
> + * @desc_hdr: packet/TR descriptor header
> + * @flags: fags, supported values
> + *  CPPI5_INFO2_HDESC_RETPOLICY
> + *  CPPI5_INFO2_HDESC_EARLYRET
> + *  CPPI5_INFO2_DESC_RETPUSHPOLICY
> + * @return_ring_id: Packet Return Queue/Ring id, value 0xFFFF reserved
> + */
> +static inline void cppi5_desc_set_retpolicy(struct cppi5_desc_hdr_t *desc_hdr,
> +					    u32 flags, u32 return_ring_id)
> +{
> +	desc_hdr->pkt_info2 |= flags;
> +	desc_hdr->pkt_info2 |= return_ring_id & CPPI5_INFO2_DESC_RETQ_MASK;
> +}
> +
> +/**
> + * cppi5_desc_get_tags_ids - get Packet Src/Dst Tags from Desc
> + * @desc_hdr: packet/TR descriptor header
> + * @src_tag_id: Source Tag
> + * @dst_tag_id: Dest Tag
> + *
> + * Returns Packet Src/Dst Tags from packet/TR descriptor
> + */
> +static inline void cppi5_desc_get_tags_ids(struct cppi5_desc_hdr_t *desc_hdr,
> +					   u32 *src_tag_id, u32 *dst_tag_id)
> +{
> +	if (src_tag_id)
> +		*src_tag_id = (desc_hdr->src_dst_tag &
> +			      CPPI5_INFO3_DESC_SRCTAG_MASK) >>
> +			      CPPI5_INFO3_DESC_SRCTAG_SHIFT;
> +	if (dst_tag_id)
> +		*dst_tag_id = desc_hdr->src_dst_tag &
> +			      CPPI5_INFO3_DESC_DSTTAG_MASK;
> +}
> +
> +/**
> + * cppi5_desc_set_tags_ids - set Packet Src/Dst Tags in HDesc
> + * @desc_hdr: packet/TR descriptor header
> + * @src_tag_id: Source Tag
> + * @dst_tag_id: Dest Tag
> + *
> + * Returns Packet Src/Dst Tags from packet/TR descriptor
> + */
> +static inline void cppi5_desc_set_tags_ids(struct cppi5_desc_hdr_t *desc_hdr,
> +					   u32 src_tag_id, u32 dst_tag_id)
> +{
> +	desc_hdr->src_dst_tag = (src_tag_id << CPPI5_INFO3_DESC_SRCTAG_SHIFT) &
> +				CPPI5_INFO3_DESC_SRCTAG_MASK;
> +	desc_hdr->src_dst_tag |= dst_tag_id & CPPI5_INFO3_DESC_DSTTAG_MASK;
> +}
> +
> +/**
> + * cppi5_hdesc_calc_size - Calculate Host Packet Descriptor size
> + * @epib: is EPIB present
> + * @psdata_size: PSDATA size
> + * @sw_data_size: SWDATA size
> + *
> + * Returns required Host Packet Descriptor size
> + * 0 - if PSDATA > CPPI5_INFO0_HDESC_PSDATA_MAX_SIZE
> + */
> +static inline u32 cppi5_hdesc_calc_size(bool epib, u32 psdata_size,
> +					u32 sw_data_size)
> +{
> +	u32 desc_size;
> +
> +	if (psdata_size > CPPI5_INFO0_HDESC_PSDATA_MAX_SIZE)
> +		return 0;
> +
> +	desc_size = sizeof(struct cppi5_host_desc_t) + psdata_size +
> +		    sw_data_size;
> +
> +	if (epib)
> +		desc_size += CPPI5_INFO0_HDESC_EPIB_SIZE;
> +
> +	return ALIGN(desc_size, CPPI5_DESC_MIN_ALIGN);
> +}
> +
> +/**
> + * cppi5_hdesc_init - Init Host Packet Descriptor size
> + * @desc: Host packet descriptor
> + * @flags: supported values
> + *	CPPI5_INFO0_HDESC_EPIB_PRESENT
> + *	CPPI5_INFO0_HDESC_PSINFO_LOCATION
> + * @psdata_size: PSDATA size
> + *
> + * Returns required Host Packet Descriptor size
> + * 0 - if PSDATA > CPPI5_INFO0_HDESC_PSDATA_MAX_SIZE
> + */
> +static inline void cppi5_hdesc_init(struct cppi5_host_desc_t *desc, u32 flags,
> +				    u32 psdata_size)
> +{
> +	desc->hdr.pkt_info0 = (CPPI5_INFO0_DESC_TYPE_VAL_HOST <<
> +			       CPPI5_INFO0_HDESC_TYPE_SHIFT) | (flags);
> +	desc->hdr.pkt_info0 |= ((psdata_size >> 2) <<
> +				CPPI5_INFO0_HDESC_PSINFO_SIZE_SHIFT) &
> +				CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK;
> +	desc->next_desc = 0;
> +}
> +
> +/**
> + * cppi5_hdesc_update_flags - Replace descriptor flags
> + * @desc: Host packet descriptor
> + * @flags: supported values
> + *	CPPI5_INFO0_HDESC_EPIB_PRESENT
> + *	CPPI5_INFO0_HDESC_PSINFO_LOCATION
> + */
> +static inline void cppi5_hdesc_update_flags(struct cppi5_host_desc_t *desc,
> +					    u32 flags)
> +{
> +	desc->hdr.pkt_info0 &= ~(CPPI5_INFO0_HDESC_EPIB_PRESENT |
> +				 CPPI5_INFO0_HDESC_PSINFO_LOCATION);
> +	desc->hdr.pkt_info0 |= flags;
> +}
> +
> +/**
> + * cppi5_hdesc_update_psdata_size - Replace PSdata size
> + * @desc: Host packet descriptor
> + * @psdata_size: PSDATA size
> + */
> +static inline void cppi5_hdesc_update_psdata_size(
> +		struct cppi5_host_desc_t *desc, u32 psdata_size)
> +{
> +	desc->hdr.pkt_info0 &= ~CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK;
> +	desc->hdr.pkt_info0 |= ((psdata_size >> 2) <<
> +				CPPI5_INFO0_HDESC_PSINFO_SIZE_SHIFT) &
> +				CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK;
> +}
> +
> +/**
> + * cppi5_hdesc_get_psdata_size - get PSdata size in bytes
> + * @desc: Host packet descriptor
> + */
> +static inline u32 cppi5_hdesc_get_psdata_size(struct cppi5_host_desc_t *desc)
> +{
> +	u32 psdata_size = 0;
> +
> +	if (!(desc->hdr.pkt_info0 & CPPI5_INFO0_HDESC_PSINFO_LOCATION))
> +		psdata_size = (desc->hdr.pkt_info0 &
> +			       CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK) >>
> +			       CPPI5_INFO0_HDESC_PSINFO_SIZE_SHIFT;
> +
> +	return (psdata_size << 2);
> +}
> +
> +/**
> + * cppi5_hdesc_get_pktlen - get Packet Length from HDesc
> + * @desc: Host packet descriptor
> + *
> + * Returns Packet Length from Host Packet Descriptor
> + */
> +static inline u32 cppi5_hdesc_get_pktlen(struct cppi5_host_desc_t *desc)
> +{
> +	return (desc->hdr.pkt_info0 & CPPI5_INFO0_HDESC_PKTLEN_MASK);
> +}
> +
> +/**
> + * cppi5_hdesc_set_pktlen - set Packet Length in HDesc
> + * @desc: Host packet descriptor
> + */
> +static inline void cppi5_hdesc_set_pktlen(struct cppi5_host_desc_t *desc,
> +					  u32 pkt_len)
> +{
> +	desc->hdr.pkt_info0 |= (pkt_len & CPPI5_INFO0_HDESC_PKTLEN_MASK);
> +}
> +
> +/**
> + * cppi5_hdesc_get_psflags - get Protocol Specific Flags from HDesc
> + * @desc: Host packet descriptor
> + *
> + * Returns Protocol Specific Flags from Host Packet Descriptor
> + */
> +static inline u32 cppi5_hdesc_get_psflags(struct cppi5_host_desc_t *desc)
> +{
> +	return (desc->hdr.pkt_info1 & CPPI5_INFO1_HDESC_PSFLGS_MASK) >>
> +		CPPI5_INFO1_HDESC_PSFLGS_SHIFT;
> +}
> +
> +/**
> + * cppi5_hdesc_set_psflags - set Protocol Specific Flags in HDesc
> + * @desc: Host packet descriptor
> + */
> +static inline void cppi5_hdesc_set_psflags(struct cppi5_host_desc_t *desc,
> +					   u32 ps_flags)
> +{
> +	desc->hdr.pkt_info1 |= (ps_flags <<
> +				CPPI5_INFO1_HDESC_PSFLGS_SHIFT) &
> +				CPPI5_INFO1_HDESC_PSFLGS_MASK;
> +}
> +
> +/**
> + * cppi5_hdesc_get_errflags - get Packet Type from HDesc
> + * @desc: Host packet descriptor
> + */
> +static inline u32 cppi5_hdesc_get_pkttype(struct cppi5_host_desc_t *desc)
> +{
> +	return (desc->hdr.pkt_info2 & CPPI5_INFO2_HDESC_PKTTYPE_MASK) >>
> +		CPPI5_INFO2_HDESC_PKTTYPE_SHIFT;
> +}
> +
> +/**
> + * cppi5_hdesc_get_errflags - set Packet Type in HDesc
> + * @desc: Host packet descriptor
> + * @pkt_type: Packet Type
> + */
> +static inline void cppi5_hdesc_set_pkttype(struct cppi5_host_desc_t *desc,
> +					   u32 pkt_type)
> +{
> +	desc->hdr.pkt_info2 |=
> +			(pkt_type << CPPI5_INFO2_HDESC_PKTTYPE_SHIFT) &
> +			 CPPI5_INFO2_HDESC_PKTTYPE_MASK;
> +}
> +
> +/**
> + * cppi5_hdesc_attach_buf - attach buffer to HDesc
> + * @desc: Host packet descriptor
> + * @buf: Buffer physical address
> + * @buf_data_len: Buffer length
> + * @obuf: Original Buffer physical address
> + * @obuf_len: Original Buffer length
> + *
> + * Attaches buffer to Host Packet Descriptor
> + */
> +static inline void cppi5_hdesc_attach_buf(struct cppi5_host_desc_t *desc,
> +					  dma_addr_t buf, u32 buf_data_len,
> +					  dma_addr_t obuf, u32 obuf_len)
> +{
> +	desc->buf_ptr = buf;
> +	desc->buf_info1 = buf_data_len & CPPI5_BUFINFO1_HDESC_DATA_LEN_MASK;
> +	desc->org_buf_ptr = obuf;
> +	desc->org_buf_len = obuf_len & CPPI5_OBUFINFO0_HDESC_BUF_LEN_MASK;
> +}
> +
> +static inline void cppi5_hdesc_get_obuf(struct cppi5_host_desc_t *desc,
> +					dma_addr_t *obuf, u32 *obuf_len)
> +{
> +	*obuf = desc->org_buf_ptr;
> +	*obuf_len = desc->org_buf_len & CPPI5_OBUFINFO0_HDESC_BUF_LEN_MASK;
> +}
> +
> +static inline void cppi5_hdesc_reset_to_original(struct cppi5_host_desc_t *desc)
> +{
> +	desc->buf_ptr = desc->org_buf_ptr;
> +	desc->buf_info1 = desc->org_buf_len;
> +}
> +
> +/**
> + * cppi5_hdesc_link_hbdesc - link Host Buffer Descriptor to HDesc
> + * @desc: Host Packet Descriptor
> + * @buf_desc: Host Buffer Descriptor physical address
> + *
> + * add and link Host Buffer Descriptor to HDesc
> + */
> +static inline void cppi5_hdesc_link_hbdesc(struct cppi5_host_desc_t *desc,
> +					   dma_addr_t hbuf_desc)
> +{
> +	desc->next_desc = hbuf_desc;
> +}
> +
> +static inline dma_addr_t cppi5_hdesc_get_next_hbdesc(
> +		struct cppi5_host_desc_t *desc)
> +{
> +	return (dma_addr_t)desc->next_desc;
> +}
> +
> +static inline void cppi5_hdesc_reset_hbdesc(struct cppi5_host_desc_t *desc)
> +{
> +	desc->hdr = (struct cppi5_desc_hdr_t) { 0 };
> +	desc->next_desc = 0;
> +}
> +
> +/**
> + * cppi5_hdesc_epib_present -  check if EPIB present
> + * @desc_hdr: packet descriptor/TR header
> + *
> + * Returns true if EPIB present in the packet
> + */
> +static inline bool cppi5_hdesc_epib_present(struct cppi5_desc_hdr_t *desc_hdr)
> +{
> +	return !!(desc_hdr->pkt_info0 & CPPI5_INFO0_HDESC_EPIB_PRESENT);
> +}
> +
> +/**
> + * cppi5_hdesc_get_psdata -  Get pointer on PSDATA
> + * @desc: Host packet descriptor
> + *
> + * Returns pointer on PSDATA in HDesc.
> + * NULL - if ps_data placed at the start of data buffer.
> + */
> +static inline void *cppi5_hdesc_get_psdata(struct cppi5_host_desc_t *desc)
> +{
> +	u32 psdata_size;
> +	void *psdata;
> +
> +	if (desc->hdr.pkt_info0 & CPPI5_INFO0_HDESC_PSINFO_LOCATION)
> +		return NULL;
> +
> +	psdata_size = (desc->hdr.pkt_info0 &
> +		       CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK) >>
> +		       CPPI5_INFO0_HDESC_PSINFO_SIZE_SHIFT;
> +
> +	if (!psdata_size)
> +		return NULL;
> +
> +	psdata = &desc->epib;
> +
> +	if (cppi5_hdesc_epib_present(&desc->hdr))
> +		psdata += CPPI5_INFO0_HDESC_EPIB_SIZE;
> +
> +	return psdata;
> +}
> +
> +static inline u32 *cppi5_hdesc_get_psdata32(struct cppi5_host_desc_t *desc)
> +{
> +	return (u32 *)cppi5_hdesc_get_psdata(desc);
> +}
> +
> +/**
> + * cppi5_hdesc_get_swdata -  Get pointer on swdata
> + * @desc: Host packet descriptor
> + *
> + * Returns pointer on SWDATA in HDesc.
> + * NOTE. It's caller responsibility to be sure hdesc actually has swdata.
> + */
> +static inline void *cppi5_hdesc_get_swdata(struct cppi5_host_desc_t *desc)
> +{
> +	u32 psdata_size = 0;
> +	void *swdata;
> +
> +	if (!(desc->hdr.pkt_info0 & CPPI5_INFO0_HDESC_PSINFO_LOCATION))
> +		psdata_size = (desc->hdr.pkt_info0 &
> +			       CPPI5_INFO0_HDESC_PSINFO_SIZE_MASK) >>
> +			       CPPI5_INFO0_HDESC_PSINFO_SIZE_SHIFT;
> +
> +	swdata = &desc->epib;
> +
> +	if (cppi5_hdesc_epib_present(&desc->hdr))
> +		swdata += CPPI5_INFO0_HDESC_EPIB_SIZE;
> +
> +	swdata += (psdata_size << 2);
> +
> +	return swdata;
> +}
> +
> +/* ================================== TR ================================== */
> +
> +#define CPPI5_TR_TYPE_SHIFT			(0U)
> +#define CPPI5_TR_TYPE_MASK			GENMASK(3, 0)
> +#define CPPI5_TR_STATIC				BIT(4)
> +#define CPPI5_TR_WAIT				BIT(5)
> +#define CPPI5_TR_EVENT_SIZE_SHIFT		(6U)
> +#define CPPI5_TR_EVENT_SIZE_MASK		GENMASK(7, 6)
> +#define CPPI5_TR_TRIGGER0_SHIFT			(8U)
> +#define CPPI5_TR_TRIGGER0_MASK			GENMASK(9, 8)
> +#define CPPI5_TR_TRIGGER0_TYPE_SHIFT		(10U)
> +#define CPPI5_TR_TRIGGER0_TYPE_MASK		GENMASK(11, 10)
> +#define CPPI5_TR_TRIGGER1_SHIFT			(12U)
> +#define CPPI5_TR_TRIGGER1_MASK			GENMASK(13, 12)
> +#define CPPI5_TR_TRIGGER1_TYPE_SHIFT		(14U)
> +#define CPPI5_TR_TRIGGER1_TYPE_MASK		GENMASK(15, 14)
> +#define CPPI5_TR_CMD_ID_SHIFT			(16U)
> +#define CPPI5_TR_CMD_ID_MASK			GENMASK(23, 16)
> +#define CPPI5_TR_CSF_FLAGS_SHIFT		(24U)
> +#define CPPI5_TR_CSF_FLAGS_MASK			GENMASK(31, 24)
> +#define   CPPI5_TR_CSF_SA_INDIRECT		BIT(0)
> +#define   CPPI5_TR_CSF_DA_INDIRECT		BIT(1)
> +#define   CPPI5_TR_CSF_SUPR_EVT			BIT(2)
> +#define   CPPI5_TR_CSF_EOL_ADV_SHIFT		(4U)
> +#define   CPPI5_TR_CSF_EOL_ADV_MASK		GENMASK(6, 4)
> +#define   CPPI5_TR_CSF_EOP			BIT(7)
> +
> +/**
> + * enum cppi5_tr_types - TR types
> + * @CPPI5_TR_TYPE0:	One dimensional data move
> + * @CPPI5_TR_TYPE1:	Two dimensional data move
> + * @CPPI5_TR_TYPE2:	Three dimensional data move
> + * @CPPI5_TR_TYPE3:	Four dimensional data move
> + * @CPPI5_TR_TYPE4:	Four dimensional data move with data formatting
> + * @CPPI5_TR_TYPE5:	Four dimensional Cache Warm
> + * @CPPI5_TR_TYPE8:	Four Dimensional Block Move
> + * @CPPI5_TR_TYPE9:	Four Dimensional Block Move with Repacking
> + * @CPPI5_TR_TYPE10:	Two Dimensional Block Move
> + * @CPPI5_TR_TYPE11:	Two Dimensional Block Move with Repacking
> + * @CPPI5_TR_TYPE15:	Four Dimensional Block Move with Repacking and
> + *			Indirection
> + */
> +enum cppi5_tr_types {
> +	CPPI5_TR_TYPE0 = 0,
> +	CPPI5_TR_TYPE1,
> +	CPPI5_TR_TYPE2,
> +	CPPI5_TR_TYPE3,
> +	CPPI5_TR_TYPE4,
> +	CPPI5_TR_TYPE5,
> +	/* type6-7: Reserved */
> +	CPPI5_TR_TYPE8 = 8,
> +	CPPI5_TR_TYPE9,
> +	CPPI5_TR_TYPE10,
> +	CPPI5_TR_TYPE11,
> +	/* type12-14: Reserved */
> +	CPPI5_TR_TYPE15 = 15,
> +	CPPI5_TR_TYPE_MAX
> +};
> +
> +/**
> + * enum cppi5_tr_event_size - TR Flags EVENT_SIZE field specifies when an event
> + *			      is generated for each TR.
> + * @CPPI5_TR_EVENT_SIZE_COMPLETION:	When TR is complete and all status for
> + * 					the TR has been received
> + * @CPPI5_TR_EVENT_SIZE_ICNT1_DEC:	Type 0: when the last data transaction
> + *					is sent for the TR
> + *					Type 1-11: when ICNT1 is decremented
> + * @CPPI5_TR_EVENT_SIZE_ICNT2_DEC:	Type 0-1,10-11: when the last data
> + *					transaction is sent for the TR
> + *					All other types: when ICNT2 is
> + *					decremented
> + * @CPPI5_TR_EVENT_SIZE_ICNT3_DEC:	Type 0-2,10-11: when the last data
> + *					transaction is sent for the TR
> + *					All other types: when ICNT3 is
> + *					decremented
> + */
> +enum cppi5_tr_event_size {
> +	CPPI5_TR_EVENT_SIZE_COMPLETION,
> +	CPPI5_TR_EVENT_SIZE_ICNT1_DEC,
> +	CPPI5_TR_EVENT_SIZE_ICNT2_DEC,
> +	CPPI5_TR_EVENT_SIZE_ICNT3_DEC,
> +	CPPI5_TR_EVENT_SIZE_MAX
> +};
> +
> +/**
> + * enum cppi5_tr_trigger - TR Flags TRIGGERx field specifies the type of trigger
> + *			   used to enable the TR to transfer data as specified
> + *			   by TRIGGERx_TYPE field.
> + * @CPPI5_TR_TRIGGER_NONE:		No trigger
> + * @CPPI5_TR_TRIGGER_GLOBAL0:		Global trigger 0
> + * @CPPI5_TR_TRIGGER_GLOBAL1:		Global trigger 1
> + * @CPPI5_TR_TRIGGER_LOCAL_EVENT:	Local Event
> + */
> +enum cppi5_tr_trigger {
> +	CPPI5_TR_TRIGGER_NONE,
> +	CPPI5_TR_TRIGGER_GLOBAL0,
> +	CPPI5_TR_TRIGGER_GLOBAL1,
> +	CPPI5_TR_TRIGGER_LOCAL_EVENT,
> +	CPPI5_TR_TRIGGER_MAX
> +};
> +
> +/**
> + * enum cppi5_tr_trigger_type - TR Flags TRIGGERx_TYPE field specifies the type
> + *				of data transfer that will be enabled by
> + *				receiving a trigger as specified by TRIGGERx.
> + * @CPPI5_TR_TRIGGER_TYPE_ICNT1_DEC:	The second inner most loop (ICNT1) will
> + *					be decremented by 1
> + * @CPPI5_TR_TRIGGER_TYPE_ICNT2_DEC:	The third inner most loop (ICNT2) will
> + *					be decremented by 1
> + * @CPPI5_TR_TRIGGER_TYPE_ICNT3_DEC:	The outer most loop (ICNT3) will be
> + *					decremented by 1
> + * @CPPI5_TR_TRIGGER_TYPE_ALL:		The entire TR will be allowed to
> + *					complete
> + */
> +enum cppi5_tr_trigger_type {
> +	CPPI5_TR_TRIGGER_TYPE_ICNT1_DEC,
> +	CPPI5_TR_TRIGGER_TYPE_ICNT2_DEC,
> +	CPPI5_TR_TRIGGER_TYPE_ICNT3_DEC,
> +	CPPI5_TR_TRIGGER_TYPE_ALL,
> +	CPPI5_TR_TRIGGER_TYPE_MAX
> +};
> +
> +typedef u32 cppi5_tr_flags_t;
> +
> +/**
> + * struct cppi5_tr_type0_t - Type 0 (One dimensional data move) TR (16 byte)
> + * @flags:		TR flags (type, triggers, event, configuration)
> + * @icnt0:		Total loop iteration count for level 0 (innermost)
> + * @_reserved:		Not used
> + * @addr:		Starting address for the source data or destination data
> + */
> +struct cppi5_tr_type0_t {
> +	cppi5_tr_flags_t flags;
> +	u16 icnt0;
> +	u16 _reserved;
> +	u64 addr;
> +} __aligned(16) __packed;
> +
> +/**
> + * struct cppi5_tr_type1_t - Type 1 (Two dimensional data move) TR (32 byte)
> + * @flags:		TR flags (type, triggers, event, configuration)
> + * @icnt0:		Total loop iteration count for level 0 (innermost)
> + * @icnt1:		Total loop iteration count for level 1
> + * @addr:		Starting address for the source data or destination data
> + * @dim1:		Signed dimension for loop level 1
> + */
> +struct cppi5_tr_type1_t {
> +	cppi5_tr_flags_t flags;
> +	u16 icnt0;
> +	u16 icnt1;
> +	u64 addr;
> +	s32 dim1;
> +} __aligned(32) __packed;
> +
> +/**
> + * struct cppi5_tr_type2_t - Type 2 (Three dimensional data move) TR (32 byte)
> + * @flags:		TR flags (type, triggers, event, configuration)
> + * @icnt0:		Total loop iteration count for level 0 (innermost)
> + * @icnt1:		Total loop iteration count for level 1
> + * @addr:		Starting address for the source data or destination data
> + * @dim1:		Signed dimension for loop level 1
> + * @icnt2:		Total loop iteration count for level 2
> + * @_reserved:		Not used
> + * @dim2:		Signed dimension for loop level 2
> + */
> +struct cppi5_tr_type2_t {
> +	cppi5_tr_flags_t flags;
> +	u16 icnt0;
> +	u16 icnt1;
> +	u64 addr;
> +	s32 dim1;
> +	u16 icnt2;
> +	u16 _reserved;
> +	s32 dim2;
> +} __aligned(32) __packed;
> +
> +/**
> + * struct cppi5_tr_type3_t - Type 3 (Four dimensional data move) TR (32 byte)
> + * @flags:		TR flags (type, triggers, event, configuration)
> + * @icnt0:		Total loop iteration count for level 0 (innermost)
> + * @icnt1:		Total loop iteration count for level 1
> + * @addr:		Starting address for the source data or destination data
> + * @dim1:		Signed dimension for loop level 1
> + * @icnt2:		Total loop iteration count for level 2
> + * @icnt3:		Total loop iteration count for level 3 (outermost)
> + * @dim2:		Signed dimension for loop level 2
> + * @dim3:		Signed dimension for loop level 3
> + */
> +struct cppi5_tr_type3_t {
> +	cppi5_tr_flags_t flags;
> +	u16 icnt0;
> +	u16 icnt1;
> +	u64 addr;
> +	s32 dim1;
> +	u16 icnt2;
> +	u16 icnt3;
> +	s32 dim2;
> +	s32 dim3;
> +} __aligned(32) __packed;
> +
> +/**
> + * struct cppi5_tr_type15_t - Type 15 (Four Dimensional Block Copy with
> + *			      Repacking and Indirection Support) TR (64 byte)
> + * @flags:		TR flags (type, triggers, event, configuration)
> + * @icnt0:		Total loop iteration count for level 0 (innermost) for
> + *			source
> + * @icnt1:		Total loop iteration count for level 1 for source
> + * @addr:		Starting address for the source data
> + * @dim1:		Signed dimension for loop level 1 for source
> + * @icnt2:		Total loop iteration count for level 2 for source
> + * @icnt3:		Total loop iteration count for level 3 (outermost) for
> + *			source
> + * @dim2:		Signed dimension for loop level 2 for source
> + * @dim3:		Signed dimension for loop level 3 for source
> + * @_reserved:		Not used
> + * @ddim1:		Signed dimension for loop level 1 for destination
> + * @daddr:		Starting address for the destination data
> + * @ddim2:		Signed dimension for loop level 2 for destination
> + * @ddim3:		Signed dimension for loop level 3 for destination
> + * @dicnt0:		Total loop iteration count for level 0 (innermost) for
> + *			destination
> + * @dicnt1:		Total loop iteration count for level 1 for destination
> + * @dicnt2:		Total loop iteration count for level 2 for destination
> + * @sicnt3:		Total loop iteration count for level 3 (outermost) for
> + *			destination
> + */
> +struct cppi5_tr_type15_t {
> +	cppi5_tr_flags_t flags;
> +	u16 icnt0;
> +	u16 icnt1;
> +	u64 addr;
> +	s32 dim1;
> +	u16 icnt2;
> +	u16 icnt3;
> +	s32 dim2;
> +	s32 dim3;
> +	u32 _reserved;
> +	s32 ddim1;
> +	u64 daddr;
> +	s32 ddim2;
> +	s32 ddim3;
> +	u16 dicnt0;
> +	u16 dicnt1;
> +	u16 dicnt2;
> +	u16 dicnt3;
> +} __aligned(64) __packed;
> +
> +/**
> + * struct cppi5_tr_resp_t - TR response record
> + * @status:		Status type and info
> + * @_reserved:		Not used
> + * @cmd_id:		Command ID for the TR for TR identification
> + * @flags:		Configuration Specific Flags
> + */
> +struct cppi5_tr_resp_t {
> +	u8 status;
> +	u8 _reserved;
> +	u8 cmd_id;
> +	u8 flags;
> +} __packed;
> +
> +#define CPPI5_TR_RESPONSE_STATUS_TYPE_SHIFT	(0U)
> +#define CPPI5_TR_RESPONSE_STATUS_TYPE_MASK	GENMASK(3, 0)
> +#define CPPI5_TR_RESPONSE_STATUS_INFO_SHIFT	(4U)
> +#define CPPI5_TR_RESPONSE_STATUS_INFO_MASK	GENMASK(7, 4)
> +#define CPPI5_TR_RESPONSE_CMDID_SHIFT		(16U)
> +#define CPPI5_TR_RESPONSE_CMDID_MASK		GENMASK(23, 16)
> +#define CPPI5_TR_RESPONSE_CFG_SPECIFIC_SHIFT	(24U)
> +#define CPPI5_TR_RESPONSE_CFG_SPECIFIC_MASK	GENMASK(31, 24)
> +
> +/**
> + * enum cppi5_tr_resp_status_type - TR Response Status Type field is used to
> + *				    determine what type of status is being
> + *				    returned.
> + * @CPPI5_TR_RESPONSE_STATUS_NONE:		No error, completion: completed
> + * @CPPI5_TR_RESPONSE_STATUS_TRANSFER_ERR:	Transfer Error, completion: none
> + *						or partially completed
> + * @CPPI5_TR_RESPONSE_STATUS_ABORTED_ERR:	Aborted Error, completion: none
> + *						or partially completed
> + * @CPPI5_TR_RESPONSE_STATUS_SUBMISSION_ERR:	Submission Error, completion:
> + *						none
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_ERR:	Unsupported Error, completion:
> + *						none
> + * @CPPI5_TR_RESPONSE_STATUS_TRANSFER_EXCEPTION: Transfer Exception, completion:
> + *						partially completed
> + * @CPPI5_TR_RESPONSE_STATUS__TEARDOWN_FLUSH:	Teardown Flush, completion: none
> + */
> +enum cppi5_tr_resp_status_type {
> +	CPPI5_TR_RESPONSE_STATUS_NONE,
> +	CPPI5_TR_RESPONSE_STATUS_TRANSFER_ERR,
> +	CPPI5_TR_RESPONSE_STATUS_ABORTED_ERR,
> +	CPPI5_TR_RESPONSE_STATUS_SUBMISSION_ERR,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_ERR,
> +	CPPI5_TR_RESPONSE_STATUS_TRANSFER_EXCEPTION,
> +	CPPI5_TR_RESPONSE_STATUS__TEARDOWN_FLUSH,
> +	CPPI5_TR_RESPONSE_STATUS_MAX
> +};
> +
> +/**
> + * enum cppi5_tr_resp_status_submission - TR Response Status field values which
> + *					  corresponds Submission Error
> + * @CPPI5_TR_RESPONSE_STATUS_SUBMISSION_ICNT0:	ICNT0 was 0
> + * @CPPI5_TR_RESPONSE_STATUS_SUBMISSION_FIFO_FULL: Channel FIFO was full when TR
> + *						received
> + * @CPPI5_TR_RESPONSE_STATUS_SUBMISSION_OWN:	Channel is not owned by the
> + *						submitter
> + */
> +enum cppi5_tr_resp_status_submission {
> +	CPPI5_TR_RESPONSE_STATUS_SUBMISSION_ICNT0,
> +	CPPI5_TR_RESPONSE_STATUS_SUBMISSION_FIFO_FULL,
> +	CPPI5_TR_RESPONSE_STATUS_SUBMISSION_OWN,
> +	CPPI5_TR_RESPONSE_STATUS_SUBMISSION_MAX
> +};
> +
> +/**
> + * enum cppi5_tr_resp_status_unsupported - TR Response Status field values which
> + *					   corresponds Unsupported Error
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_TR_TYPE:	TR Type not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_STATIC:	STATIC not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_EOL:		EOL not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_CFG_SPECIFIC:	CONFIGURATION SPECIFIC
> + *							not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_AMODE:		AMODE not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_ELTYPE:	ELTYPE not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_DFMT:		DFMT not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_SECTR:		SECTR not supported
> + * @CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_AMODE_SPECIFIC: AMODE SPECIFIC field
> + *							not supported
> + */
> +enum cppi5_tr_resp_status_unsupported {
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_TR_TYPE,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_STATIC,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_EOL,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_CFG_SPECIFIC,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_AMODE,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_ELTYPE,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_DFMT,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_SECTR,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_AMODE_SPECIFIC,
> +	CPPI5_TR_RESPONSE_STATUS_UNSUPPORTED_MAX
> +};
> +
> +/**
> + * cppi5_trdesc_calc_size - Calculate TR Descriptor size
> + * @tr_count: number of TR records
> + * @tr_size: Nominal size of TR record (max) [16, 32, 64, 128]
> + *
> + * Returns required TR Descriptor size
> + */
> +static inline size_t cppi5_trdesc_calc_size(u32 tr_count, u32 tr_size)
> +{
> +	/*
> +	 * The Size of a TR descriptor is:
> +	 * 1 x tr_size : the first 16 bytes is used by the packet info block +
> +	 * tr_count x tr_size : Transfer Request Records +
> +	 * tr_count x sizeof(struct cppi5_tr_resp_t) : Transfer Response Records
> +	 */
> +	return tr_size * (tr_count + 1) +
> +		sizeof(struct cppi5_tr_resp_t) * tr_count;
> +}
> +
> +/**
> + * cppi5_trdesc_init - Init TR Descriptor
> + * @desc: TR Descriptor
> + * @tr_count: number of TR records
> + * @tr_size: Nominal size of TR record (max) [16, 32, 64, 128]
> + * @reload_idx: Absolute index to jump to on the 2nd and following passes
> + *		through the TR packet.
> + * @reload_count: Number of times to jump from last entry to reload_idx. 0x1ff
> + *		  indicates infinite looping.
> + *
> + * Init TR Descriptor
> + */
> +static inline void cppi5_trdesc_init(struct cppi5_desc_hdr_t *desc_hdr,
> +				     u32 tr_count, u32 tr_size, u32 reload_idx,
> +				     u32 reload_count)
> +{
> +	desc_hdr->pkt_info0 = CPPI5_INFO0_DESC_TYPE_VAL_TR <<
> +			      CPPI5_INFO0_HDESC_TYPE_SHIFT;
> +	desc_hdr->pkt_info0 |= (reload_count << CPPI5_INFO0_TRDESC_RLDCNT_SHIFT) &
> +			       CPPI5_INFO0_TRDESC_RLDCNT_MASK;
> +	desc_hdr->pkt_info0 |= (reload_idx << CPPI5_INFO0_TRDESC_RLDIDX_SHIFT) &
> +			       CPPI5_INFO0_TRDESC_RLDIDX_MASK;
> +	desc_hdr->pkt_info0 |= (tr_count - 1) & CPPI5_INFO0_TRDESC_LASTIDX_MASK;
> +
> +	desc_hdr->pkt_info1 |= ((ffs(tr_size >> 4) - 1) <<
> +				CPPI5_INFO1_TRDESC_RECSIZE_SHIFT) &
> +				CPPI5_INFO1_TRDESC_RECSIZE_MASK;
> +}
> +
> +/**
> + * cppi5_tr_init - Init TR record
> + * @flags: Pointer to the TR's flags
> + * @type: TR type
> + * @static_tr: TR is static
> + * @wait: Wait for TR completion before allow the next TR to start
> + * @event_size: output event generation cfg
> + * @cmd_id: TR identifier (application specifics)
> + *
> + * Init TR record
> + */
> +static inline void cppi5_tr_init(cppi5_tr_flags_t *flags,
> +				 enum cppi5_tr_types type, bool static_tr,
> +				 bool wait, enum cppi5_tr_event_size event_size,
> +				 u32 cmd_id)
> +{
> +	*flags = type;
> +	*flags |= (event_size << CPPI5_TR_EVENT_SIZE_SHIFT) &
> +		  CPPI5_TR_EVENT_SIZE_MASK;
> +
> +	*flags |= (cmd_id << CPPI5_TR_CMD_ID_SHIFT) &
> +		  CPPI5_TR_CMD_ID_MASK;
> +
> +	if (static_tr && (type == CPPI5_TR_TYPE8 || type == CPPI5_TR_TYPE9))
> +		*flags |= CPPI5_TR_STATIC;
> +
> +	if (wait)
> +		*flags |= CPPI5_TR_WAIT;
> +}
> +
> +/**
> + * cppi5_tr_set_trigger - Configure trigger0/1 and trigger0/1_type
> + * @flags: Pointer to the TR's flags
> + * @trigger0: trigger0 selection
> + * @trigger0_type: type of data transfer that will be enabled by trigger0
> + * @trigger1: trigger1 selection
> + * @trigger1_type: type of data transfer that will be enabled by trigger1
> + *
> + * Configure the triggers for the TR
> + */
> +static inline void cppi5_tr_set_trigger(cppi5_tr_flags_t *flags,
> +		enum cppi5_tr_trigger trigger0,
> +		enum cppi5_tr_trigger_type trigger0_type,
> +		enum cppi5_tr_trigger trigger1,
> +		enum cppi5_tr_trigger_type trigger1_type)
> +{
> +	*flags |= (trigger0 << CPPI5_TR_TRIGGER0_SHIFT) &
> +		  CPPI5_TR_TRIGGER0_MASK;
> +	*flags |= (trigger0_type << CPPI5_TR_TRIGGER0_TYPE_SHIFT) &
> +		  CPPI5_TR_TRIGGER0_TYPE_MASK;
> +
> +	*flags |= (trigger1 << CPPI5_TR_TRIGGER1_SHIFT) &
> +		  CPPI5_TR_TRIGGER1_MASK;
> +	*flags |= (trigger1_type << CPPI5_TR_TRIGGER1_TYPE_SHIFT) &
> +		  CPPI5_TR_TRIGGER1_TYPE_MASK;
> +}
> +
> +/**
> + * cppi5_tr_cflag_set - Update the Configuration specific flags
> + * @flags: Pointer to the TR's flags
> + * @csf: Configuration specific flags
> + *
> + * Set a bit in Configuration Specific Flags section of the TR flags.
> + */
> +static inline void cppi5_tr_csf_set(cppi5_tr_flags_t *flags, u32 csf)
> +{
> +	*flags |= (csf << CPPI5_TR_CSF_FLAGS_SHIFT) &
> +		  CPPI5_TR_CSF_FLAGS_MASK;
> +}
> +
> +#endif /* __TI_CPPI5_H__ */
> 

--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Tero Kristo Nov. 5, 2019, 7:49 a.m. UTC | #2
On 01/11/2019 10:41, Peter Ujfalusi wrote:
> In K3 architecture the DMA operates within threads. One end of the thread
> is UDMAP, the other is on the peripheral side.
> 
> The UDMAP channel configuration depends on the needs of the remote
> endpoint and it can be differ from peripheral to peripheral.
> 
> This patch adds database for am654 and j721e and small API to fetch the
> PSI-L endpoint configuration from the database which should only used by
> the DMA driver(s).
> 
> Another API is added for native peripherals to give possibility to pass new
> configuration for the threads they are using, which is needed to be able to
> handle changes caused by different firmware loaded for the peripheral for
> example.
> 
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> ---
>   drivers/dma/ti/Kconfig         |   3 +
>   drivers/dma/ti/Makefile        |   1 +
>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>   include/linux/dma/k3-psil.h    |  47 +++++++
>   7 files changed, 578 insertions(+)
>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>   create mode 100644 drivers/dma/ti/k3-psil.c
>   create mode 100644 include/linux/dma/k3-psil.h
> 
> diff --git a/drivers/dma/ti/Kconfig b/drivers/dma/ti/Kconfig
> index d507c24fbf31..72f3d2728178 100644
> --- a/drivers/dma/ti/Kconfig
> +++ b/drivers/dma/ti/Kconfig
> @@ -34,5 +34,8 @@ config DMA_OMAP
>   	  Enable support for the TI sDMA (System DMA or DMA4) controller. This
>   	  DMA engine is found on OMAP and DRA7xx parts.
>   
> +config TI_K3_PSIL
> +	bool
> +
>   config TI_DMA_CROSSBAR
>   	bool
> diff --git a/drivers/dma/ti/Makefile b/drivers/dma/ti/Makefile
> index 113e59ec9c32..f8d912ad7eaf 100644
> --- a/drivers/dma/ti/Makefile
> +++ b/drivers/dma/ti/Makefile
> @@ -2,4 +2,5 @@
>   obj-$(CONFIG_TI_CPPI41) += cppi41.o
>   obj-$(CONFIG_TI_EDMA) += edma.o
>   obj-$(CONFIG_DMA_OMAP) += omap-dma.o
> +obj-$(CONFIG_TI_K3_PSIL) += k3-psil.o k3-psil-am654.o k3-psil-j721e.o
>   obj-$(CONFIG_TI_DMA_CROSSBAR) += dma-crossbar.o
> diff --git a/drivers/dma/ti/k3-psil-am654.c b/drivers/dma/ti/k3-psil-am654.c
> new file mode 100644
> index 000000000000..edd7fff36f44
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil-am654.c
> @@ -0,0 +1,172 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + *  Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + *  Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
> + */
> +
> +#include <linux/kernel.h>
> +
> +#include "k3-psil-priv.h"
> +
> +#define PSIL_PDMA_XY_TR(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_PDMA_XY,	\
> +		},					\
> +	}
> +
> +#define PSIL_PDMA_XY_PKT(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_PDMA_XY,	\
> +			.pkt_mode = 1,			\
> +		},					\
> +	}
> +
> +#define PSIL_ETHERNET(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_NATIVE,	\
> +			.pkt_mode = 1,			\
> +			.needs_epib = 1,		\
> +			.psd_size = 16,			\
> +		},					\
> +	}
> +
> +#define PSIL_SA2UL(x, tx)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_NATIVE,	\
> +			.pkt_mode = 1,			\
> +			.needs_epib = 1,		\
> +			.psd_size = 64,			\
> +			.notdpkt = tx,			\
> +		},					\
> +	}
> +
> +/* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
> +struct psil_ep am654_src_ep_map[] = {
> +	/* SA2UL */
> +	PSIL_SA2UL(0x4000, 0),
> +	PSIL_SA2UL(0x4001, 0),
> +	/* PRU_ICSSG0 */
> +	PSIL_ETHERNET(0x4100),
> +	PSIL_ETHERNET(0x4101),
> +	PSIL_ETHERNET(0x4102),
> +	PSIL_ETHERNET(0x4103),
> +	/* PRU_ICSSG1 */
> +	PSIL_ETHERNET(0x4200),
> +	PSIL_ETHERNET(0x4201),
> +	PSIL_ETHERNET(0x4202),
> +	PSIL_ETHERNET(0x4203),
> +	/* PRU_ICSSG2 */
> +	PSIL_ETHERNET(0x4300),
> +	PSIL_ETHERNET(0x4301),
> +	PSIL_ETHERNET(0x4302),
> +	PSIL_ETHERNET(0x4303),
> +	/* PDMA0 - McASPs */
> +	PSIL_PDMA_XY_TR(0x4400),
> +	PSIL_PDMA_XY_TR(0x4401),
> +	PSIL_PDMA_XY_TR(0x4402),
> +	/* PDMA1 - SPI0-4 */
> +	PSIL_PDMA_XY_PKT(0x4500),
> +	PSIL_PDMA_XY_PKT(0x4501),
> +	PSIL_PDMA_XY_PKT(0x4502),
> +	PSIL_PDMA_XY_PKT(0x4503),
> +	PSIL_PDMA_XY_PKT(0x4504),
> +	PSIL_PDMA_XY_PKT(0x4505),
> +	PSIL_PDMA_XY_PKT(0x4506),
> +	PSIL_PDMA_XY_PKT(0x4507),
> +	PSIL_PDMA_XY_PKT(0x4508),
> +	PSIL_PDMA_XY_PKT(0x4509),
> +	PSIL_PDMA_XY_PKT(0x450a),
> +	PSIL_PDMA_XY_PKT(0x450b),
> +	PSIL_PDMA_XY_PKT(0x450c),
> +	PSIL_PDMA_XY_PKT(0x450d),
> +	PSIL_PDMA_XY_PKT(0x450e),
> +	PSIL_PDMA_XY_PKT(0x450f),
> +	PSIL_PDMA_XY_PKT(0x4510),
> +	PSIL_PDMA_XY_PKT(0x4511),
> +	PSIL_PDMA_XY_PKT(0x4512),
> +	PSIL_PDMA_XY_PKT(0x4513),
> +	/* PDMA1 - USART0-2 */
> +	PSIL_PDMA_XY_PKT(0x4514),
> +	PSIL_PDMA_XY_PKT(0x4515),
> +	PSIL_PDMA_XY_PKT(0x4516),
> +	/* CPSW0 */
> +	PSIL_ETHERNET(0x7000),
> +	/* MCU_PDMA0 - ADCs */
> +	PSIL_PDMA_XY_TR(0x7100),
> +	PSIL_PDMA_XY_TR(0x7101),
> +	PSIL_PDMA_XY_TR(0x7102),
> +	PSIL_PDMA_XY_TR(0x7103),
> +	/* MCU_PDMA1 - MCU_SPI0-2 */
> +	PSIL_PDMA_XY_PKT(0x7200),
> +	PSIL_PDMA_XY_PKT(0x7201),
> +	PSIL_PDMA_XY_PKT(0x7202),
> +	PSIL_PDMA_XY_PKT(0x7203),
> +	PSIL_PDMA_XY_PKT(0x7204),
> +	PSIL_PDMA_XY_PKT(0x7205),
> +	PSIL_PDMA_XY_PKT(0x7206),
> +	PSIL_PDMA_XY_PKT(0x7207),
> +	PSIL_PDMA_XY_PKT(0x7208),
> +	PSIL_PDMA_XY_PKT(0x7209),
> +	PSIL_PDMA_XY_PKT(0x720a),
> +	PSIL_PDMA_XY_PKT(0x720b),
> +	/* MCU_PDMA1 - MCU_USART0 */
> +	PSIL_PDMA_XY_PKT(0x7212),
> +};
> +
> +/* PSI-L destination thread IDs, used for TX (DMA_MEM_TO_DEV) */
> +struct psil_ep am654_dst_ep_map[] = {
> +	/* SA2UL */
> +	PSIL_SA2UL(0xc000, 1),
> +	/* PRU_ICSSG0 */
> +	PSIL_ETHERNET(0xc100),
> +	PSIL_ETHERNET(0xc101),
> +	PSIL_ETHERNET(0xc102),
> +	PSIL_ETHERNET(0xc103),
> +	PSIL_ETHERNET(0xc104),
> +	PSIL_ETHERNET(0xc105),
> +	PSIL_ETHERNET(0xc106),
> +	PSIL_ETHERNET(0xc107),
> +	/* PRU_ICSSG1 */
> +	PSIL_ETHERNET(0xc200),
> +	PSIL_ETHERNET(0xc201),
> +	PSIL_ETHERNET(0xc202),
> +	PSIL_ETHERNET(0xc203),
> +	PSIL_ETHERNET(0xc204),
> +	PSIL_ETHERNET(0xc205),
> +	PSIL_ETHERNET(0xc206),
> +	PSIL_ETHERNET(0xc207),
> +	/* PRU_ICSSG2 */
> +	PSIL_ETHERNET(0xc300),
> +	PSIL_ETHERNET(0xc301),
> +	PSIL_ETHERNET(0xc302),
> +	PSIL_ETHERNET(0xc303),
> +	PSIL_ETHERNET(0xc304),
> +	PSIL_ETHERNET(0xc305),
> +	PSIL_ETHERNET(0xc306),
> +	PSIL_ETHERNET(0xc307),
> +	/* CPSW0 */
> +	PSIL_ETHERNET(0xf000),
> +	PSIL_ETHERNET(0xf001),
> +	PSIL_ETHERNET(0xf002),
> +	PSIL_ETHERNET(0xf003),
> +	PSIL_ETHERNET(0xf004),
> +	PSIL_ETHERNET(0xf005),
> +	PSIL_ETHERNET(0xf006),
> +	PSIL_ETHERNET(0xf007),
> +};
> +
> +struct psil_ep_map am654_ep_map = {
> +	.name = "am654",
> +	.src = am654_src_ep_map,
> +	.src_count = ARRAY_SIZE(am654_src_ep_map),
> +	.dst = am654_dst_ep_map,
> +	.dst_count = ARRAY_SIZE(am654_dst_ep_map),
> +};
> diff --git a/drivers/dma/ti/k3-psil-j721e.c b/drivers/dma/ti/k3-psil-j721e.c
> new file mode 100644
> index 000000000000..86e1ff57e197
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil-j721e.c
> @@ -0,0 +1,219 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + *  Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + *  Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
> + */
> +
> +#include <linux/kernel.h>
> +
> +#include "k3-psil-priv.h"
> +
> +#define PSIL_PDMA_XY_TR(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_PDMA_XY,	\
> +		},					\
> +	}
> +
> +#define PSIL_PDMA_XY_PKT(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_PDMA_XY,	\
> +			.pkt_mode = 1,			\
> +		},					\
> +	}
> +
> +#define PSIL_PDMA_MCASP(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_PDMA_XY,	\
> +			.pdma_acc32 = 1,		\
> +			.pdma_burst = 1,		\
> +		},					\
> +	}
> +
> +#define PSIL_ETHERNET(x)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_NATIVE,	\
> +			.pkt_mode = 1,			\
> +			.needs_epib = 1,		\
> +			.psd_size = 16,			\
> +		},					\
> +	}
> +
> +#define PSIL_SA2UL(x, tx)				\
> +	{						\
> +		.thread_id = x,				\
> +		.ep_config = {				\
> +			.ep_type = PSIL_EP_NATIVE,	\
> +			.pkt_mode = 1,			\
> +			.needs_epib = 1,		\
> +			.psd_size = 64,			\
> +			.notdpkt = tx,			\
> +		},					\
> +	}
> +
> +/* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
> +struct psil_ep j721e_src_ep_map[] = {
> +	/* SA2UL */
> +	PSIL_SA2UL(0x4000, 0),
> +	PSIL_SA2UL(0x4001, 0),
> +	/* PRU_ICSSG0 */
> +	PSIL_ETHERNET(0x4100),
> +	PSIL_ETHERNET(0x4101),
> +	PSIL_ETHERNET(0x4102),
> +	PSIL_ETHERNET(0x4103),
> +	/* PRU_ICSSG1 */
> +	PSIL_ETHERNET(0x4200),
> +	PSIL_ETHERNET(0x4201),
> +	PSIL_ETHERNET(0x4202),
> +	PSIL_ETHERNET(0x4203),
> +	/* PDMA6 (PSIL_PDMA_MCASP_G0) - McASP0-2 */
> +	PSIL_PDMA_MCASP(0x4400),
> +	PSIL_PDMA_MCASP(0x4401),
> +	PSIL_PDMA_MCASP(0x4402),
> +	/* PDMA7 (PSIL_PDMA_MCASP_G1) - McASP3-11 */
> +	PSIL_PDMA_MCASP(0x4500),
> +	PSIL_PDMA_MCASP(0x4501),
> +	PSIL_PDMA_MCASP(0x4502),
> +	PSIL_PDMA_MCASP(0x4503),
> +	PSIL_PDMA_MCASP(0x4504),
> +	PSIL_PDMA_MCASP(0x4505),
> +	PSIL_PDMA_MCASP(0x4506),
> +	PSIL_PDMA_MCASP(0x4507),
> +	PSIL_PDMA_MCASP(0x4508),
> +	/* PDMA8 (PDMA_MISC_G0) - SPI0-1 */
> +	PSIL_PDMA_XY_PKT(0x4600),
> +	PSIL_PDMA_XY_PKT(0x4601),
> +	PSIL_PDMA_XY_PKT(0x4602),
> +	PSIL_PDMA_XY_PKT(0x4603),
> +	PSIL_PDMA_XY_PKT(0x4604),
> +	PSIL_PDMA_XY_PKT(0x4605),
> +	PSIL_PDMA_XY_PKT(0x4606),
> +	PSIL_PDMA_XY_PKT(0x4607),
> +	/* PDMA9 (PDMA_MISC_G1) - SPI2-3 */
> +	PSIL_PDMA_XY_PKT(0x460c),
> +	PSIL_PDMA_XY_PKT(0x460d),
> +	PSIL_PDMA_XY_PKT(0x460e),
> +	PSIL_PDMA_XY_PKT(0x460f),
> +	PSIL_PDMA_XY_PKT(0x4610),
> +	PSIL_PDMA_XY_PKT(0x4611),
> +	PSIL_PDMA_XY_PKT(0x4612),
> +	PSIL_PDMA_XY_PKT(0x4613),
> +	/* PDMA10 (PDMA_MISC_G2) - SPI4-5 */
> +	PSIL_PDMA_XY_PKT(0x4618),
> +	PSIL_PDMA_XY_PKT(0x4619),
> +	PSIL_PDMA_XY_PKT(0x461a),
> +	PSIL_PDMA_XY_PKT(0x461b),
> +	PSIL_PDMA_XY_PKT(0x461c),
> +	PSIL_PDMA_XY_PKT(0x461d),
> +	PSIL_PDMA_XY_PKT(0x461e),
> +	PSIL_PDMA_XY_PKT(0x461f),
> +	/* PDMA11 (PDMA_MISC_G3) */
> +	PSIL_PDMA_XY_PKT(0x4624),
> +	PSIL_PDMA_XY_PKT(0x4625),
> +	PSIL_PDMA_XY_PKT(0x4626),
> +	PSIL_PDMA_XY_PKT(0x4627),
> +	PSIL_PDMA_XY_PKT(0x4628),
> +	PSIL_PDMA_XY_PKT(0x4629),
> +	PSIL_PDMA_XY_PKT(0x4630),
> +	PSIL_PDMA_XY_PKT(0x463a),
> +	/* PDMA13 (PDMA_USART_G0) - UART0-1 */
> +	PSIL_PDMA_XY_PKT(0x4700),
> +	PSIL_PDMA_XY_PKT(0x4701),
> +	/* PDMA14 (PDMA_USART_G1) - UART2-3 */
> +	PSIL_PDMA_XY_PKT(0x4702),
> +	PSIL_PDMA_XY_PKT(0x4703),
> +	/* PDMA15 (PDMA_USART_G2) - UART4-9 */
> +	PSIL_PDMA_XY_PKT(0x4704),
> +	PSIL_PDMA_XY_PKT(0x4705),
> +	PSIL_PDMA_XY_PKT(0x4706),
> +	PSIL_PDMA_XY_PKT(0x4707),
> +	PSIL_PDMA_XY_PKT(0x4708),
> +	PSIL_PDMA_XY_PKT(0x4709),
> +	/* CPSW9 */
> +	PSIL_ETHERNET(0x4a00),
> +	/* CPSW0 */
> +	PSIL_ETHERNET(0x7000),
> +	/* MCU_PDMA0 (MCU_PDMA_MISC_G0) - SPI0 */
> +	PSIL_PDMA_XY_PKT(0x7100),
> +	PSIL_PDMA_XY_PKT(0x7101),
> +	PSIL_PDMA_XY_PKT(0x7102),
> +	PSIL_PDMA_XY_PKT(0x7103),
> +	/* MCU_PDMA1 (MCU_PDMA_MISC_G1) - SPI1-2 */
> +	PSIL_PDMA_XY_PKT(0x7200),
> +	PSIL_PDMA_XY_PKT(0x7201),
> +	PSIL_PDMA_XY_PKT(0x7202),
> +	PSIL_PDMA_XY_PKT(0x7203),
> +	PSIL_PDMA_XY_PKT(0x7204),
> +	PSIL_PDMA_XY_PKT(0x7205),
> +	PSIL_PDMA_XY_PKT(0x7206),
> +	PSIL_PDMA_XY_PKT(0x7207),
> +	/* MCU_PDMA2 (MCU_PDMA_MISC_G2) - UART0 */
> +	PSIL_PDMA_XY_PKT(0x7300),
> +	/* MCU_PDMA_ADC - ADC0-1 */
> +	PSIL_PDMA_XY_TR(0x7400),
> +	PSIL_PDMA_XY_TR(0x7401),
> +	PSIL_PDMA_XY_TR(0x7402),
> +	PSIL_PDMA_XY_TR(0x7403),
> +	/* SA2UL */
> +	PSIL_SA2UL(0x7500, 0),
> +	PSIL_SA2UL(0x7501, 0),
> +};
> +
> +/* PSI-L destination thread IDs, used for TX (DMA_MEM_TO_DEV) */
> +struct psil_ep j721e_dst_ep_map[] = {
> +	/* SA2UL */
> +	PSIL_SA2UL(0xc000, 1),
> +	/* PRU_ICSSG0 */
> +	PSIL_ETHERNET(0xc100),
> +	PSIL_ETHERNET(0xc101),
> +	PSIL_ETHERNET(0xc102),
> +	PSIL_ETHERNET(0xc103),
> +	PSIL_ETHERNET(0xc104),
> +	PSIL_ETHERNET(0xc105),
> +	PSIL_ETHERNET(0xc106),
> +	PSIL_ETHERNET(0xc107),
> +	/* PRU_ICSSG1 */
> +	PSIL_ETHERNET(0xc200),
> +	PSIL_ETHERNET(0xc201),
> +	PSIL_ETHERNET(0xc202),
> +	PSIL_ETHERNET(0xc203),
> +	PSIL_ETHERNET(0xc204),
> +	PSIL_ETHERNET(0xc205),
> +	PSIL_ETHERNET(0xc206),
> +	PSIL_ETHERNET(0xc207),
> +	/* CPSW9 */
> +	PSIL_ETHERNET(0xca00),
> +	PSIL_ETHERNET(0xca01),
> +	PSIL_ETHERNET(0xca02),
> +	PSIL_ETHERNET(0xca03),
> +	PSIL_ETHERNET(0xca04),
> +	PSIL_ETHERNET(0xca05),
> +	PSIL_ETHERNET(0xca06),
> +	PSIL_ETHERNET(0xca07),
> +	/* CPSW0 */
> +	PSIL_ETHERNET(0xf000),
> +	PSIL_ETHERNET(0xf001),
> +	PSIL_ETHERNET(0xf002),
> +	PSIL_ETHERNET(0xf003),
> +	PSIL_ETHERNET(0xf004),
> +	PSIL_ETHERNET(0xf005),
> +	PSIL_ETHERNET(0xf006),
> +	PSIL_ETHERNET(0xf007),
> +	/* SA2UL */
> +	PSIL_SA2UL(0xf500, 1),
> +};
> +
> +struct psil_ep_map j721e_ep_map = {
> +	.name = "j721e",
> +	.src = j721e_src_ep_map,
> +	.src_count = ARRAY_SIZE(j721e_src_ep_map),
> +	.dst = j721e_dst_ep_map,
> +	.dst_count = ARRAY_SIZE(j721e_dst_ep_map),
> +};
> diff --git a/drivers/dma/ti/k3-psil-priv.h b/drivers/dma/ti/k3-psil-priv.h
> new file mode 100644
> index 000000000000..f74420653d8a
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil-priv.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + *  Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef K3_PSIL_PRIV_H_
> +#define K3_PSIL_PRIV_H_
> +
> +#include <linux/dma/k3-psil.h>
> +
> +struct psil_ep {
> +	u32 thread_id;
> +	struct psil_endpoint_config ep_config;
> +};
> +
> +/**
> + * struct psil_ep_map - PSI-L thread ID configuration maps
> + * @name:	Name of the map, set it to the name of the SoC
> + * @src:	Array of source PSI-L thread configurations
> + * @src_count:	Number of entries in the src array
> + * @dst:	Array of destination PSI-L thread configurations
> + * @dst_count:	Number of entries in the dst array
> + *
> + * In case of symmetric configuration for a matching src/dst thread (for example
> + * 0x4400 and 0xc400) only the src configuration can be present. If no dst
> + * configuration found the code will look for (dst_thread_id & ~0x8000) to find
> + * the symmetric match.
> + */
> +struct psil_ep_map {
> +	char *name;
> +	struct psil_ep	*src;
> +	int src_count;
> +	struct psil_ep	*dst;
> +	int dst_count;
> +};
> +
> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id);
> +
> +#endif /* K3_PSIL_PRIV_H_ */
> diff --git a/drivers/dma/ti/k3-psil.c b/drivers/dma/ti/k3-psil.c
> new file mode 100644
> index 000000000000..e610022f09f4
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil.c
> @@ -0,0 +1,97 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + *  Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + *  Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +
> +#include "k3-psil-priv.h"
> +
> +extern struct psil_ep_map am654_ep_map;
> +extern struct psil_ep_map j721e_ep_map;
> +
> +static DEFINE_MUTEX(ep_map_mutex);
> +static struct psil_ep_map *soc_ep_map;

So, you are only protecting the high level soc_ep_map pointer only. You 
don't need to protect the database itself via some usecounting or 
something, or are you doing it within the DMA driver?

-Tero

> +
> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
> +{
> +	int i;
> +
> +	mutex_lock(&ep_map_mutex);
> +	if (!soc_ep_map) {
> +		if (of_machine_is_compatible("ti,am654")) {
> +			soc_ep_map = &am654_ep_map;
> +		} else if (of_machine_is_compatible("ti,j721e")) {
> +			soc_ep_map = &j721e_ep_map;
> +		} else {
> +			pr_err("PSIL: No compatible machine found for map\n");
> +			return ERR_PTR(-ENOTSUPP);
> +		}
> +		pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
> +	}
> +	mutex_unlock(&ep_map_mutex);
> +
> +	if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
> +		/* check in destination thread map */
> +		for (i = 0; i < soc_ep_map->dst_count; i++) {
> +			if (soc_ep_map->dst[i].thread_id == thread_id)
> +				return &soc_ep_map->dst[i].ep_config;
> +		}
> +	}
> +
> +	thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
> +	if (soc_ep_map->src) {
> +		for (i = 0; i < soc_ep_map->src_count; i++) {
> +			if (soc_ep_map->src[i].thread_id == thread_id)
> +				return &soc_ep_map->src[i].ep_config;
> +		}
> +	}
> +
> +	return ERR_PTR(-ENOENT);
> +}
> +EXPORT_SYMBOL(psil_get_ep_config);
> +
> +int psil_set_new_ep_config(struct device *dev, const char *name,
> +			   struct psil_endpoint_config *ep_config)
> +{
> +	struct psil_endpoint_config *dst_ep_config;
> +	struct of_phandle_args dma_spec;
> +	u32 thread_id;
> +	int index;
> +
> +	if (!dev || !dev->of_node)
> +		return -EINVAL;
> +
> +	index = of_property_match_string(dev->of_node, "dma-names", name);
> +	if (index < 0)
> +		return index;
> +
> +	if (of_parse_phandle_with_args(dev->of_node, "dmas", "#dma-cells",
> +				       index, &dma_spec))
> +		return -ENOENT;
> +
> +	thread_id = dma_spec.args[0];
> +
> +	dst_ep_config = psil_get_ep_config(thread_id);
> +	if (IS_ERR(dst_ep_config)) {
> +		pr_err("PSIL: thread ID 0x%04x not defined in map\n",
> +		       thread_id);
> +		of_node_put(dma_spec.np);
> +		return PTR_ERR(dst_ep_config);
> +	}
> +
> +	memcpy(dst_ep_config, ep_config, sizeof(*dst_ep_config));
> +
> +	of_node_put(dma_spec.np);
> +	return 0;
> +}
> +EXPORT_SYMBOL(psil_set_new_ep_config);
> +
> +MODULE_DESCRIPTION("TI K3 PSI-L endpoint database");
> +MODULE_AUTHOR("Peter Ujfalusi <peter.ujfalusi@ti.com>");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
> new file mode 100644
> index 000000000000..16e9c8c6f839
> --- /dev/null
> +++ b/include/linux/dma/k3-psil.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + *  Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef K3_PSIL_H_
> +#define K3_PSIL_H_
> +
> +#include <linux/types.h>
> +
> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
> +
> +struct device;
> +
> +/* Channel Throughput Levels */
> +enum udma_tp_level {
> +	UDMA_TP_NORMAL = 0,
> +	UDMA_TP_HIGH = 1,
> +	UDMA_TP_ULTRAHIGH = 2,
> +	UDMA_TP_LAST,
> +};
> +
> +enum psil_endpoint_type {
> +	PSIL_EP_NATIVE = 0,
> +	PSIL_EP_PDMA_XY,
> +	PSIL_EP_PDMA_MCAN,
> +	PSIL_EP_PDMA_AASRC,
> +};
> +
> +struct psil_endpoint_config {
> +	enum psil_endpoint_type ep_type;
> +
> +	unsigned pkt_mode:1;
> +	unsigned notdpkt:1;
> +	unsigned needs_epib:1;
> +	u32 psd_size;
> +	enum udma_tp_level channel_tpl;
> +
> +	/* PDMA properties, valid for PSIL_EP_PDMA_* */
> +	unsigned pdma_acc32:1;
> +	unsigned pdma_burst:1;
> +};
> +
> +int psil_set_new_ep_config(struct device *dev, const char *name,
> +			   struct psil_endpoint_config *ep_config);
> +
> +#endif /* K3_PSIL_H_ */
> 

--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 5, 2019, 8:13 a.m. UTC | #3
On 05/11/2019 9.49, Tero Kristo wrote:
> On 01/11/2019 10:41, Peter Ujfalusi wrote:
>> In K3 architecture the DMA operates within threads. One end of the thread
>> is UDMAP, the other is on the peripheral side.
>>
>> The UDMAP channel configuration depends on the needs of the remote
>> endpoint and it can be differ from peripheral to peripheral.
>>
>> This patch adds database for am654 and j721e and small API to fetch the
>> PSI-L endpoint configuration from the database which should only used by
>> the DMA driver(s).
>>
>> Another API is added for native peripherals to give possibility to
>> pass new
>> configuration for the threads they are using, which is needed to be
>> able to
>> handle changes caused by different firmware loaded for the peripheral for
>> example.
>>
>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>> ---
>>   drivers/dma/ti/Kconfig         |   3 +
>>   drivers/dma/ti/Makefile        |   1 +
>>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>>   include/linux/dma/k3-psil.h    |  47 +++++++
>>   7 files changed, 578 insertions(+)
>>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>>   create mode 100644 drivers/dma/ti/k3-psil.c
>>   create mode 100644 include/linux/dma/k3-psil.h

...

>> diff --git a/drivers/dma/ti/k3-psil.c b/drivers/dma/ti/k3-psil.c
>> new file mode 100644
>> index 000000000000..e610022f09f4
>> --- /dev/null
>> +++ b/drivers/dma/ti/k3-psil.c
>> @@ -0,0 +1,97 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>> http://www.ti.com
>> + *  Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/device.h>
>> +#include <linux/module.h>
>> +#include <linux/mutex.h>
>> +#include <linux/of.h>
>> +
>> +#include "k3-psil-priv.h"
>> +
>> +extern struct psil_ep_map am654_ep_map;
>> +extern struct psil_ep_map j721e_ep_map;
>> +
>> +static DEFINE_MUTEX(ep_map_mutex);
>> +static struct psil_ep_map *soc_ep_map;
> 
> So, you are only protecting the high level soc_ep_map pointer only. You
> don't need to protect the database itself via some usecounting or
> something, or are you doing it within the DMA driver?

That's correct, I protect only the soc_ep_map.
The DMA drivers can look up threads concurrently I just need to make
sure that the soc_ep_map is configured when the first
psil_get_ep_config() comes.
After this the DMA drivers are free to look up things.

The ep_config update will be coming from the DMA client driver(s) and
not from the DMA driver. The clinet driver knows how thier PSI-L
endpoint if configured so they could update the default configuration
_before_ they would request a DMA channel.

> 
> -Tero
> 
>> +
>> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
>> +{
>> +    int i;
>> +
>> +    mutex_lock(&ep_map_mutex);
>> +    if (!soc_ep_map) {
>> +        if (of_machine_is_compatible("ti,am654")) {
>> +            soc_ep_map = &am654_ep_map;
>> +        } else if (of_machine_is_compatible("ti,j721e")) {
>> +            soc_ep_map = &j721e_ep_map;
>> +        } else {
>> +            pr_err("PSIL: No compatible machine found for map\n");
>> +            return ERR_PTR(-ENOTSUPP);
>> +        }
>> +        pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
>> +    }
>> +    mutex_unlock(&ep_map_mutex);
>> +
>> +    if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
>> +        /* check in destination thread map */
>> +        for (i = 0; i < soc_ep_map->dst_count; i++) {
>> +            if (soc_ep_map->dst[i].thread_id == thread_id)
>> +                return &soc_ep_map->dst[i].ep_config;
>> +        }
>> +    }
>> +
>> +    thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
>> +    if (soc_ep_map->src) {
>> +        for (i = 0; i < soc_ep_map->src_count; i++) {
>> +            if (soc_ep_map->src[i].thread_id == thread_id)
>> +                return &soc_ep_map->src[i].ep_config;
>> +        }
>> +    }
>> +
>> +    return ERR_PTR(-ENOENT);
>> +}
>> +EXPORT_SYMBOL(psil_get_ep_config);
>> +
>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>> +               struct psil_endpoint_config *ep_config)
>> +{
>> +    struct psil_endpoint_config *dst_ep_config;
>> +    struct of_phandle_args dma_spec;
>> +    u32 thread_id;
>> +    int index;
>> +
>> +    if (!dev || !dev->of_node)
>> +        return -EINVAL;
>> +
>> +    index = of_property_match_string(dev->of_node, "dma-names", name);
>> +    if (index < 0)
>> +        return index;
>> +
>> +    if (of_parse_phandle_with_args(dev->of_node, "dmas", "#dma-cells",
>> +                       index, &dma_spec))
>> +        return -ENOENT;
>> +
>> +    thread_id = dma_spec.args[0];
>> +
>> +    dst_ep_config = psil_get_ep_config(thread_id);
>> +    if (IS_ERR(dst_ep_config)) {
>> +        pr_err("PSIL: thread ID 0x%04x not defined in map\n",
>> +               thread_id);
>> +        of_node_put(dma_spec.np);
>> +        return PTR_ERR(dst_ep_config);
>> +    }
>> +
>> +    memcpy(dst_ep_config, ep_config, sizeof(*dst_ep_config));
>> +
>> +    of_node_put(dma_spec.np);
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL(psil_set_new_ep_config);
>> +
>> +MODULE_DESCRIPTION("TI K3 PSI-L endpoint database");
>> +MODULE_AUTHOR("Peter Ujfalusi <peter.ujfalusi@ti.com>");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
>> new file mode 100644
>> index 000000000000..16e9c8c6f839
>> --- /dev/null
>> +++ b/include/linux/dma/k3-psil.h
>> @@ -0,0 +1,47 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>> http://www.ti.com
>> + */
>> +
>> +#ifndef K3_PSIL_H_
>> +#define K3_PSIL_H_
>> +
>> +#include <linux/types.h>
>> +
>> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
>> +
>> +struct device;
>> +
>> +/* Channel Throughput Levels */
>> +enum udma_tp_level {
>> +    UDMA_TP_NORMAL = 0,
>> +    UDMA_TP_HIGH = 1,
>> +    UDMA_TP_ULTRAHIGH = 2,
>> +    UDMA_TP_LAST,
>> +};
>> +
>> +enum psil_endpoint_type {
>> +    PSIL_EP_NATIVE = 0,
>> +    PSIL_EP_PDMA_XY,
>> +    PSIL_EP_PDMA_MCAN,
>> +    PSIL_EP_PDMA_AASRC,
>> +};
>> +
>> +struct psil_endpoint_config {
>> +    enum psil_endpoint_type ep_type;
>> +
>> +    unsigned pkt_mode:1;
>> +    unsigned notdpkt:1;
>> +    unsigned needs_epib:1;
>> +    u32 psd_size;
>> +    enum udma_tp_level channel_tpl;
>> +
>> +    /* PDMA properties, valid for PSIL_EP_PDMA_* */
>> +    unsigned pdma_acc32:1;
>> +    unsigned pdma_burst:1;
>> +};
>> +
>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>> +               struct psil_endpoint_config *ep_config);
>> +
>> +#endif /* K3_PSIL_H_ */
>>
> 
> -- 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Grygorii Strashko Nov. 5, 2019, 10 a.m. UTC | #4
Hi Peter,

On 01/11/2019 10:41, Peter Ujfalusi wrote:
> In K3 architecture the DMA operates within threads. One end of the thread
> is UDMAP, the other is on the peripheral side.
> 
> The UDMAP channel configuration depends on the needs of the remote
> endpoint and it can be differ from peripheral to peripheral.
> 
> This patch adds database for am654 and j721e and small API to fetch the
> PSI-L endpoint configuration from the database which should only used by
> the DMA driver(s).
> 
> Another API is added for native peripherals to give possibility to pass new
> configuration for the threads they are using, which is needed to be able to
> handle changes caused by different firmware loaded for the peripheral for
> example.

I have no objection to this approach, but ...

> 
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> ---
>   drivers/dma/ti/Kconfig         |   3 +
>   drivers/dma/ti/Makefile        |   1 +
>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>   include/linux/dma/k3-psil.h    |  47 +++++++
>   7 files changed, 578 insertions(+)
>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>   create mode 100644 drivers/dma/ti/k3-psil.c
>   create mode 100644 include/linux/dma/k3-psil.h
> 

[...]

> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
> new file mode 100644
> index 000000000000..16e9c8c6f839
> --- /dev/null
> +++ b/include/linux/dma/k3-psil.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + *  Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef K3_PSIL_H_
> +#define K3_PSIL_H_
> +
> +#include <linux/types.h>
> +
> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
> +
> +struct device;
> +
> +/* Channel Throughput Levels */
> +enum udma_tp_level {
> +	UDMA_TP_NORMAL = 0,
> +	UDMA_TP_HIGH = 1,
> +	UDMA_TP_ULTRAHIGH = 2,
> +	UDMA_TP_LAST,
> +};
> +
> +enum psil_endpoint_type {
> +	PSIL_EP_NATIVE = 0,
> +	PSIL_EP_PDMA_XY,
> +	PSIL_EP_PDMA_MCAN,
> +	PSIL_EP_PDMA_AASRC,
> +};
> +
> +struct psil_endpoint_config {
> +	enum psil_endpoint_type ep_type;
> +
> +	unsigned pkt_mode:1;
> +	unsigned notdpkt:1;
> +	unsigned needs_epib:1;
> +	u32 psd_size;
> +	enum udma_tp_level channel_tpl;
> +
> +	/* PDMA properties, valid for PSIL_EP_PDMA_* */
> +	unsigned pdma_acc32:1;
> +	unsigned pdma_burst:1;
> +};
> +
> +int psil_set_new_ep_config(struct device *dev, const char *name,
> +			   struct psil_endpoint_config *ep_config);
> +
> +#endif /* K3_PSIL_H_ */
> 

I see no user now of this public interface, so I think it better to drop it until
there will be real user of it.
Peter Ujfalusi Nov. 5, 2019, 10:27 a.m. UTC | #5
On 05/11/2019 12.00, Grygorii Strashko wrote:
> Hi Peter,
> 
> On 01/11/2019 10:41, Peter Ujfalusi wrote:
>> In K3 architecture the DMA operates within threads. One end of the thread
>> is UDMAP, the other is on the peripheral side.
>>
>> The UDMAP channel configuration depends on the needs of the remote
>> endpoint and it can be differ from peripheral to peripheral.
>>
>> This patch adds database for am654 and j721e and small API to fetch the
>> PSI-L endpoint configuration from the database which should only used by
>> the DMA driver(s).
>>
>> Another API is added for native peripherals to give possibility to
>> pass new
>> configuration for the threads they are using, which is needed to be
>> able to
>> handle changes caused by different firmware loaded for the peripheral for
>> example.
> 
> I have no objection to this approach, but ...
> 
>>
>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>> ---
>>   drivers/dma/ti/Kconfig         |   3 +
>>   drivers/dma/ti/Makefile        |   1 +
>>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>>   include/linux/dma/k3-psil.h    |  47 +++++++
>>   7 files changed, 578 insertions(+)
>>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>>   create mode 100644 drivers/dma/ti/k3-psil.c
>>   create mode 100644 include/linux/dma/k3-psil.h
>>
> 
> [...]
> 
>> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
>> new file mode 100644
>> index 000000000000..16e9c8c6f839
>> --- /dev/null
>> +++ b/include/linux/dma/k3-psil.h
>> @@ -0,0 +1,47 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>> http://www.ti.com
>> + */
>> +
>> +#ifndef K3_PSIL_H_
>> +#define K3_PSIL_H_
>> +
>> +#include <linux/types.h>
>> +
>> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
>> +
>> +struct device;
>> +
>> +/* Channel Throughput Levels */
>> +enum udma_tp_level {
>> +    UDMA_TP_NORMAL = 0,
>> +    UDMA_TP_HIGH = 1,
>> +    UDMA_TP_ULTRAHIGH = 2,
>> +    UDMA_TP_LAST,
>> +};
>> +
>> +enum psil_endpoint_type {
>> +    PSIL_EP_NATIVE = 0,
>> +    PSIL_EP_PDMA_XY,
>> +    PSIL_EP_PDMA_MCAN,
>> +    PSIL_EP_PDMA_AASRC,
>> +};
>> +
>> +struct psil_endpoint_config {
>> +    enum psil_endpoint_type ep_type;
>> +
>> +    unsigned pkt_mode:1;
>> +    unsigned notdpkt:1;
>> +    unsigned needs_epib:1;
>> +    u32 psd_size;
>> +    enum udma_tp_level channel_tpl;
>> +
>> +    /* PDMA properties, valid for PSIL_EP_PDMA_* */
>> +    unsigned pdma_acc32:1;
>> +    unsigned pdma_burst:1;
>> +};
>> +
>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>> +               struct psil_endpoint_config *ep_config);
>> +
>> +#endif /* K3_PSIL_H_ */
>>
> 
> I see no user now of this public interface, so I think it better to drop
> it until
> there will be real user of it.

The same argument is valid for the glue layer ;)

This is only going to be used by native PSI-L devices and the
psil_endpoint_config is going to be extended to facilitate their needs
to give information to the DMA driver on how to set things up.

I would rather avoid churn later on than adding the support from the start.

The point is that the PSI-L endpoint configuration is part of the PSI-L
peripheral and based on factors these configurations might differ from
the default one. For example if we want to merge the two physical rx
channel for sa2ul (so they use the same rflow) or other things we (I)
can not foresee yet.
Or if different firmware is loaded for them and it affects their PSI-L
configuration.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Grygorii Strashko Nov. 5, 2019, 11:25 a.m. UTC | #6
On 05/11/2019 12:27, Peter Ujfalusi wrote:
> 
> 
> On 05/11/2019 12.00, Grygorii Strashko wrote:
>> Hi Peter,
>>
>> On 01/11/2019 10:41, Peter Ujfalusi wrote:
>>> In K3 architecture the DMA operates within threads. One end of the thread
>>> is UDMAP, the other is on the peripheral side.
>>>
>>> The UDMAP channel configuration depends on the needs of the remote
>>> endpoint and it can be differ from peripheral to peripheral.
>>>
>>> This patch adds database for am654 and j721e and small API to fetch the
>>> PSI-L endpoint configuration from the database which should only used by
>>> the DMA driver(s).
>>>
>>> Another API is added for native peripherals to give possibility to
>>> pass new
>>> configuration for the threads they are using, which is needed to be
>>> able to
>>> handle changes caused by different firmware loaded for the peripheral for
>>> example.
>>
>> I have no objection to this approach, but ...
>>
>>>
>>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>>> ---
>>>    drivers/dma/ti/Kconfig         |   3 +
>>>    drivers/dma/ti/Makefile        |   1 +
>>>    drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>>>    drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>>>    drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>>>    drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>>>    include/linux/dma/k3-psil.h    |  47 +++++++
>>>    7 files changed, 578 insertions(+)
>>>    create mode 100644 drivers/dma/ti/k3-psil-am654.c
>>>    create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>>>    create mode 100644 drivers/dma/ti/k3-psil-priv.h
>>>    create mode 100644 drivers/dma/ti/k3-psil.c
>>>    create mode 100644 include/linux/dma/k3-psil.h
>>>
>>
>> [...]
>>
>>> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
>>> new file mode 100644
>>> index 000000000000..16e9c8c6f839
>>> --- /dev/null
>>> +++ b/include/linux/dma/k3-psil.h
>>> @@ -0,0 +1,47 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +/*
>>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>>> http://www.ti.com
>>> + */
>>> +
>>> +#ifndef K3_PSIL_H_
>>> +#define K3_PSIL_H_
>>> +
>>> +#include <linux/types.h>
>>> +
>>> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
>>> +
>>> +struct device;
>>> +
>>> +/* Channel Throughput Levels */
>>> +enum udma_tp_level {
>>> +    UDMA_TP_NORMAL = 0,
>>> +    UDMA_TP_HIGH = 1,
>>> +    UDMA_TP_ULTRAHIGH = 2,
>>> +    UDMA_TP_LAST,
>>> +};
>>> +
>>> +enum psil_endpoint_type {
>>> +    PSIL_EP_NATIVE = 0,
>>> +    PSIL_EP_PDMA_XY,
>>> +    PSIL_EP_PDMA_MCAN,
>>> +    PSIL_EP_PDMA_AASRC,
>>> +};
>>> +
>>> +struct psil_endpoint_config {
>>> +    enum psil_endpoint_type ep_type;
>>> +
>>> +    unsigned pkt_mode:1;
>>> +    unsigned notdpkt:1;
>>> +    unsigned needs_epib:1;
>>> +    u32 psd_size;
>>> +    enum udma_tp_level channel_tpl;
>>> +
>>> +    /* PDMA properties, valid for PSIL_EP_PDMA_* */
>>> +    unsigned pdma_acc32:1;
>>> +    unsigned pdma_burst:1;
>>> +};
>>> +
>>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>>> +               struct psil_endpoint_config *ep_config);
>>> +
>>> +#endif /* K3_PSIL_H_ */
>>>
>>
>> I see no user now of this public interface, so I think it better to drop
>> it until
>> there will be real user of it.
> 
> The same argument is valid for the glue layer ;)
> 
> This is only going to be used by native PSI-L devices and the
> psil_endpoint_config is going to be extended to facilitate their needs
> to give information to the DMA driver on how to set things up.
> 
> I would rather avoid churn later on than adding the support from the start.
> 
> The point is that the PSI-L endpoint configuration is part of the PSI-L
> peripheral and based on factors these configurations might differ from
> the default one. For example if we want to merge the two physical rx
> channel for sa2ul (so they use the same rflow) or other things we (I)
> can not foresee yet.
> Or if different firmware is loaded for them and it affects their PSI-L
> configuration.

Ok. It's up to you.

otherwise:
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Vinod Koul Nov. 11, 2019, 4:21 a.m. UTC | #7
On 01-11-19, 10:41, Peter Ujfalusi wrote:
> From: Grygorii Strashko <grygorii.strashko@ti.com>

> +config TI_K3_RINGACC
> +	tristate "K3 Ring accelerator Sub System"
> +	depends on ARCH_K3 || COMPILE_TEST
> +	depends on TI_SCI_INTA_IRQCHIP
> +	default y

You want to get an earful from Linus? We dont do default y on new stuff,
never :)

> +struct k3_ring_rt_regs {
> +	u32	resv_16[4];
> +	u32	db;		/* RT Ring N Doorbell Register */
> +	u32	resv_4[1];
> +	u32	occ;		/* RT Ring N Occupancy Register */
> +	u32	indx;		/* RT Ring N Current Index Register */
> +	u32	hwocc;		/* RT Ring N Hardware Occupancy Register */
> +	u32	hwindx;		/* RT Ring N Current Index Register */

nice comments, how about moving them up into kernel-doc style? (here and
other places as well)


> +struct k3_ring *k3_ringacc_request_ring(struct k3_ringacc *ringacc,
> +					int id, u32 flags)
> +{
> +	int proxy_id = K3_RINGACC_PROXY_NOT_USED;
> +
> +	mutex_lock(&ringacc->req_lock);
> +
> +	if (id == K3_RINGACC_RING_ID_ANY) {
> +		/* Request for any general purpose ring */
> +		struct ti_sci_resource_desc *gp_rings =
> +						&ringacc->rm_gp_range->desc[0];
> +		unsigned long size;
> +
> +		size = gp_rings->start + gp_rings->num;
> +		id = find_next_zero_bit(ringacc->rings_inuse, size,
> +					gp_rings->start);
> +		if (id == size)
> +			goto error;
> +	} else if (id < 0) {
> +		goto error;
> +	}
> +
> +	if (test_bit(id, ringacc->rings_inuse) &&
> +	    !(ringacc->rings[id].flags & K3_RING_FLAG_SHARED))
> +		goto error;
> +	else if (ringacc->rings[id].flags & K3_RING_FLAG_SHARED)
> +		goto out;
> +
> +	if (flags & K3_RINGACC_RING_USE_PROXY) {
> +		proxy_id = find_next_zero_bit(ringacc->proxy_inuse,
> +					      ringacc->num_proxies, 0);
> +		if (proxy_id == ringacc->num_proxies)
> +			goto error;
> +	}
> +
> +	if (!try_module_get(ringacc->dev->driver->owner))
> +		goto error;

should this not be one of the first things to do?

> +
> +	if (proxy_id != K3_RINGACC_PROXY_NOT_USED) {
> +		set_bit(proxy_id, ringacc->proxy_inuse);
> +		ringacc->rings[id].proxy_id = proxy_id;
> +		dev_dbg(ringacc->dev, "Giving ring#%d proxy#%d\n", id,
> +			proxy_id);
> +	} else {
> +		dev_dbg(ringacc->dev, "Giving ring#%d\n", id);
> +	}

how bout removing else and doing common print?
Vinod Koul Nov. 11, 2019, 4:39 a.m. UTC | #8
On 01-11-19, 10:41, Peter Ujfalusi wrote:
> A DMA hardware can have big cache or FIFO and the amount of data sitting in
> the DMA fabric can be an interest for the clients.
> 
> For example in audio we want to know the delay in the data flow and in case
> the DMA have significantly large FIFO/cache, it can affect the latenc/delay
> 
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> Reviewed-by: Tero Kristo <t-kristo@ti.com>
> ---
>  drivers/dma/dmaengine.h   | 8 ++++++++
>  include/linux/dmaengine.h | 2 ++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h
> index 501c0b063f85..b0b97475707a 100644
> --- a/drivers/dma/dmaengine.h
> +++ b/drivers/dma/dmaengine.h
> @@ -77,6 +77,7 @@ static inline enum dma_status dma_cookie_status(struct dma_chan *chan,
>  		state->last = complete;
>  		state->used = used;
>  		state->residue = 0;
> +		state->in_flight_bytes = 0;
>  	}
>  	return dma_async_is_complete(cookie, complete, used);
>  }
> @@ -87,6 +88,13 @@ static inline void dma_set_residue(struct dma_tx_state *state, u32 residue)
>  		state->residue = residue;
>  }
>  
> +static inline void dma_set_in_flight_bytes(struct dma_tx_state *state,
> +					   u32 in_flight_bytes)
> +{
> +	if (state)
> +		state->in_flight_bytes = in_flight_bytes;
> +}
> +
>  struct dmaengine_desc_callback {
>  	dma_async_tx_callback callback;
>  	dma_async_tx_callback_result callback_result;
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 0e8b426bbde9..c4c5219030a6 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -682,11 +682,13 @@ static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descr
>   * @residue: the remaining number of bytes left to transmit
>   *	on the selected transfer for states DMA_IN_PROGRESS and
>   *	DMA_PAUSED if this is implemented in the driver, else 0
> + * @in_flight_bytes: amount of data in bytes cached by the DMA.
>   */
>  struct dma_tx_state {
>  	dma_cookie_t last;
>  	dma_cookie_t used;
>  	u32 residue;
> +	u32 in_flight_bytes;

Should we add this here or use the dmaengine_result()
Vinod Koul Nov. 11, 2019, 4:47 a.m. UTC | #9
On 01-11-19, 10:41, Peter Ujfalusi wrote:

> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil.c
> @@ -0,0 +1,97 @@
> +// SPDX-License-Identifier: GPL-2.0

...

> +extern struct psil_ep_map am654_ep_map;
> +extern struct psil_ep_map j721e_ep_map;
> +
> +static DEFINE_MUTEX(ep_map_mutex);
> +static struct psil_ep_map *soc_ep_map;
> +
> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
> +{
> +	int i;
> +
> +	mutex_lock(&ep_map_mutex);
> +	if (!soc_ep_map) {
> +		if (of_machine_is_compatible("ti,am654")) {
> +			soc_ep_map = &am654_ep_map;
> +		} else if (of_machine_is_compatible("ti,j721e")) {
> +			soc_ep_map = &j721e_ep_map;
> +		} else {
> +			pr_err("PSIL: No compatible machine found for map\n");
> +			return ERR_PTR(-ENOTSUPP);
> +		}
> +		pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
> +	}
> +	mutex_unlock(&ep_map_mutex);
> +
> +	if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
> +		/* check in destination thread map */
> +		for (i = 0; i < soc_ep_map->dst_count; i++) {
> +			if (soc_ep_map->dst[i].thread_id == thread_id)
> +				return &soc_ep_map->dst[i].ep_config;
> +		}
> +	}
> +
> +	thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
> +	if (soc_ep_map->src) {
> +		for (i = 0; i < soc_ep_map->src_count; i++) {
> +			if (soc_ep_map->src[i].thread_id == thread_id)
> +				return &soc_ep_map->src[i].ep_config;
> +		}
> +	}
> +
> +	return ERR_PTR(-ENOENT);
> +}
> +EXPORT_SYMBOL(psil_get_ep_config);

This doesn't match the license of this module, we need it to be
EXPORT_SYMBOL_GPL
Vinod Koul Nov. 11, 2019, 5:28 a.m. UTC | #10
On 01-11-19, 10:41, Peter Ujfalusi wrote:

> +struct udma_chan {
> +	struct virt_dma_chan vc;
> +	struct dma_slave_config	cfg;
> +	struct udma_dev *ud;
> +	struct udma_desc *desc;
> +	struct udma_desc *terminated_desc;

descriptor and not a list?

> +	struct udma_static_tr static_tr;
> +	char *name;
> +
> +	struct udma_tchan *tchan;
> +	struct udma_rchan *rchan;
> +	struct udma_rflow *rflow;
> +
> +	bool psil_paired;
> +
> +	int irq_num_ring;
> +	int irq_num_udma;
> +
> +	bool cyclic;
> +	bool paused;
> +
> +	enum udma_chan_state state;
> +	struct completion teardown_completed;
> +
> +	u32 bcnt; /* number of bytes completed since the start of the channel */
> +	u32 in_ring_cnt; /* number of descriptors in flight */
> +
> +	bool pkt_mode; /* TR or packet */
> +	bool needs_epib; /* EPIB is needed for the communication or not */
> +	u32 psd_size; /* size of Protocol Specific Data */
> +	u32 metadata_size; /* (needs_epib ? 16:0) + psd_size */
> +	u32 hdesc_size; /* Size of a packet descriptor in packet mode */
> +	bool notdpkt; /* Suppress sending TDC packet */
> +	int remote_thread_id;
> +	u32 src_thread;
> +	u32 dst_thread;
> +	enum psil_endpoint_type ep_type;
> +	bool enable_acc32;
> +	bool enable_burst;
> +	enum udma_tp_level channel_tpl; /* Channel Throughput Level */
> +
> +	/* dmapool for packet mode descriptors */
> +	bool use_dma_pool;
> +	struct dma_pool *hdesc_pool;
> +
> +	u32 id;
> +	enum dma_transfer_direction dir;

why does channel have this, it already exists in descriptor

> +static irqreturn_t udma_udma_irq_handler(int irq, void *data)
> +{
> +	struct udma_chan *uc = data;
> +
> +	udma_tr_event_callback(uc);

any reason why we want to call a fn and not code here..?
Vinod Koul Nov. 11, 2019, 5:33 a.m. UTC | #11
On 01-11-19, 10:41, Peter Ujfalusi wrote:

> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
> +{
> +	struct psil_endpoint_config *ep_config;
> +	struct udma_chan *uc;
> +	struct udma_dev *ud;
> +	u32 *args;
> +
> +	if (chan->device->dev->driver != &udma_driver.driver)
> +		return false;
> +
> +	uc = to_udma_chan(chan);
> +	ud = uc->ud;
> +	args = param;
> +	uc->remote_thread_id = args[0];
> +
> +	if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
> +		uc->dir = DMA_MEM_TO_DEV;
> +	else
> +		uc->dir = DMA_DEV_TO_MEM;

Can you explain this a bit?

> +static int udma_remove(struct platform_device *pdev)
> +{
> +	struct udma_dev *ud = platform_get_drvdata(pdev);
> +
> +	of_dma_controller_free(pdev->dev.of_node);
> +	dma_async_device_unregister(&ud->ddev);
> +
> +	/* Make sure that we did proper cleanup */
> +	cancel_work_sync(&ud->purge_work);
> +	udma_purge_desc_work(&ud->purge_work);

kill the vchan tasklets at it too please
Vinod Koul Nov. 11, 2019, 6:06 a.m. UTC | #12
On 01-11-19, 10:41, Peter Ujfalusi wrote:
> Split patch for review containing: channel rsource allocation and free

s/rsource/resource


> +static int udma_tisci_tx_channel_config(struct udma_chan *uc)
> +{
> +	struct udma_dev *ud = uc->ud;
> +	struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
> +	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
> +	struct udma_tchan *tchan = uc->tchan;
> +	int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
> +	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
> +	u32 mode, fetch_size;
> +	int ret = 0;
> +
> +	if (uc->pkt_mode) {
> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
> +		fetch_size = cppi5_hdesc_calc_size(uc->needs_epib, uc->psd_size,
> +						   0);
> +	} else {
> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
> +		fetch_size = sizeof(struct cppi5_desc_hdr_t);
> +	}
> +
> +	req_tx.valid_params =
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;

bunch of these are repeat, you can define a COMMON_VALID_PARAMS and use
that + specific ones..

> +
> +	req_tx.nav_id = tisci_rm->tisci_dev_id;
> +	req_tx.index = tchan->id;
> +	req_tx.tx_pause_on_err = 0;
> +	req_tx.tx_filt_einfo = 0;
> +	req_tx.tx_filt_pswords = 0;

i think initialization to 0 is superfluous

> +	req_tx.tx_chan_type = mode;
> +	req_tx.tx_supr_tdpkt = uc->notdpkt;
> +	req_tx.tx_fetch_size = fetch_size >> 2;
> +	req_tx.txcq_qnum = tc_ring;
> +	if (uc->ep_type == PSIL_EP_PDMA_XY) {
> +		/* wait for peer to complete the teardown for PDMAs */
> +		req_tx.valid_params |=
> +				TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_TDTYPE_VALID;
> +		req_tx.tx_tdtype = 1;
> +	}
> +
> +	ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
> +	if (ret)
> +		dev_err(ud->dev, "tchan%d cfg failed %d\n", tchan->id, ret);
> +
> +	return ret;
> +}
> +
> +static int udma_tisci_rx_channel_config(struct udma_chan *uc)
> +{
> +	struct udma_dev *ud = uc->ud;
> +	struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
> +	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
> +	struct udma_rchan *rchan = uc->rchan;
> +	int fd_ring = k3_ringacc_get_ring_id(rchan->fd_ring);
> +	int rx_ring = k3_ringacc_get_ring_id(rchan->r_ring);
> +	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
> +	struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
> +	u32 mode, fetch_size;
> +	int ret = 0;
> +
> +	if (uc->pkt_mode) {
> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
> +		fetch_size = cppi5_hdesc_calc_size(uc->needs_epib,
> +							uc->psd_size, 0);
> +	} else {
> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
> +		fetch_size = sizeof(struct cppi5_desc_hdr_t);
> +	}
> +
> +	req_rx.valid_params =
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_SHORT_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_LONG_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
> +
> +	req_rx.nav_id = tisci_rm->tisci_dev_id;
> +	req_rx.index = rchan->id;
> +	req_rx.rx_fetch_size =  fetch_size >> 2;
> +	req_rx.rxcq_qnum = rx_ring;
> +	req_rx.rx_pause_on_err = 0;
> +	req_rx.rx_chan_type = mode;
> +	req_rx.rx_ignore_short = 0;
> +	req_rx.rx_ignore_long = 0;
> +	req_rx.flowid_start = 0;
> +	req_rx.flowid_cnt = 0;
> +
> +	ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
> +	if (ret) {
> +		dev_err(ud->dev, "rchan%d cfg failed %d\n", rchan->id, ret);
> +		return ret;
> +	}
> +
> +	flow_req.valid_params =
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_EINFO_PRESENT_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_PSINFO_PRESENT_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_ERROR_HANDLING_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DESC_TYPE_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_HI_SEL_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_LO_SEL_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_HI_SEL_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_LO_SEL_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
> +
> +	flow_req.nav_id = tisci_rm->tisci_dev_id;
> +	flow_req.flow_index = rchan->id;
> +
> +	if (uc->needs_epib)
> +		flow_req.rx_einfo_present = 1;
> +	else
> +		flow_req.rx_einfo_present = 0;
> +	if (uc->psd_size)
> +		flow_req.rx_psinfo_present = 1;
> +	else
> +		flow_req.rx_psinfo_present = 0;
> +	flow_req.rx_error_handling = 1;
> +	flow_req.rx_desc_type = 0;
> +	flow_req.rx_dest_qnum = rx_ring;
> +	flow_req.rx_src_tag_hi_sel = 2;
> +	flow_req.rx_src_tag_lo_sel = 4;
> +	flow_req.rx_dest_tag_hi_sel = 5;
> +	flow_req.rx_dest_tag_lo_sel = 4;

can we get rid of magic numbers here and elsewhere, or at least comment
on what these mean..

> +static int udma_alloc_chan_resources(struct dma_chan *chan)
> +{
> +	struct udma_chan *uc = to_udma_chan(chan);
> +	struct udma_dev *ud = to_udma_dev(chan->device);
> +	const struct udma_match_data *match_data = ud->match_data;
> +	struct k3_ring *irq_ring;
> +	u32 irq_udma_idx;
> +	int ret;
> +
> +	if (uc->pkt_mode || uc->dir == DMA_MEM_TO_MEM) {
> +		uc->use_dma_pool = true;
> +		/* in case of MEM_TO_MEM we have maximum of two TRs */
> +		if (uc->dir == DMA_MEM_TO_MEM) {
> +			uc->hdesc_size = cppi5_trdesc_calc_size(
> +					sizeof(struct cppi5_tr_type15_t), 2);
> +			uc->pkt_mode = false;
> +		}
> +	}
> +
> +	if (uc->use_dma_pool) {
> +		uc->hdesc_pool = dma_pool_create(uc->name, ud->ddev.dev,
> +						 uc->hdesc_size, ud->desc_align,
> +						 0);
> +		if (!uc->hdesc_pool) {
> +			dev_err(ud->ddev.dev,
> +				"Descriptor pool allocation failed\n");
> +			uc->use_dma_pool = false;
> +			return -ENOMEM;
> +		}
> +	}
> +
> +	/*
> +	 * Make sure that the completion is in a known state:
> +	 * No teardown, the channel is idle
> +	 */
> +	reinit_completion(&uc->teardown_completed);
> +	complete_all(&uc->teardown_completed);

should we not complete first and then do reinit to bring a clean state?

> +	uc->state = UDMA_CHAN_IS_IDLE;
> +
> +	switch (uc->dir) {
> +	case DMA_MEM_TO_MEM:

can you explain why a allocation should be channel dependent, shouldn't
these things be done in prep_ calls?

I looked ahead and checked the prep_ calls and we can use any direction
so this somehow doesn't make sense!
Vinod Koul Nov. 11, 2019, 6:09 a.m. UTC | #13
On 01-11-19, 10:41, Peter Ujfalusi wrote:

> +/* Not much yet */

?

> +static enum dma_status udma_tx_status(struct dma_chan *chan,
> +				      dma_cookie_t cookie,
> +				      struct dma_tx_state *txstate)
> +{
> +	struct udma_chan *uc = to_udma_chan(chan);
> +	enum dma_status ret;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&uc->vc.lock, flags);
> +
> +	ret = dma_cookie_status(chan, cookie, txstate);
> +
> +	if (!udma_is_chan_running(uc))
> +		ret = DMA_COMPLETE;

so a paused channel will result in dma complete status?
Vinod Koul Nov. 11, 2019, 6:11 a.m. UTC | #14
> +config TI_K3_UDMA
> +	tristate "Texas Instruments UDMA support"
> +	depends on ARCH_K3 || COMPILE_TEST
> +	depends on TI_SCI_PROTOCOL
> +	depends on TI_SCI_INTA_IRQCHIP
> +	select DMA_ENGINE
> +	select DMA_VIRTUAL_CHANNELS
> +	select TI_K3_RINGACC
> +	select TI_K3_PSIL
> +	default y

Again no default y!
Vinod Koul Nov. 11, 2019, 6:12 a.m. UTC | #15
On 01-11-19, 10:41, Peter Ujfalusi wrote:
> From: Grygorii Strashko <grygorii.strashko@ti.com>
> 
> Certain users can not use right now the DMAengine API due to missing
> features in the core. Prime example is Networking.
> 
> These users can use the glue layer interface to avoid misuse of DMAengine
> API and when the core gains the needed features they can be converted to
> use generic API.

Can you add some notes on what all features does this layer implement..
Peter Ujfalusi Nov. 11, 2019, 7:39 a.m. UTC | #16
On 11/11/2019 6.21, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> From: Grygorii Strashko <grygorii.strashko@ti.com>
> 
>> +config TI_K3_RINGACC
>> +	tristate "K3 Ring accelerator Sub System"
>> +	depends on ARCH_K3 || COMPILE_TEST
>> +	depends on TI_SCI_INTA_IRQCHIP
>> +	default y
> 
> You want to get an earful from Linus? We dont do default y on new stuff,
> never :)

OK

>> +struct k3_ring_rt_regs {
>> +	u32	resv_16[4];
>> +	u32	db;		/* RT Ring N Doorbell Register */
>> +	u32	resv_4[1];
>> +	u32	occ;		/* RT Ring N Occupancy Register */
>> +	u32	indx;		/* RT Ring N Current Index Register */
>> +	u32	hwocc;		/* RT Ring N Hardware Occupancy Register */
>> +	u32	hwindx;		/* RT Ring N Current Index Register */
> 
> nice comments, how about moving them up into kernel-doc style? (here and
> other places as well)

Sure, I'll convert the comments.

>> +struct k3_ring *k3_ringacc_request_ring(struct k3_ringacc *ringacc,
>> +					int id, u32 flags)
>> +{
>> +	int proxy_id = K3_RINGACC_PROXY_NOT_USED;
>> +
>> +	mutex_lock(&ringacc->req_lock);
>> +
>> +	if (id == K3_RINGACC_RING_ID_ANY) {
>> +		/* Request for any general purpose ring */
>> +		struct ti_sci_resource_desc *gp_rings =
>> +						&ringacc->rm_gp_range->desc[0];
>> +		unsigned long size;
>> +
>> +		size = gp_rings->start + gp_rings->num;
>> +		id = find_next_zero_bit(ringacc->rings_inuse, size,
>> +					gp_rings->start);
>> +		if (id == size)
>> +			goto error;
>> +	} else if (id < 0) {
>> +		goto error;
>> +	}
>> +
>> +	if (test_bit(id, ringacc->rings_inuse) &&
>> +	    !(ringacc->rings[id].flags & K3_RING_FLAG_SHARED))
>> +		goto error;
>> +	else if (ringacc->rings[id].flags & K3_RING_FLAG_SHARED)
>> +		goto out;
>> +
>> +	if (flags & K3_RINGACC_RING_USE_PROXY) {
>> +		proxy_id = find_next_zero_bit(ringacc->proxy_inuse,
>> +					      ringacc->num_proxies, 0);
>> +		if (proxy_id == ringacc->num_proxies)
>> +			goto error;
>> +	}
>> +
>> +	if (!try_module_get(ringacc->dev->driver->owner))
>> +		goto error;
> 
> should this not be one of the first things to do?

I'll move it.

> 
>> +
>> +	if (proxy_id != K3_RINGACC_PROXY_NOT_USED) {
>> +		set_bit(proxy_id, ringacc->proxy_inuse);
>> +		ringacc->rings[id].proxy_id = proxy_id;
>> +		dev_dbg(ringacc->dev, "Giving ring#%d proxy#%d\n", id,
>> +			proxy_id);
>> +	} else {
>> +		dev_dbg(ringacc->dev, "Giving ring#%d\n", id);
>> +	}
> 
> how bout removing else and doing common print?

When the proxy is used we want to print that as well, I think it is
cleaner to have separate prints for the two cases.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 8 a.m. UTC | #17
On 11/11/2019 6.39, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> A DMA hardware can have big cache or FIFO and the amount of data sitting in
>> the DMA fabric can be an interest for the clients.
>>
>> For example in audio we want to know the delay in the data flow and in case
>> the DMA have significantly large FIFO/cache, it can affect the latenc/delay
>>
>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>> Reviewed-by: Tero Kristo <t-kristo@ti.com>
>> ---
>>  drivers/dma/dmaengine.h   | 8 ++++++++
>>  include/linux/dmaengine.h | 2 ++
>>  2 files changed, 10 insertions(+)
>>
>> diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h
>> index 501c0b063f85..b0b97475707a 100644
>> --- a/drivers/dma/dmaengine.h
>> +++ b/drivers/dma/dmaengine.h
>> @@ -77,6 +77,7 @@ static inline enum dma_status dma_cookie_status(struct dma_chan *chan,
>>  		state->last = complete;
>>  		state->used = used;
>>  		state->residue = 0;
>> +		state->in_flight_bytes = 0;
>>  	}
>>  	return dma_async_is_complete(cookie, complete, used);
>>  }
>> @@ -87,6 +88,13 @@ static inline void dma_set_residue(struct dma_tx_state *state, u32 residue)
>>  		state->residue = residue;
>>  }
>>  
>> +static inline void dma_set_in_flight_bytes(struct dma_tx_state *state,
>> +					   u32 in_flight_bytes)
>> +{
>> +	if (state)
>> +		state->in_flight_bytes = in_flight_bytes;
>> +}
>> +
>>  struct dmaengine_desc_callback {
>>  	dma_async_tx_callback callback;
>>  	dma_async_tx_callback_result callback_result;
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 0e8b426bbde9..c4c5219030a6 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -682,11 +682,13 @@ static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descr
>>   * @residue: the remaining number of bytes left to transmit
>>   *	on the selected transfer for states DMA_IN_PROGRESS and
>>   *	DMA_PAUSED if this is implemented in the driver, else 0
>> + * @in_flight_bytes: amount of data in bytes cached by the DMA.
>>   */
>>  struct dma_tx_state {
>>  	dma_cookie_t last;
>>  	dma_cookie_t used;
>>  	u32 residue;
>> +	u32 in_flight_bytes;
> 
> Should we add this here or use the dmaengine_result()

Ideally at the time dmaengine_result is used (at tx completion callback)
there should be nothing in flight ;)

The reason why it is added to dma_tx_state is that clients can check at
any time while the DMA is running the number of cached bytes.
Audio needs this for cyclic and UART also needs to know it.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 8:33 a.m. UTC | #18
On 11/11/2019 7.28, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
> 
>> +struct udma_chan {
>> +	struct virt_dma_chan vc;
>> +	struct dma_slave_config	cfg;
>> +	struct udma_dev *ud;
>> +	struct udma_desc *desc;
>> +	struct udma_desc *terminated_desc;
> 
> descriptor and not a list?

Yes, not a list. I have only one transfer (if any) submitted to
hardware. This is mostly due to the packet mode RX operation: no
prelinked support in UDMAP so I need to have as many descriptors queued
up as the number of sg elements.

I need to keep the terminated descriptor around to be able to free it up
_after_ UDMAP returned it to avoid it modifying released memory.

>> +	struct udma_static_tr static_tr;
>> +	char *name;
>> +
>> +	struct udma_tchan *tchan;
>> +	struct udma_rchan *rchan;
>> +	struct udma_rflow *rflow;
>> +
>> +	bool psil_paired;
>> +
>> +	int irq_num_ring;
>> +	int irq_num_udma;
>> +
>> +	bool cyclic;
>> +	bool paused;
>> +
>> +	enum udma_chan_state state;
>> +	struct completion teardown_completed;
>> +
>> +	u32 bcnt; /* number of bytes completed since the start of the channel */
>> +	u32 in_ring_cnt; /* number of descriptors in flight */
>> +
>> +	bool pkt_mode; /* TR or packet */
>> +	bool needs_epib; /* EPIB is needed for the communication or not */
>> +	u32 psd_size; /* size of Protocol Specific Data */
>> +	u32 metadata_size; /* (needs_epib ? 16:0) + psd_size */
>> +	u32 hdesc_size; /* Size of a packet descriptor in packet mode */
>> +	bool notdpkt; /* Suppress sending TDC packet */
>> +	int remote_thread_id;
>> +	u32 src_thread;
>> +	u32 dst_thread;
>> +	enum psil_endpoint_type ep_type;
>> +	bool enable_acc32;
>> +	bool enable_burst;
>> +	enum udma_tp_level channel_tpl; /* Channel Throughput Level */
>> +
>> +	/* dmapool for packet mode descriptors */
>> +	bool use_dma_pool;
>> +	struct dma_pool *hdesc_pool;
>> +
>> +	u32 id;
>> +	enum dma_transfer_direction dir;
> 
> why does channel have this, it already exists in descriptor

The channel can not change role, it is set when it was requested. In the
prep callbacks I do check if the direction matches with the channel's
direction.

>> +static irqreturn_t udma_udma_irq_handler(int irq, void *data)
>> +{
>> +	struct udma_chan *uc = data;
>> +
>> +	udma_tr_event_callback(uc);
> 
> any reason why we want to call a fn and not code here..?

No particular reason, I'll move them.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 8:47 a.m. UTC | #19
On 11/11/2019 6.47, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
> 
>> --- /dev/null
>> +++ b/drivers/dma/ti/k3-psil.c
>> @@ -0,0 +1,97 @@
>> +// SPDX-License-Identifier: GPL-2.0
> 
> ...
> 
>> +extern struct psil_ep_map am654_ep_map;
>> +extern struct psil_ep_map j721e_ep_map;
>> +
>> +static DEFINE_MUTEX(ep_map_mutex);
>> +static struct psil_ep_map *soc_ep_map;
>> +
>> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
>> +{
>> +	int i;
>> +
>> +	mutex_lock(&ep_map_mutex);
>> +	if (!soc_ep_map) {
>> +		if (of_machine_is_compatible("ti,am654")) {
>> +			soc_ep_map = &am654_ep_map;
>> +		} else if (of_machine_is_compatible("ti,j721e")) {
>> +			soc_ep_map = &j721e_ep_map;
>> +		} else {
>> +			pr_err("PSIL: No compatible machine found for map\n");
>> +			return ERR_PTR(-ENOTSUPP);
>> +		}
>> +		pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
>> +	}
>> +	mutex_unlock(&ep_map_mutex);
>> +
>> +	if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
>> +		/* check in destination thread map */
>> +		for (i = 0; i < soc_ep_map->dst_count; i++) {
>> +			if (soc_ep_map->dst[i].thread_id == thread_id)
>> +				return &soc_ep_map->dst[i].ep_config;
>> +		}
>> +	}
>> +
>> +	thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
>> +	if (soc_ep_map->src) {
>> +		for (i = 0; i < soc_ep_map->src_count; i++) {
>> +			if (soc_ep_map->src[i].thread_id == thread_id)
>> +				return &soc_ep_map->src[i].ep_config;
>> +		}
>> +	}
>> +
>> +	return ERR_PTR(-ENOENT);
>> +}
>> +EXPORT_SYMBOL(psil_get_ep_config);
> 
> This doesn't match the license of this module, we need it to be
> EXPORT_SYMBOL_GPL

Oops, will fix it.


- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Vinod Koul Nov. 11, 2019, 9 a.m. UTC | #20
On 11-11-19, 10:33, Peter Ujfalusi wrote:
> On 11/11/2019 7.28, Vinod Koul wrote:
> > On 01-11-19, 10:41, Peter Ujfalusi wrote:

> >> +	struct udma_static_tr static_tr;
> >> +	char *name;
> >> +
> >> +	struct udma_tchan *tchan;
> >> +	struct udma_rchan *rchan;
> >> +	struct udma_rflow *rflow;
> >> +
> >> +	bool psil_paired;
> >> +
> >> +	int irq_num_ring;
> >> +	int irq_num_udma;
> >> +
> >> +	bool cyclic;
> >> +	bool paused;
> >> +
> >> +	enum udma_chan_state state;
> >> +	struct completion teardown_completed;
> >> +
> >> +	u32 bcnt; /* number of bytes completed since the start of the channel */
> >> +	u32 in_ring_cnt; /* number of descriptors in flight */
> >> +
> >> +	bool pkt_mode; /* TR or packet */
> >> +	bool needs_epib; /* EPIB is needed for the communication or not */
> >> +	u32 psd_size; /* size of Protocol Specific Data */
> >> +	u32 metadata_size; /* (needs_epib ? 16:0) + psd_size */
> >> +	u32 hdesc_size; /* Size of a packet descriptor in packet mode */
> >> +	bool notdpkt; /* Suppress sending TDC packet */
> >> +	int remote_thread_id;
> >> +	u32 src_thread;
> >> +	u32 dst_thread;
> >> +	enum psil_endpoint_type ep_type;
> >> +	bool enable_acc32;
> >> +	bool enable_burst;
> >> +	enum udma_tp_level channel_tpl; /* Channel Throughput Level */
> >> +
> >> +	/* dmapool for packet mode descriptors */
> >> +	bool use_dma_pool;
> >> +	struct dma_pool *hdesc_pool;
> >> +
> >> +	u32 id;
> >> +	enum dma_transfer_direction dir;
> > 
> > why does channel have this, it already exists in descriptor
> 
> The channel can not change role, it is set when it was requested. In the

how do you do this on set? The channel is requested, we do not know the
direction. When prep_ is invoked we know it..
Peter Ujfalusi Nov. 11, 2019, 9:12 a.m. UTC | #21
On 11/11/2019 11.00, Vinod Koul wrote:
> On 11-11-19, 10:33, Peter Ujfalusi wrote:
>> On 11/11/2019 7.28, Vinod Koul wrote:
>>> On 01-11-19, 10:41, Peter Ujfalusi wrote:
> 
>>>> +	struct udma_static_tr static_tr;
>>>> +	char *name;
>>>> +
>>>> +	struct udma_tchan *tchan;
>>>> +	struct udma_rchan *rchan;
>>>> +	struct udma_rflow *rflow;
>>>> +
>>>> +	bool psil_paired;
>>>> +
>>>> +	int irq_num_ring;
>>>> +	int irq_num_udma;
>>>> +
>>>> +	bool cyclic;
>>>> +	bool paused;
>>>> +
>>>> +	enum udma_chan_state state;
>>>> +	struct completion teardown_completed;
>>>> +
>>>> +	u32 bcnt; /* number of bytes completed since the start of the channel */
>>>> +	u32 in_ring_cnt; /* number of descriptors in flight */
>>>> +
>>>> +	bool pkt_mode; /* TR or packet */
>>>> +	bool needs_epib; /* EPIB is needed for the communication or not */
>>>> +	u32 psd_size; /* size of Protocol Specific Data */
>>>> +	u32 metadata_size; /* (needs_epib ? 16:0) + psd_size */
>>>> +	u32 hdesc_size; /* Size of a packet descriptor in packet mode */
>>>> +	bool notdpkt; /* Suppress sending TDC packet */
>>>> +	int remote_thread_id;
>>>> +	u32 src_thread;
>>>> +	u32 dst_thread;
>>>> +	enum psil_endpoint_type ep_type;
>>>> +	bool enable_acc32;
>>>> +	bool enable_burst;
>>>> +	enum udma_tp_level channel_tpl; /* Channel Throughput Level */
>>>> +
>>>> +	/* dmapool for packet mode descriptors */
>>>> +	bool use_dma_pool;
>>>> +	struct dma_pool *hdesc_pool;
>>>> +
>>>> +	u32 id;
>>>> +	enum dma_transfer_direction dir;
>>>
>>> why does channel have this, it already exists in descriptor
>>
>> The channel can not change role, it is set when it was requested. In the
> 
> how do you do this on set? The channel is requested, we do not know the
> direction. When prep_ is invoked we know it..

In UDMAP we must know it as a channel can do only one direction transfer:

dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
dma-names = "tx", "rx";

0xc400 is a destination thread ID, so the 'tx' channel can only do
MEM_TO_DEV
0x4400 is a source thread, 'rx' can only do DEV_TO_MEM.

We can not switch direction runtime.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 9:16 a.m. UTC | #22
On 11/11/2019 7.33, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
> 
>> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
>> +{
>> +	struct psil_endpoint_config *ep_config;
>> +	struct udma_chan *uc;
>> +	struct udma_dev *ud;
>> +	u32 *args;
>> +
>> +	if (chan->device->dev->driver != &udma_driver.driver)
>> +		return false;
>> +
>> +	uc = to_udma_chan(chan);
>> +	ud = uc->ud;
>> +	args = param;
>> +	uc->remote_thread_id = args[0];
>> +
>> +	if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
>> +		uc->dir = DMA_MEM_TO_DEV;
>> +	else
>> +		uc->dir = DMA_DEV_TO_MEM;
> 
> Can you explain this a bit?

The UDMAP in K3 works between two PSI-L endpoint. The source and
destination needs to be paired to allow data flow.
Source thread IDs are in range of 0x0000 - 0x7fff, while destination
thread IDs are 0x8000 - 0xffff.

If the remote thread ID have the bit 31 set (0x8000) then the transfer
is MEM_TO_DEV and I need to pick one unused tchan for it. If the remote
is the source then it can be handled by rchan.

dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
dma-names = "tx", "rx";

0xc400 is a destination thread ID, so it is MEM_TO_DEV
0x4400 is a source thread ID, so it is DEV_TO_MEM

Even in MEM_TO_MEM case I need to pair two UDMAP channels:
UDMAP source threads are starting at offset 0x1000, UDMAP destination
threads are 0x9000+

Changing direction runtime is hardly possible as it would involve
tearing down the channel, removing interrupts, destroying rings,
removing the PSI-L pairing and redoing everything.

>> +static int udma_remove(struct platform_device *pdev)
>> +{
>> +	struct udma_dev *ud = platform_get_drvdata(pdev);
>> +
>> +	of_dma_controller_free(pdev->dev.of_node);
>> +	dma_async_device_unregister(&ud->ddev);
>> +
>> +	/* Make sure that we did proper cleanup */
>> +	cancel_work_sync(&ud->purge_work);
>> +	udma_purge_desc_work(&ud->purge_work);
> 
> kill the vchan tasklets at it too please

Oh, I have missed that, I'll add it.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 9:40 a.m. UTC | #23
On 11/11/2019 8.06, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> Split patch for review containing: channel rsource allocation and free
> 
> s/rsource/resource

I'll try to remember to fix up this temporally commit message, at the
end these split patches are going to be squashed into one commit when
things are ready to be applied.

>> +static int udma_tisci_tx_channel_config(struct udma_chan *uc)
>> +{
>> +	struct udma_dev *ud = uc->ud;
>> +	struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
>> +	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
>> +	struct udma_tchan *tchan = uc->tchan;
>> +	int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
>> +	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
>> +	u32 mode, fetch_size;
>> +	int ret = 0;
>> +
>> +	if (uc->pkt_mode) {
>> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
>> +		fetch_size = cppi5_hdesc_calc_size(uc->needs_epib, uc->psd_size,
>> +						   0);
>> +	} else {
>> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
>> +		fetch_size = sizeof(struct cppi5_desc_hdr_t);
>> +	}
>> +
>> +	req_tx.valid_params =
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;
> 
> bunch of these are repeat, you can define a COMMON_VALID_PARAMS and use
> that + specific ones..

OK, I'll try to sanitize these a bit.

>> +
>> +	req_tx.nav_id = tisci_rm->tisci_dev_id;
>> +	req_tx.index = tchan->id;
>> +	req_tx.tx_pause_on_err = 0;
>> +	req_tx.tx_filt_einfo = 0;
>> +	req_tx.tx_filt_pswords = 0;
> 
> i think initialization to 0 is superfluous

Indeed, I'll remove these.

>> +	req_tx.tx_chan_type = mode;
>> +	req_tx.tx_supr_tdpkt = uc->notdpkt;
>> +	req_tx.tx_fetch_size = fetch_size >> 2;
>> +	req_tx.txcq_qnum = tc_ring;
>> +	if (uc->ep_type == PSIL_EP_PDMA_XY) {
>> +		/* wait for peer to complete the teardown for PDMAs */
>> +		req_tx.valid_params |=
>> +				TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_TDTYPE_VALID;
>> +		req_tx.tx_tdtype = 1;
>> +	}
>> +
>> +	ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
>> +	if (ret)
>> +		dev_err(ud->dev, "tchan%d cfg failed %d\n", tchan->id, ret);
>> +
>> +	return ret;
>> +}
>> +
>> +static int udma_tisci_rx_channel_config(struct udma_chan *uc)
>> +{
>> +	struct udma_dev *ud = uc->ud;
>> +	struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
>> +	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
>> +	struct udma_rchan *rchan = uc->rchan;
>> +	int fd_ring = k3_ringacc_get_ring_id(rchan->fd_ring);
>> +	int rx_ring = k3_ringacc_get_ring_id(rchan->r_ring);
>> +	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
>> +	struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
>> +	u32 mode, fetch_size;
>> +	int ret = 0;
>> +
>> +	if (uc->pkt_mode) {
>> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
>> +		fetch_size = cppi5_hdesc_calc_size(uc->needs_epib,
>> +							uc->psd_size, 0);
>> +	} else {
>> +		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
>> +		fetch_size = sizeof(struct cppi5_desc_hdr_t);
>> +	}
>> +
>> +	req_rx.valid_params =
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_SHORT_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_LONG_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
>> +
>> +	req_rx.nav_id = tisci_rm->tisci_dev_id;
>> +	req_rx.index = rchan->id;
>> +	req_rx.rx_fetch_size =  fetch_size >> 2;
>> +	req_rx.rxcq_qnum = rx_ring;
>> +	req_rx.rx_pause_on_err = 0;
>> +	req_rx.rx_chan_type = mode;
>> +	req_rx.rx_ignore_short = 0;
>> +	req_rx.rx_ignore_long = 0;
>> +	req_rx.flowid_start = 0;
>> +	req_rx.flowid_cnt = 0;
>> +
>> +	ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
>> +	if (ret) {
>> +		dev_err(ud->dev, "rchan%d cfg failed %d\n", rchan->id, ret);
>> +		return ret;
>> +	}
>> +
>> +	flow_req.valid_params =
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_EINFO_PRESENT_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_PSINFO_PRESENT_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_ERROR_HANDLING_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DESC_TYPE_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_HI_SEL_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_LO_SEL_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_HI_SEL_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_LO_SEL_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
>> +		TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
>> +
>> +	flow_req.nav_id = tisci_rm->tisci_dev_id;
>> +	flow_req.flow_index = rchan->id;
>> +
>> +	if (uc->needs_epib)
>> +		flow_req.rx_einfo_present = 1;
>> +	else
>> +		flow_req.rx_einfo_present = 0;
>> +	if (uc->psd_size)
>> +		flow_req.rx_psinfo_present = 1;
>> +	else
>> +		flow_req.rx_psinfo_present = 0;
>> +	flow_req.rx_error_handling = 1;
>> +	flow_req.rx_desc_type = 0;
>> +	flow_req.rx_dest_qnum = rx_ring;
>> +	flow_req.rx_src_tag_hi_sel = 2;
>> +	flow_req.rx_src_tag_lo_sel = 4;
>> +	flow_req.rx_dest_tag_hi_sel = 5;
>> +	flow_req.rx_dest_tag_lo_sel = 4;
> 
> can we get rid of magic numbers here and elsewhere, or at least comment
> on what these mean..

True, I'll clean it up.

>> +static int udma_alloc_chan_resources(struct dma_chan *chan)
>> +{
>> +	struct udma_chan *uc = to_udma_chan(chan);
>> +	struct udma_dev *ud = to_udma_dev(chan->device);
>> +	const struct udma_match_data *match_data = ud->match_data;
>> +	struct k3_ring *irq_ring;
>> +	u32 irq_udma_idx;
>> +	int ret;
>> +
>> +	if (uc->pkt_mode || uc->dir == DMA_MEM_TO_MEM) {
>> +		uc->use_dma_pool = true;
>> +		/* in case of MEM_TO_MEM we have maximum of two TRs */
>> +		if (uc->dir == DMA_MEM_TO_MEM) {
>> +			uc->hdesc_size = cppi5_trdesc_calc_size(
>> +					sizeof(struct cppi5_tr_type15_t), 2);
>> +			uc->pkt_mode = false;
>> +		}
>> +	}
>> +
>> +	if (uc->use_dma_pool) {
>> +		uc->hdesc_pool = dma_pool_create(uc->name, ud->ddev.dev,
>> +						 uc->hdesc_size, ud->desc_align,
>> +						 0);
>> +		if (!uc->hdesc_pool) {
>> +			dev_err(ud->ddev.dev,
>> +				"Descriptor pool allocation failed\n");
>> +			uc->use_dma_pool = false;
>> +			return -ENOMEM;
>> +		}
>> +	}
>> +
>> +	/*
>> +	 * Make sure that the completion is in a known state:
>> +	 * No teardown, the channel is idle
>> +	 */
>> +	reinit_completion(&uc->teardown_completed);
>> +	complete_all(&uc->teardown_completed);
> 
> should we not complete first and then do reinit to bring a clean state?

The reason why it is like this is that the udma_synchronize() is
checking the completion and if the client requested the channel and
calls terminate_all_sync() without any transfer then no one will mark
the completion completed.

>> +	uc->state = UDMA_CHAN_IS_IDLE;
>> +
>> +	switch (uc->dir) {
>> +	case DMA_MEM_TO_MEM:
> 
> can you explain why a allocation should be channel dependent, shouldn't
> these things be done in prep_ calls?

A channel can not change direction, it is either MEM_TO_DEV, DEV_TO_MEM
or MEM_TO_MEM and it is set when the channel is requested.

> I looked ahead and checked the prep_ calls and we can use any direction
> so this somehow doesn't make sense!

I'm checking in the prep callbacks if the requested direction is
matching with the channel direction.

I just can not change the channel direction runtime.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 10:29 a.m. UTC | #24
On 11/11/2019 8.09, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
> 
>> +/* Not much yet */
> 
> ?

Forgot to remove it when I did implemented the tx_status() ;)

> 
>> +static enum dma_status udma_tx_status(struct dma_chan *chan,
>> +				      dma_cookie_t cookie,
>> +				      struct dma_tx_state *txstate)
>> +{
>> +	struct udma_chan *uc = to_udma_chan(chan);
>> +	enum dma_status ret;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&uc->vc.lock, flags);
>> +
>> +	ret = dma_cookie_status(chan, cookie, txstate);
>> +
>> +	if (!udma_is_chan_running(uc))
>> +		ret = DMA_COMPLETE;
> 
> so a paused channel will result in dma complete status?

The channel is still enabled (running), the pause only sets a bit in the
channel's real time control register.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 10:30 a.m. UTC | #25
On 11/11/2019 8.11, Vinod Koul wrote:
>> +config TI_K3_UDMA
>> +	tristate "Texas Instruments UDMA support"
>> +	depends on ARCH_K3 || COMPILE_TEST
>> +	depends on TI_SCI_PROTOCOL
>> +	depends on TI_SCI_INTA_IRQCHIP
>> +	select DMA_ENGINE
>> +	select DMA_VIRTUAL_CHANNELS
>> +	select TI_K3_RINGACC
>> +	select TI_K3_PSIL
>> +	default y
> 
> Again no default y!

Removed

> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 11, 2019, 10:31 a.m. UTC | #26
On 11/11/2019 8.12, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>>
>> Certain users can not use right now the DMAengine API due to missing
>> features in the core. Prime example is Networking.
>>
>> These users can use the glue layer interface to avoid misuse of DMAengine
>> API and when the core gains the needed features they can be converted to
>> use generic API.
> 
> Can you add some notes on what all features does this layer implement..

In the commit message or in the code?

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Vinod Koul Nov. 12, 2019, 5:34 a.m. UTC | #27
On 11-11-19, 11:16, Peter Ujfalusi wrote:
> 
> 
> On 11/11/2019 7.33, Vinod Koul wrote:
> > On 01-11-19, 10:41, Peter Ujfalusi wrote:
> > 
> >> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
> >> +{
> >> +	struct psil_endpoint_config *ep_config;
> >> +	struct udma_chan *uc;
> >> +	struct udma_dev *ud;
> >> +	u32 *args;
> >> +
> >> +	if (chan->device->dev->driver != &udma_driver.driver)
> >> +		return false;
> >> +
> >> +	uc = to_udma_chan(chan);
> >> +	ud = uc->ud;
> >> +	args = param;
> >> +	uc->remote_thread_id = args[0];
> >> +
> >> +	if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
> >> +		uc->dir = DMA_MEM_TO_DEV;
> >> +	else
> >> +		uc->dir = DMA_DEV_TO_MEM;
> > 
> > Can you explain this a bit?
> 
> The UDMAP in K3 works between two PSI-L endpoint. The source and
> destination needs to be paired to allow data flow.
> Source thread IDs are in range of 0x0000 - 0x7fff, while destination
> thread IDs are 0x8000 - 0xffff.
> 
> If the remote thread ID have the bit 31 set (0x8000) then the transfer
> is MEM_TO_DEV and I need to pick one unused tchan for it. If the remote
> is the source then it can be handled by rchan.
> 
> dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
> dma-names = "tx", "rx";
> 
> 0xc400 is a destination thread ID, so it is MEM_TO_DEV
> 0x4400 is a source thread ID, so it is DEV_TO_MEM
> 
> Even in MEM_TO_MEM case I need to pair two UDMAP channels:
> UDMAP source threads are starting at offset 0x1000, UDMAP destination
> threads are 0x9000+

Okay so a channel is set for a direction until teardown. Also this and
other patch comments are quite useful, can we add them here?

> Changing direction runtime is hardly possible as it would involve
> tearing down the channel, removing interrupts, destroying rings,
> removing the PSI-L pairing and redoing everything.

okay I would expect the prep_ to check for direction and reject the call
if direction is different.
Vinod Koul Nov. 12, 2019, 5:36 a.m. UTC | #28
On 11-11-19, 12:29, Peter Ujfalusi wrote:
> On 11/11/2019 8.09, Vinod Koul wrote:
> > On 01-11-19, 10:41, Peter Ujfalusi wrote:

> >> +static enum dma_status udma_tx_status(struct dma_chan *chan,
> >> +				      dma_cookie_t cookie,
> >> +				      struct dma_tx_state *txstate)
> >> +{
> >> +	struct udma_chan *uc = to_udma_chan(chan);
> >> +	enum dma_status ret;
> >> +	unsigned long flags;
> >> +
> >> +	spin_lock_irqsave(&uc->vc.lock, flags);
> >> +
> >> +	ret = dma_cookie_status(chan, cookie, txstate);
> >> +
> >> +	if (!udma_is_chan_running(uc))
> >> +		ret = DMA_COMPLETE;
> > 
> > so a paused channel will result in dma complete status?
> 
> The channel is still enabled (running), the pause only sets a bit in the
> channel's real time control register.

Okay and which cases will channel not be running i.e., you return
DMA_COMPLETE above?
Vinod Koul Nov. 12, 2019, 5:37 a.m. UTC | #29
On 11-11-19, 12:31, Peter Ujfalusi wrote:
> 
> 
> On 11/11/2019 8.12, Vinod Koul wrote:
> > On 01-11-19, 10:41, Peter Ujfalusi wrote:
> >> From: Grygorii Strashko <grygorii.strashko@ti.com>
> >>
> >> Certain users can not use right now the DMAengine API due to missing
> >> features in the core. Prime example is Networking.
> >>
> >> These users can use the glue layer interface to avoid misuse of DMAengine
> >> API and when the core gains the needed features they can be converted to
> >> use generic API.
> > 
> > Can you add some notes on what all features does this layer implement..
> 
> In the commit message or in the code?

commit here so that we know what to expect.

Thanks
Peter Ujfalusi Nov. 12, 2019, 7:22 a.m. UTC | #30
On 12/11/2019 7.34, Vinod Koul wrote:
> On 11-11-19, 11:16, Peter Ujfalusi wrote:
>>
>>
>> On 11/11/2019 7.33, Vinod Koul wrote:
>>> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>>>
>>>> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
>>>> +{
>>>> +	struct psil_endpoint_config *ep_config;
>>>> +	struct udma_chan *uc;
>>>> +	struct udma_dev *ud;
>>>> +	u32 *args;
>>>> +
>>>> +	if (chan->device->dev->driver != &udma_driver.driver)
>>>> +		return false;
>>>> +
>>>> +	uc = to_udma_chan(chan);
>>>> +	ud = uc->ud;
>>>> +	args = param;
>>>> +	uc->remote_thread_id = args[0];
>>>> +
>>>> +	if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
>>>> +		uc->dir = DMA_MEM_TO_DEV;
>>>> +	else
>>>> +		uc->dir = DMA_DEV_TO_MEM;
>>>
>>> Can you explain this a bit?
>>
>> The UDMAP in K3 works between two PSI-L endpoint. The source and
>> destination needs to be paired to allow data flow.
>> Source thread IDs are in range of 0x0000 - 0x7fff, while destination
>> thread IDs are 0x8000 - 0xffff.
>>
>> If the remote thread ID have the bit 31 set (0x8000) then the transfer
>> is MEM_TO_DEV and I need to pick one unused tchan for it. If the remote
>> is the source then it can be handled by rchan.
>>
>> dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
>> dma-names = "tx", "rx";
>>
>> 0xc400 is a destination thread ID, so it is MEM_TO_DEV
>> 0x4400 is a source thread ID, so it is DEV_TO_MEM
>>
>> Even in MEM_TO_MEM case I need to pair two UDMAP channels:
>> UDMAP source threads are starting at offset 0x1000, UDMAP destination
>> threads are 0x9000+
> 
> Okay so a channel is set for a direction until teardown. Also this and
> other patch comments are quite useful, can we add them here?

The direction checks in the prep callbacks do print the reason why the
transfer is rejected when it comes to not matching direction.

Having said that, I can add comment to the udma_alloc_chan_resources()
function about this restriction, or better a dev_dbg() to say that the
given channel is allocated for a given direction.

>> Changing direction runtime is hardly possible as it would involve
>> tearing down the channel, removing interrupts, destroying rings,
>> removing the PSI-L pairing and redoing everything.
> 
> okay I would expect the prep_ to check for direction and reject the call
> if direction is different.

They do check, udma_prep_slave_sg() and udma_prep_dma_cyclic():
if (dir != uc->dir) {
	dev_err(chan->device->dev,
		"%s: chan%d is for %s, not supporting %s\n",
		__func__, uc->id, udma_get_dir_text(uc->dir),
		udma_get_dir_text(dir));
	return NULL;
}

udma_prep_dma_memcpy():
if (uc->dir != DMA_MEM_TO_MEM) {
	dev_err(chan->device->dev,
		"%s: chan%d is for %s, not supporting %s\n",
		__func__, uc->id, udma_get_dir_text(uc->dir),
		udma_get_dir_text(DMA_MEM_TO_MEM));
	return NULL;
}

> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 12, 2019, 7:24 a.m. UTC | #31
On 12/11/2019 7.36, Vinod Koul wrote:
> On 11-11-19, 12:29, Peter Ujfalusi wrote:
>> On 11/11/2019 8.09, Vinod Koul wrote:
>>> On 01-11-19, 10:41, Peter Ujfalusi wrote:
> 
>>>> +static enum dma_status udma_tx_status(struct dma_chan *chan,
>>>> +				      dma_cookie_t cookie,
>>>> +				      struct dma_tx_state *txstate)
>>>> +{
>>>> +	struct udma_chan *uc = to_udma_chan(chan);
>>>> +	enum dma_status ret;
>>>> +	unsigned long flags;
>>>> +
>>>> +	spin_lock_irqsave(&uc->vc.lock, flags);
>>>> +
>>>> +	ret = dma_cookie_status(chan, cookie, txstate);
>>>> +
>>>> +	if (!udma_is_chan_running(uc))
>>>> +		ret = DMA_COMPLETE;
>>>
>>> so a paused channel will result in dma complete status?
>>
>> The channel is still enabled (running), the pause only sets a bit in the
>> channel's real time control register.
> 
> Okay and which cases will channel not be running i.e., you return
> DMA_COMPLETE above?

After terminate_all or before the first issue_pending call. When the
channel is disabled.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 12, 2019, 7:25 a.m. UTC | #32
On 12/11/2019 7.37, Vinod Koul wrote:
> On 11-11-19, 12:31, Peter Ujfalusi wrote:
>>
>>
>> On 11/11/2019 8.12, Vinod Koul wrote:
>>> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>>>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>>>>
>>>> Certain users can not use right now the DMAengine API due to missing
>>>> features in the core. Prime example is Networking.
>>>>
>>>> These users can use the glue layer interface to avoid misuse of DMAengine
>>>> API and when the core gains the needed features they can be converted to
>>>> use generic API.
>>>
>>> Can you add some notes on what all features does this layer implement..
>>
>> In the commit message or in the code?
> 
> commit here so that we know what to expect.

Can you check the v5 commit message if it is sufficient? If not, I can
make it more verbose for v6.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki