mbox series

[v3,0/5] arm: aspeed: Add UART DMA support

Message ID 20230320081133.23655-1-chiawei_wang@aspeedtech.com
Headers show
Series arm: aspeed: Add UART DMA support | expand

Message

ChiaWei Wang March 20, 2023, 8:11 a.m. UTC
This patch serias adds the 8250 driver with DMA support for AST26xx UART devices

v3 change:
 - add error check against dma_to_phys
 - revise UDMA binding
 - remove UDMA binding header
 - remove redundant header inclusions of UDMA driver
 - fix incorrect UDMA channel number (14->28)
 - place UDMA kconfig in alphabetical order
 - collect Acked-by tags

v2 change:
 - re-write UDMA driver based on the DMAEngine framework
 - re-write 8250_aspeed driver with DMA support based on the 8250_dma implementation
 - remove virtual UART part as there is already a 8250_aspeed_vuart driver

Chia-Wei Wang (5):
  dt-bindings: serial: 8250: Add aspeed,ast2600-uart
  dt-bindings: dmaengine: Add AST2600 UDMA bindings
  dmaengine: aspeed: Add AST2600 UART DMA driver
  serial: 8250: Add AST2600 UART driver
  ARM: dts: aspeed-g6: Add UDMA node

 .../bindings/dma/aspeed,ast2600-udma.yaml     |  56 ++
 .../devicetree/bindings/serial/8250.yaml      |   1 +
 arch/arm/boot/dts/aspeed-g6.dtsi              |   9 +
 drivers/dma/Kconfig                           |   9 +
 drivers/dma/Makefile                          |   1 +
 drivers/dma/ast2600-udma.c                    | 534 ++++++++++++++++++
 drivers/tty/serial/8250/8250_aspeed.c         | 224 ++++++++
 drivers/tty/serial/8250/Kconfig               |   8 +
 drivers/tty/serial/8250/Makefile              |   1 +
 9 files changed, 843 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/aspeed,ast2600-udma.yaml
 create mode 100644 drivers/dma/ast2600-udma.c
 create mode 100644 drivers/tty/serial/8250/8250_aspeed.c

Comments

Ilpo Järvinen March 20, 2023, 9:42 a.m. UTC | #1
On Mon, 20 Mar 2023, Chia-Wei Wang wrote:

> Add new UART driver with DMA support for Aspeed AST2600 SoCs.
> The drivers mainly prepare the dma instance based on the 8250_dma
> implementation to leverage the AST2600 UART DMA (UDMA) engine.
> 
> Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
> ---
>  drivers/tty/serial/8250/8250_aspeed.c | 224 ++++++++++++++++++++++++++
>  drivers/tty/serial/8250/Kconfig       |   8 +
>  drivers/tty/serial/8250/Makefile      |   1 +
>  3 files changed, 233 insertions(+)
>  create mode 100644 drivers/tty/serial/8250/8250_aspeed.c
> 
> diff --git a/drivers/tty/serial/8250/8250_aspeed.c b/drivers/tty/serial/8250/8250_aspeed.c
> new file mode 100644
> index 000000000000..04d0bf6fba28
> --- /dev/null
> +++ b/drivers/tty/serial/8250/8250_aspeed.c
> @@ -0,0 +1,224 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) ASPEED Technology Inc.
> + */
> +#include <linux/device.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/serial_8250.h>
> +#include <linux/serial_reg.h>
> +#include <linux/of.h>
> +#include <linux/of_irq.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +#include <linux/clk.h>
> +#include <linux/reset.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/circ_buf.h>
> +#include <linux/tty_flip.h>
> +#include <linux/pm_runtime.h>
> +
> +#include "8250.h"
> +
> +#define DEVICE_NAME "aspeed-uart"
> +
> +struct ast8250_data {
> +	int line;
> +	int irq;
> +	u8 __iomem *regs;
> +	struct reset_control *rst;
> +	struct clk *clk;
> +#ifdef CONFIG_SERIAL_8250_DMA
> +	struct uart_8250_dma dma;
> +#endif
> +};
> +
> +#ifdef CONFIG_SERIAL_8250_DMA
> +static int ast8250_rx_dma(struct uart_8250_port *p);
> +
> +static void ast8250_rx_dma_complete(void *param)
> +{
> +	struct uart_8250_port *p = param;
> +	struct uart_8250_dma *dma = p->dma;
> +	struct tty_port *tty_port = &p->port.state->port;
> +	struct dma_tx_state	state;
> +	int	count;
> +
> +	dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
> +
> +	count = dma->rx_size - state.residue;
> +
> +	tty_insert_flip_string(tty_port, dma->rx_buf, count);
> +	p->port.icount.rx += count;
> +
> +	tty_flip_buffer_push(tty_port);
> +
> +	ast8250_rx_dma(p);
> +}
> +
> +static int ast8250_rx_dma(struct uart_8250_port *p)
> +{
> +	struct uart_8250_dma *dma = p->dma;
> +	struct dma_async_tx_descriptor *tx;
> +
> +	tx = dmaengine_prep_slave_single(dma->rxchan, dma->rx_addr,
> +					 dma->rx_size, DMA_DEV_TO_MEM,
> +					 DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> +	if (!tx)
> +		return -EBUSY;

How does the DMA Rx "loop" restart when this is taken?

> +	tx->callback = ast8250_rx_dma_complete;
> +	tx->callback_param = p;
> +
> +	dma->rx_cookie = dmaengine_submit(tx);
> +
> +	dma_async_issue_pending(dma->rxchan);
> +
> +	return 0;
> +}
> +#endif

These 2 functions look very similar to what 8250_dma offers for you. The 
only difference I could see is that always start DMA Rx thing which could 
be handled by adding some capability flag into uart_8250_dma for those 
UARTs that can launch DMA Rx while Rx queue is empty.

So, just use the standard 8250_dma functions and make the small 
capabilities flag tweak there. 

By using the stock functions you also avoid 8250_dma Rx and your DMA Rx 
racing like they currently would (8250_port assigns the functions from 
8250_dma when you don't specify the rx handler and the default 8250 irq 
handler will call into those standard 8250 DMA functions).


I'm curious about this HW and how it behaves under these two scenarios:
- When Rx is empty, does UART/DMA just sit there waiting forever?
- When a stream of incoming Rx characters suddenly ends, how does UART/DMA
  react? ...On 8250 UARTs I'm familiar with this triggers UART_IIR_TIMEOUT
  which you don't seem to handle.

When you provide answer to those two questions, I can try to help you 
further on how to integrate into the standard 8250 DMA code.

> +static int ast8250_handle_irq(struct uart_port *port)
> +{
> +	return serial8250_handle_irq(port, serial_port_in(port, UART_IIR));
> +}
> +
> +static int ast8250_startup(struct uart_port *port)
> +{
> +#ifdef CONFIG_SERIAL_8250_DMA

This ifdef'fery is entirely unnecessary.

> +	int rc;
> +	struct uart_8250_port *up = up_to_u8250p(port);

Reverse xmas tree order.

> +
> +	rc = serial8250_do_startup(port);
> +	if (rc)
> +		return rc;
> +
> +	/*
> +	 * The default RX DMA is launched upon rising DR bit.
> +	 *
> +	 * However, this can result in byte lost if UART FIFO has
> +	 * been overruned before the DMA engine gets prepared and
> +	 * read the data out. This is especially common when UART
> +	 * DMA is used for file transfer. Thus we initiate RX DMA
> +	 * as early as possible.
> +	 */
> +	if (up->dma)
> +		return ast8250_rx_dma(up);

Once you start using the general 8250 dma code and add the DMA Rx always
capabilities flag, this can go into serial8250_do_startup(). Since 
serial8250_rx_dma() always exists independent of CONFIG_SERIAL_8250_DMA, 
no include guards are necessary.

...After which you probably don't need this whole function anymore.

> +
> +	return 0;
> +#else
> +	return serial8250_do_startup(port);
> +#endif
> +}
> +
> +static void ast8250_shutdown(struct uart_port *port)
> +{
> +	return serial8250_do_shutdown(port);
> +}
> +
> +static int ast8250_probe(struct platform_device *pdev)
> +{
> +	int rc;

Put this as last.

> +	struct uart_8250_port uart = {};
> +	struct uart_port *port = &uart.port;
> +	struct device *dev = &pdev->dev;
> +	struct ast8250_data *data;
> +	struct resource *res;
> +
> +	data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	data->irq = platform_get_irq(pdev, 0);
> +	if (data->irq < 0)
> +		return data->irq;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	if (res == NULL) {
> +		dev_err(dev, "failed to get register base\n");
> +		return -ENODEV;
> +	}
> +
> +	data->regs = devm_ioremap(dev, res->start, resource_size(res));
> +	if (IS_ERR(data->regs)) {
> +		dev_err(dev, "failed to map registers\n");
> +		return PTR_ERR(data->regs);
> +	}
> +
> +	data->clk = devm_clk_get(dev, NULL);
> +	if (IS_ERR(data->clk)) {
> +		dev_err(dev, "failed to get clocks\n");
> +		return -ENODEV;
> +	}
> +
> +	rc = clk_prepare_enable(data->clk);
> +	if (rc) {
> +		dev_err(dev, "failed to enable clock\n");
> +		return rc;
> +	}
> +
> +	data->rst = devm_reset_control_get_optional_exclusive(dev, NULL);
> +	if (!IS_ERR(data->rst))
> +		reset_control_deassert(data->rst);
> +
> +	spin_lock_init(&port->lock);
> +	port->dev = dev;
> +	port->type = PORT_16550A;
> +	port->irq = data->irq;
> +	port->line = of_alias_get_id(dev->of_node, "serial");
> +	port->handle_irq = ast8250_handle_irq;
> +	port->mapbase = res->start;
> +	port->mapsize = resource_size(res);
> +	port->membase = data->regs;
> +	port->uartclk = clk_get_rate(data->clk);
> +	port->regshift = 2;
> +	port->iotype = UPIO_MEM32;
> +	port->flags = UPF_FIXED_TYPE | UPF_FIXED_PORT | UPF_SHARE_IRQ;
> +	port->startup = ast8250_startup;
> +	port->shutdown = ast8250_shutdown;
> +	port->private_data = data;

> +#ifdef CONFIG_SERIAL_8250_DMA
> +	data->dma.rxconf.src_maxburst = UART_XMIT_SIZE;
> +	data->dma.txconf.dst_maxburst = UART_XMIT_SIZE;
> +	uart.dma = &data->dma;
> +#endif

Add a setup function for this and make an empty function with the same 
name when CONFIG_SERIAL_8250_DMA is not there.

> +
> +	data->line = serial8250_register_8250_port(&uart);
> +	if (data->line < 0) {
> +		dev_err(dev, "failed to register 8250 port\n");
> +		return data->line;
> +	}
> +
> +	platform_set_drvdata(pdev, data);
> +
> +	return 0;
> +}
> +
> +static int ast8250_remove(struct platform_device *pdev)
> +{
> +	struct ast8250_data *data = platform_get_drvdata(pdev);
> +
> +	serial8250_unregister_port(data->line);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id ast8250_of_match[] = {
> +	{ .compatible = "aspeed,ast2600-uart" },
> +	{ },
> +};
> +
> +static struct platform_driver ast8250_platform_driver = {
> +	.driver = {
> +		.name = DEVICE_NAME,
> +		.of_match_table = ast8250_of_match,
> +	},
> +	.probe = ast8250_probe,
> +	.remove = ast8250_remove,
> +};
> +
> +module_platform_driver(ast8250_platform_driver);
> +
> +MODULE_AUTHOR("Chia-Wei Wang <chiawei_wang@aspeedtech.com>");
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Aspeed 8250 UART Driver with DMA support");
ChiaWei Wang March 20, 2023, 10:19 a.m. UTC | #2
> From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Sent: Monday, March 20, 2023 5:43 PM
> 
> On Mon, 20 Mar 2023, Chia-Wei Wang wrote:
> 
> > Add new UART driver with DMA support for Aspeed AST2600 SoCs.
> > The drivers mainly prepare the dma instance based on the 8250_dma
> > implementation to leverage the AST2600 UART DMA (UDMA) engine.
> >
> > Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
> > ---
> >  drivers/tty/serial/8250/8250_aspeed.c | 224
> ++++++++++++++++++++++++++
> >  drivers/tty/serial/8250/Kconfig       |   8 +
> >  drivers/tty/serial/8250/Makefile      |   1 +
> >  3 files changed, 233 insertions(+)
> >  create mode 100644 drivers/tty/serial/8250/8250_aspeed.c
> >
> > diff --git a/drivers/tty/serial/8250/8250_aspeed.c
> > b/drivers/tty/serial/8250/8250_aspeed.c
> > new file mode 100644
> > index 000000000000..04d0bf6fba28
> > --- /dev/null
> > +++ b/drivers/tty/serial/8250/8250_aspeed.c
> > @@ -0,0 +1,224 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) ASPEED Technology Inc.
> > + */
> > +#include <linux/device.h>
> > +#include <linux/io.h>
> > +#include <linux/module.h>
> > +#include <linux/serial_8250.h>
> > +#include <linux/serial_reg.h>
> > +#include <linux/of.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/clk.h>
> > +#include <linux/reset.h>
> > +#include <linux/dma-mapping.h>
> > +#include <linux/circ_buf.h>
> > +#include <linux/tty_flip.h>
> > +#include <linux/pm_runtime.h>
> > +
> > +#include "8250.h"
> > +
> > +#define DEVICE_NAME "aspeed-uart"
> > +
> > +struct ast8250_data {
> > +	int line;
> > +	int irq;
> > +	u8 __iomem *regs;
> > +	struct reset_control *rst;
> > +	struct clk *clk;
> > +#ifdef CONFIG_SERIAL_8250_DMA
> > +	struct uart_8250_dma dma;
> > +#endif
> > +};
> > +
> > +#ifdef CONFIG_SERIAL_8250_DMA
> > +static int ast8250_rx_dma(struct uart_8250_port *p);
> > +
> > +static void ast8250_rx_dma_complete(void *param) {
> > +	struct uart_8250_port *p = param;
> > +	struct uart_8250_dma *dma = p->dma;
> > +	struct tty_port *tty_port = &p->port.state->port;
> > +	struct dma_tx_state	state;
> > +	int	count;
> > +
> > +	dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
> > +
> > +	count = dma->rx_size - state.residue;
> > +
> > +	tty_insert_flip_string(tty_port, dma->rx_buf, count);
> > +	p->port.icount.rx += count;
> > +
> > +	tty_flip_buffer_push(tty_port);
> > +
> > +	ast8250_rx_dma(p);
> > +}
> > +
> > +static int ast8250_rx_dma(struct uart_8250_port *p) {
> > +	struct uart_8250_dma *dma = p->dma;
> > +	struct dma_async_tx_descriptor *tx;
> > +
> > +	tx = dmaengine_prep_slave_single(dma->rxchan, dma->rx_addr,
> > +					 dma->rx_size, DMA_DEV_TO_MEM,
> > +					 DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > +	if (!tx)
> > +		return -EBUSY;
> 
> How does the DMA Rx "loop" restart when this is taken?

The loop re-starts from ast8250_startup.

> 
> > +	tx->callback = ast8250_rx_dma_complete;
> > +	tx->callback_param = p;
> > +
> > +	dma->rx_cookie = dmaengine_submit(tx);
> > +
> > +	dma_async_issue_pending(dma->rxchan);
> > +
> > +	return 0;
> > +}
> > +#endif
> 
> These 2 functions look very similar to what 8250_dma offers for you. The only
> difference I could see is that always start DMA Rx thing which could be
> handled by adding some capability flag into uart_8250_dma for those UARTs
> that can launch DMA Rx while Rx queue is empty.
> 
> So, just use the standard 8250_dma functions and make the small capabilities
> flag tweak there.
> 
> By using the stock functions you also avoid 8250_dma Rx and your DMA Rx
> racing like they currently would (8250_port assigns the functions from
> 8250_dma when you don't specify the rx handler and the default 8250 irq
> handler will call into those standard 8250 DMA functions).

Yes for the difference described.

Our customers usually use UDMA for file-transmissions over UART.
And I found the preceding bytes will get lost easily due to the late start of DMA engine.

In fact, I was seeking the default implementation to always start RX DMA instead of enabling it upon DR bit rising.
But no luck and thus add ast8250_rx_dma. (The default 8250 ISR also called into up->dma->rx_dma)

If adding a new capability flag is the better way to go, I will try to implement in that way for further review.

> 
> 
> I'm curious about this HW and how it behaves under these two scenarios:
> - When Rx is empty, does UART/DMA just sit there waiting forever?

Yes.

> - When a stream of incoming Rx characters suddenly ends, how does
> UART/DMA
>   react? ...On 8250 UARTs I'm familiar with this triggers UART_IIR_TIMEOUT
>   which you don't seem to handle.

UDMA also has a timeout control.
If the data suddenly ends and timeout occurs, UDMA will trigger an interrupt.
UDMA ISR then check if there is data available using DMA read/write pointers and invokes callback if any.

> 
> When you provide answer to those two questions, I can try to help you further
> on how to integrate into the standard 8250 DMA code.

Thanks!
It would be great using the default one to avoid mostly duplicated code.

> 
> > +static int ast8250_handle_irq(struct uart_port *port) {
> > +	return serial8250_handle_irq(port, serial_port_in(port, UART_IIR));
> > +}
> > +
> > +static int ast8250_startup(struct uart_port *port) { #ifdef
> > +CONFIG_SERIAL_8250_DMA
> 
> This ifdef'fery is entirely unnecessary.

Will remove as suggested.

> 
> > +	int rc;
> > +	struct uart_8250_port *up = up_to_u8250p(port);
> 
> Reverse xmas tree order.

Will revise as suggested.

> 
> > +
> > +	rc = serial8250_do_startup(port);
> > +	if (rc)
> > +		return rc;
> > +
> > +	/*
> > +	 * The default RX DMA is launched upon rising DR bit.
> > +	 *
> > +	 * However, this can result in byte lost if UART FIFO has
> > +	 * been overruned before the DMA engine gets prepared and
> > +	 * read the data out. This is especially common when UART
> > +	 * DMA is used for file transfer. Thus we initiate RX DMA
> > +	 * as early as possible.
> > +	 */
> > +	if (up->dma)
> > +		return ast8250_rx_dma(up);
> 
> Once you start using the general 8250 dma code and add the DMA Rx always
> capabilities flag, this can go into serial8250_do_startup(). Since
> serial8250_rx_dma() always exists independent of CONFIG_SERIAL_8250_DMA,
> no include guards are necessary.
> 
> ...After which you probably don't need this whole function anymore.

Agree.

> 
> > +
> > +	return 0;
> > +#else
> > +	return serial8250_do_startup(port);
> > +#endif
> > +}
> > +
> > +static void ast8250_shutdown(struct uart_port *port) {
> > +	return serial8250_do_shutdown(port); }
> > +
> > +static int ast8250_probe(struct platform_device *pdev) {
> > +	int rc;
> 
> Put this as last.

Will revise as suggested.

> 
> > +	struct uart_8250_port uart = {};
> > +	struct uart_port *port = &uart.port;
> > +	struct device *dev = &pdev->dev;
> > +	struct ast8250_data *data;
> > +	struct resource *res;
> > +
> > +	data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
> > +	if (!data)
> > +		return -ENOMEM;
> > +
> > +	data->irq = platform_get_irq(pdev, 0);
> > +	if (data->irq < 0)
> > +		return data->irq;
> > +
> > +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +	if (res == NULL) {
> > +		dev_err(dev, "failed to get register base\n");
> > +		return -ENODEV;
> > +	}
> > +
> > +	data->regs = devm_ioremap(dev, res->start, resource_size(res));
> > +	if (IS_ERR(data->regs)) {
> > +		dev_err(dev, "failed to map registers\n");
> > +		return PTR_ERR(data->regs);
> > +	}
> > +
> > +	data->clk = devm_clk_get(dev, NULL);
> > +	if (IS_ERR(data->clk)) {
> > +		dev_err(dev, "failed to get clocks\n");
> > +		return -ENODEV;
> > +	}
> > +
> > +	rc = clk_prepare_enable(data->clk);
> > +	if (rc) {
> > +		dev_err(dev, "failed to enable clock\n");
> > +		return rc;
> > +	}
> > +
> > +	data->rst = devm_reset_control_get_optional_exclusive(dev, NULL);
> > +	if (!IS_ERR(data->rst))
> > +		reset_control_deassert(data->rst);
> > +
> > +	spin_lock_init(&port->lock);
> > +	port->dev = dev;
> > +	port->type = PORT_16550A;
> > +	port->irq = data->irq;
> > +	port->line = of_alias_get_id(dev->of_node, "serial");
> > +	port->handle_irq = ast8250_handle_irq;
> > +	port->mapbase = res->start;
> > +	port->mapsize = resource_size(res);
> > +	port->membase = data->regs;
> > +	port->uartclk = clk_get_rate(data->clk);
> > +	port->regshift = 2;
> > +	port->iotype = UPIO_MEM32;
> > +	port->flags = UPF_FIXED_TYPE | UPF_FIXED_PORT | UPF_SHARE_IRQ;
> > +	port->startup = ast8250_startup;
> > +	port->shutdown = ast8250_shutdown;
> > +	port->private_data = data;
> 
> > +#ifdef CONFIG_SERIAL_8250_DMA
> > +	data->dma.rxconf.src_maxburst = UART_XMIT_SIZE;
> > +	data->dma.txconf.dst_maxburst = UART_XMIT_SIZE;
> > +	uart.dma = &data->dma;
> > +#endif
> 
> Add a setup function for this and make an empty function with the same name
> when CONFIG_SERIAL_8250_DMA is not there.

Will revise as suggested.

> 
> > +
> > +	data->line = serial8250_register_8250_port(&uart);
> > +	if (data->line < 0) {
> > +		dev_err(dev, "failed to register 8250 port\n");
> > +		return data->line;
> > +	}
> > +
> > +	platform_set_drvdata(pdev, data);
> > +
> > +	return 0;
> > +}
> > +
> > +static int ast8250_remove(struct platform_device *pdev) {
> > +	struct ast8250_data *data = platform_get_drvdata(pdev);
> > +
> > +	serial8250_unregister_port(data->line);
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct of_device_id ast8250_of_match[] = {
> > +	{ .compatible = "aspeed,ast2600-uart" },
> > +	{ },
> > +};
> > +
> > +static struct platform_driver ast8250_platform_driver = {
> > +	.driver = {
> > +		.name = DEVICE_NAME,
> > +		.of_match_table = ast8250_of_match,
> > +	},
> > +	.probe = ast8250_probe,
> > +	.remove = ast8250_remove,
> > +};
> > +
> > +module_platform_driver(ast8250_platform_driver);
> > +

Thanks,
Chiawei
Ilpo Järvinen March 20, 2023, 10:48 a.m. UTC | #3
On Mon, 20 Mar 2023, ChiaWei Wang wrote:

> > From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Sent: Monday, March 20, 2023 5:43 PM
> > 
> > On Mon, 20 Mar 2023, Chia-Wei Wang wrote:
> > 
> > > Add new UART driver with DMA support for Aspeed AST2600 SoCs.
> > > The drivers mainly prepare the dma instance based on the 8250_dma
> > > implementation to leverage the AST2600 UART DMA (UDMA) engine.
> > >
> > > Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
> > > ---
> > >  drivers/tty/serial/8250/8250_aspeed.c | 224
> > ++++++++++++++++++++++++++
> > >  drivers/tty/serial/8250/Kconfig       |   8 +
> > >  drivers/tty/serial/8250/Makefile      |   1 +
> > >  3 files changed, 233 insertions(+)
> > >  create mode 100644 drivers/tty/serial/8250/8250_aspeed.c
> > >
> > > diff --git a/drivers/tty/serial/8250/8250_aspeed.c
> > > b/drivers/tty/serial/8250/8250_aspeed.c
> > > new file mode 100644
> > > index 000000000000..04d0bf6fba28
> > > --- /dev/null
> > > +++ b/drivers/tty/serial/8250/8250_aspeed.c
> > > @@ -0,0 +1,224 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) ASPEED Technology Inc.
> > > + */
> > > +#include <linux/device.h>
> > > +#include <linux/io.h>
> > > +#include <linux/module.h>
> > > +#include <linux/serial_8250.h>
> > > +#include <linux/serial_reg.h>
> > > +#include <linux/of.h>
> > > +#include <linux/of_irq.h>
> > > +#include <linux/of_platform.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/clk.h>
> > > +#include <linux/reset.h>
> > > +#include <linux/dma-mapping.h>
> > > +#include <linux/circ_buf.h>
> > > +#include <linux/tty_flip.h>
> > > +#include <linux/pm_runtime.h>
> > > +
> > > +#include "8250.h"
> > > +
> > > +#define DEVICE_NAME "aspeed-uart"
> > > +
> > > +struct ast8250_data {
> > > +	int line;
> > > +	int irq;
> > > +	u8 __iomem *regs;
> > > +	struct reset_control *rst;
> > > +	struct clk *clk;
> > > +#ifdef CONFIG_SERIAL_8250_DMA
> > > +	struct uart_8250_dma dma;
> > > +#endif
> > > +};
> > > +
> > > +#ifdef CONFIG_SERIAL_8250_DMA
> > > +static int ast8250_rx_dma(struct uart_8250_port *p);
> > > +
> > > +static void ast8250_rx_dma_complete(void *param) {
> > > +	struct uart_8250_port *p = param;
> > > +	struct uart_8250_dma *dma = p->dma;
> > > +	struct tty_port *tty_port = &p->port.state->port;
> > > +	struct dma_tx_state	state;
> > > +	int	count;
> > > +
> > > +	dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
> > > +
> > > +	count = dma->rx_size - state.residue;
> > > +
> > > +	tty_insert_flip_string(tty_port, dma->rx_buf, count);
> > > +	p->port.icount.rx += count;
> > > +
> > > +	tty_flip_buffer_push(tty_port);
> > > +
> > > +	ast8250_rx_dma(p);
> > > +}
> > > +
> > > +static int ast8250_rx_dma(struct uart_8250_port *p) {
> > > +	struct uart_8250_dma *dma = p->dma;
> > > +	struct dma_async_tx_descriptor *tx;
> > > +
> > > +	tx = dmaengine_prep_slave_single(dma->rxchan, dma->rx_addr,
> > > +					 dma->rx_size, DMA_DEV_TO_MEM,
> > > +					 DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > > +	if (!tx)
> > > +		return -EBUSY;
> > 
> > How does the DMA Rx "loop" restart when this is taken?
> 
> The loop re-starts from ast8250_startup.

Why would startup get called again?

> > > +	tx->callback = ast8250_rx_dma_complete;
> > > +	tx->callback_param = p;
> > > +
> > > +	dma->rx_cookie = dmaengine_submit(tx);
> > > +
> > > +	dma_async_issue_pending(dma->rxchan);
> > > +
> > > +	return 0;
> > > +}
> > > +#endif
> > 
> > These 2 functions look very similar to what 8250_dma offers for you. The only
> > difference I could see is that always start DMA Rx thing which could be
> > handled by adding some capability flag into uart_8250_dma for those UARTs
> > that can launch DMA Rx while Rx queue is empty.
> > 
> > So, just use the standard 8250_dma functions and make the small capabilities
> > flag tweak there.
> > 
> > By using the stock functions you also avoid 8250_dma Rx and your DMA Rx
> > racing like they currently would (8250_port assigns the functions from
> > 8250_dma when you don't specify the rx handler and the default 8250 irq
> > handler will call into those standard 8250 DMA functions).
> 
> Yes for the difference described.
> 
> Our customers usually use UDMA for file-transmissions over UART.
> And I found the preceding bytes will get lost easily due to the late 
> start of DMA engine. 
>
> In fact, I was seeking the default implementation to always start RX DMA 
> instead of enabling it upon DR bit rising. But no luck and thus add 
> ast8250_rx_dma. (The default 8250 ISR also called into up->dma->rx_dma)
>
> If adding a new capability flag is the better way to go, I will try to 
> implement in that way for further review.

Yes it would be much better.

Add unsigned int capabilities into uart_8250_dma and put the necessary 
checks + code into general code. Don't add any #ifdef 
CONFIG_SERIAL_8250_DMA into 8250_port.c nor 8250_dma.c. Instead, if you 
feel a need for one, use the #ifdef ... #else ... #endif in 8250.h to
provide an empty static inline function for the #else case.

> > I'm curious about this HW and how it behaves under these two scenarios:
> > - When Rx is empty, does UART/DMA just sit there waiting forever?
> 
> Yes.

Okay.

> > - When a stream of incoming Rx characters suddenly ends, how does
> > UART/DMA
> >   react? ...On 8250 UARTs I'm familiar with this triggers UART_IIR_TIMEOUT
> >   which you don't seem to handle.
> 
> UDMA also has a timeout control.
> If the data suddenly ends and timeout occurs, UDMA will trigger an interrupt.
> UDMA ISR then check if there is data available using DMA read/write 
> pointers and invokes callback if any. 

Okay. And the UART side won't trigger any interrupts?

> > When you provide answer to those two questions, I can try to help you further
> > on how to integrate into the standard 8250 DMA code.
> 
> Thanks!
> It would be great using the default one to avoid mostly duplicated code.

You need to take a look into handle_rx_dma() what to do there. Probably 
just call to ->rx_dma() unconditionally to prevent UART interrupts from 
messing up with DMA Rx. This restart for DMA Rx is just for backup if the 
DMA Rx "loop" stopped due to an error.
Ilpo Järvinen March 20, 2023, 11:14 a.m. UTC | #4
On Mon, 20 Mar 2023, Chia-Wei Wang wrote:

> Aspeed AST2600 UART DMA (UDMA) includes 28 channels for the
> DMA transmission and recevie of each UART devices.
> 
> Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
> ---
> diff --git a/drivers/dma/ast2600-udma.c b/drivers/dma/ast2600-udma.c
> new file mode 100644
> index 000000000000..39117b26996d
> --- /dev/null
> +++ b/drivers/dma/ast2600-udma.c
> @@ -0,0 +1,534 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) ASPEED Technology Inc.
> + */
> +#include <linux/delay.h>
> +#include <linux/bitfield.h>
> +#include <linux/interrupt.h>
> +#include <linux/dmaengine.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/platform_device.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_dma.h>
> +#include <linux/dma-direct.h>
> +#include "dmaengine.h"
> +
> +#define DEVICE_NAME	"aspeed-udma"
> +
> +/* register offset */
> +#define UDMA_TX_EN		0x000
> +#define UDMA_RX_EN		0x004
> +#define UDMA_TMOUT		0x00c
> +#define UDMA_TX_PTR_RST		0x020
> +#define UDMA_RX_PTR_RST		0x024
> +#define UDMA_TX_INT_EN		0x030
> +#define UDMA_TX_INT_STS		0x034
> +#define UDMA_RX_INT_EN		0x038
> +#define UDMA_RX_INT_STS		0x03c
> +
> +#define UDMA_CH_OFFS(x)		((x) * 0x10)
> +#define UDMA_CH_RPTR(x)		(0x040 + UDMA_CH_OFFS(x))
> +#define UDMA_CH_WPTR(x)		(0x044 + UDMA_CH_OFFS(x))
> +#define UDMA_CH_ADDR(x)		(0x048 + UDMA_CH_OFFS(x))
> +#define UDMA_CH_CTRL(x)		(0x04c + UDMA_CH_OFFS(x))
> +#define   UDMA_CH_CTRL_BUFSZ	GENMASK(1, 0)
> +
> +#define UDMA_MAX_BUFSZ	(0x10000)

Unnecessary parenthesis.

> +#define UDMA_MAX_TXSZ	(UDMA_MAX_BUFSZ - 1)
> +
> +enum ast2600_udma_bufsz {
> +	UDMA_BUFSZ_1KB,
> +	UDMA_BUFSZ_4KB,
> +	UDMA_BUFSZ_16KB,
> +	UDMA_BUFSZ_64KB,
> +};
> +
> +struct ast2600_udma_desc {
> +	struct dma_async_tx_descriptor tx;
> +	dma_addr_t addr;
> +	unsigned int size;
> +};
> +
> +struct ast2600_udma_chan {
> +	struct dma_chan chan;
> +	struct ast2600_udma_desc ud;
> +	struct ast2600_udma *udma;
> +	uint32_t residue;
> +
> +	/* 4B-aligned local buffer for workaround */
> +	uint8_t *buf;
> +	dma_addr_t buf_addr;
> +
> +	bool is_tx;
> +};
> +
> +struct ast2600_udma {
> +	struct dma_device ddev;
> +	uint8_t __iomem *regs;
> +	int irq;
> +	struct ast2600_udma_chan *ucs;
> +	uint32_t n_ucs;
> +	spinlock_t lock;
> +};
> +
> +static struct ast2600_udma_chan *to_ast2600_udma_chan(struct dma_chan *chan)
> +{
> +	return container_of(chan, struct ast2600_udma_chan, chan);
> +}
> +
> +static int ast2600_udma_alloc_chan_resources(struct dma_chan *chan)
> +{
> +	dma_cookie_init(chan);
> +
> +	return 0;
> +}
> +
> +static dma_cookie_t ast2600_udma_tx_submit(struct dma_async_tx_descriptor *tx)
> +{
> +	return dma_cookie_assign(tx);
> +}
> +
> +/* consider only 8250_dma using dmaengine_prep_slave_single() */
> +static struct dma_async_tx_descriptor *ast2600_udma_prep_slave_sg(struct dma_chan *chan,
> +								  struct scatterlist *sgl,
> +								  unsigned int sg_len,
> +								  enum dma_transfer_direction dir,
> +								  unsigned long tx_flags,
> +								  void *context)
> +{
> +	struct device *dev = chan->device->dev;
> +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> +	struct ast2600_udma_desc *ud = &uc->ud;
> +	phys_addr_t pa;
> +	void *va;
> +
> +	if (!is_slave_direction(dir)) {
> +		dev_err(dev, "direction is not slave mode\n");
> +		return NULL;
> +	}
> +
> +	if (sg_len != 1) {

> 1

> +		dev_err(dev, "scatter list length is not 1\n");
> +		return NULL;
> +	}
> +
> +	if (uc->is_tx && dir != DMA_MEM_TO_DEV) {
> +		dev_err(dev, "invalid direction to TX channel\n");
> +		return NULL;
> +	}
> +
> +	ud->addr = sg_dma_address(sgl);
> +	ud->size = sg_dma_len(sgl);
> +
> +	if (uc->is_tx) {
> +		if (ud->size > UDMA_MAX_TXSZ) {
> +			dev_err(dev, "invalid TX DMA SIZE");
> +			return NULL;
> +		}
> +
> +		/*
> +		 * UDMA is limited to 4B-aligned DMA addresses.
> +		 * Thus copy data to local 4B-aligned buffer if
> +		 * the source does not fit.
> +		 */
> +		if (ud->addr & 0x3) {

!IS_ALIGNED()

Remember to add include for it.

> +			pa = dma_to_phys(chan->device->dev, ud->addr);
> +			if (pa != (phys_addr_t)-1) {
> +				va = phys_to_virt(pa);
> +				memcpy(uc->buf, va, ud->size);
> +				ud->addr = uc->buf_addr;
> +			}
> +		}
> +	} else {
> +		/*
> +		 * UDMA RX buffer size is limited to 1/4/16/64 KB
> +		 * We use the lower bits to encode the buffer size
> +		 */
> +		switch (ud->size) {
> +		case 0x400:

SZ_1K, etc in linux/sizes.h.

> +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ, UDMA_BUFSZ_1KB);
> +			break;
> +		case 0x1000:
> +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ, UDMA_BUFSZ_4KB);
> +			break;
> +		case 0x4000:
> +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ, UDMA_BUFSZ_16KB);
> +			break;
> +		case 0x10000:
> +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ, UDMA_BUFSZ_64KB);
> +			break;
> +		default:
> +			dev_err(dev, "invalid RX DMA size\n");
> +			return NULL;
> +		}
> +	}
> +
> +	dma_async_tx_descriptor_init(&ud->tx, &uc->chan);
> +	ud->tx.tx_submit = ast2600_udma_tx_submit;
> +
> +	return &ud->tx;
> +}
> +
> +static void ast2600_udma_issue_pending(struct dma_chan *chan)
> +{
> +	unsigned long flags;
> +	uint32_t r_pr, r_is, r_ie, r_en, reg;
> +	uint32_t ch_id = chan->chan_id;
> +	uint32_t ch_bit = ch_id / 2;
> +	dma_addr_t rx_addr;
> +	uint32_t rx_size;
> +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> +	struct ast2600_udma_desc *ud = &uc->ud;
> +	struct ast2600_udma *udma = uc->udma;
> +
> +	if (uc->is_tx) {
> +		r_pr = UDMA_TX_PTR_RST;
> +		r_is = UDMA_TX_INT_STS;
> +		r_ie = UDMA_TX_INT_EN;
> +		r_en = UDMA_TX_EN;
> +	} else {
> +		r_pr = UDMA_RX_PTR_RST;
> +		r_is = UDMA_RX_INT_STS;
> +		r_ie = UDMA_RX_INT_EN;
> +		r_en = UDMA_RX_EN;
> +	}
> +
> +	spin_lock_irqsave(&udma->lock, flags);
> +
> +	/* reset channel HW read/write pointer */
> +	writel(BIT(ch_bit), udma->regs + r_pr);
> +	writel(0, udma->regs + r_pr);
> +
> +	/* clear interrupt status */
> +	writel(BIT(ch_bit), udma->regs + r_is);
> +
> +	/* set transfer address & size */
> +	if (uc->is_tx) {
> +		writel(ud->addr, udma->regs + UDMA_CH_ADDR(ch_id));
> +		writel(ud->size, udma->regs + UDMA_CH_WPTR(ch_id));
> +		writel(UDMA_BUFSZ_64KB, udma->regs + UDMA_CH_CTRL(ch_id));
> +	} else {
> +		/*
> +		 * UDMA is limited to 4B-aligned addresses.
> +		 * Thus use local 4B-aligned buffer to get
> +		 * RX data and copy to the real destination
> +		 * after then.
> +		 */
> +		rx_addr = (ud->addr & 0x3) ? uc->buf_addr : ud->addr;

!IS_ALIGNED()

> +		rx_size = FIELD_GET(UDMA_CH_CTRL_BUFSZ, ud->size);
> +		writel(rx_addr, udma->regs + UDMA_CH_ADDR(ch_id));
> +		writel(rx_size, udma->regs + UDMA_CH_CTRL(ch_id));
> +	}
> +
> +	/* enable interrupt */
> +	reg = readl(udma->regs + r_ie) | BIT(ch_bit);

Usually, in this kind of constructs, the logic is put on its own line 
between readl and writel.

> +	writel(reg, udma->regs + r_ie);
> +
> +	/* start DMA */
> +	reg = readl(udma->regs + r_en) | BIT(ch_bit);
> +	writel(reg, udma->regs + r_en);
> +
> +	spin_unlock_irqrestore(&udma->lock, flags);
> +}
> +
> +static enum dma_status ast2600_udma_tx_status(struct dma_chan *chan,
> +		dma_cookie_t cookie, struct dma_tx_state *txstate)
> +{
> +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> +	enum dma_status sts = dma_cookie_status(chan, cookie, txstate);
> +
> +	dma_set_residue(txstate, uc->residue);
> +
> +	return sts;
> +}
> +
> +static int ast2600_udma_pause(struct dma_chan *chan)
> +{
> +	unsigned long flags;
> +	uint32_t r_en, r_ie, reg;
> +	uint32_t ch_id = chan->chan_id;
> +	uint32_t ch_bit = ch_id / 2;
> +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> +	struct ast2600_udma *udma = uc->udma;
> +
> +	if (uc->is_tx) {
> +		r_en = UDMA_TX_EN;
> +		r_ie = UDMA_TX_INT_EN;
> +	} else {
> +		r_en = UDMA_RX_EN;
> +		r_ie = UDMA_RX_INT_EN;
> +	}
> +
> +	spin_lock_irqsave(&udma->lock, flags);
> +
> +	reg = readl(udma->regs + r_en) & ~BIT(ch_bit);
> +	writel(reg, udma->regs + r_en);
> +
> +	reg = readl(udma->regs + r_ie) & ~BIT(ch_bit);
> +	writel(reg, udma->regs + r_ie);
> +
> +	spin_unlock_irqrestore(&udma->lock, flags);
> +
> +	return 0;
> +}
> +
> +static int ast2600_udma_resume(struct dma_chan *chan)
> +{
> +	unsigned long flags;
> +	uint32_t r_en, r_ie, reg;
> +	uint32_t ch_id = chan->chan_id;
> +	uint32_t ch_bit = ch_id / 2;
> +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> +	struct ast2600_udma *udma = uc->udma;
> +
> +	if (uc->is_tx) {
> +		r_en = UDMA_TX_EN;
> +		r_ie = UDMA_TX_INT_EN;
> +	} else {
> +		r_en = UDMA_RX_EN;
> +		r_ie = UDMA_RX_INT_EN;
> +	}
> +
> +	spin_lock_irqsave(&udma->lock, flags);
> +
> +	reg = readl(udma->regs + r_en) | BIT(ch_bit);
> +	writel(reg, udma->regs + r_en);
> +
> +	reg = readl(udma->regs + r_ie) | BIT(ch_bit);
> +	writel(reg, udma->regs + r_ie);
> +
> +	spin_unlock_irqrestore(&udma->lock, flags);
> +
> +	return 0;
> +}
> +
> +static int ast2600_udma_terminate(struct dma_chan *chan)
> +{
> +	unsigned long flags;
> +	uint32_t r_pr, r_is, r_ie, r_en, reg;
> +	uint32_t ch_id = chan->chan_id;
> +	uint32_t ch_bit = ch_id / 2;
> +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> +	struct ast2600_udma *udma = uc->udma;
> +
> +	if (uc->is_tx) {
> +		r_pr = UDMA_TX_PTR_RST;
> +		r_is = UDMA_TX_INT_STS;
> +		r_ie = UDMA_TX_INT_EN;
> +		r_en = UDMA_TX_EN;
> +	} else {
> +		r_pr = UDMA_RX_PTR_RST;
> +		r_is = UDMA_RX_INT_STS;
> +		r_ie = UDMA_RX_INT_EN;
> +		r_en = UDMA_RX_EN;
> +	}
> +
> +	spin_lock_irqsave(&udma->lock, flags);
> +
> +	/* disable DMA */
> +	reg = readl(udma->regs + r_en) & ~BIT(ch_bit);
> +	writel(reg, udma->regs + r_en);
> +
> +	/* disable interrupt */
> +	reg = readl(udma->regs + r_ie) & ~BIT(ch_bit);
> +	writel(reg, udma->regs + r_ie);
> +
> +	/* clear interrupt status */
> +	writel(BIT(ch_bit), udma->regs + r_is);
> +
> +	/* reset channel HW read/write pointer */
> +	writel(BIT(ch_bit), udma->regs + r_pr);
> +	writel(0, udma->regs + r_pr);
> +
> +	spin_unlock_irqrestore(&udma->lock, flags);
> +
> +	return 0;
> +}
> +
> +static irqreturn_t ast2600_udma_isr(int irq, void *arg)
> +{
> +	struct ast2600_udma *udma = arg;
> +	struct ast2600_udma_chan *uc;
> +	struct ast2600_udma_desc *ud;
> +	struct dma_async_tx_descriptor *tx;
> +	uint32_t sts, rptr, wptr;
> +	uint32_t ch_id, ch_bit;
> +	phys_addr_t pa;
> +	void *va;
> +
> +	/* handle TX interrupt */
> +	sts = readl(udma->regs + UDMA_TX_INT_STS);
> +	for_each_set_bit(ch_bit, (unsigned long *)&sts, (udma->n_ucs / 2)) {

Why not make sts unsigned long to avoid the cast?

Unnecessary parenthesis.

> +		ch_id = ch_bit << 1;
> +		rptr = readl(udma->regs + UDMA_CH_RPTR(ch_id));
> +		wptr = readl(udma->regs + UDMA_CH_WPTR(ch_id));
> +
> +		uc = &udma->ucs[ch_id];
> +		uc->residue = wptr - rptr;
> +
> +		ast2600_udma_terminate(&uc->chan);
> +
> +		tx = &uc->ud.tx;
> +		dma_cookie_complete(tx);
> +		dma_descriptor_unmap(tx);
> +		dmaengine_desc_get_callback_invoke(tx, NULL);
> +	}
> +
> +	/* handle RX interrupt */
> +	sts = readl(udma->regs + UDMA_RX_INT_STS);
> +	for_each_set_bit(ch_bit, (unsigned long *)&sts, udma->n_ucs / 2) {
> +		ch_id = (ch_bit << 1) + 1;
> +		wptr = readl(udma->regs + UDMA_CH_WPTR(ch_id));
> +
> +		uc = &udma->ucs[ch_id];
> +		ud = &uc->ud;
> +		tx = &ud->tx;
> +
> +		uc->residue = (ud->size & ~UDMA_CH_CTRL_BUFSZ) - wptr;
> +
> +		/* handle non-4B-aligned case */
> +		if (ud->addr & 0x3) {

!IS_ALIGNED()

> +			pa = dma_to_phys(uc->chan.device->dev, ud->addr);
> +			if (pa != (phys_addr_t)-1) {
> +				va = phys_to_virt(pa);
> +				memcpy(va, uc->buf, wptr);
> +			}
> +		}
> +
> +		ast2600_udma_terminate(&uc->chan);
> +
> +		dma_cookie_complete(tx);
> +		dma_descriptor_unmap(tx);
> +		dmaengine_desc_get_callback_invoke(tx, NULL);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int ast2600_udma_probe(struct platform_device *pdev)
> +{
> +	int i, rc;
> +	struct resource *res;
> +	struct ast2600_udma *udma;
> +	struct device *dev = &pdev->dev;
> +
> +	udma = devm_kzalloc(dev, sizeof(*udma), GFP_KERNEL);
> +	if (!udma)
> +		return -ENOMEM;
> +
> +	dma_cap_set(DMA_SLAVE, udma->ddev.cap_mask);
> +	udma->ddev.device_alloc_chan_resources = ast2600_udma_alloc_chan_resources;
> +	udma->ddev.device_prep_slave_sg = ast2600_udma_prep_slave_sg;
> +	udma->ddev.device_issue_pending = ast2600_udma_issue_pending;
> +	udma->ddev.device_tx_status = ast2600_udma_tx_status;
> +	udma->ddev.device_pause = ast2600_udma_pause;
> +	udma->ddev.device_resume = ast2600_udma_resume;
> +	udma->ddev.device_terminate_all = ast2600_udma_terminate;
> +	udma->ddev.src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE);
> +	udma->ddev.dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE);
> +	udma->ddev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV);
> +	udma->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
> +	udma->ddev.dev = dev;
> +	INIT_LIST_HEAD(&udma->ddev.channels);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	if (IS_ERR(res)) {
> +		dev_err(dev, "cannot get IO resource\n");
> +		return -ENODEV;
> +	}
> +
> +	udma->regs = devm_ioremap_resource(dev, res);
> +	if (IS_ERR(udma->regs)) {
> +		dev_err(dev, "cannot map IO registers\n");
> +		return PTR_ERR(udma->regs);
> +	}
> +
> +	/* timeout value: 0x200 * (PCLK * 14400) */
> +	writel(0x200, udma->regs + UDMA_TMOUT);
> +
> +	/* disable all for safety */
> +	writel(0x0, udma->regs + UDMA_TX_EN);
> +	writel(0x0, udma->regs + UDMA_RX_EN);
> +
> +	udma->irq = platform_get_irq(pdev, 0);
> +	if (udma->irq < 0)
> +		return udma->irq;
> +
> +	rc = devm_request_irq(&pdev->dev, udma->irq, ast2600_udma_isr,
> +			      IRQF_SHARED, DEVICE_NAME, udma);
> +	if (rc) {
> +		dev_err(dev, "cannot request IRQ\n");
> +		return rc;
> +	}
> +
> +	rc = of_property_read_u32(dev->of_node, "dma-channels", &udma->n_ucs);
> +	if (rc) {
> +		dev_err(dev, "cannot find number of channels\n");
> +		return rc;
> +	}
> +
> +	udma->ucs = devm_kzalloc(dev,
> +				 sizeof(struct ast2600_udma_chan) * udma->n_ucs, GFP_KERNEL);
> +	if (!udma->ucs)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < udma->n_ucs; ++i) {
> +		udma->ucs[i].is_tx = !(i % 2);

& 1 would make more sense I think.

> +		udma->ucs[i].chan.device = &udma->ddev;
> +		udma->ucs[i].buf = dmam_alloc_coherent(dev, UDMA_MAX_BUFSZ,
> +						       &udma->ucs[i].buf_addr, GFP_KERNEL);
> +		if (!udma->ucs[i].buf)
> +			return -ENOMEM;
> +
> +		udma->ucs[i].udma = udma;
> +		list_add_tail(&udma->ucs[i].chan.device_node, &udma->ddev.channels);
> +	}
> +
> +	rc = dma_async_device_register(&udma->ddev);
> +	if (rc)
> +		return rc;
> +
> +	rc = of_dma_controller_register(dev->of_node, of_dma_xlate_by_chan_id, &udma->ddev);
> +	if (rc)
> +		return rc;
> +
> +	spin_lock_init(&udma->lock);
> +
> +	platform_set_drvdata(pdev, udma);
> +
> +	dev_info(dev, "module loaded\n");

Don't print anything when there's no error.

> +	return 0;
> +}
> +
> +static int ast2600_udma_remove(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct ast2600_udma *udma = platform_get_drvdata(pdev);
> +
> +	of_dma_controller_free(dev->of_node);
> +	dma_async_device_unregister(&udma->ddev);
> +
> +	dev_info(dev, "module removed\n");

Ditto.

> +
> +	return 0;
> +}
> +
> +static const struct of_device_id ast2600_udma_match[] = {
> +	{ .compatible = "aspeed,ast2600-udma" },
> +	{ },
> +};
> +
> +static struct platform_driver ast2600_udma_driver = {
> +	.probe = ast2600_udma_probe,
> +	.remove = ast2600_udma_remove,
> +	.driver = {
> +			.name = DEVICE_NAME,
> +			.of_match_table = ast2600_udma_match,
> +	},
> +};
> +
> +module_platform_driver(ast2600_udma_driver);
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Chia-Wei Wang <chiawei_wang@aspeedtech.com");
>
ChiaWei Wang March 22, 2023, 8:34 a.m. UTC | #5
Will revise the coding style as suggested below in v4 revision.

Thanks,
Chiawei

> From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Sent: Monday, March 20, 2023 7:14 PM
> 
> On Mon, 20 Mar 2023, Chia-Wei Wang wrote:
> 
> > Aspeed AST2600 UART DMA (UDMA) includes 28 channels for the DMA
> > transmission and recevie of each UART devices.
> >
> > Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
> > ---
> > diff --git a/drivers/dma/ast2600-udma.c b/drivers/dma/ast2600-udma.c
> > new file mode 100644 index 000000000000..39117b26996d
> > --- /dev/null
> > +++ b/drivers/dma/ast2600-udma.c
> > @@ -0,0 +1,534 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) ASPEED Technology Inc.
> > + */
> > +#include <linux/delay.h>
> > +#include <linux/bitfield.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/dmaengine.h>
> > +#include <linux/dma-mapping.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/of_dma.h>
> > +#include <linux/dma-direct.h>
> > +#include "dmaengine.h"
> > +
> > +#define DEVICE_NAME	"aspeed-udma"
> > +
> > +/* register offset */
> > +#define UDMA_TX_EN		0x000
> > +#define UDMA_RX_EN		0x004
> > +#define UDMA_TMOUT		0x00c
> > +#define UDMA_TX_PTR_RST		0x020
> > +#define UDMA_RX_PTR_RST		0x024
> > +#define UDMA_TX_INT_EN		0x030
> > +#define UDMA_TX_INT_STS		0x034
> > +#define UDMA_RX_INT_EN		0x038
> > +#define UDMA_RX_INT_STS		0x03c
> > +
> > +#define UDMA_CH_OFFS(x)		((x) * 0x10)
> > +#define UDMA_CH_RPTR(x)		(0x040 + UDMA_CH_OFFS(x))
> > +#define UDMA_CH_WPTR(x)		(0x044 + UDMA_CH_OFFS(x))
> > +#define UDMA_CH_ADDR(x)		(0x048 + UDMA_CH_OFFS(x))
> > +#define UDMA_CH_CTRL(x)		(0x04c + UDMA_CH_OFFS(x))
> > +#define   UDMA_CH_CTRL_BUFSZ	GENMASK(1, 0)
> > +
> > +#define UDMA_MAX_BUFSZ	(0x10000)
> 
> Unnecessary parenthesis.
> 
> > +#define UDMA_MAX_TXSZ	(UDMA_MAX_BUFSZ - 1)
> > +
> > +enum ast2600_udma_bufsz {
> > +	UDMA_BUFSZ_1KB,
> > +	UDMA_BUFSZ_4KB,
> > +	UDMA_BUFSZ_16KB,
> > +	UDMA_BUFSZ_64KB,
> > +};
> > +
> > +struct ast2600_udma_desc {
> > +	struct dma_async_tx_descriptor tx;
> > +	dma_addr_t addr;
> > +	unsigned int size;
> > +};
> > +
> > +struct ast2600_udma_chan {
> > +	struct dma_chan chan;
> > +	struct ast2600_udma_desc ud;
> > +	struct ast2600_udma *udma;
> > +	uint32_t residue;
> > +
> > +	/* 4B-aligned local buffer for workaround */
> > +	uint8_t *buf;
> > +	dma_addr_t buf_addr;
> > +
> > +	bool is_tx;
> > +};
> > +
> > +struct ast2600_udma {
> > +	struct dma_device ddev;
> > +	uint8_t __iomem *regs;
> > +	int irq;
> > +	struct ast2600_udma_chan *ucs;
> > +	uint32_t n_ucs;
> > +	spinlock_t lock;
> > +};
> > +
> > +static struct ast2600_udma_chan *to_ast2600_udma_chan(struct
> dma_chan
> > +*chan) {
> > +	return container_of(chan, struct ast2600_udma_chan, chan); }
> > +
> > +static int ast2600_udma_alloc_chan_resources(struct dma_chan *chan) {
> > +	dma_cookie_init(chan);
> > +
> > +	return 0;
> > +}
> > +
> > +static dma_cookie_t ast2600_udma_tx_submit(struct
> > +dma_async_tx_descriptor *tx) {
> > +	return dma_cookie_assign(tx);
> > +}
> > +
> > +/* consider only 8250_dma using dmaengine_prep_slave_single() */
> > +static struct dma_async_tx_descriptor
> *ast2600_udma_prep_slave_sg(struct dma_chan *chan,
> > +								  struct scatterlist *sgl,
> > +								  unsigned int sg_len,
> > +								  enum dma_transfer_direction dir,
> > +								  unsigned long tx_flags,
> > +								  void *context)
> > +{
> > +	struct device *dev = chan->device->dev;
> > +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> > +	struct ast2600_udma_desc *ud = &uc->ud;
> > +	phys_addr_t pa;
> > +	void *va;
> > +
> > +	if (!is_slave_direction(dir)) {
> > +		dev_err(dev, "direction is not slave mode\n");
> > +		return NULL;
> > +	}
> > +
> > +	if (sg_len != 1) {
> 
> > 1
> 
> > +		dev_err(dev, "scatter list length is not 1\n");
> > +		return NULL;
> > +	}
> > +
> > +	if (uc->is_tx && dir != DMA_MEM_TO_DEV) {
> > +		dev_err(dev, "invalid direction to TX channel\n");
> > +		return NULL;
> > +	}
> > +
> > +	ud->addr = sg_dma_address(sgl);
> > +	ud->size = sg_dma_len(sgl);
> > +
> > +	if (uc->is_tx) {
> > +		if (ud->size > UDMA_MAX_TXSZ) {
> > +			dev_err(dev, "invalid TX DMA SIZE");
> > +			return NULL;
> > +		}
> > +
> > +		/*
> > +		 * UDMA is limited to 4B-aligned DMA addresses.
> > +		 * Thus copy data to local 4B-aligned buffer if
> > +		 * the source does not fit.
> > +		 */
> > +		if (ud->addr & 0x3) {
> 
> !IS_ALIGNED()
> 
> Remember to add include for it.
> 
> > +			pa = dma_to_phys(chan->device->dev, ud->addr);
> > +			if (pa != (phys_addr_t)-1) {
> > +				va = phys_to_virt(pa);
> > +				memcpy(uc->buf, va, ud->size);
> > +				ud->addr = uc->buf_addr;
> > +			}
> > +		}
> > +	} else {
> > +		/*
> > +		 * UDMA RX buffer size is limited to 1/4/16/64 KB
> > +		 * We use the lower bits to encode the buffer size
> > +		 */
> > +		switch (ud->size) {
> > +		case 0x400:
> 
> SZ_1K, etc in linux/sizes.h.
> 
> > +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ,
> UDMA_BUFSZ_1KB);
> > +			break;
> > +		case 0x1000:
> > +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ,
> UDMA_BUFSZ_4KB);
> > +			break;
> > +		case 0x4000:
> > +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ,
> UDMA_BUFSZ_16KB);
> > +			break;
> > +		case 0x10000:
> > +			ud->size |= FIELD_PREP(UDMA_CH_CTRL_BUFSZ,
> UDMA_BUFSZ_64KB);
> > +			break;
> > +		default:
> > +			dev_err(dev, "invalid RX DMA size\n");
> > +			return NULL;
> > +		}
> > +	}
> > +
> > +	dma_async_tx_descriptor_init(&ud->tx, &uc->chan);
> > +	ud->tx.tx_submit = ast2600_udma_tx_submit;
> > +
> > +	return &ud->tx;
> > +}
> > +
> > +static void ast2600_udma_issue_pending(struct dma_chan *chan) {
> > +	unsigned long flags;
> > +	uint32_t r_pr, r_is, r_ie, r_en, reg;
> > +	uint32_t ch_id = chan->chan_id;
> > +	uint32_t ch_bit = ch_id / 2;
> > +	dma_addr_t rx_addr;
> > +	uint32_t rx_size;
> > +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> > +	struct ast2600_udma_desc *ud = &uc->ud;
> > +	struct ast2600_udma *udma = uc->udma;
> > +
> > +	if (uc->is_tx) {
> > +		r_pr = UDMA_TX_PTR_RST;
> > +		r_is = UDMA_TX_INT_STS;
> > +		r_ie = UDMA_TX_INT_EN;
> > +		r_en = UDMA_TX_EN;
> > +	} else {
> > +		r_pr = UDMA_RX_PTR_RST;
> > +		r_is = UDMA_RX_INT_STS;
> > +		r_ie = UDMA_RX_INT_EN;
> > +		r_en = UDMA_RX_EN;
> > +	}
> > +
> > +	spin_lock_irqsave(&udma->lock, flags);
> > +
> > +	/* reset channel HW read/write pointer */
> > +	writel(BIT(ch_bit), udma->regs + r_pr);
> > +	writel(0, udma->regs + r_pr);
> > +
> > +	/* clear interrupt status */
> > +	writel(BIT(ch_bit), udma->regs + r_is);
> > +
> > +	/* set transfer address & size */
> > +	if (uc->is_tx) {
> > +		writel(ud->addr, udma->regs + UDMA_CH_ADDR(ch_id));
> > +		writel(ud->size, udma->regs + UDMA_CH_WPTR(ch_id));
> > +		writel(UDMA_BUFSZ_64KB, udma->regs + UDMA_CH_CTRL(ch_id));
> > +	} else {
> > +		/*
> > +		 * UDMA is limited to 4B-aligned addresses.
> > +		 * Thus use local 4B-aligned buffer to get
> > +		 * RX data and copy to the real destination
> > +		 * after then.
> > +		 */
> > +		rx_addr = (ud->addr & 0x3) ? uc->buf_addr : ud->addr;
> 
> !IS_ALIGNED()
> 
> > +		rx_size = FIELD_GET(UDMA_CH_CTRL_BUFSZ, ud->size);
> > +		writel(rx_addr, udma->regs + UDMA_CH_ADDR(ch_id));
> > +		writel(rx_size, udma->regs + UDMA_CH_CTRL(ch_id));
> > +	}
> > +
> > +	/* enable interrupt */
> > +	reg = readl(udma->regs + r_ie) | BIT(ch_bit);
> 
> Usually, in this kind of constructs, the logic is put on its own line between readl
> and writel.
> 
> > +	writel(reg, udma->regs + r_ie);
> > +
> > +	/* start DMA */
> > +	reg = readl(udma->regs + r_en) | BIT(ch_bit);
> > +	writel(reg, udma->regs + r_en);
> > +
> > +	spin_unlock_irqrestore(&udma->lock, flags); }
> > +
> > +static enum dma_status ast2600_udma_tx_status(struct dma_chan *chan,
> > +		dma_cookie_t cookie, struct dma_tx_state *txstate) {
> > +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> > +	enum dma_status sts = dma_cookie_status(chan, cookie, txstate);
> > +
> > +	dma_set_residue(txstate, uc->residue);
> > +
> > +	return sts;
> > +}
> > +
> > +static int ast2600_udma_pause(struct dma_chan *chan) {
> > +	unsigned long flags;
> > +	uint32_t r_en, r_ie, reg;
> > +	uint32_t ch_id = chan->chan_id;
> > +	uint32_t ch_bit = ch_id / 2;
> > +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> > +	struct ast2600_udma *udma = uc->udma;
> > +
> > +	if (uc->is_tx) {
> > +		r_en = UDMA_TX_EN;
> > +		r_ie = UDMA_TX_INT_EN;
> > +	} else {
> > +		r_en = UDMA_RX_EN;
> > +		r_ie = UDMA_RX_INT_EN;
> > +	}
> > +
> > +	spin_lock_irqsave(&udma->lock, flags);
> > +
> > +	reg = readl(udma->regs + r_en) & ~BIT(ch_bit);
> > +	writel(reg, udma->regs + r_en);
> > +
> > +	reg = readl(udma->regs + r_ie) & ~BIT(ch_bit);
> > +	writel(reg, udma->regs + r_ie);
> > +
> > +	spin_unlock_irqrestore(&udma->lock, flags);
> > +
> > +	return 0;
> > +}
> > +
> > +static int ast2600_udma_resume(struct dma_chan *chan) {
> > +	unsigned long flags;
> > +	uint32_t r_en, r_ie, reg;
> > +	uint32_t ch_id = chan->chan_id;
> > +	uint32_t ch_bit = ch_id / 2;
> > +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> > +	struct ast2600_udma *udma = uc->udma;
> > +
> > +	if (uc->is_tx) {
> > +		r_en = UDMA_TX_EN;
> > +		r_ie = UDMA_TX_INT_EN;
> > +	} else {
> > +		r_en = UDMA_RX_EN;
> > +		r_ie = UDMA_RX_INT_EN;
> > +	}
> > +
> > +	spin_lock_irqsave(&udma->lock, flags);
> > +
> > +	reg = readl(udma->regs + r_en) | BIT(ch_bit);
> > +	writel(reg, udma->regs + r_en);
> > +
> > +	reg = readl(udma->regs + r_ie) | BIT(ch_bit);
> > +	writel(reg, udma->regs + r_ie);
> > +
> > +	spin_unlock_irqrestore(&udma->lock, flags);
> > +
> > +	return 0;
> > +}
> > +
> > +static int ast2600_udma_terminate(struct dma_chan *chan) {
> > +	unsigned long flags;
> > +	uint32_t r_pr, r_is, r_ie, r_en, reg;
> > +	uint32_t ch_id = chan->chan_id;
> > +	uint32_t ch_bit = ch_id / 2;
> > +	struct ast2600_udma_chan *uc = to_ast2600_udma_chan(chan);
> > +	struct ast2600_udma *udma = uc->udma;
> > +
> > +	if (uc->is_tx) {
> > +		r_pr = UDMA_TX_PTR_RST;
> > +		r_is = UDMA_TX_INT_STS;
> > +		r_ie = UDMA_TX_INT_EN;
> > +		r_en = UDMA_TX_EN;
> > +	} else {
> > +		r_pr = UDMA_RX_PTR_RST;
> > +		r_is = UDMA_RX_INT_STS;
> > +		r_ie = UDMA_RX_INT_EN;
> > +		r_en = UDMA_RX_EN;
> > +	}
> > +
> > +	spin_lock_irqsave(&udma->lock, flags);
> > +
> > +	/* disable DMA */
> > +	reg = readl(udma->regs + r_en) & ~BIT(ch_bit);
> > +	writel(reg, udma->regs + r_en);
> > +
> > +	/* disable interrupt */
> > +	reg = readl(udma->regs + r_ie) & ~BIT(ch_bit);
> > +	writel(reg, udma->regs + r_ie);
> > +
> > +	/* clear interrupt status */
> > +	writel(BIT(ch_bit), udma->regs + r_is);
> > +
> > +	/* reset channel HW read/write pointer */
> > +	writel(BIT(ch_bit), udma->regs + r_pr);
> > +	writel(0, udma->regs + r_pr);
> > +
> > +	spin_unlock_irqrestore(&udma->lock, flags);
> > +
> > +	return 0;
> > +}
> > +
> > +static irqreturn_t ast2600_udma_isr(int irq, void *arg) {
> > +	struct ast2600_udma *udma = arg;
> > +	struct ast2600_udma_chan *uc;
> > +	struct ast2600_udma_desc *ud;
> > +	struct dma_async_tx_descriptor *tx;
> > +	uint32_t sts, rptr, wptr;
> > +	uint32_t ch_id, ch_bit;
> > +	phys_addr_t pa;
> > +	void *va;
> > +
> > +	/* handle TX interrupt */
> > +	sts = readl(udma->regs + UDMA_TX_INT_STS);
> > +	for_each_set_bit(ch_bit, (unsigned long *)&sts, (udma->n_ucs / 2)) {
> 
> Why not make sts unsigned long to avoid the cast?
> 
> Unnecessary parenthesis.
> 
> > +		ch_id = ch_bit << 1;
> > +		rptr = readl(udma->regs + UDMA_CH_RPTR(ch_id));
> > +		wptr = readl(udma->regs + UDMA_CH_WPTR(ch_id));
> > +
> > +		uc = &udma->ucs[ch_id];
> > +		uc->residue = wptr - rptr;
> > +
> > +		ast2600_udma_terminate(&uc->chan);
> > +
> > +		tx = &uc->ud.tx;
> > +		dma_cookie_complete(tx);
> > +		dma_descriptor_unmap(tx);
> > +		dmaengine_desc_get_callback_invoke(tx, NULL);
> > +	}
> > +
> > +	/* handle RX interrupt */
> > +	sts = readl(udma->regs + UDMA_RX_INT_STS);
> > +	for_each_set_bit(ch_bit, (unsigned long *)&sts, udma->n_ucs / 2) {
> > +		ch_id = (ch_bit << 1) + 1;
> > +		wptr = readl(udma->regs + UDMA_CH_WPTR(ch_id));
> > +
> > +		uc = &udma->ucs[ch_id];
> > +		ud = &uc->ud;
> > +		tx = &ud->tx;
> > +
> > +		uc->residue = (ud->size & ~UDMA_CH_CTRL_BUFSZ) - wptr;
> > +
> > +		/* handle non-4B-aligned case */
> > +		if (ud->addr & 0x3) {
> 
> !IS_ALIGNED()
> 
> > +			pa = dma_to_phys(uc->chan.device->dev, ud->addr);
> > +			if (pa != (phys_addr_t)-1) {
> > +				va = phys_to_virt(pa);
> > +				memcpy(va, uc->buf, wptr);
> > +			}
> > +		}
> > +
> > +		ast2600_udma_terminate(&uc->chan);
> > +
> > +		dma_cookie_complete(tx);
> > +		dma_descriptor_unmap(tx);
> > +		dmaengine_desc_get_callback_invoke(tx, NULL);
> > +	}
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static int ast2600_udma_probe(struct platform_device *pdev) {
> > +	int i, rc;
> > +	struct resource *res;
> > +	struct ast2600_udma *udma;
> > +	struct device *dev = &pdev->dev;
> > +
> > +	udma = devm_kzalloc(dev, sizeof(*udma), GFP_KERNEL);
> > +	if (!udma)
> > +		return -ENOMEM;
> > +
> > +	dma_cap_set(DMA_SLAVE, udma->ddev.cap_mask);
> > +	udma->ddev.device_alloc_chan_resources =
> ast2600_udma_alloc_chan_resources;
> > +	udma->ddev.device_prep_slave_sg = ast2600_udma_prep_slave_sg;
> > +	udma->ddev.device_issue_pending = ast2600_udma_issue_pending;
> > +	udma->ddev.device_tx_status = ast2600_udma_tx_status;
> > +	udma->ddev.device_pause = ast2600_udma_pause;
> > +	udma->ddev.device_resume = ast2600_udma_resume;
> > +	udma->ddev.device_terminate_all = ast2600_udma_terminate;
> > +	udma->ddev.src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE);
> > +	udma->ddev.dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE);
> > +	udma->ddev.directions = BIT(DMA_DEV_TO_MEM) |
> BIT(DMA_MEM_TO_DEV);
> > +	udma->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
> > +	udma->ddev.dev = dev;
> > +	INIT_LIST_HEAD(&udma->ddev.channels);
> > +
> > +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +	if (IS_ERR(res)) {
> > +		dev_err(dev, "cannot get IO resource\n");
> > +		return -ENODEV;
> > +	}
> > +
> > +	udma->regs = devm_ioremap_resource(dev, res);
> > +	if (IS_ERR(udma->regs)) {
> > +		dev_err(dev, "cannot map IO registers\n");
> > +		return PTR_ERR(udma->regs);
> > +	}
> > +
> > +	/* timeout value: 0x200 * (PCLK * 14400) */
> > +	writel(0x200, udma->regs + UDMA_TMOUT);
> > +
> > +	/* disable all for safety */
> > +	writel(0x0, udma->regs + UDMA_TX_EN);
> > +	writel(0x0, udma->regs + UDMA_RX_EN);
> > +
> > +	udma->irq = platform_get_irq(pdev, 0);
> > +	if (udma->irq < 0)
> > +		return udma->irq;
> > +
> > +	rc = devm_request_irq(&pdev->dev, udma->irq, ast2600_udma_isr,
> > +			      IRQF_SHARED, DEVICE_NAME, udma);
> > +	if (rc) {
> > +		dev_err(dev, "cannot request IRQ\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = of_property_read_u32(dev->of_node, "dma-channels",
> &udma->n_ucs);
> > +	if (rc) {
> > +		dev_err(dev, "cannot find number of channels\n");
> > +		return rc;
> > +	}
> > +
> > +	udma->ucs = devm_kzalloc(dev,
> > +				 sizeof(struct ast2600_udma_chan) * udma->n_ucs,
> GFP_KERNEL);
> > +	if (!udma->ucs)
> > +		return -ENOMEM;
> > +
> > +	for (i = 0; i < udma->n_ucs; ++i) {
> > +		udma->ucs[i].is_tx = !(i % 2);
> 
> & 1 would make more sense I think.
> 
> > +		udma->ucs[i].chan.device = &udma->ddev;
> > +		udma->ucs[i].buf = dmam_alloc_coherent(dev, UDMA_MAX_BUFSZ,
> > +						       &udma->ucs[i].buf_addr, GFP_KERNEL);
> > +		if (!udma->ucs[i].buf)
> > +			return -ENOMEM;
> > +
> > +		udma->ucs[i].udma = udma;
> > +		list_add_tail(&udma->ucs[i].chan.device_node,
> &udma->ddev.channels);
> > +	}
> > +
> > +	rc = dma_async_device_register(&udma->ddev);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = of_dma_controller_register(dev->of_node, of_dma_xlate_by_chan_id,
> &udma->ddev);
> > +	if (rc)
> > +		return rc;
> > +
> > +	spin_lock_init(&udma->lock);
> > +
> > +	platform_set_drvdata(pdev, udma);
> > +
> > +	dev_info(dev, "module loaded\n");
> 
> Don't print anything when there's no error.
> 
> > +	return 0;
> > +}
> > +
> > +static int ast2600_udma_remove(struct platform_device *pdev) {
> > +	struct device *dev = &pdev->dev;
> > +	struct ast2600_udma *udma = platform_get_drvdata(pdev);
> > +
> > +	of_dma_controller_free(dev->of_node);
> > +	dma_async_device_unregister(&udma->ddev);
> > +
> > +	dev_info(dev, "module removed\n");
> 
> Ditto.
> 
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct of_device_id ast2600_udma_match[] = {
> > +	{ .compatible = "aspeed,ast2600-udma" },
> > +	{ },
> > +};
> > +
> > +static struct platform_driver ast2600_udma_driver = {
> > +	.probe = ast2600_udma_probe,
> > +	.remove = ast2600_udma_remove,
> > +	.driver = {
> > +			.name = DEVICE_NAME,
> > +			.of_match_table = ast2600_udma_match,
> > +	},
> > +};
> > +
> > +module_platform_driver(ast2600_udma_driver);
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Chia-Wei Wang <chiawei_wang@aspeedtech.com");
> >
> 
> --
>  i.
ChiaWei Wang March 22, 2023, 8:34 a.m. UTC | #6
> From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Sent: Monday, March 20, 2023 6:49 PM
> 
> On Mon, 20 Mar 2023, ChiaWei Wang wrote:
> 
> > > From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > > Sent: Monday, March 20, 2023 5:43 PM
> > >
> > > On Mon, 20 Mar 2023, Chia-Wei Wang wrote:
> > >
> > > > Add new UART driver with DMA support for Aspeed AST2600 SoCs.
> > > > The drivers mainly prepare the dma instance based on the 8250_dma
> > > > implementation to leverage the AST2600 UART DMA (UDMA) engine.
> > > >
> > > > Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
> > > > ---
> > > >  drivers/tty/serial/8250/8250_aspeed.c | 224
> > > ++++++++++++++++++++++++++
> > > >  drivers/tty/serial/8250/Kconfig       |   8 +
> > > >  drivers/tty/serial/8250/Makefile      |   1 +
> > > >  3 files changed, 233 insertions(+)  create mode 100644
> > > > drivers/tty/serial/8250/8250_aspeed.c
> > > >
> > > > diff --git a/drivers/tty/serial/8250/8250_aspeed.c
> > > > b/drivers/tty/serial/8250/8250_aspeed.c
> > > > new file mode 100644
> > > > index 000000000000..04d0bf6fba28
> > > > --- /dev/null
> > > > +++ b/drivers/tty/serial/8250/8250_aspeed.c
> > > > @@ -0,0 +1,224 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * Copyright (C) ASPEED Technology Inc.
> > > > + */
> > > > +#include <linux/device.h>
> > > > +#include <linux/io.h>
> > > > +#include <linux/module.h>
> > > > +#include <linux/serial_8250.h>
> > > > +#include <linux/serial_reg.h>
> > > > +#include <linux/of.h>
> > > > +#include <linux/of_irq.h>
> > > > +#include <linux/of_platform.h>
> > > > +#include <linux/platform_device.h> #include <linux/clk.h>
> > > > +#include <linux/reset.h> #include <linux/dma-mapping.h> #include
> > > > +<linux/circ_buf.h> #include <linux/tty_flip.h> #include
> > > > +<linux/pm_runtime.h>
> > > > +
> > > > +#include "8250.h"
> > > > +
> > > > +#define DEVICE_NAME "aspeed-uart"
> > > > +
> > > > +struct ast8250_data {
> > > > +	int line;
> > > > +	int irq;
> > > > +	u8 __iomem *regs;
> > > > +	struct reset_control *rst;
> > > > +	struct clk *clk;
> > > > +#ifdef CONFIG_SERIAL_8250_DMA
> > > > +	struct uart_8250_dma dma;
> > > > +#endif
> > > > +};
> > > > +
> > > > +#ifdef CONFIG_SERIAL_8250_DMA
> > > > +static int ast8250_rx_dma(struct uart_8250_port *p);
> > > > +
> > > > +static void ast8250_rx_dma_complete(void *param) {
> > > > +	struct uart_8250_port *p = param;
> > > > +	struct uart_8250_dma *dma = p->dma;
> > > > +	struct tty_port *tty_port = &p->port.state->port;
> > > > +	struct dma_tx_state	state;
> > > > +	int	count;
> > > > +
> > > > +	dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
> > > > +
> > > > +	count = dma->rx_size - state.residue;
> > > > +
> > > > +	tty_insert_flip_string(tty_port, dma->rx_buf, count);
> > > > +	p->port.icount.rx += count;
> > > > +
> > > > +	tty_flip_buffer_push(tty_port);
> > > > +
> > > > +	ast8250_rx_dma(p);
> > > > +}
> > > > +
> > > > +static int ast8250_rx_dma(struct uart_8250_port *p) {
> > > > +	struct uart_8250_dma *dma = p->dma;
> > > > +	struct dma_async_tx_descriptor *tx;
> > > > +
> > > > +	tx = dmaengine_prep_slave_single(dma->rxchan, dma->rx_addr,
> > > > +					 dma->rx_size, DMA_DEV_TO_MEM,
> > > > +					 DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> > > > +	if (!tx)
> > > > +		return -EBUSY;
> > >
> > > How does the DMA Rx "loop" restart when this is taken?
> >
> > The loop re-starts from ast8250_startup.
> 
> Why would startup get called again?

Sorry for misleading, it relies on handle_rx_dma that restarts the loop

If UDMA suddenly shutdown, RX data stays in UART FIFO and thus triggers an interrupt.
The ISR call chain, serial8250_handle_irq -> handle_rx_dma, will restarts the UDMA loop.

However, before UDMA is restarted, there is a window suffering the same overrun issue as mentioned before.
I think make the default code support "RX DMA AO (Always On)" is certain to avoid producing more duplicated code with little customizations.

> 
> > > > +	tx->callback = ast8250_rx_dma_complete;
> > > > +	tx->callback_param = p;
> > > > +
> > > > +	dma->rx_cookie = dmaengine_submit(tx);
> > > > +
> > > > +	dma_async_issue_pending(dma->rxchan);
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +#endif
> > >
> > > These 2 functions look very similar to what 8250_dma offers for you.
> > > The only difference I could see is that always start DMA Rx thing
> > > which could be handled by adding some capability flag into
> > > uart_8250_dma for those UARTs that can launch DMA Rx while Rx queue is
> empty.
> > >
> > > So, just use the standard 8250_dma functions and make the small
> > > capabilities flag tweak there.
> > >
> > > By using the stock functions you also avoid 8250_dma Rx and your DMA
> > > Rx racing like they currently would (8250_port assigns the functions
> > > from 8250_dma when you don't specify the rx handler and the default
> > > 8250 irq handler will call into those standard 8250 DMA functions).
> >
> > Yes for the difference described.
> >
> > Our customers usually use UDMA for file-transmissions over UART.
> > And I found the preceding bytes will get lost easily due to the late
> > start of DMA engine.
> >
> > In fact, I was seeking the default implementation to always start RX
> > DMA instead of enabling it upon DR bit rising. But no luck and thus
> > add ast8250_rx_dma. (The default 8250 ISR also called into
> > up->dma->rx_dma)
> >
> > If adding a new capability flag is the better way to go, I will try to
> > implement in that way for further review.
> 
> Yes it would be much better.
> 
> Add unsigned int capabilities into uart_8250_dma and put the necessary
> checks + code into general code. Don't add any #ifdef
> CONFIG_SERIAL_8250_DMA into 8250_port.c nor 8250_dma.c. Instead, if you
> feel a need for one, use the #ifdef ... #else ... #endif in 8250.h to provide an
> empty static inline function for the #else case.
> 

Got it. I will get the patch prepared for your review in v4 revision.

> > > I'm curious about this HW and how it behaves under these two scenarios:
> > > - When Rx is empty, does UART/DMA just sit there waiting forever?
> >
> > Yes.
> 
> Okay.
> 
> > > - When a stream of incoming Rx characters suddenly ends, how does
> > > UART/DMA
> > >   react? ...On 8250 UARTs I'm familiar with this triggers
> UART_IIR_TIMEOUT
> > >   which you don't seem to handle.
> >
> > UDMA also has a timeout control.
> > If the data suddenly ends and timeout occurs, UDMA will trigger an
> interrupt.
> > UDMA ISR then check if there is data available using DMA read/write
> > pointers and invokes callback if any.
> 
> Okay. And the UART side won't trigger any interrupts?

UART side still triggers interrupts.

However, for RX data, UDMA keep polling DR bit and read the data out from UART FIFO.
Thus when serial8250_handle_irq ISR checks IIR, it gets no interrupt pending as data has been taken.
The ISR will return if no other events.

> 
> > > When you provide answer to those two questions, I can try to help
> > > you further on how to integrate into the standard 8250 DMA code.
> >
> > Thanks!
> > It would be great using the default one to avoid mostly duplicated code.
> 
> You need to take a look into handle_rx_dma() what to do there. Probably just
> call to ->rx_dma() unconditionally to prevent UART interrupts from messing up
> with DMA Rx. This restart for DMA Rx is just for backup if the DMA Rx "loop"
> stopped due to an error.

Yes. A new if-condition to support "RX DMA AO" should be needed.

Thanks for the review and feedback.
I will move on to v4 revision with default code modification.

Regards,
Chiawei