mbox series

[0/4] drivers: misc: Add support for TMR Manager and Inject IPs

Message ID 20220628054338.1631516-1-appana.durga.rao@xilinx.com
Headers show
Series drivers: misc: Add support for TMR Manager and Inject IPs | expand

Message

Appana Durga Kedareswara rao June 28, 2022, 5:43 a.m. UTC
This patch series adds support for TMR Manager and TMR Inject IPs.
The TMR Manager is responsible for handling the TMR subsystem state,
including fault detection and error recovery. The core is triplicated
in each of the sub-blocks in the TMR subsystem, and provides majority
voting of its internal state.
The TMR Inject core provides functional fault injection by changing
selected MicroBlaze instructions, which provides the possibility to
verify that the TMR subsystem error detection and fault recovery logic
is working properly.

For more details about TMR Manager and Inject IPs please refer PG268[1].
[1]: https://docs.xilinx.com/r/en-US/pg268-tmr/Triple-Modular-Redundancy-TMR-v1.0-LogiCORE-IP-Product-Guide-PG268

This patch series has a dependency on the patch series
https://www.spinics.net/lists/arm-kernel/msg991887.html

Appana Durga Kedareswara rao (4):
  dt-bindings: misc: tmr-manager: Add device-tree binding for TMR
    Manager
  drivers: misc: Add Support for TMR Manager
  dt-bindings: misc: tmr-inject: Add device-tree binding for TMR Inject
  drivers: misc: Add Support for TMR Inject IP

 .../testing/sysfs-driver-xilinx-tmr-inject    |  12 +
 .../testing/sysfs-driver-xilinx-tmr-manager   |  18 ++
 .../bindings/misc/xlnx,tmr-inject.yaml        |  47 +++
 .../bindings/misc/xlnx,tmr-manager.yaml       |  48 +++
 MAINTAINERS                                   |  14 +
 drivers/misc/Kconfig                          |  20 ++
 drivers/misc/Makefile                         |   2 +
 drivers/misc/xilinx_tmr_inject.c              | 201 ++++++++++++
 drivers/misc/xilinx_tmr_manager.c             | 288 ++++++++++++++++++
 9 files changed, 650 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-xilinx-tmr-inject
 create mode 100644 Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
 create mode 100644 Documentation/devicetree/bindings/misc/xlnx,tmr-inject.yaml
 create mode 100644 Documentation/devicetree/bindings/misc/xlnx,tmr-manager.yaml
 create mode 100644 drivers/misc/xilinx_tmr_inject.c
 create mode 100644 drivers/misc/xilinx_tmr_manager.c

Comments

Greg KH July 1, 2022, 8:19 a.m. UTC | #1
On Tue, Jun 28, 2022 at 11:13:36AM +0530, Appana Durga Kedareswara rao wrote:
> The Triple Modular Redundancy(TMR) Manager is responsible for handling
> the TMR subsystem state, including fault detection and error recovery
> provides soft error detection, correction and recovery features.
> 
> This commit adds support for fault detection feature via
> sysfs interface.
> 
> Usage:
> To know the hardware status:
> cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/status
> To know the break handler count(Error count):
> cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/errcnt
> 
> Signed-off-by: Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
> ---
>  .../testing/sysfs-driver-xilinx-tmr-manager   |  18 ++
>  MAINTAINERS                                   |   7 +
>  drivers/misc/Kconfig                          |  10 +
>  drivers/misc/Makefile                         |   1 +
>  drivers/misc/xilinx_tmr_manager.c             | 288 ++++++++++++++++++
>  5 files changed, 324 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>  create mode 100644 drivers/misc/xilinx_tmr_manager.c
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager b/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
> new file mode 100644
> index 000000000000..c2f615126201
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
> @@ -0,0 +1,18 @@
> +What:		/sys/devices/platform/amba_pl/<dev>/status
> +Date:		June 2022
> +Contact:	appana.durga.rao@xilinx.com
> +Description:	This control file provides the status of the tmr manager
> +		useful for getting the status of fault.
> +		This file cannot be written.

Examples of what the data in this file will be?

> +
> +What:		/sys/devices/platform/amba_pl/<dev>/errcnt
> +Date:		June 2022
> +Contact:	appana.durga.rao@xilinx.com
> +Description:	This control file provides the fault detection count.
> +		This file cannot be written.

Again, example output?



> +
> +What:		/sys/devices/platform/amba_pl/<dev>/dis_block_break
> +Date:		June 2022
> +Contact:	appana.durga.rao@xilinx.com
> +Description:	This control file enables the break signal.
> +		This file is write only.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 6cc825857722..3d29109788ed 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12962,6 +12962,13 @@ W:	http://www.monstr.eu/fdt/
>  T:	git git://git.monstr.eu/linux-2.6-microblaze.git
>  F:	arch/microblaze/
>  
> +MICROBLAZE TMR MANAGER
> +M:	Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
> +S:	Supported
> +F:	Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
> +F:	Documentation/devicetree/bindings/misc/xlnx,tmr-manager.yaml
> +F:	drivers/misc/xilinx_tmr_manager.c
> +
>  MICROCHIP AT91 DMA DRIVERS
>  M:	Ludovic Desroches <ludovic.desroches@microchip.com>
>  M:	Tudor Ambarus <tudor.ambarus@microchip.com>
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 41d2bb0ae23a..555ae2e33b91 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -483,6 +483,16 @@ config OPEN_DICE
>  
>  	  If unsure, say N.
>  
> +config TMR_MANAGER
> +	bool "Select TMR Manager"
> +	depends on MICROBLAZE && MB_MANAGER
> +	help
> +	  This option enables the driver developed for TMR Manager. The Triple
> +	  Modular Redundancy(TMR) manager provides support for fault detection
> +	  via sysfs interface.
> +
> +	  Say N here unless you know what you are doing.
> +
>  source "drivers/misc/c2port/Kconfig"
>  source "drivers/misc/eeprom/Kconfig"
>  source "drivers/misc/cb710/Kconfig"
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 70e800e9127f..28b9803f909b 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
>  obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
>  obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
>  obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
> +obj-$(CONFIG_TMR_MANAGER)	+= xilinx_tmr_manager.o
> diff --git a/drivers/misc/xilinx_tmr_manager.c b/drivers/misc/xilinx_tmr_manager.c
> new file mode 100644
> index 000000000000..5a4562e4a9d4
> --- /dev/null
> +++ b/drivers/misc/xilinx_tmr_manager.c
> @@ -0,0 +1,288 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Xilinx TMR Subsystem.
> + *
> + * Copyright (C) 2022 Xilinx, Inc.
> + *
> + * Description:
> + * This driver is developed for TMR Manager,The Triple Modular Redundancy(TMR)
> + * Manager is responsible for handling the TMR subsystem state, including
> + * fault detection and error recovery. The core is triplicated in each of
> + * the sub-blocks in the TMR subsystem, and provides majority voting of
> + * its internal state provides soft error detection, correction and
> + * recovery. Error detection feature is provided through sysfs
> + * entries which allow the user to observer the TMR microblaze
> + * status.
> + */
> +
> +#include <asm/xilinx_mb_manager.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +
> +/* TMR Manager Register offsets */
> +#define XTMR_MANAGER_CR_OFFSET		0x0
> +#define XTMR_MANAGER_FFR_OFFSET		0x4
> +#define XTMR_MANAGER_CMR0_OFFSET	0x8
> +#define XTMR_MANAGER_CMR1_OFFSET	0xC
> +#define XTMR_MANAGER_BDIR_OFFSET	0x10
> +#define XTMR_MANAGER_SEMIMR_OFFSET	0x1C
> +
> +/* Register Bitmasks/shifts */
> +#define XTMR_MANAGER_CR_MAGIC1_MASK	GENMASK(7, 0)
> +#define XTMR_MANAGER_CR_MAGIC2_MASK	GENMASK(15, 8)
> +#define XTMR_MANAGER_CR_RIR_MASK	BIT(16)
> +#define XTMR_MANAGER_FFR_LM12_MASK	BIT(0)
> +#define XTMR_MANAGER_FFR_LM13_MASK	BIT(1)
> +#define XTMR_MANAGER_FFR_LM23_MASK	BIT(2)
> +
> +#define XTMR_MANAGER_CR_MAGIC2_SHIFT	4
> +#define XTMR_MANAGER_CR_RIR_SHIFT	16
> +#define XTMR_MANAGER_CR_BB_SHIFT	18
> +
> +#define XTMR_MANAGER_MAGIC1_MAX_VAL	255
> +
> +/**
> + * struct xtmr_manager_dev - Driver data for TMR Manager
> + * @regs: device physical base address
> + * @dev: pointer to device struct
> + * @cr_val: control register value
> + * @magic1: Magic 1 hardware configuration value
> + * @err_cnt: error statistics count
> + * @phys_baseaddr: Physical base address
> + */
> +struct xtmr_manager_dev {
> +	void __iomem *regs;
> +	struct device *dev;

Is this the parent of this device?  Why not make a new device?  The
interaction here isn't well defined, what controls the lifespan of this
structure?

> +	u32 cr_val;
> +	u32 magic1;
> +	u32 err_cnt;
> +	uintptr_t phys_baseaddr;

This should be resource_size_t, not uintptr_t, right?


> +};
> +
> +/* IO accessors */
> +static inline void xtmr_manager_write(struct xtmr_manager_dev *xtmr_manager,
> +				      u32 addr, u32 value)
> +{
> +	iowrite32(value, xtmr_manager->regs + addr);
> +}
> +
> +static inline u32 xtmr_manager_read(struct xtmr_manager_dev *xtmr_manager,
> +				    u32 addr)
> +{
> +	return ioread32(xtmr_manager->regs + addr);
> +}
> +
> +/**
> + * xtmr_manager_unblock_break - unblocks the break signal
> + * @xtmr_manager: Pointer to xtmr_manager_dev structure
> + */
> +static void xtmr_manager_unblock_break(struct xtmr_manager_dev *xtmr_manager)
> +{
> +	xtmr_manager->cr_val &= ~(1 << XTMR_MANAGER_CR_BB_SHIFT);
> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_CR_OFFSET,
> +			   xtmr_manager->cr_val);
> +}
> +
> +/**
> + * xmb_manager_reset_handler - clears the ffr register contents
> + * @priv: Private pointer
> + */
> +static void xmb_manager_reset_handler(void *priv)
> +{
> +	struct xtmr_manager_dev *xtmr_manager = (struct xtmr_manager_dev *)priv;
> +	/* Clear the FFR Register contents as a part of recovery process. */
> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_FFR_OFFSET, 0);

Did you run checkpatch on this code?  Please do so.

> +}
> +
> +/**

No need for kernel doc formatting for static functions, right?

> + * xmb_manager_update_errcnt - update the error inject count
> + * @priv: Private pointer
> + */
> +static void xmb_manager_update_errcnt(void *priv)

Use a real pointer, not a void one please.  We want type safe functions.

> +{
> +	struct xtmr_manager_dev *xtmr_manager = (struct xtmr_manager_dev *)priv;

And even if you did have a real pointer, you never need to cast from a
void *.

> +
> +	xtmr_manager->err_cnt++;
> +}
> +
> +static ssize_t errcnt_show(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
> +
> +	return sprintf(buf, "%x\n", xtmr_manager->err_cnt);

sysfs_emit() here and everywhere please.

> +}
> +static DEVICE_ATTR_RO(errcnt);
> +
> +static ssize_t status_show(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
> +	size_t ffr;
> +	int len = 0;
> +
> +	ffr = xtmr_manager_read(xtmr_manager, XTMR_MANAGER_FFR_OFFSET);
> +	if ((ffr & XTMR_MANAGER_FFR_LM12_MASK) == XTMR_MANAGER_FFR_LM12_MASK) {
> +		len += sprintf(buf + len,
> +			       "Lockstep mismatch between processor 1 and 2\n");
> +	}
> +
> +	if ((ffr & XTMR_MANAGER_FFR_LM13_MASK) == XTMR_MANAGER_FFR_LM13_MASK) {
> +		len += sprintf(buf + len,
> +			       "Lockstep mismatch between processor 1 and 3\n");
> +	}
> +
> +	if ((ffr & XTMR_MANAGER_FFR_LM23_MASK) == XTMR_MANAGER_FFR_LM23_MASK) {
> +		len += sprintf(buf + len,
> +			       "Lockstep mismatch between processor 2 and 3\n");

That feels like a lot more than "one bit of information", who is going
to parse this and do something with this information?


> +	}
> +
> +	return len;
> +}
> +static DEVICE_ATTR_RO(status);
> +
> +static ssize_t dis_block_break_store(struct device *dev,
> +				     struct device_attribute *attr,
> +				     const char *buf, size_t size)
> +{
> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
> +	int ret;
> +	long value;
> +
> +	ret = kstrtoul(buf, 16, &value);
> +	if (ret)
> +		return ret;
> +
> +	if (value > 1)
> +		return -EINVAL;
> +
> +	xtmr_manager_unblock_break(xtmr_manager);

No error checking?

> +	return size;
> +}
> +static DEVICE_ATTR_WO(dis_block_break);
> +
> +static struct attribute *xtmr_manager_attrs[] = {
> +	&dev_attr_dis_block_break.attr,
> +	&dev_attr_status.attr,
> +	&dev_attr_errcnt.attr,
> +	NULL,
> +};
> +ATTRIBUTE_GROUPS(xtmr_manager);
> +
> +static void xtmr_manager_init(struct xtmr_manager_dev *xtmr_manager)
> +{
> +	/* Clear the SEM interrupt mask register to disable the interrupt */
> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_SEMIMR_OFFSET, 0);
> +
> +	/* Allow recovery reset by default */
> +	xtmr_manager->cr_val = (1 << XTMR_MANAGER_CR_RIR_SHIFT) |
> +				xtmr_manager->magic1;
> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_CR_OFFSET,
> +			   xtmr_manager->cr_val);
> +	/*
> +	 * Configure Break Delay Initialization Register to zero so that
> +	 * break occurs immediately
> +	 */
> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_BDIR_OFFSET, 0);
> +
> +	/*
> +	 * To come out of break handler need to block the break signal
> +	 * in the tmr manager, update the xtmr_manager cr_val for the same
> +	 */
> +	xtmr_manager->cr_val |= (1 << XTMR_MANAGER_CR_BB_SHIFT);
> +
> +	/*
> +	 * When the break vector gets asserted because of error injection,
> +	 * the break signal must be blocked before exiting from the
> +	 * break handler, Below api updates the TMR manager address and
> +	 * control register and error counter callback arguments,
> +	 * which will be used by the break handler to block the
> +	 * break and call the callback function.
> +	 */
> +	xmb_manager_register(xtmr_manager->phys_baseaddr, xtmr_manager->cr_val,
> +			     xmb_manager_update_errcnt,
> +			     xtmr_manager, xmb_manager_reset_handler);

No error handling?

> +}
> +
> +/**
> + * xtmr_manager_probe - Driver probe function
> + * @pdev: Pointer to the platform_device structure
> + *
> + * This is the driver probe routine. It does all the memory
> + * allocation and creates sysfs entries for the device.
> + *
> + * Return: 0 on success and failure value on error
> + */
> +static int xtmr_manager_probe(struct platform_device *pdev)
> +{
> +	struct xtmr_manager_dev *xtmr_manager;
> +	struct resource *res;
> +	int err;
> +
> +	xtmr_manager = devm_kzalloc(&pdev->dev, sizeof(*xtmr_manager),
> +				    GFP_KERNEL);
> +	if (!xtmr_manager)
> +		return -ENOMEM;
> +
> +	xtmr_manager->dev = &pdev->dev;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	xtmr_manager->regs =  devm_ioremap_resource(xtmr_manager->dev, res);
> +	if (IS_ERR(xtmr_manager->regs))
> +		return PTR_ERR(xtmr_manager->regs);
> +
> +	xtmr_manager->phys_baseaddr = res->start;
> +
> +	err = of_property_read_u32(pdev->dev.of_node, "xlnx,magic1",
> +				   &xtmr_manager->magic1);
> +	if (err < 0) {
> +		dev_err(&pdev->dev, "unable to read xlnx,magic1 property");
> +		return err;
> +	} else if (xtmr_manager->magic1 > XTMR_MANAGER_MAGIC1_MAX_VAL) {

No need for else.

> +		dev_err(&pdev->dev, "invalid xlnx,magic1 property value");
> +		return err;

but err is not an error value here, right?

> +	}
> +
> +	/* Initialize TMR Manager */
> +	xtmr_manager_init(xtmr_manager);
> +
> +	err = sysfs_create_groups(&xtmr_manager->dev->kobj,
> +				  xtmr_manager_groups);

You just raced with userspace and lost.  NEVER deal with raw kobjects,
that's a huge sign that you are doing something wrong in a driver.

Please use the proper default groups pointer for your driver to do this
correctly so that the driver core will handle this for you.

> +	if (err < 0) {
> +		dev_err(&pdev->dev, "unable to create sysfs entries\n");
> +		return err;
> +	}
> +
> +	platform_set_drvdata(pdev, xtmr_manager);
> +
> +	return 0;
> +}
> +
> +static int xtmr_manager_remove(struct platform_device *pdev)
> +{
> +	sysfs_remove_groups(&pdev->dev.kobj, xtmr_manager_groups);

Again, never touch a kobject in a driver.

thanks,

greg k-h
Appana Durga Kedareswara rao July 4, 2022, 4:07 a.m. UTC | #2
Hi Greg,

Thanks for the review.

On 01/07/22 1:49 pm, Greg KH wrote:
> On Tue, Jun 28, 2022 at 11:13:36AM +0530, Appana Durga Kedareswara rao wrote:
>> The Triple Modular Redundancy(TMR) Manager is responsible for handling
>> the TMR subsystem state, including fault detection and error recovery
>> provides soft error detection, correction and recovery features.
>>
>> This commit adds support for fault detection feature via
>> sysfs interface.
>>
>> Usage:
>> To know the hardware status:
>> cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/status
>> To know the break handler count(Error count):
>> cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/errcnt
>>
>> Signed-off-by: Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
>> ---
>>   .../testing/sysfs-driver-xilinx-tmr-manager   |  18 ++
>>   MAINTAINERS                                   |   7 +
>>   drivers/misc/Kconfig                          |  10 +
>>   drivers/misc/Makefile                         |   1 +
>>   drivers/misc/xilinx_tmr_manager.c             | 288 ++++++++++++++++++
>>   5 files changed, 324 insertions(+)
>>   create mode 100644 Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>>   create mode 100644 drivers/misc/xilinx_tmr_manager.c
>>
>> diff --git a/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager b/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>> new file mode 100644
>> index 000000000000..c2f615126201
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>> @@ -0,0 +1,18 @@
>> +What:		/sys/devices/platform/amba_pl/<dev>/status
>> +Date:		June 2022
>> +Contact:	appana.durga.rao@xilinx.com
>> +Description:	This control file provides the status of the tmr manager
>> +		useful for getting the status of fault.
>> +		This file cannot be written.
> 
> Examples of what the data in this file will be?
> 

Okay will fix in v2.
>> +
>> +What:		/sys/devices/platform/amba_pl/<dev>/errcnt
>> +Date:		June 2022
>> +Contact:	appana.durga.rao@xilinx.com
>> +Description:	This control file provides the fault detection count.
>> +		This file cannot be written.
> 
> Again, example output?
> 
> 

sure will update in v2.
> 
>> +
>> +What:		/sys/devices/platform/amba_pl/<dev>/dis_block_break
>> +Date:		June 2022
>> +Contact:	appana.durga.rao@xilinx.com
>> +Description:	This control file enables the break signal.
>> +		This file is write only.
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 6cc825857722..3d29109788ed 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -12962,6 +12962,13 @@ W:	http://www.monstr.eu/fdt/
>>   T:	git git://git.monstr.eu/linux-2.6-microblaze.git
>>   F:	arch/microblaze/
>>   
>> +MICROBLAZE TMR MANAGER
>> +M:	Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
>> +S:	Supported
>> +F:	Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>> +F:	Documentation/devicetree/bindings/misc/xlnx,tmr-manager.yaml
>> +F:	drivers/misc/xilinx_tmr_manager.c
>> +
>>   MICROCHIP AT91 DMA DRIVERS
>>   M:	Ludovic Desroches <ludovic.desroches@microchip.com>
>>   M:	Tudor Ambarus <tudor.ambarus@microchip.com>
>> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
>> index 41d2bb0ae23a..555ae2e33b91 100644
>> --- a/drivers/misc/Kconfig
>> +++ b/drivers/misc/Kconfig
>> @@ -483,6 +483,16 @@ config OPEN_DICE
>>   
>>   	  If unsure, say N.
>>   
>> +config TMR_MANAGER
>> +	bool "Select TMR Manager"
>> +	depends on MICROBLAZE && MB_MANAGER
>> +	help
>> +	  This option enables the driver developed for TMR Manager. The Triple
>> +	  Modular Redundancy(TMR) manager provides support for fault detection
>> +	  via sysfs interface.
>> +
>> +	  Say N here unless you know what you are doing.
>> +
>>   source "drivers/misc/c2port/Kconfig"
>>   source "drivers/misc/eeprom/Kconfig"
>>   source "drivers/misc/cb710/Kconfig"
>> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
>> index 70e800e9127f..28b9803f909b 100644
>> --- a/drivers/misc/Makefile
>> +++ b/drivers/misc/Makefile
>> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
>>   obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
>>   obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
>>   obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
>> +obj-$(CONFIG_TMR_MANAGER)	+= xilinx_tmr_manager.o
>> diff --git a/drivers/misc/xilinx_tmr_manager.c b/drivers/misc/xilinx_tmr_manager.c
>> new file mode 100644
>> index 000000000000..5a4562e4a9d4
>> --- /dev/null
>> +++ b/drivers/misc/xilinx_tmr_manager.c
>> @@ -0,0 +1,288 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Xilinx TMR Subsystem.
>> + *
>> + * Copyright (C) 2022 Xilinx, Inc.
>> + *
>> + * Description:
>> + * This driver is developed for TMR Manager,The Triple Modular Redundancy(TMR)
>> + * Manager is responsible for handling the TMR subsystem state, including
>> + * fault detection and error recovery. The core is triplicated in each of
>> + * the sub-blocks in the TMR subsystem, and provides majority voting of
>> + * its internal state provides soft error detection, correction and
>> + * recovery. Error detection feature is provided through sysfs
>> + * entries which allow the user to observer the TMR microblaze
>> + * status.
>> + */
>> +
>> +#include <asm/xilinx_mb_manager.h>
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>> +
>> +/* TMR Manager Register offsets */
>> +#define XTMR_MANAGER_CR_OFFSET		0x0
>> +#define XTMR_MANAGER_FFR_OFFSET		0x4
>> +#define XTMR_MANAGER_CMR0_OFFSET	0x8
>> +#define XTMR_MANAGER_CMR1_OFFSET	0xC
>> +#define XTMR_MANAGER_BDIR_OFFSET	0x10
>> +#define XTMR_MANAGER_SEMIMR_OFFSET	0x1C
>> +
>> +/* Register Bitmasks/shifts */
>> +#define XTMR_MANAGER_CR_MAGIC1_MASK	GENMASK(7, 0)
>> +#define XTMR_MANAGER_CR_MAGIC2_MASK	GENMASK(15, 8)
>> +#define XTMR_MANAGER_CR_RIR_MASK	BIT(16)
>> +#define XTMR_MANAGER_FFR_LM12_MASK	BIT(0)
>> +#define XTMR_MANAGER_FFR_LM13_MASK	BIT(1)
>> +#define XTMR_MANAGER_FFR_LM23_MASK	BIT(2)
>> +
>> +#define XTMR_MANAGER_CR_MAGIC2_SHIFT	4
>> +#define XTMR_MANAGER_CR_RIR_SHIFT	16
>> +#define XTMR_MANAGER_CR_BB_SHIFT	18
>> +
>> +#define XTMR_MANAGER_MAGIC1_MAX_VAL	255
>> +
>> +/**
>> + * struct xtmr_manager_dev - Driver data for TMR Manager
>> + * @regs: device physical base address
>> + * @dev: pointer to device struct
>> + * @cr_val: control register value
>> + * @magic1: Magic 1 hardware configuration value
>> + * @err_cnt: error statistics count
>> + * @phys_baseaddr: Physical base address
>> + */
>> +struct xtmr_manager_dev {
>> +	void __iomem *regs;
>> +	struct device *dev;
> 
> Is this the parent of this device?  Why not make a new device?  The
> interaction here isn't well defined, what controls the lifespan of this
> structure?
> 

It is parent of this device,
Agree with your comments will fix in v2.
>> +	u32 cr_val;
>> +	u32 magic1;
>> +	u32 err_cnt;
>> +	uintptr_t phys_baseaddr;
> 
> This should be resource_size_t, not uintptr_t, right?
> 

Okay will fix in v2.

> 
>> +};
>> +
>> +/* IO accessors */
>> +static inline void xtmr_manager_write(struct xtmr_manager_dev *xtmr_manager,
>> +				      u32 addr, u32 value)
>> +{
>> +	iowrite32(value, xtmr_manager->regs + addr);
>> +}
>> +
>> +static inline u32 xtmr_manager_read(struct xtmr_manager_dev *xtmr_manager,
>> +				    u32 addr)
>> +{
>> +	return ioread32(xtmr_manager->regs + addr);
>> +}
>> +
>> +/**
>> + * xtmr_manager_unblock_break - unblocks the break signal
>> + * @xtmr_manager: Pointer to xtmr_manager_dev structure
>> + */
>> +static void xtmr_manager_unblock_break(struct xtmr_manager_dev *xtmr_manager)
>> +{
>> +	xtmr_manager->cr_val &= ~(1 << XTMR_MANAGER_CR_BB_SHIFT);
>> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_CR_OFFSET,
>> +			   xtmr_manager->cr_val);
>> +}
>> +
>> +/**
>> + * xmb_manager_reset_handler - clears the ffr register contents
>> + * @priv: Private pointer
>> + */
>> +static void xmb_manager_reset_handler(void *priv)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = (struct xtmr_manager_dev *)priv;
>> +	/* Clear the FFR Register contents as a part of recovery process. */
>> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_FFR_OFFSET, 0);
> 
> Did you run checkpatch on this code?  Please do so.
> 

Yes i didn't get any warnings from the script.

>> +}
>> +
>> +/**
> 
> No need for kernel doc formatting for static functions, right?
> 
will remove in v2.

>> + * xmb_manager_update_errcnt - update the error inject count
>> + * @priv: Private pointer
>> + */
>> +static void xmb_manager_update_errcnt(void *priv)
> 
> Use a real pointer, not a void one please.  We want type safe functions.
> 

will fix in v2.
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = (struct xtmr_manager_dev *)priv;
> 
> And even if you did have a real pointer, you never need to cast from a
> void *.
>

will fix in v2.

>> +
>> +	xtmr_manager->err_cnt++;
>> +}
>> +
>> +static ssize_t errcnt_show(struct device *dev, struct device_attribute *attr,
>> +			   char *buf)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
>> +
>> +	return sprintf(buf, "%x\n", xtmr_manager->err_cnt);
> 
> sysfs_emit() here and everywhere please.
> 

sure will fix in v2.
>> +}
>> +static DEVICE_ATTR_RO(errcnt);
>> +
>> +static ssize_t status_show(struct device *dev, struct device_attribute *attr,
>> +			   char *buf)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
>> +	size_t ffr;
>> +	int len = 0;
>> +
>> +	ffr = xtmr_manager_read(xtmr_manager, XTMR_MANAGER_FFR_OFFSET);
>> +	if ((ffr & XTMR_MANAGER_FFR_LM12_MASK) == XTMR_MANAGER_FFR_LM12_MASK) {
>> +		len += sprintf(buf + len,
>> +			       "Lockstep mismatch between processor 1 and 2\n");
>> +	}
>> +
>> +	if ((ffr & XTMR_MANAGER_FFR_LM13_MASK) == XTMR_MANAGER_FFR_LM13_MASK) {
>> +		len += sprintf(buf + len,
>> +			       "Lockstep mismatch between processor 1 and 3\n");
>> +	}
>> +
>> +	if ((ffr & XTMR_MANAGER_FFR_LM23_MASK) == XTMR_MANAGER_FFR_LM23_MASK) {
>> +		len += sprintf(buf + len,
>> +			       "Lockstep mismatch between processor 2 and 3\n");
> 
> That feels like a lot more than "one bit of information", who is going
> to parse this and do something with this information?
> 
> 

It is just to convey the TMR Manager status to the user.

>> +	}
>> +
>> +	return len;
>> +}
>> +static DEVICE_ATTR_RO(status);
>> +
>> +static ssize_t dis_block_break_store(struct device *dev,
>> +				     struct device_attribute *attr,
>> +				     const char *buf, size_t size)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
>> +	int ret;
>> +	long value;
>> +
>> +	ret = kstrtoul(buf, 16, &value);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (value > 1)
>> +		return -EINVAL;
>> +
>> +	xtmr_manager_unblock_break(xtmr_manager);
> 
> No error checking?
> 

will fix in v2.

>> +	return size;
>> +}
>> +static DEVICE_ATTR_WO(dis_block_break);
>> +
>> +static struct attribute *xtmr_manager_attrs[] = {
>> +	&dev_attr_dis_block_break.attr,
>> +	&dev_attr_status.attr,
>> +	&dev_attr_errcnt.attr,
>> +	NULL,
>> +};
>> +ATTRIBUTE_GROUPS(xtmr_manager);
>> +
>> +static void xtmr_manager_init(struct xtmr_manager_dev *xtmr_manager)
>> +{
>> +	/* Clear the SEM interrupt mask register to disable the interrupt */
>> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_SEMIMR_OFFSET, 0);
>> +
>> +	/* Allow recovery reset by default */
>> +	xtmr_manager->cr_val = (1 << XTMR_MANAGER_CR_RIR_SHIFT) |
>> +				xtmr_manager->magic1;
>> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_CR_OFFSET,
>> +			   xtmr_manager->cr_val);
>> +	/*
>> +	 * Configure Break Delay Initialization Register to zero so that
>> +	 * break occurs immediately
>> +	 */
>> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_BDIR_OFFSET, 0);
>> +
>> +	/*
>> +	 * To come out of break handler need to block the break signal
>> +	 * in the tmr manager, update the xtmr_manager cr_val for the same
>> +	 */
>> +	xtmr_manager->cr_val |= (1 << XTMR_MANAGER_CR_BB_SHIFT);
>> +
>> +	/*
>> +	 * When the break vector gets asserted because of error injection,
>> +	 * the break signal must be blocked before exiting from the
>> +	 * break handler, Below api updates the TMR manager address and
>> +	 * control register and error counter callback arguments,
>> +	 * which will be used by the break handler to block the
>> +	 * break and call the callback function.
>> +	 */
>> +	xmb_manager_register(xtmr_manager->phys_baseaddr, xtmr_manager->cr_val,
>> +			     xmb_manager_update_errcnt,
>> +			     xtmr_manager, xmb_manager_reset_handler);
> 
> No error handling?
>

will fix in v2.

>> +}
>> +
>> +/**
>> + * xtmr_manager_probe - Driver probe function
>> + * @pdev: Pointer to the platform_device structure
>> + *
>> + * This is the driver probe routine. It does all the memory
>> + * allocation and creates sysfs entries for the device.
>> + *
>> + * Return: 0 on success and failure value on error
>> + */
>> +static int xtmr_manager_probe(struct platform_device *pdev)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager;
>> +	struct resource *res;
>> +	int err;
>> +
>> +	xtmr_manager = devm_kzalloc(&pdev->dev, sizeof(*xtmr_manager),
>> +				    GFP_KERNEL);
>> +	if (!xtmr_manager)
>> +		return -ENOMEM;
>> +
>> +	xtmr_manager->dev = &pdev->dev;
>> +
>> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +	xtmr_manager->regs =  devm_ioremap_resource(xtmr_manager->dev, res);
>> +	if (IS_ERR(xtmr_manager->regs))
>> +		return PTR_ERR(xtmr_manager->regs);
>> +
>> +	xtmr_manager->phys_baseaddr = res->start;
>> +
>> +	err = of_property_read_u32(pdev->dev.of_node, "xlnx,magic1",
>> +				   &xtmr_manager->magic1);
>> +	if (err < 0) {
>> +		dev_err(&pdev->dev, "unable to read xlnx,magic1 property");
>> +		return err;
>> +	} else if (xtmr_manager->magic1 > XTMR_MANAGER_MAGIC1_MAX_VAL) {
> 
> No need for else.

ok will fix in v2
> 
>> +		dev_err(&pdev->dev, "invalid xlnx,magic1 property value");
>> +		return err;
> 
> but err is not an error value here, right?
> 

Yes will return proper error code in v2
>> +	}
>> +
>> +	/* Initialize TMR Manager */
>> +	xtmr_manager_init(xtmr_manager);
>> +
>> +	err = sysfs_create_groups(&xtmr_manager->dev->kobj,
>> +				  xtmr_manager_groups);
> 
> You just raced with userspace and lost.  NEVER deal with raw kobjects,
> that's a huge sign that you are doing something wrong in a driver.
> 
> Please use the proper default groups pointer for your driver to do this
> correctly so that the driver core will handle this for you.
> 

Okay will fix in v2.

>> +	if (err < 0) {
>> +		dev_err(&pdev->dev, "unable to create sysfs entries\n");
>> +		return err;
>> +	}
>> +
>> +	platform_set_drvdata(pdev, xtmr_manager);
>> +
>> +	return 0;
>> +}
>> +
>> +static int xtmr_manager_remove(struct platform_device *pdev)
>> +{
>> +	sysfs_remove_groups(&pdev->dev.kobj, xtmr_manager_groups);
> 
> Again, never touch a kobject in a driver.
> 

Sure will fix in v2.

Regards,
Kedar.
> thanks,
> 
> greg k-h
Zhou Furong July 5, 2022, 2:39 a.m. UTC | #3
>> +static struct attribute *xtmr_manager_attrs[] = {
>> +	&dev_attr_dis_block_break.attr,
>> +	&dev_attr_status.attr,
>> +	&dev_attr_errcnt.attr,
>> +	NULL,
remove comma after NULL as it's last one