diff mbox series

[v5,11/11] cxl/pci: Add callback to log AER correctable error

Message ID 166984638949.2804499.1293428014191809830.stgit@djiang5-desk3.ch.intel.com
State New
Headers show
Series None | expand

Commit Message

Dave Jiang Nov. 30, 2022, 10:13 p.m. UTC
Add AER error handler callback to read the correctable error status
register for the CXL device. Log the error as a trace event and clear the
error. For CXL devices, the driver also needs to write back to the AER CE
status register to clear the unmasked CEs.

See CXL spec rev3.0 8.2.4.16 for Correctable Error Status Register.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---

v5:
- Update cor_error_log() to cor_error_detected(). 

 drivers/cxl/pci.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Bjorn Helgaas Nov. 30, 2022, 10:47 p.m. UTC | #1
On Wed, Nov 30, 2022 at 03:13:45PM -0700, Dave Jiang wrote:
> Add AER error handler callback to read the correctable error status
> register for the CXL device. Log the error as a trace event and clear the
> error. For CXL devices, the driver also needs to write back to the AER CE
> status register to clear the unmasked CEs.

"AER CE status register" points in the wrong direction.

> See CXL spec rev3.0 8.2.4.16 for Correctable Error Status Register.

>  static const struct pci_error_handlers cxl_error_handlers = {
>  	.error_detected	= cxl_error_detected,
>  	.slot_reset	= cxl_slot_reset,
>  	.resume		= cxl_error_resume,
> +	.cor_error_detected	= cxl_correctable_error_logging,

It makes grep/cscope a little more useful when the function name
includes the struct member name, e.g., "cxl_cor_error_detected".

Bjorn
diff mbox series

Patch

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 11f842df9807..ffebd997dc15 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -622,10 +622,30 @@  static void cxl_error_resume(struct pci_dev *pdev)
 		 dev->driver ? "successful" : "failed");
 }
 
+static void cxl_correctable_error_logging(struct pci_dev *pdev)
+{
+	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
+	struct cxl_memdev *cxlmd = cxlds->cxlmd;
+	struct device *dev = &cxlmd->dev;
+	void __iomem *addr;
+	u32 status;
+
+	if (!cxlds->regs.ras)
+		return;
+
+	addr = cxlds->regs.ras + CXL_RAS_CORRECTABLE_STATUS_OFFSET;
+	status = le32_to_cpu(readl(addr));
+	if (status & CXL_RAS_CORRECTABLE_STATUS_MASK) {
+		writel(status & CXL_RAS_CORRECTABLE_STATUS_MASK, addr);
+		trace_cxl_aer_correctable_error(dev_name(dev), status);
+	}
+}
+
 static const struct pci_error_handlers cxl_error_handlers = {
 	.error_detected	= cxl_error_detected,
 	.slot_reset	= cxl_slot_reset,
 	.resume		= cxl_error_resume,
+	.cor_error_detected	= cxl_correctable_error_logging,
 };
 
 static struct pci_driver cxl_pci_driver = {