mbox series

[v4,0/4] occ: fsi and hwmon: Extract and provide the SBEFIFO FFDC

Message ID 20211019205307.36946-1-eajames@linux.ibm.com
Headers show
Series occ: fsi and hwmon: Extract and provide the SBEFIFO FFDC | expand

Message

Eddie James Oct. 19, 2021, 8:53 p.m. UTC
Currently, users have no way to obtain the FFDC (First Failure Data
Capture) provided by the SBEFIFO when an operation fails. To remedy this,
add code in the FSI OCC driver to store this FFDC in the user's response
buffer and set the response length accordingly.
On the hwmon side, there is a need at the application level to perform
side-band operations in response to SBE errors. Therefore, add a new
binary sysfs file that provides the FFDC (or lack thereof) when there is
an SBEFIFO error. Now applications can take action when an SBE error is
detected.

Changes since v3:
 - Rebase
 - Add a check for valid FFDC length
 - Add comments about SBE words being four bytes

Changes since v2:
 - Add documentation

Changes since v1:
 - Remove the magic value that indicated an SBE/SBEFIFO error with no
   FFDC.
 - Remove binary sysfs state management and intead just clear the error
   flag when the whole FFDC has been read.

Eddie James (4):
  fsi: occ: Use a large buffer for responses
  fsi: occ: Store the SBEFIFO FFDC in the user response buffer
  docs: ABI: testing: Document the OCC hwmon FFDC binary interface
  hwmon: (occ) Provide the SBEFIFO FFDC in binary sysfs

 .../sysfs-bus-platform-devices-occ-hwmon      |  13 ++
 drivers/fsi/fsi-occ.c                         | 164 +++++++++---------
 drivers/hwmon/occ/p9_sbe.c                    |  86 ++++++++-
 include/linux/fsi-occ.h                       |   2 +
 4 files changed, 186 insertions(+), 79 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-platform-devices-occ-hwmon

Comments

Joel Stanley Oct. 21, 2021, 11:25 p.m. UTC | #1
On Tue, 19 Oct 2021 at 20:53, Eddie James <eajames@linux.ibm.com> wrote:
>
> Currently, users have no way to obtain the FFDC (First Failure Data
> Capture) provided by the SBEFIFO when an operation fails. To remedy this,
> add code in the FSI OCC driver to store this FFDC in the user's response
> buffer and set the response length accordingly.
> On the hwmon side, there is a need at the application level to perform
> side-band operations in response to SBE errors. Therefore, add a new
> binary sysfs file that provides the FFDC (or lack thereof) when there is
> an SBEFIFO error. Now applications can take action when an SBE error is
> detected.

Thanks, I've merged these. I took the chance to add some of your
responses to the commit messages as they were useful.

>
> Changes since v3:
>  - Rebase
>  - Add a check for valid FFDC length
>  - Add comments about SBE words being four bytes
>
> Changes since v2:
>  - Add documentation
>
> Changes since v1:
>  - Remove the magic value that indicated an SBE/SBEFIFO error with no
>    FFDC.
>  - Remove binary sysfs state management and intead just clear the error
>    flag when the whole FFDC has been read.
>
> Eddie James (4):
>   fsi: occ: Use a large buffer for responses
>   fsi: occ: Store the SBEFIFO FFDC in the user response buffer
>   docs: ABI: testing: Document the OCC hwmon FFDC binary interface
>   hwmon: (occ) Provide the SBEFIFO FFDC in binary sysfs
>
>  .../sysfs-bus-platform-devices-occ-hwmon      |  13 ++
>  drivers/fsi/fsi-occ.c                         | 164 +++++++++---------
>  drivers/hwmon/occ/p9_sbe.c                    |  86 ++++++++-
>  include/linux/fsi-occ.h                       |   2 +
>  4 files changed, 186 insertions(+), 79 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-platform-devices-occ-hwmon
>
> --
> 2.27.0
>