mbox series

[00/11] File system wide monitoring

Message ID 20210521024134.1032503-1-krisman@collabora.com
Headers show
Series File system wide monitoring | expand

Message

Gabriel Krisman Bertazi May 21, 2021, 2:41 a.m. UTC
Hi,

This series follow up on my previous proposal [1] to support file system
wide monitoring.  As suggested by Amir, this proposal drops the ring
buffer in favor of a single slot associated with each mark.  This
simplifies a bit the implementation, as you can see in the code.

As a reminder, This proposal is limited to an interface for
administrators to monitor the health of a file system, instead of a
generic inteface for file errors.  Therefore, this doesn't solve the
problem of writeback errors or the need to watch a specific subtree.

In comparison to the previous RFC, this implementation also drops the
per-fs data and location, and leave those as future extensions.

* Implementation

The feature is implemented on top of fanotify, as a new type of fanotify
mark, FAN_ERROR, which a file system monitoring tool can register to
receive error notifications.  When an error occurs a new notification is
generated, in addition followed by this info field:

 - FS generic data: A file system agnostic structure that has a generic
 error code and identifies the filesystem.  Basically, it let's
 userspace know something happened on a monitored filesystem.  Since
 only the first error is recorded since the last read, this also
 includes a counter of errors that happened since the last read.

* Testing

This was tested by watching notifications flowing from an intentionally
corrupted filesystem in different places.  In addition, other events
were watched in an attempt to detect regressions.

Is there a specific testsuite for fanotify I should be running?

* Patches

This patchset is divided as follows: Patch 1 through 5 are refactoring
to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6 and
7 implement the FS_ERROR API for filesystems to report error; patch 8
add support for FAN_ERROR in fanotify; Patch 9 is an example
implementation for ext4; patch 10 and 11 provide a sample userspace code
and documentation.

I also pushed the full series to:

  https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot

[1] https://lwn.net/Articles/854545/

Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dave Chinner <david@fromorbit.com>
Cc: jack@suse.com
To: amir73il@gmail.com
Cc: dhowells@redhat.com
Cc: khazhy@google.com
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-ext4@vger.kernel.org

Gabriel Krisman Bertazi (11):
  fanotify: Fold event size calculation to its own function
  fanotify: Split fsid check from other fid mode checks
  fanotify: Simplify directory sanity check in DFID_NAME mode
  fanotify: Expose fanotify_mark
  inotify: Don't force FS_IN_IGNORED
  fsnotify: Support FS_ERROR event type
  fsnotify: Introduce helpers to send error_events
  fanotify: Introduce FAN_ERROR event
  ext4: Send notifications on error
  samples: Add fs error monitoring example
  Documentation: Document the FAN_ERROR event

 .../admin-guide/filesystem-monitoring.rst     |  52 +++++
 Documentation/admin-guide/index.rst           |   1 +
 fs/ext4/super.c                               |   8 +
 fs/notify/fanotify/fanotify.c                 |  80 ++++++-
 fs/notify/fanotify/fanotify.h                 |  38 +++-
 fs/notify/fanotify/fanotify_user.c            | 213 ++++++++++++++----
 fs/notify/inotify/inotify_user.c              |   6 +-
 include/linux/fanotify.h                      |   6 +-
 include/linux/fsnotify.h                      |  13 ++
 include/linux/fsnotify_backend.h              |  15 +-
 include/uapi/linux/fanotify.h                 |  10 +
 samples/Kconfig                               |   8 +
 samples/Makefile                              |   1 +
 samples/fanotify/Makefile                     |   3 +
 samples/fanotify/fs-monitor.c                 |  91 ++++++++
 15 files changed, 485 insertions(+), 60 deletions(-)
 create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
 create mode 100644 samples/fanotify/Makefile
 create mode 100644 samples/fanotify/fs-monitor.c

Comments

Amir Goldstein May 21, 2021, 8:31 a.m. UTC | #1
On Fri, May 21, 2021 at 5:42 AM Gabriel Krisman Bertazi
<krisman@collabora.com> wrote:
>
> Hi,
>
> This series follow up on my previous proposal [1] to support file system
> wide monitoring.  As suggested by Amir, this proposal drops the ring
> buffer in favor of a single slot associated with each mark.  This
> simplifies a bit the implementation, as you can see in the code.
>
> As a reminder, This proposal is limited to an interface for
> administrators to monitor the health of a file system, instead of a
> generic inteface for file errors.  Therefore, this doesn't solve the
> problem of writeback errors or the need to watch a specific subtree.
>
> In comparison to the previous RFC, this implementation also drops the
> per-fs data and location, and leave those as future extensions.
>
> * Implementation
>
> The feature is implemented on top of fanotify, as a new type of fanotify
> mark, FAN_ERROR, which a file system monitoring tool can register to
> receive error notifications.  When an error occurs a new notification is
> generated, in addition followed by this info field:
>
>  - FS generic data: A file system agnostic structure that has a generic
>  error code and identifies the filesystem.  Basically, it let's
>  userspace know something happened on a monitored filesystem.  Since
>  only the first error is recorded since the last read, this also
>  includes a counter of errors that happened since the last read.
>
> * Testing
>
> This was tested by watching notifications flowing from an intentionally
> corrupted filesystem in different places.  In addition, other events
> were watched in an attempt to detect regressions.
>
> Is there a specific testsuite for fanotify I should be running?

LTP is where we maintain the fsnotify regression test.
The inotify* and fanotify* tests specifically.

>
> * Patches
>
> This patchset is divided as follows: Patch 1 through 5 are refactoring
> to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6 and
> 7 implement the FS_ERROR API for filesystems to report error; patch 8
> add support for FAN_ERROR in fanotify; Patch 9 is an example
> implementation for ext4; patch 10 and 11 provide a sample userspace code
> and documentation.
>
> I also pushed the full series to:
>
>   https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot

All in all the series looks good, give or take some implementation
details.

One general comment about UAPI (CC linux-api) -
I think Darrick has proposed to report ino/gen instead of only ino.
I personally think it would be a shame not to reuse the already existing
FAN_EVENT_INFO_TYPE_FID record format, but I can understand why
you did not want to go there:
1. Not all error reports carry inode information
2. Not all filesystems support file handles
3. Any other reason that I missed?

My proposal is that in cases where group was initialized with
FAN_REPORT_FID (implies fs supports file handles) AND error report
does carry inode information, record fanotify_info in fanotify_error_event
and report FAN_EVENT_INFO_TYPE_FID record in addition to
FAN_EVENT_INFO_TYPE_ERROR record to user.

I am not insisting on this change, but I think it won't add much complexity
to your implementation and it will allow more flexibility to the API going
forward.

However, for the time being, if you want to avoid the UAPI discussion,
I don't mind if you disallow FAN_ERROR mark for group with
FAN_REPORT_FID.

In most likelihood, the tool monitoring filesystem for errors will not care
about other events, so it shouldn't care about FAN_REPORT_FID anyway.
I'd like to hear what other think about this point as well.

Thanks,
Amir.
Theodore Ts'o May 22, 2021, 11:25 p.m. UTC | #2
Hi Gabriel,

Quick question; what userspace program are you using to test this
feature?  Do you have a custom testing program you are using?  If so,
could share it?

Many thanks!!

						- Ted

On Thu, May 20, 2021 at 10:41:23PM -0400, Gabriel Krisman Bertazi wrote:
> Hi,
> 
> This series follow up on my previous proposal [1] to support file system
> wide monitoring.  As suggested by Amir, this proposal drops the ring
> buffer in favor of a single slot associated with each mark.  This
> simplifies a bit the implementation, as you can see in the code.
> 
> As a reminder, This proposal is limited to an interface for
> administrators to monitor the health of a file system, instead of a
> generic inteface for file errors.  Therefore, this doesn't solve the
> problem of writeback errors or the need to watch a specific subtree.
> 
> In comparison to the previous RFC, this implementation also drops the
> per-fs data and location, and leave those as future extensions.
> 
> * Implementation
> 
> The feature is implemented on top of fanotify, as a new type of fanotify
> mark, FAN_ERROR, which a file system monitoring tool can register to
> receive error notifications.  When an error occurs a new notification is
> generated, in addition followed by this info field:
> 
>  - FS generic data: A file system agnostic structure that has a generic
>  error code and identifies the filesystem.  Basically, it let's
>  userspace know something happened on a monitored filesystem.  Since
>  only the first error is recorded since the last read, this also
>  includes a counter of errors that happened since the last read.
> 
> * Testing
> 
> This was tested by watching notifications flowing from an intentionally
> corrupted filesystem in different places.  In addition, other events
> were watched in an attempt to detect regressions.
> 
> Is there a specific testsuite for fanotify I should be running?
> 
> * Patches
> 
> This patchset is divided as follows: Patch 1 through 5 are refactoring
> to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6 and
> 7 implement the FS_ERROR API for filesystems to report error; patch 8
> add support for FAN_ERROR in fanotify; Patch 9 is an example
> implementation for ext4; patch 10 and 11 provide a sample userspace code
> and documentation.
> 
> I also pushed the full series to:
> 
>   https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot
> 
> [1] https://lwn.net/Articles/854545/
> 
> Cc: Darrick J. Wong <djwong@kernel.org>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: jack@suse.com
> To: amir73il@gmail.com
> Cc: dhowells@redhat.com
> Cc: khazhy@google.com
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-ext4@vger.kernel.org
> 
> Gabriel Krisman Bertazi (11):
>   fanotify: Fold event size calculation to its own function
>   fanotify: Split fsid check from other fid mode checks
>   fanotify: Simplify directory sanity check in DFID_NAME mode
>   fanotify: Expose fanotify_mark
>   inotify: Don't force FS_IN_IGNORED
>   fsnotify: Support FS_ERROR event type
>   fsnotify: Introduce helpers to send error_events
>   fanotify: Introduce FAN_ERROR event
>   ext4: Send notifications on error
>   samples: Add fs error monitoring example
>   Documentation: Document the FAN_ERROR event
> 
>  .../admin-guide/filesystem-monitoring.rst     |  52 +++++
>  Documentation/admin-guide/index.rst           |   1 +
>  fs/ext4/super.c                               |   8 +
>  fs/notify/fanotify/fanotify.c                 |  80 ++++++-
>  fs/notify/fanotify/fanotify.h                 |  38 +++-
>  fs/notify/fanotify/fanotify_user.c            | 213 ++++++++++++++----
>  fs/notify/inotify/inotify_user.c              |   6 +-
>  include/linux/fanotify.h                      |   6 +-
>  include/linux/fsnotify.h                      |  13 ++
>  include/linux/fsnotify_backend.h              |  15 +-
>  include/uapi/linux/fanotify.h                 |  10 +
>  samples/Kconfig                               |   8 +
>  samples/Makefile                              |   1 +
>  samples/fanotify/Makefile                     |   3 +
>  samples/fanotify/fs-monitor.c                 |  91 ++++++++
>  15 files changed, 485 insertions(+), 60 deletions(-)
>  create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
>  create mode 100644 samples/fanotify/Makefile
>  create mode 100644 samples/fanotify/fs-monitor.c
> 
> -- 
> 2.31.0
>
Ian Kent May 24, 2021, 3:06 a.m. UTC | #3
On Thu, 2021-05-20 at 22:41 -0400, Gabriel Krisman Bertazi wrote:
> Hi,
> 
> This series follow up on my previous proposal [1] to support file
> system
> wide monitoring.  As suggested by Amir, this proposal drops the ring
> buffer in favor of a single slot associated with each mark.  This
> simplifies a bit the implementation, as you can see in the code.

I get the need for simplification but I'm wondering where this
will end up.

I also know kernel space to user space error communication has
been a concern for quite a while now.

And, from that, there are a couple of things that occur to me.

One is that the standard errno is often not sufficient to give
sufficiently accurate error reports.

It seems to me that, in the long run, there needs to be a way
for sub-systems to register errors that they will use to report
events (with associated text description) so they can be more
informative. That's probably not as simple as it sounds due to
things like error number clashes, etc. OTOH that mechanism could
be used to avoid using text strings in notifications provided
provided there was a matching user space library, thereby reducing
the size of the event report.

Another aspect, also related to the limitations of error reporting
in general, is the way the information could be used. Again, not a
simple thing to do or grok, but would probably require some way of
grouping errors that are related in a stack like manner for user
space inference engines to analyse. Yes, this is very much out of
scope but is a big picture long term usefulness type of notion.

And I don't know how error storms occurring as a side effect of
some fairly serious problem could be handled ... 

So not really related to the current implementation but a comment
to try and get peoples thoughts about where this is heading in
the long run.

Ian
> 
> As a reminder, This proposal is limited to an interface for
> administrators to monitor the health of a file system, instead of a
> generic inteface for file errors.  Therefore, this doesn't solve the
> problem of writeback errors or the need to watch a specific subtree.
> 
> In comparison to the previous RFC, this implementation also drops the
> per-fs data and location, and leave those as future extensions.
> 
> * Implementation
> 
> The feature is implemented on top of fanotify, as a new type of
> fanotify
> mark, FAN_ERROR, which a file system monitoring tool can register to
> receive error notifications.  When an error occurs a new notification
> is
> generated, in addition followed by this info field:
> 
>  - FS generic data: A file system agnostic structure that has a
> generic
>  error code and identifies the filesystem.  Basically, it let's
>  userspace know something happened on a monitored filesystem.  Since
>  only the first error is recorded since the last read, this also
>  includes a counter of errors that happened since the last read.
> 
> * Testing
> 
> This was tested by watching notifications flowing from an
> intentionally
> corrupted filesystem in different places.  In addition, other events
> were watched in an attempt to detect regressions.
> 
> Is there a specific testsuite for fanotify I should be running?
> 
> * Patches
> 
> This patchset is divided as follows: Patch 1 through 5 are
> refactoring
> to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6
> and
> 7 implement the FS_ERROR API for filesystems to report error; patch 8
> add support for FAN_ERROR in fanotify; Patch 9 is an example
> implementation for ext4; patch 10 and 11 provide a sample userspace
> code
> and documentation.
> 
> I also pushed the full series to:
> 
>   https://gitlab.collabora.com/krisman/linux -b fanotify-
> notifications-single-slot
> 
> [1] https://lwn.net/Articles/854545/
> 
> Cc: Darrick J. Wong <djwong@kernel.org>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: jack@suse.com
> To: amir73il@gmail.com
> Cc: dhowells@redhat.com
> Cc: khazhy@google.com
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-ext4@vger.kernel.org
> 
> Gabriel Krisman Bertazi (11):
>   fanotify: Fold event size calculation to its own function
>   fanotify: Split fsid check from other fid mode checks
>   fanotify: Simplify directory sanity check in DFID_NAME mode
>   fanotify: Expose fanotify_mark
>   inotify: Don't force FS_IN_IGNORED
>   fsnotify: Support FS_ERROR event type
>   fsnotify: Introduce helpers to send error_events
>   fanotify: Introduce FAN_ERROR event
>   ext4: Send notifications on error
>   samples: Add fs error monitoring example
>   Documentation: Document the FAN_ERROR event
> 
>  .../admin-guide/filesystem-monitoring.rst     |  52 +++++
>  Documentation/admin-guide/index.rst           |   1 +
>  fs/ext4/super.c                               |   8 +
>  fs/notify/fanotify/fanotify.c                 |  80 ++++++-
>  fs/notify/fanotify/fanotify.h                 |  38 +++-
>  fs/notify/fanotify/fanotify_user.c            | 213 ++++++++++++++--
> --
>  fs/notify/inotify/inotify_user.c              |   6 +-
>  include/linux/fanotify.h                      |   6 +-
>  include/linux/fsnotify.h                      |  13 ++
>  include/linux/fsnotify_backend.h              |  15 +-
>  include/uapi/linux/fanotify.h                 |  10 +
>  samples/Kconfig                               |   8 +
>  samples/Makefile                              |   1 +
>  samples/fanotify/Makefile                     |   3 +
>  samples/fanotify/fs-monitor.c                 |  91 ++++++++
>  15 files changed, 485 insertions(+), 60 deletions(-)
>  create mode 100644 Documentation/admin-guide/filesystem-
> monitoring.rst
>  create mode 100644 samples/fanotify/Makefile
>  create mode 100644 samples/fanotify/fs-monitor.c
>
Gabriel Krisman Bertazi May 24, 2021, 3:19 p.m. UTC | #4
"Theodore Y. Ts'o" <tytso@mit.edu> writes:

> Hi Gabriel,
>
> Quick question; what userspace program are you using to test this
> feature?  Do you have a custom testing program you are using?  If so,
> could share it?

Hello Ted,

I'm using the program in patch 10, to watch and print notifications ,
along with corrupt filesystems. I trigger operations via command line
and watch the reports flow. I have slightly modified the sample code to
test marks disappearing at inopportune times, but that's trivial to
recreate with the samples code.

I plan to write more automated tests for LTP, once we settle on this
design.