diff mbox series

[4/5] block: add trace point when fdatasync fails

Message ID 20210415135851.862406-5-berrange@redhat.com
State New
Headers show
Series block, migration: improve debugging of migration bdrv_flush failure | expand

Commit Message

Daniel P. Berrangé April 15, 2021, 1:58 p.m. UTC
A flush failure is a critical failure scenario for some operations.
For example, it will prevent migration from completing, as it will
make vm_stop() report an error. Thus it is important to have a
trace point present for debugging.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 block/file-posix.c | 2 ++
 block/trace-events | 1 +
 2 files changed, 3 insertions(+)

Comments

Dr. David Alan Gilbert April 15, 2021, 4:34 p.m. UTC | #1
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> A flush failure is a critical failure scenario for some operations.
> For example, it will prevent migration from completing, as it will
> make vm_stop() report an error. Thus it is important to have a
> trace point present for debugging.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

I'd have had to admit to not having thought that would fail; the fact
we're debugging something where it does, suggests it's a good idea!


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  block/file-posix.c | 2 ++
>  block/trace-events | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 99cf452f84..6aafeda44f 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1362,6 +1362,8 @@ static int handle_aiocb_flush(void *opaque)
>  
>      ret = qemu_fdatasync(aiocb->aio_fildes);
>      if (ret == -1) {
> +        trace_file_flush_fdatasync_failed(errno);
> +
>          /* There is no clear definition of the semantics of a failing fsync(),
>           * so we may have to assume the worst. The sad truth is that this
>           * assumption is correct for Linux. Some pages are now probably marked
> diff --git a/block/trace-events b/block/trace-events
> index 1a12d634e2..c8a943e992 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -206,6 +206,7 @@ file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_of
>  file_FindEjectableOpticalMedia(const char *media) "Matching using %s"
>  file_setup_cdrom(const char *partition) "Using %s as optical disc"
>  file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d"
> +file_flush_fdatasync_failed(int err) "errno %d"
>  
>  # sheepdog.c
>  sheepdog_reconnect_to_sdog(void) "Wait for connection to be established"
> -- 
> 2.30.2
>
diff mbox series

Patch

diff --git a/block/file-posix.c b/block/file-posix.c
index 99cf452f84..6aafeda44f 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1362,6 +1362,8 @@  static int handle_aiocb_flush(void *opaque)
 
     ret = qemu_fdatasync(aiocb->aio_fildes);
     if (ret == -1) {
+        trace_file_flush_fdatasync_failed(errno);
+
         /* There is no clear definition of the semantics of a failing fsync(),
          * so we may have to assume the worst. The sad truth is that this
          * assumption is correct for Linux. Some pages are now probably marked
diff --git a/block/trace-events b/block/trace-events
index 1a12d634e2..c8a943e992 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -206,6 +206,7 @@  file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_of
 file_FindEjectableOpticalMedia(const char *media) "Matching using %s"
 file_setup_cdrom(const char *partition) "Using %s as optical disc"
 file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d"
+file_flush_fdatasync_failed(int err) "errno %d"
 
 # sheepdog.c
 sheepdog_reconnect_to_sdog(void) "Wait for connection to be established"