Message ID | 20210415135851.862406-5-berrange@redhat.com |
---|---|
State | New |
Headers | show |
Series | block, migration: improve debugging of migration bdrv_flush failure | expand |
* Daniel P. Berrangé (berrange@redhat.com) wrote: > A flush failure is a critical failure scenario for some operations. > For example, it will prevent migration from completing, as it will > make vm_stop() report an error. Thus it is important to have a > trace point present for debugging. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> I'd have had to admit to not having thought that would fail; the fact we're debugging something where it does, suggests it's a good idea! Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > block/file-posix.c | 2 ++ > block/trace-events | 1 + > 2 files changed, 3 insertions(+) > > diff --git a/block/file-posix.c b/block/file-posix.c > index 99cf452f84..6aafeda44f 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -1362,6 +1362,8 @@ static int handle_aiocb_flush(void *opaque) > > ret = qemu_fdatasync(aiocb->aio_fildes); > if (ret == -1) { > + trace_file_flush_fdatasync_failed(errno); > + > /* There is no clear definition of the semantics of a failing fsync(), > * so we may have to assume the worst. The sad truth is that this > * assumption is correct for Linux. Some pages are now probably marked > diff --git a/block/trace-events b/block/trace-events > index 1a12d634e2..c8a943e992 100644 > --- a/block/trace-events > +++ b/block/trace-events > @@ -206,6 +206,7 @@ file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_of > file_FindEjectableOpticalMedia(const char *media) "Matching using %s" > file_setup_cdrom(const char *partition) "Using %s as optical disc" > file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d" > +file_flush_fdatasync_failed(int err) "errno %d" > > # sheepdog.c > sheepdog_reconnect_to_sdog(void) "Wait for connection to be established" > -- > 2.30.2 >
diff --git a/block/file-posix.c b/block/file-posix.c index 99cf452f84..6aafeda44f 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1362,6 +1362,8 @@ static int handle_aiocb_flush(void *opaque) ret = qemu_fdatasync(aiocb->aio_fildes); if (ret == -1) { + trace_file_flush_fdatasync_failed(errno); + /* There is no clear definition of the semantics of a failing fsync(), * so we may have to assume the worst. The sad truth is that this * assumption is correct for Linux. Some pages are now probably marked diff --git a/block/trace-events b/block/trace-events index 1a12d634e2..c8a943e992 100644 --- a/block/trace-events +++ b/block/trace-events @@ -206,6 +206,7 @@ file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_of file_FindEjectableOpticalMedia(const char *media) "Matching using %s" file_setup_cdrom(const char *partition) "Using %s as optical disc" file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d" +file_flush_fdatasync_failed(int err) "errno %d" # sheepdog.c sheepdog_reconnect_to_sdog(void) "Wait for connection to be established"
A flush failure is a critical failure scenario for some operations. For example, it will prevent migration from completing, as it will make vm_stop() report an error. Thus it is important to have a trace point present for debugging. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- block/file-posix.c | 2 ++ block/trace-events | 1 + 2 files changed, 3 insertions(+)