Patchwork [V2] get_tmp_filename: add explicit error message

login
register
mail settings
Submitter Fabien Chouteau
Date Feb. 6, 2013, 2:17 p.m.
Message ID <1360160243-31611-1-git-send-email-chouteau@adacore.com>
Download mbox | patch
Permalink /patch/218633/
State New
Headers show

Comments

Fabien Chouteau - Feb. 6, 2013, 2:17 p.m.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
---
 block.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)
Stefan Hajnoczi - Feb. 18, 2013, 1:47 p.m.
On Wed, Feb 06, 2013 at 03:17:23PM +0100, Fabien Chouteau wrote:
> Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
> ---
>  block.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)

Markus: Any more feedback on this patch?

> diff --git a/block.c b/block.c
> index ba67c0d..79fe01b 100644
> --- a/block.c
> +++ b/block.c
> @@ -428,9 +428,16 @@ int get_tmp_filename(char *filename, int size)
>      /* GetTempFileName requires that its output buffer (4th param)
>         have length MAX_PATH or greater.  */
>      assert(size >= MAX_PATH);
> -    return (GetTempPath(MAX_PATH, temp_dir)
> -            && GetTempFileName(temp_dir, "qem", 0, filename)
> -            ? 0 : -GetLastError());
> +    if (GetTempPath(MAX_PATH, temp_dir) == 0) {
> +        error_report("%s: GetTempPath() error: %d\n", __func__, GetLastError());
> +        return -GetLastError();
> +    }
> +    if (GetTempFileName(temp_dir, "qem", 0, filename) == 0) {
> +        error_report("%s: GetTempFileName(%s) error: %d\n", __func__, temp_dir,
> +                GetLastError());
> +        return -GetLastError();
> +    }
> +    return 0;
>  #else
>      int fd;
>      const char *tmpdir;
> @@ -442,9 +449,11 @@ int get_tmp_filename(char *filename, int size)
>      }
>      fd = mkstemp(filename);
>      if (fd < 0) {
> +        error_report("%s: mkstemp() error: %s\n", __func__, strerror(errno));
>          return -errno;
>      }
>      if (close(fd) != 0) {
> +        error_report("%s: close() error: %s\n", __func__, strerror(errno));
>          unlink(filename);
>          return -errno;
>      }
> -- 
> 1.7.9.5
>
Markus Armbruster - Feb. 18, 2013, 4:37 p.m.
I agree with you that the existing error reporting is too unspecific in
many cases, and I applaud your attempt to do something about it, but I'm
afraid this patch creates as many problems as it solves.  Details below.

Fabien Chouteau <chouteau@adacore.com> writes:

> Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
> ---
>  block.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/block.c b/block.c
> index ba67c0d..79fe01b 100644
> --- a/block.c
> +++ b/block.c
> @@ -428,9 +428,16 @@ int get_tmp_filename(char *filename, int size)
>      /* GetTempFileName requires that its output buffer (4th param)
>         have length MAX_PATH or greater.  */
>      assert(size >= MAX_PATH);
> -    return (GetTempPath(MAX_PATH, temp_dir)
> -            && GetTempFileName(temp_dir, "qem", 0, filename)
> -            ? 0 : -GetLastError());
> +    if (GetTempPath(MAX_PATH, temp_dir) == 0) {
> +        error_report("%s: GetTempPath() error: %d\n", __func__, GetLastError());
> +        return -GetLastError();
> +    }
> +    if (GetTempFileName(temp_dir, "qem", 0, filename) == 0) {
> +        error_report("%s: GetTempFileName(%s) error: %d\n", __func__, temp_dir,
> +                GetLastError());
> +        return -GetLastError();
> +    }
> +    return 0;
>  #else
>      int fd;
>      const char *tmpdir;
> @@ -442,9 +449,11 @@ int get_tmp_filename(char *filename, int size)
>      }
>      fd = mkstemp(filename);
>      if (fd < 0) {
> +        error_report("%s: mkstemp() error: %s\n", __func__, strerror(errno));
>          return -errno;
>      }
>      if (close(fd) != 0) {
> +        error_report("%s: close() error: %s\n", __func__, strerror(errno));
>          unlink(filename);
>          return -errno;
>      }

In my review of v1, I wrote "The function's (implied) contract is to
return an error code without printing anything.  If you want to change
the contract to include reporting the error, you [...] have to
demonstrate that all callers are happy with the change of contract."  So
let's check the two callers of get_tmp_filename():

1. bdrv_open()

   Complex function, can fail in many ways.  Returns an error code.
   Does not report errors; that's left to its callers.

   Your patch effectively changes bdrv_open() to report the error in one
   of its failure modes.

   For callers that report bdrv_open() failure to the user, we then get
   two error messages: the one you add, followed by a less specific one
   from further up the call chain.  Reporting the same error multiple
   times is not nice.

   For callers that neglect to report bdrv_open() failure to the user
   even though they should (if such buggy callers exist), you fix the
   problem for one failure mode only.

   For callers that handle bdrv_open() failure without reporting it to
   the user, you add an unwanted error message.  Not good.  You haven't
   demonstrated that no such callers exist.

2. vvfat.c's enable_write_target()

   bdrv_vvfat's bdrv_file_open() method vvfat_open() is the only caller.
   It's called by bdrv_open() (covered by 1.), and by bdrv_file_open().
   Like bdrv_open(), bdrv_file_open() returns an error code, and leaves
   error reporting to its caller.  Same issues as above.

Apart from these fundamental gaps, the new error message needs polish.
Say mkstemp() fails ENOSPC.  Gets reported roughly like this:

    qemu-system-x86_64: -drive file=f16.img: get_tmp_filename: mkstemp() error: No space left on device
    qemu-system-x86_64: -drive file=f16.img: could not open disk image f16.img: No space left on device

The second message talks to the user in user terms.  That's proper.  The
first one talks source code instead.  From a user's point of view,
"get_tmp_filename" and "mkstemp() error" are gobbledygook.  At best,
they can help him guessing what the problem might be.
Fabien Chouteau - Feb. 19, 2013, 10:28 a.m.
On 02/18/2013 05:37 PM, Markus Armbruster wrote:
> I agree with you that the existing error reporting is too unspecific in
> many cases, and I applaud your attempt to do something about it, but I'm
> afraid this patch creates as many problems as it solves.  Details below.
>
>
> In my review of v1, I wrote "The function's (implied) contract is to
> return an error code without printing anything.  If you want to change
> the contract to include reporting the error, you [...] have to
> demonstrate that all callers are happy with the change of contract."  So
> let's check the two callers of get_tmp_filename():
>
> 1. bdrv_open()
>
>    Complex function, can fail in many ways.  Returns an error code.
>    Does not report errors; that's left to its callers.
>
>    Your patch effectively changes bdrv_open() to report the error in one
>    of its failure modes.
>
>    For callers that report bdrv_open() failure to the user, we then get
>    two error messages: the one you add, followed by a less specific one
>    from further up the call chain.  Reporting the same error multiple
>    times is not nice.

It seems that your point of view is very Linux centric, on Windows we
didn't get any error message, just "Operation not permitted" for every
error in bdrv_open. I've spent 15 mins trying to find the exact location
of the error, going into the complex call tree of bdrv, and it's not the
first time.

Maybe I should just put the error message in the Windows code. So it
doesn't duplicate on Linux.

>
>    For callers that neglect to report bdrv_open() failure to the user
>    even though they should (if such buggy callers exist), you fix the
>    problem for one failure mode only.

One could say that it's already something. I'd like to have the time to
add error messages for all possible failures, but unfortunately I don't.

BTW, it looks likes a common rule in Qemu, we never check the error code
from Win32 API (maybe because there's no way to efficiently report those
errors).

> Apart from these fundamental gaps, the new error message needs polish.
> Say mkstemp() fails ENOSPC.  Gets reported roughly like this:
>
>     qemu-system-x86_64: -drive file=f16.img: get_tmp_filename: mkstemp() error: No space left on device
>     qemu-system-x86_64: -drive file=f16.img: could not open disk image f16.img: No space left on device
>
> The second message talks to the user in user terms.  That's proper.  The
> first one talks source code instead.  From a user's point of view,
> "get_tmp_filename" and "mkstemp() error" are gobbledygook.  At best,
> they can help him guessing what the problem might be.
>

I know this error message is not user friendly, but (again) it's still
better than 15 mins of digging in the code...

I don't want to spend much time on this small issue. If you consider
that it creates more problems than is solves, that's fine. I understand
your concerns and wanted to expose mine. Anyway the patch will remain in
our branch until a better solution is found.

Regards,
Markus Armbruster - Feb. 19, 2013, 12:01 p.m.
Fabien Chouteau <chouteau@adacore.com> writes:

> On 02/18/2013 05:37 PM, Markus Armbruster wrote:
>> I agree with you that the existing error reporting is too unspecific in
>> many cases, and I applaud your attempt to do something about it, but I'm
>> afraid this patch creates as many problems as it solves.  Details below.
>>
>>
>> In my review of v1, I wrote "The function's (implied) contract is to
>> return an error code without printing anything.  If you want to change
>> the contract to include reporting the error, you [...] have to
>> demonstrate that all callers are happy with the change of contract."  So
>> let's check the two callers of get_tmp_filename():
>>
>> 1. bdrv_open()
>>
>>    Complex function, can fail in many ways.  Returns an error code.
>>    Does not report errors; that's left to its callers.
>>
>>    Your patch effectively changes bdrv_open() to report the error in one
>>    of its failure modes.
>>
>>    For callers that report bdrv_open() failure to the user, we then get
>>    two error messages: the one you add, followed by a less specific one
>>    from further up the call chain.  Reporting the same error multiple
>>    times is not nice.
>
> It seems that your point of view is very Linux centric, on Windows we
> didn't get any error message, just "Operation not permitted" for every
> error in bdrv_open.

Sounds like a bug, namely bdrv_open() returning bogus error codes on
Windows at least for some failures.

>                     I've spent 15 mins trying to find the exact location
> of the error, going into the complex call tree of bdrv, and it's not the
> first time.

Yes, the error reporting is really unhelpful in many cases, and not just
under Windows.

> Maybe I should just put the error message in the Windows code. So it
> doesn't duplicate on Linux.
>
>>
>>    For callers that neglect to report bdrv_open() failure to the user
>>    even though they should (if such buggy callers exist), you fix the
>>    problem for one failure mode only.
>
> One could say that it's already something. I'd like to have the time to
> add error messages for all possible failures, but unfortunately I don't.
>
> BTW, it looks likes a common rule in Qemu, we never check the error code
> from Win32 API (maybe because there's no way to efficiently report those
> errors).

I doubt there are fundamental difficulties with the Windows API.  No,
the root problem with QEMU under Windows is lack of maintainers.  Until
somebody cares enough to take on Windows maintenance, it'll continue to
languish.

>> Apart from these fundamental gaps, the new error message needs polish.
>> Say mkstemp() fails ENOSPC.  Gets reported roughly like this:
>>
>>     qemu-system-x86_64: -drive file=f16.img: get_tmp_filename:
>> mkstemp() error: No space left on device
>>     qemu-system-x86_64: -drive file=f16.img: could not open disk
>> image f16.img: No space left on device
>>
>> The second message talks to the user in user terms.  That's proper.  The
>> first one talks source code instead.  From a user's point of view,
>> "get_tmp_filename" and "mkstemp() error" are gobbledygook.  At best,
>> they can help him guessing what the problem might be.
>>
>
> I know this error message is not user friendly, but (again) it's still
> better than 15 mins of digging in the code...

No argument :)

> I don't want to spend much time on this small issue. If you consider
> that it creates more problems than is solves, that's fine. I understand
> your concerns and wanted to expose mine. Anyway the patch will remain in
> our branch until a better solution is found.

Fair enough & thanks for doing as much as you did.

Patch

diff --git a/block.c b/block.c
index ba67c0d..79fe01b 100644
--- a/block.c
+++ b/block.c
@@ -428,9 +428,16 @@  int get_tmp_filename(char *filename, int size)
     /* GetTempFileName requires that its output buffer (4th param)
        have length MAX_PATH or greater.  */
     assert(size >= MAX_PATH);
-    return (GetTempPath(MAX_PATH, temp_dir)
-            && GetTempFileName(temp_dir, "qem", 0, filename)
-            ? 0 : -GetLastError());
+    if (GetTempPath(MAX_PATH, temp_dir) == 0) {
+        error_report("%s: GetTempPath() error: %d\n", __func__, GetLastError());
+        return -GetLastError();
+    }
+    if (GetTempFileName(temp_dir, "qem", 0, filename) == 0) {
+        error_report("%s: GetTempFileName(%s) error: %d\n", __func__, temp_dir,
+                GetLastError());
+        return -GetLastError();
+    }
+    return 0;
 #else
     int fd;
     const char *tmpdir;
@@ -442,9 +449,11 @@  int get_tmp_filename(char *filename, int size)
     }
     fd = mkstemp(filename);
     if (fd < 0) {
+        error_report("%s: mkstemp() error: %s\n", __func__, strerror(errno));
         return -errno;
     }
     if (close(fd) != 0) {
+        error_report("%s: close() error: %s\n", __func__, strerror(errno));
         unlink(filename);
         return -errno;
     }