diff mbox

[02/14] qerror: add new errors

Message ID 1338387301-10074-3-git-send-email-lcapitulino@redhat.com
State New
Headers show

Commit Message

Luiz Capitulino May 30, 2012, 2:14 p.m. UTC
New errors for write() and open() failures. Will be used by the
next commits.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 qerror.c | 28 ++++++++++++++++++++++++++++
 qerror.h | 21 +++++++++++++++++++++
 2 files changed, 49 insertions(+)

Comments

Paolo Bonzini May 31, 2012, 10:42 a.m. UTC | #1
Il 30/05/2012 16:14, Luiz Capitulino ha scritto:
> New errors for write() and open() failures. Will be used by the
> next commits.

Ouch.  We have already these errors:

#define QERR_OPEN_FILE_FAILED \
    "{ 'class': 'OpenFileFailed', 'data': { 'filename': %s } }"

#define QERR_SOCKET_CONNECT_FAILED \
    "{ 'class': 'SockConnectFailed', 'data': {} }"

#define QERR_SOCKET_LISTEN_FAILED \
    "{ 'class': 'SockListenFailed', 'data': {} }"

#define QERR_SOCKET_BIND_FAILED \
    "{ 'class': 'SockBindFailed', 'data': {} }"

#define QERR_SOCKET_CREATE_FAILED \
    "{ 'class': 'SockCreateFailed', 'data': {} }"


So let's just add QERR_FILE_WRITE_FAILED.

What about errno values?  Let's add an enum QemuErrno and convert host
errnos to that enum.  Enums are sent as strings, so they are neutral to
the OS of the host and client.  And the client (if it desires) can
convert back to an errno value and use strerror to provide a
human-readable, easily internationalizable error message.

Errors are not QAPI-ized yet, so we can add errno values to the above
five errors too.

What more can you ask for?

Paolo
Luiz Capitulino May 31, 2012, 2:06 p.m. UTC | #2
On Thu, 31 May 2012 12:42:34 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 30/05/2012 16:14, Luiz Capitulino ha scritto:
> > New errors for write() and open() failures. Will be used by the
> > next commits.
> 
> Ouch.  We have already these errors:
> 
> #define QERR_OPEN_FILE_FAILED \
>     "{ 'class': 'OpenFileFailed', 'data': { 'filename': %s } }"
>
> #define QERR_SOCKET_CONNECT_FAILED \
>     "{ 'class': 'SockConnectFailed', 'data': {} }"
> 
> #define QERR_SOCKET_LISTEN_FAILED \
>     "{ 'class': 'SockListenFailed', 'data': {} }"
> 
> #define QERR_SOCKET_BIND_FAILED \
>     "{ 'class': 'SockBindFailed', 'data': {} }"
> 
> #define QERR_SOCKET_CREATE_FAILED \
>     "{ 'class': 'SockCreateFailed', 'data': {} }"
> 
> 
> So let's just add QERR_FILE_WRITE_FAILED.

All of them are bad because they don't tell why the operation has failed (more
on errno below). Honestly, I don't remember seeing these. I would have opposed
to merging them.

> What about errno values?  Let's add an enum QemuErrno and convert host
> errnos to that enum.  Enums are sent as strings, so they are neutral to
> the OS of the host and client.  And the client (if it desires) can
> convert back to an errno value and use strerror to provide a
> human-readable, easily internationalizable error message.

That's more or less what this patch does and the path we're taking with
qerror. For errors that have an errno correspondent we're adding classes
with errno names and this can be converted to an enum or whatever you wish.

Now, using strerror() as a human description has been debated a lot in the
past and has been declined (mostly by Anthony). I'm in favor of it, but I'm
afraid I'm not the one to be convinced.

> Errors are not QAPI-ized yet, so we can add errno values to the above
> five errors too.

We've preferred adding new errors instead of adding errno values to
existing errors. I don't remember exactly why, but I personally don't care.

> What more can you ask for?

I'm not asking for anything :)
Paolo Bonzini May 31, 2012, 2:25 p.m. UTC | #3
Il 31/05/2012 16:06, Luiz Capitulino ha scritto:
>> What about errno values?  Let's add an enum QemuErrno and convert host
>> errnos to that enum.  Enums are sent as strings, so they are neutral to
>> the OS of the host and client.  And the client (if it desires) can
>> convert back to an errno value and use strerror to provide a
>> human-readable, easily internationalizable error message.
> 
> That's more or less what this patch does and the path we're taking with
> qerror. For errors that have an errno correspondent we're adding classes
> with errno names and this can be converted to an enum or whatever you wish.
> 
> Now, using strerror() as a human description has been debated a lot in the
> past and has been declined (mostly by Anthony). I'm in favor of it, but I'm
> afraid I'm not the one to be convinced.

QEMU should not use strerror(), I agree with Anthony on this one.  But
if the client wants to convert QMP errno values to host errno values and
use strerror() on those, why not.

>> Errors are not QAPI-ized yet, so we can add errno values to the above
>> five errors too.
> 
> We've preferred adding new errors instead of adding errno values to
> existing errors. I don't remember exactly why, but I personally don't care.

Then let's not do it anymore. :)

Paolo
Luiz Capitulino May 31, 2012, 2:31 p.m. UTC | #4
On Thu, 31 May 2012 16:25:48 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 31/05/2012 16:06, Luiz Capitulino ha scritto:
> >> What about errno values?  Let's add an enum QemuErrno and convert host
> >> errnos to that enum.  Enums are sent as strings, so they are neutral to
> >> the OS of the host and client.  And the client (if it desires) can
> >> convert back to an errno value and use strerror to provide a
> >> human-readable, easily internationalizable error message.
> > 
> > That's more or less what this patch does and the path we're taking with
> > qerror. For errors that have an errno correspondent we're adding classes
> > with errno names and this can be converted to an enum or whatever you wish.
> > 
> > Now, using strerror() as a human description has been debated a lot in the
> > past and has been declined (mostly by Anthony). I'm in favor of it, but I'm
> > afraid I'm not the one to be convinced.
> 
> QEMU should not use strerror(), I agree with Anthony on this one.  But
> if the client wants to convert QMP errno values to host errno values and
> use strerror() on those, why not.

Ah, that's ok. But the client can do it already. Of course it has not know
how we're mapping errno to strings, because we don't have "EINVAL" as a
error name we have "InvalidArgument".

> >> Errors are not QAPI-ized yet, so we can add errno values to the above
> >> five errors too.
> > 
> > We've preferred adding new errors instead of adding errno values to
> > existing errors. I don't remember exactly why, but I personally don't care.
> 
> Then let's not do it anymore. :)

We already did, changing this now will probably be worse.
Paolo Bonzini May 31, 2012, 2:54 p.m. UTC | #5
Il 31/05/2012 16:31, Luiz Capitulino ha scritto:
>>>> Errors are not QAPI-ized yet, so we can add errno values to the above
>>>> five errors too.
>>>
>>> We've preferred adding new errors instead of adding errno values to
>>> existing errors. I don't remember exactly why, but I personally don't care.
>>
>> Then let's not do it anymore. :)
> 
> We already did, changing this now will probably be worse.

Wait, I think you're conflating two things.

One is "do not shoehorn errors into errno values".  So for QOM invalid values we
have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
to Error rather than returning negative errno values and then returning generic
error codes, because those would be ugly and non-descriptive.  I agree with that.

The other is "when errors come straight from the OS, _do_ use errno values".
This is for OpenFileFailed, for the new socket errors and so on.  This is what
I am proposing.

These two rules together match what most other programs do.

For example:

    $ sed UUUU
    sed: -e expression #1, char 1: unknown command: `U'
                                   ^^^^^^^^^^^^^^^^     error type
         ^^^^^^^^^^^^^  ^       ^                   ^^^ arguments

in JSON:

    { 'error': 'UnknownCommand',
      'source' : { 'type': 'expression',
                   'data': { 'number': 1, 'char': 1 } },
      'command': 'U'
    }

It does not say

    sed: parsing program: No such system call

(aka ENOSYS).  So far that's obvious. :)  Now let's look at the second part, OS
errors:

    $ echo | sed p > /dev/full
    sed: couldn't flush stdout: No space left on device
         ^^^^^^^^^^^^^^                                 error type
                        ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments

That would become, in JSON:

    { 'error': 'FlushFailed',
      'file': 'stdout',
      'os_error': 'enospc' }

Here is similar example:

    $ yes | sed p > /dev/full
    sed: couldn't write 1 item to stdout: No space left on device
         ^^^^^^^^^^^^^^                                            error type
                        ^         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^  arguments

    { 'error': 'WriteFailed',
      'file': 'stdout', 'count': 1,
      'os_error': 'enospc' }


Note how restricting everything to a single NoSpaceLeftOnDevice error would
force you to drop information such as filename or count.

Is it clearer?

Paolo
Kevin Wolf May 31, 2012, 3:07 p.m. UTC | #6
Am 31.05.2012 16:06, schrieb Luiz Capitulino:
> On Thu, 31 May 2012 12:42:34 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Errors are not QAPI-ized yet, so we can add errno values to the above
>> five errors too.
> 
> We've preferred adding new errors instead of adding errno values to
> existing errors. I don't remember exactly why, but I personally don't care.

The problem is that you really need more information. You have at least
the information _what_ went wrong ("could not open image"), and usually
also _why_ it went wrong ("permission denied"). You may want to add even
more fields, but this is the minimum.

You cannot represent both in a single error code. Either you have a
FileOpenFailed error that has an errno field for the reason, or you have
a PermissionDenied error that has some field that tells us that opening
the file has failed (and probably the name of the file).

What Anthony has been suggesting so far is the PermissionDenied error
without any additional information. And that is not enough, especially
when you consider a QMP command like 'transaction' that modifies
multiple files at once.

Kevin
Luiz Capitulino May 31, 2012, 3:44 p.m. UTC | #7
On Thu, 31 May 2012 16:54:47 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 31/05/2012 16:31, Luiz Capitulino ha scritto:
> >>>> Errors are not QAPI-ized yet, so we can add errno values to the above
> >>>> five errors too.
> >>>
> >>> We've preferred adding new errors instead of adding errno values to
> >>> existing errors. I don't remember exactly why, but I personally don't care.
> >>
> >> Then let's not do it anymore. :)
> > 
> > We already did, changing this now will probably be worse.
> 
> Wait, I think you're conflating two things.
> 
> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
> to Error rather than returning negative errno values and then returning generic
> error codes, because those would be ugly and non-descriptive.  I agree with that.
> 
> The other is "when errors come straight from the OS, _do_ use errno values".
> This is for OpenFileFailed, for the new socket errors and so on.  This is what
> I am proposing.
> 
> These two rules together match what most other programs do.

[ cutting examples, hope to not cut anything important ]

> (aka ENOSYS).  So far that's obvious. :)  Now let's look at the second part, OS
> errors:
> 
>     $ echo | sed p > /dev/full
>     sed: couldn't flush stdout: No space left on device
>          ^^^^^^^^^^^^^^                                 error type
>                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
> 
> That would become, in JSON:
> 
>     { 'error': 'FlushFailed',
>       'file': 'stdout',
>       'os_error': 'enospc' }

I'm more or less fine with either way. I have some concerns on carrying
cruft if we keep changing our minds, but that's probably affordable. I also
think Anthony's way (ie. what we do today) is simpler, but again I wouldn't
oppose to do what you're suggesting.

Actually, I did propose something similar in the past but Anthony objected.

> Here is similar example:
> 
>     $ yes | sed p > /dev/full
>     sed: couldn't write 1 item to stdout: No space left on device
>          ^^^^^^^^^^^^^^                                            error type
>                         ^         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^  arguments
> 
>     { 'error': 'WriteFailed',
>       'file': 'stdout', 'count': 1,
>       'os_error': 'enospc' }
> 
> 
> Note how restricting everything to a single NoSpaceLeftOnDevice error would
> force you to drop information such as filename or count.

We could have optional arguments to NoSpaceLeftOnDevice. Could bloat the error,
or not (as most error scenarios can be common to each other).
Paolo Bonzini May 31, 2012, 3:49 p.m. UTC | #8
Il 31/05/2012 17:44, Luiz Capitulino ha scritto:
>> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
>> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
>> to Error rather than returning negative errno values and then returning generic
>> error codes, because those would be ugly and non-descriptive.  I agree with that.
>>
>> The other is "when errors come straight from the OS, _do_ use errno values".
>> This is for OpenFileFailed, for the new socket errors and so on.  This is what
>> I am proposing.
>>
>> These two rules together match what most other programs do.
>>
>>     $ echo | sed p > /dev/full
>>     sed: couldn't flush stdout: No space left on device
>>          ^^^^^^^^^^^^^^                                 error type
>>                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
>>
>> That would become, in JSON:
>>
>>     { 'error': 'FlushFailed',
>>       'file': 'stdout',
>>       'os_error': 'enospc' }
> 
> Actually, I did propose something similar in the past but Anthony objected.

Looks like in the meanwhile we moved closer to this mechanism
(OpenFileFailed, Sock*, etc.), except we have no clear way to identify
_what_ error actually happened rather than _where_.

> We could have optional arguments to NoSpaceLeftOnDevice. Could bloat the error,
> or not (as most error scenarios can be common to each other).

Not as soon as you start reusing the same errors for QOM or similar
scenarios (like PermissionDenied).

Paolo
Luiz Capitulino May 31, 2012, 4:08 p.m. UTC | #9
On Thu, 31 May 2012 17:49:42 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 31/05/2012 17:44, Luiz Capitulino ha scritto:
> >> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
> >> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
> >> to Error rather than returning negative errno values and then returning generic
> >> error codes, because those would be ugly and non-descriptive.  I agree with that.
> >>
> >> The other is "when errors come straight from the OS, _do_ use errno values".
> >> This is for OpenFileFailed, for the new socket errors and so on.  This is what
> >> I am proposing.
> >>
> >> These two rules together match what most other programs do.
> >>
> >>     $ echo | sed p > /dev/full
> >>     sed: couldn't flush stdout: No space left on device
> >>          ^^^^^^^^^^^^^^                                 error type
> >>                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
> >>
> >> That would become, in JSON:
> >>
> >>     { 'error': 'FlushFailed',
> >>       'file': 'stdout',
> >>       'os_error': 'enospc' }
> > 
> > Actually, I did propose something similar in the past but Anthony objected.
> 
> Looks like in the meanwhile we moved closer to this mechanism
> (OpenFileFailed, Sock*, etc.), except we have no clear way to identify
> _what_ error actually happened rather than _where_.

In fact, it's the other way around. OpenFileFailed, for example, is an old
error. We thought it would be ok to have it that way (also because of shallow
QMP conversion, as it would take ages to convert all possible errors). But today
it's clear it's bad and we're trying to move away from it.

The socket ones repeat this, but it's probably because people usually go
for the generic error first because having detailed errors with qerror is
cumbersome. I have some ideas to improve it that could _mitigate_ that problem,
like having a schema file qapi-errors.json:

 { 'error': 'InvalidParameter', 'data': { 'name': 'str' } }

vs. having to edit qerror.[ch]. This is not difficult to do, and I have a rfc
for it already.

But I really don't have the ultimate answer to the question of how to make
error reporting and infrastructure near perfect.
Paolo Bonzini May 31, 2012, 4:10 p.m. UTC | #10
Il 31/05/2012 18:08, Luiz Capitulino ha scritto:
> On Thu, 31 May 2012 17:49:42 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> Il 31/05/2012 17:44, Luiz Capitulino ha scritto:
>>>> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
>>>> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
>>>> to Error rather than returning negative errno values and then returning generic
>>>> error codes, because those would be ugly and non-descriptive.  I agree with that.
>>>>
>>>> The other is "when errors come straight from the OS, _do_ use errno values".
>>>> This is for OpenFileFailed, for the new socket errors and so on.  This is what
>>>> I am proposing.
>>>>
>>>> These two rules together match what most other programs do.
>>>>
>>>>     $ echo | sed p > /dev/full
>>>>     sed: couldn't flush stdout: No space left on device
>>>>          ^^^^^^^^^^^^^^                                 error type
>>>>                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
>>>>
>>>> That would become, in JSON:
>>>>
>>>>     { 'error': 'FlushFailed',
>>>>       'file': 'stdout',
>>>>       'os_error': 'enospc' }
>>>
>>> Actually, I did propose something similar in the past but Anthony objected.
>>
>> Looks like in the meanwhile we moved closer to this mechanism
>> (OpenFileFailed, Sock*, etc.), except we have no clear way to identify
>> _what_ error actually happened rather than _where_.
> 
> In fact, it's the other way around. OpenFileFailed, for example, is an old
> error. We thought it would be ok to have it that way (also because of shallow
> QMP conversion, as it would take ages to convert all possible errors). But today
> it's clear it's bad and we're trying to move away from it.

Is it so clear?  It's bad that we do not have a way to report OS errors
precisely.  But what _is_ absolutely clear, is that we need to report
the "what" (EACCES etc.), but also either the "where" (filename) or the
"when" (bind/connect/...)

Adding an errno to the *Failed errors really seems like a no-brainer to me.

Paolo
Kevin Wolf June 1, 2012, 12:40 p.m. UTC | #11
Am 31.05.2012 18:08, schrieb Luiz Capitulino:
> On Thu, 31 May 2012 17:49:42 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> Il 31/05/2012 17:44, Luiz Capitulino ha scritto:
>>>> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
>>>> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
>>>> to Error rather than returning negative errno values and then returning generic
>>>> error codes, because those would be ugly and non-descriptive.  I agree with that.
>>>>
>>>> The other is "when errors come straight from the OS, _do_ use errno values".
>>>> This is for OpenFileFailed, for the new socket errors and so on.  This is what
>>>> I am proposing.
>>>>
>>>> These two rules together match what most other programs do.
>>>>
>>>>     $ echo | sed p > /dev/full
>>>>     sed: couldn't flush stdout: No space left on device
>>>>          ^^^^^^^^^^^^^^                                 error type
>>>>                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
>>>>
>>>> That would become, in JSON:
>>>>
>>>>     { 'error': 'FlushFailed',
>>>>       'file': 'stdout',
>>>>       'os_error': 'enospc' }
>>>
>>> Actually, I did propose something similar in the past but Anthony objected.
>>
>> Looks like in the meanwhile we moved closer to this mechanism
>> (OpenFileFailed, Sock*, etc.), except we have no clear way to identify
>> _what_ error actually happened rather than _where_.
> 
> In fact, it's the other way around. OpenFileFailed, for example, is an old
> error. We thought it would be ok to have it that way (also because of shallow
> QMP conversion, as it would take ages to convert all possible errors). But today
> it's clear it's bad and we're trying to move away from it.

It's not all that clear for me. Or actually, it is rather clear that
it's basically the right thing, but some additional information is required.

> The socket ones repeat this, but it's probably because people usually go
> for the generic error first because having detailed errors with qerror is
> cumbersome. I have some ideas to improve it that could _mitigate_ that problem,
> like having a schema file qapi-errors.json:
> 
>  { 'error': 'InvalidParameter', 'data': { 'name': 'str' } }

Errors that are as simple as that are useless if they are all you get.
Even for InvalidParameter this is true when you have more than a flat
arguments dict.

Maybe what is required in order to fully describe an error is a whole
chain of error objects that describe which error caused which other
action to fail:

{ 'error': 'TransactionCommandFailed', 'data': {
  'command': 'blockdev-snapshot-sync',
  'cause': {
      'error': 'FileOpenFailed', 'data' : {
        'filename': '/tmp/foo.img',
        'cause': {
          'error': 'PermissionDenied', 'data': {}
        }
      }
   }
}

Not sure if that would be easy to process for a QMP client, though...

Kevin
Luiz Capitulino June 13, 2012, 5:49 p.m. UTC | #12
On Thu, 31 May 2012 16:54:47 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Wait, I think you're conflating two things.
> 
> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
> to Error rather than returning negative errno values and then returning generic
> error codes, because those would be ugly and non-descriptive.  I agree with that.
> 
> The other is "when errors come straight from the OS, _do_ use errno values".
> This is for OpenFileFailed, for the new socket errors and so on.  This is what
> I am proposing.

[...]

>     $ echo | sed p > /dev/full
>     sed: couldn't flush stdout: No space left on device
>          ^^^^^^^^^^^^^^                                 error type
>                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
> 
> That would become, in JSON:
> 
>     { 'error': 'FlushFailed',
>       'file': 'stdout',
>       'os_error': 'enospc' }

This is not a new discussion and what we're doing today is to return errno
as a QError class name. So, for the example above we'd return something like:

 { 'error': 'NoSpace' }

It's possible to add new optional values if you need more information, but
I know that that's not what you're asking.

I mostly agree that your version would be better, the only problem I see
is that this is probably going to mess a bit more our API as we have been
doing like my example above for some time.

Anthony, the current design was mostly influenced by you and you had
objections on doing what Paolo and Kevin are suggesting. What do you think?
Luiz Capitulino June 15, 2012, 2:46 p.m. UTC | #13
On Wed, 13 Jun 2012 14:49:10 -0300
Luiz Capitulino <lcapitulino@redhat.com> wrote:

> On Thu, 31 May 2012 16:54:47 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > Wait, I think you're conflating two things.
> > 
> > One is "do not shoehorn errors into errno values".  So for QOM invalid values we
> > have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
> > to Error rather than returning negative errno values and then returning generic
> > error codes, because those would be ugly and non-descriptive.  I agree with that.
> > 
> > The other is "when errors come straight from the OS, _do_ use errno values".
> > This is for OpenFileFailed, for the new socket errors and so on.  This is what
> > I am proposing.
> 
> [...]
> 
> >     $ echo | sed p > /dev/full
> >     sed: couldn't flush stdout: No space left on device
> >          ^^^^^^^^^^^^^^                                 error type
> >                         ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
> > 
> > That would become, in JSON:
> > 
> >     { 'error': 'FlushFailed',
> >       'file': 'stdout',
> >       'os_error': 'enospc' }
> 
> This is not a new discussion and what we're doing today is to return errno
> as a QError class name. So, for the example above we'd return something like:
> 
>  { 'error': 'NoSpace' }
> 
> It's possible to add new optional values if you need more information, but
> I know that that's not what you're asking.
> 
> I mostly agree that your version would be better, the only problem I see
> is that this is probably going to mess a bit more our API as we have been
> doing like my example above for some time.
> 
> Anthony, the current design was mostly influenced by you and you had
> objections on doing what Paolo and Kevin are suggesting. What do you think?

Ping?

We have to reach a consensus of this because this is holding qapi conversions.
Anthony Liguori June 15, 2012, 4:52 p.m. UTC | #14
On 06/13/2012 12:49 PM, Luiz Capitulino wrote:
> On Thu, 31 May 2012 16:54:47 +0200
> Paolo Bonzini<pbonzini@redhat.com>  wrote:
>
>> Wait, I think you're conflating two things.
>>
>> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
>> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
>> to Error rather than returning negative errno values and then returning generic
>> error codes, because those would be ugly and non-descriptive.  I agree with that.
>>
>> The other is "when errors come straight from the OS, _do_ use errno values".
>> This is for OpenFileFailed, for the new socket errors and so on.  This is what
>> I am proposing.
>
> [...]
>
>>      $ echo | sed p>  /dev/full
>>      sed: couldn't flush stdout: No space left on device
>>           ^^^^^^^^^^^^^^                                 error type
>>                          ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
>>
>> That would become, in JSON:
>>
>>      { 'error': 'FlushFailed',
>>        'file': 'stdout',
>>        'os_error': 'enospc' }
>
> This is not a new discussion and what we're doing today is to return errno
> as a QError class name. So, for the example above we'd return something like:
>
>   { 'error': 'NoSpace' }


No, you're confusing things I think.  { 'error': 'NoSpace' } is bad.  errno is 
not an intrinsically bad thing but errno critically relies on the *caller* to 
understand the context that the error has occurred in.  Just returning { 
'error': 'NoSpace' } is not good enough in QMP because the caller doesn't know 
the context.  What was the command doing such that that error was returning?

In many cases, errno has different meanings depending on the context.  EINVAL is 
a good example of this.

The devil is in the details here.  Having an error like:

{ 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w', 'os_error': 
'enospc' }

is actually pretty reasonable for something like a memory dump command where the 
user specifies a file.

OTOH, for something complex like live snapshotting which many involve opening 
multiple files, it may not be good enough.

So context is really everything here.

Regards,

Anthony Liguori

>
> It's possible to add new optional values if you need more information, but
> I know that that's not what you're asking.
>
> I mostly agree that your version would be better, the only problem I see
> is that this is probably going to mess a bit more our API as we have been
> doing like my example above for some time.
>
> Anthony, the current design was mostly influenced by you and you had
> objections on doing what Paolo and Kevin are suggesting. What do you think?
>
Paolo Bonzini June 15, 2012, 4:55 p.m. UTC | #15
Il 15/06/2012 18:52, Anthony Liguori ha scritto:
> Having an error like:
> 
> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w',
> 'os_error': 'enospc' }
> 
> is actually pretty reasonable for something like a memory dump command
> where the user specifies a file.
> 
> OTOH, for something complex like live snapshotting which many involve
> opening multiple files, it may not be good enough.
> 
> So context is really everything here.

I agree, though I think this is the least of the problems in reporting
errors from complex commands such as transaction. :)

So I guess we can proceed with errno values, yuppy!

Paolo
Luiz Capitulino June 15, 2012, 5:02 p.m. UTC | #16
On Fri, 15 Jun 2012 11:52:57 -0500
Anthony Liguori <aliguori@us.ibm.com> wrote:

> On 06/13/2012 12:49 PM, Luiz Capitulino wrote:
> > On Thu, 31 May 2012 16:54:47 +0200
> > Paolo Bonzini<pbonzini@redhat.com>  wrote:
> >
> >> Wait, I think you're conflating two things.
> >>
> >> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
> >> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
> >> to Error rather than returning negative errno values and then returning generic
> >> error codes, because those would be ugly and non-descriptive.  I agree with that.
> >>
> >> The other is "when errors come straight from the OS, _do_ use errno values".
> >> This is for OpenFileFailed, for the new socket errors and so on.  This is what
> >> I am proposing.
> >
> > [...]
> >
> >>      $ echo | sed p>  /dev/full
> >>      sed: couldn't flush stdout: No space left on device
> >>           ^^^^^^^^^^^^^^                                 error type
> >>                          ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
> >>
> >> That would become, in JSON:
> >>
> >>      { 'error': 'FlushFailed',
> >>        'file': 'stdout',
> >>        'os_error': 'enospc' }
> >
> > This is not a new discussion and what we're doing today is to return errno
> > as a QError class name. So, for the example above we'd return something like:
> >
> >   { 'error': 'NoSpace' }
> 
> 
> No, you're confusing things I think.  { 'error': 'NoSpace' } is bad.  errno is 
> not an intrinsically bad thing but errno critically relies on the *caller* to 
> understand the context that the error has occurred in.  Just returning { 
> 'error': 'NoSpace' } is not good enough in QMP because the caller doesn't know 
> the context.  What was the command doing such that that error was returning?

The screendump command (not merged yet) can return it during open() or write().

> In many cases, errno has different meanings depending on the context.  EINVAL is 
> a good example of this.
> 
> The devil is in the details here.  Having an error like:
> 
> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w', 'os_error': 
> 'enospc' }
> 
> is actually pretty reasonable for something like a memory dump command where the 
> user specifies a file.

And WriteFailed, ReadFailed, etc? This is what Paolo is suggesting I think. If
you agree on doing this, then I'll switch the screendump conversion to do this
and we'll have to extend existing errors.

And, I'm afraid that some not so new qapi conversions did the NoSpace example
above...

> OTOH, for something complex like live snapshotting which many involve opening 
> multiple files, it may not be good enough.

Kevin, I think you had a different proposal for the live snapshot command?
Daniel P. Berrangé June 15, 2012, 5:03 p.m. UTC | #17
On Fri, Jun 15, 2012 at 11:52:57AM -0500, Anthony Liguori wrote:
> On 06/13/2012 12:49 PM, Luiz Capitulino wrote:
> No, you're confusing things I think.  { 'error': 'NoSpace' } is bad.
> errno is not an intrinsically bad thing but errno critically relies
> on the *caller* to understand the context that the error has
> occurred in.  Just returning { 'error': 'NoSpace' } is not good
> enough in QMP because the caller doesn't know the context.  What was
> the command doing such that that error was returning?
> 
> In many cases, errno has different meanings depending on the
> context.  EINVAL is a good example of this.
> 
> The devil is in the details here.  Having an error like:
> 
> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w',
> 'os_error': 'enospc' }
> 
> is actually pretty reasonable for something like a memory dump
> command where the user specifies a file.

I can't help thinking that we're still over-engineering the error
reporting for QMP, and that really all we need is a reasonably
coarse error code/class, and an informal string.

eg,

   { 'error': "SystemError", msg = "failed to open file '/foo/bar' for writing: no space on device" }

   { 'error': "DNSError", msg = "unable to resolve hostname 'foo': cannot reach nameserver"}

   etc

In libvirt we started with a ridiculously complicated virErrorPtr
struct, which no one ever remembered to fill our details in, or
filledout details inconsistently. These days we only ever bother
with a coarse error class, and a string, and in the case of a
system error, we also include the raw errno value.

Pretty much all common APIs / languages focus primarily on just
an error code/class and a informal string too, with the odd
exception eg Python's OSException provides you the errno value
too

Are any users of QMP actually asking for this kind of advanced
error reporting ?  From libvirt's POV we're perfectly content
with just an error class & string.

Regards,
Daniel
Kevin Wolf June 15, 2012, 5:23 p.m. UTC | #18
Am 15.06.2012 19:02, schrieb Luiz Capitulino:
> On Fri, 15 Jun 2012 11:52:57 -0500
> Anthony Liguori <aliguori@us.ibm.com> wrote:
> 
>> On 06/13/2012 12:49 PM, Luiz Capitulino wrote:
>>> On Thu, 31 May 2012 16:54:47 +0200
>>> Paolo Bonzini<pbonzini@redhat.com>  wrote:
>>>
>>>> Wait, I think you're conflating two things.
>>>>
>>>> One is "do not shoehorn errors into errno values".  So for QOM invalid values we
>>>> have PropertyValueBad, not a generic InvalidArgument value.  We convert everything
>>>> to Error rather than returning negative errno values and then returning generic
>>>> error codes, because those would be ugly and non-descriptive.  I agree with that.
>>>>
>>>> The other is "when errors come straight from the OS, _do_ use errno values".
>>>> This is for OpenFileFailed, for the new socket errors and so on.  This is what
>>>> I am proposing.
>>>
>>> [...]
>>>
>>>>      $ echo | sed p>  /dev/full
>>>>      sed: couldn't flush stdout: No space left on device
>>>>           ^^^^^^^^^^^^^^                                 error type
>>>>                          ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^ arguments
>>>>
>>>> That would become, in JSON:
>>>>
>>>>      { 'error': 'FlushFailed',
>>>>        'file': 'stdout',
>>>>        'os_error': 'enospc' }
>>>
>>> This is not a new discussion and what we're doing today is to return errno
>>> as a QError class name. So, for the example above we'd return something like:
>>>
>>>   { 'error': 'NoSpace' }
>>
>>
>> No, you're confusing things I think.  { 'error': 'NoSpace' } is bad.  errno is 
>> not an intrinsically bad thing but errno critically relies on the *caller* to 
>> understand the context that the error has occurred in.  Just returning { 
>> 'error': 'NoSpace' } is not good enough in QMP because the caller doesn't know 
>> the context.  What was the command doing such that that error was returning?
> 
> The screendump command (not merged yet) can return it during open() or write().
> 
>> In many cases, errno has different meanings depending on the context.  EINVAL is 
>> a good example of this.
>>
>> The devil is in the details here.  Having an error like:
>>
>> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w', 'os_error': 
>> 'enospc' }
>>
>> is actually pretty reasonable for something like a memory dump command where the 
>> user specifies a file.
> 
> And WriteFailed, ReadFailed, etc? This is what Paolo is suggesting I think. If
> you agree on doing this, then I'll switch the screendump conversion to do this
> and we'll have to extend existing errors.
> 
> And, I'm afraid that some not so new qapi conversions did the NoSpace example
> above...
> 
>> OTOH, for something complex like live snapshotting which many involve opening 
>> multiple files, it may not be good enough.
> 
> Kevin, I think you had a different proposal for the live snapshot command?

http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg00035.html

You mean this one? Note that this wouldn't only be for live snapshots,
but a general approach to error handling, even though most command
wouldn't nest very deep.

I'm not sure whether I'm really proposing this, I just wanted to bring
it up as a possible alternative to be discussed. I'm pretty sure that it
would work and it would cover everything we need (which I'm not for the
other proposals). But I'm not quite sure if it would be easy enough to
use for a QMP client.

Kevin
Anthony Liguori June 15, 2012, 5:48 p.m. UTC | #19
On 06/15/2012 11:55 AM, Paolo Bonzini wrote:
> Il 15/06/2012 18:52, Anthony Liguori ha scritto:
>> Having an error like:
>>
>> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w',
>> 'os_error': 'enospc' }
>>
>> is actually pretty reasonable for something like a memory dump command
>> where the user specifies a file.
>>
>> OTOH, for something complex like live snapshotting which many involve
>> opening multiple files, it may not be good enough.
>>
>> So context is really everything here.
>
> I agree, though I think this is the least of the problems in reporting
> errors from complex commands such as transaction. :)
>
> So I guess we can proceed with errno values, yuppy!

Yes, but I reserve the right to nack abuses :-)

Regards,

Anthony Liguori

>
> Paolo
>
Paolo Bonzini June 15, 2012, 7:11 p.m. UTC | #20
Il 15/06/2012 19:48, Anthony Liguori ha scritto:
>> So I guess we can proceed with errno values, yuppy!
> 
> Yes, but I reserve the right to nack abuses :-)

I will help, don't worry.

Paolo
Markus Armbruster June 18, 2012, 3:41 p.m. UTC | #21
"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Fri, Jun 15, 2012 at 11:52:57AM -0500, Anthony Liguori wrote:
>> On 06/13/2012 12:49 PM, Luiz Capitulino wrote:
>> No, you're confusing things I think.  { 'error': 'NoSpace' } is bad.
>> errno is not an intrinsically bad thing but errno critically relies
>> on the *caller* to understand the context that the error has
>> occurred in.  Just returning { 'error': 'NoSpace' } is not good
>> enough in QMP because the caller doesn't know the context.  What was
>> the command doing such that that error was returning?
>> 
>> In many cases, errno has different meanings depending on the
>> context.  EINVAL is a good example of this.
>> 
>> The devil is in the details here.  Having an error like:
>> 
>> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w',
>> 'os_error': 'enospc' }
>> 
>> is actually pretty reasonable for something like a memory dump
>> command where the user specifies a file.
>
> I can't help thinking that we're still over-engineering the error
> reporting for QMP, and that really all we need is a reasonably
> coarse error code/class, and an informal string.
>
> eg,
>
>    { 'error': "SystemError", msg = "failed to open file '/foo/bar' for writing: no space on device" }
>
>    { 'error': "DNSError", msg = "unable to resolve hostname 'foo': cannot reach nameserver"}
>
>    etc
>
> In libvirt we started with a ridiculously complicated virErrorPtr
> struct, which no one ever remembered to fill our details in, or
> filledout details inconsistently. These days we only ever bother
> with a coarse error class, and a string, and in the case of a
> system error, we also include the raw errno value.

Good match for real-world error handling, which is usually a minor
variation of

    if (didn't work)
        if (retry might fix it)
            retry
        else if (I got a plan B to try)
            try plan B
        else
            punt to human

Error information used:

1. whether it failed

2. whether a failure is transient or permanent

3. a description of the failure fit for human consumption

> Pretty much all common APIs / languages focus primarily on just
> an error code/class and a informal string too, with the odd
> exception eg Python's OSException provides you the errno value
> too
>
> Are any users of QMP actually asking for this kind of advanced
> error reporting ?  From libvirt's POV we're perfectly content
> with just an error class & string.

Real users, please, not theoretical ones.
Anthony Liguori June 18, 2012, 6:31 p.m. UTC | #22
On 06/18/2012 10:41 AM, Markus Armbruster wrote:
> "Daniel P. Berrange"<berrange@redhat.com>  writes:
>
>> On Fri, Jun 15, 2012 at 11:52:57AM -0500, Anthony Liguori wrote:
>>> On 06/13/2012 12:49 PM, Luiz Capitulino wrote:
>>> No, you're confusing things I think.  { 'error': 'NoSpace' } is bad.
>>> errno is not an intrinsically bad thing but errno critically relies
>>> on the *caller* to understand the context that the error has
>>> occurred in.  Just returning { 'error': 'NoSpace' } is not good
>>> enough in QMP because the caller doesn't know the context.  What was
>>> the command doing such that that error was returning?
>>>
>>> In many cases, errno has different meanings depending on the
>>> context.  EINVAL is a good example of this.
>>>
>>> The devil is in the details here.  Having an error like:
>>>
>>> { 'error': 'OpenFileFailed', 'file': 'filename', 'mode': 'r/w',
>>> 'os_error': 'enospc' }
>>>
>>> is actually pretty reasonable for something like a memory dump
>>> command where the user specifies a file.
>>
>> I can't help thinking that we're still over-engineering the error
>> reporting for QMP, and that really all we need is a reasonably
>> coarse error code/class, and an informal string.
>>
>> eg,
>>
>>     { 'error': "SystemError", msg = "failed to open file '/foo/bar' for writing: no space on device" }
>>
>>     { 'error': "DNSError", msg = "unable to resolve hostname 'foo': cannot reach nameserver"}
>>
>>     etc
>>
>> In libvirt we started with a ridiculously complicated virErrorPtr
>> struct, which no one ever remembered to fill our details in, or
>> filledout details inconsistently. These days we only ever bother
>> with a coarse error class, and a string, and in the case of a
>> system error, we also include the raw errno value.
>
> Good match for real-world error handling, which is usually a minor
> variation of
>
>      if (didn't work)
>          if (retry might fix it)
>              retry
>          else if (I got a plan B to try)
>              try plan B
>          else
>              punt to human
>
> Error information used:
>
> 1. whether it failed
>
> 2. whether a failure is transient or permanent
>
> 3. a description of the failure fit for human consumption
>
>> Pretty much all common APIs / languages focus primarily on just
>> an error code/class and a informal string too, with the odd
>> exception eg Python's OSException provides you the errno value
>> too
>>
>> Are any users of QMP actually asking for this kind of advanced
>> error reporting ?  From libvirt's POV we're perfectly content
>> with just an error class&  string.
>
> Real users, please, not theoretical ones.

Irrespective of anything else, I think it's safe to say the experiment of "rich 
errors" has been a failure.  We still have way too many places using error_report.

As I mentioned in another thread, I think we should:

1) Introduce a GENERIC_ERROR QError type.  It could have a 'domain' and a 'msg' 
field.

2) Focus on converting users of error_report over to use propagated Error objects.

We shouldn't/can't change existing QError users.  We also shouldn't consider 
changing the wire protocol.  But for new error users, we should/can relax the 
reported errors.

We need a clear support policy on whether the contents of 'msg' are stable or 
not too.

Regards,

Anthony Liguori
Kevin Wolf June 19, 2012, 7:39 a.m. UTC | #23
Am 18.06.2012 20:31, schrieb Anthony Liguori:
> Irrespective of anything else, I think it's safe to say the experiment of "rich 
> errors" has been a failure.  We still have way too many places using error_report.
> 
> As I mentioned in another thread, I think we should:
> 
> 1) Introduce a GENERIC_ERROR QError type.  It could have a 'domain' and a 'msg' 
> field.
> 
> 2) Focus on converting users of error_report over to use propagated Error objects.
> 
> We shouldn't/can't change existing QError users.  We also shouldn't consider 
> changing the wire protocol.  But for new error users, we should/can relax the 
> reported errors.
> 
> We need a clear support policy on whether the contents of 'msg' are stable or 
> not too.

Another point that you used to bring up in earlier discussions is
translated error messages. If we start returning error messages that are
meant to displayed to the user, should we get your gettext patches
applied which you did for the GTK backend? libvirt would then have to
pay attention to start qemu with the same locale as the client has.

Kevin
Daniel P. Berrangé June 19, 2012, 9:20 a.m. UTC | #24
On Tue, Jun 19, 2012 at 09:39:34AM +0200, Kevin Wolf wrote:
> Am 18.06.2012 20:31, schrieb Anthony Liguori:
> > Irrespective of anything else, I think it's safe to say the experiment of "rich 
> > errors" has been a failure.  We still have way too many places using error_report.
> > 
> > As I mentioned in another thread, I think we should:
> > 
> > 1) Introduce a GENERIC_ERROR QError type.  It could have a 'domain' and a 'msg' 
> > field.
> > 
> > 2) Focus on converting users of error_report over to use propagated Error objects.
> > 
> > We shouldn't/can't change existing QError users.  We also shouldn't consider 
> > changing the wire protocol.  But for new error users, we should/can relax the 
> > reported errors.
> > 
> > We need a clear support policy on whether the contents of 'msg' are stable or 
> > not too.
> 
> Another point that you used to bring up in earlier discussions is
> translated error messages. If we start returning error messages that are
> meant to displayed to the user, should we get your gettext patches
> applied which you did for the GTK backend? libvirt would then have to
> pay attention to start qemu with the same locale as the client has.

You can't really start the VM in the same locale as the client app,
because there's no persistent 1:N relationship between libvirt clients
and VMs - it is M:N, so you can't choose a single VM. In addition there
is a bunch of work that libvirt does against VMs in contexts that have
no associated client. You just have to have 1 system wide locale for
all QEMU VMs on a host and libvirt.

Regards,
Daniel
Kevin Wolf June 19, 2012, 9:31 a.m. UTC | #25
Am 19.06.2012 11:20, schrieb Daniel P. Berrange:
> On Tue, Jun 19, 2012 at 09:39:34AM +0200, Kevin Wolf wrote:
>> Am 18.06.2012 20:31, schrieb Anthony Liguori:
>>> Irrespective of anything else, I think it's safe to say the experiment of "rich 
>>> errors" has been a failure.  We still have way too many places using error_report.
>>>
>>> As I mentioned in another thread, I think we should:
>>>
>>> 1) Introduce a GENERIC_ERROR QError type.  It could have a 'domain' and a 'msg' 
>>> field.
>>>
>>> 2) Focus on converting users of error_report over to use propagated Error objects.
>>>
>>> We shouldn't/can't change existing QError users.  We also shouldn't consider 
>>> changing the wire protocol.  But for new error users, we should/can relax the 
>>> reported errors.
>>>
>>> We need a clear support policy on whether the contents of 'msg' are stable or 
>>> not too.
>>
>> Another point that you used to bring up in earlier discussions is
>> translated error messages. If we start returning error messages that are
>> meant to displayed to the user, should we get your gettext patches
>> applied which you did for the GTK backend? libvirt would then have to
>> pay attention to start qemu with the same locale as the client has.
> 
> You can't really start the VM in the same locale as the client app,
> because there's no persistent 1:N relationship between libvirt clients
> and VMs - it is M:N, so you can't choose a single VM. In addition there
> is a bunch of work that libvirt does against VMs in contexts that have
> no associated client. You just have to have 1 system wide locale for
> all QEMU VMs on a host and libvirt.

Good point. So if we ever needed it, we would have to introduce a
monitor command to switch. But in most cases client and server locale
should be the same anyway, so I think we can ignore that part for the start.

Kevin
Luiz Capitulino June 19, 2012, 1:12 p.m. UTC | #26
On Mon, 18 Jun 2012 13:31:52 -0500
Anthony Liguori <anthony@codemonkey.ws> wrote:

> >> Are any users of QMP actually asking for this kind of advanced
> >> error reporting ?  From libvirt's POV we're perfectly content
> >> with just an error class&  string.
> >
> > Real users, please, not theoretical ones.
> 
> Irrespective of anything else, I think it's safe to say the experiment of "rich 
> errors" has been a failure.  

Yes, I fully agree.

> We still have way too many places using error_report.
> 
> As I mentioned in another thread, I think we should:
> 
> 1) Introduce a GENERIC_ERROR QError type.  It could have a 'domain' and a 'msg' 
> field.
> 
> 2) Focus on converting users of error_report over to use propagated Error objects.

I agree with this too and the conversion itself can mostly be automated
I think. However, I think this is a related, but different problem (more below).

> We shouldn't/can't change existing QError users.  We also shouldn't consider 
> changing the wire protocol.  But for new error users, we should/can relax the 
> reported errors.

Can we agree on what 'relax' actually means?

In the very beginning of QMP, Markus had an idea of making errors absurdly
simple iirc it was just three general classes and a message (am I right,
Markus)?

Daniel seems to suggest something along these lines too. However, in my
understanding we're going to have two kinds of errors:

 1. OS errors: system calls or library functions errors. They will look
    like this:

     { "error": "OpenFileFailed", "filename": "/tmp/foo",
       "os-error": "nospace" }

    This means that, for every system call we're going to have a FOOFailed.
    Not sure this is reasonable.

 2. Anything else that doesn't fall in item 2, iow command specific errors,
    like InvalidBlockFormat.

Is this we really want to have? This is an honest question.

Btw, I think we first have to decide what we really want and afterwards we
discuss compatibility. I'm not saying we'll break it, but we might be able
to move forward and still maintain compatibility depending on what we want.

> We need a clear support policy on whether the contents of 'msg' are stable or 
> not too.

It's already declared on the qmp spec as not stable:

- The "desc" member is a human-readable error message. Clients should
  not attempt to parse this message.

Also, the qmp-commands.txt is very strong on error compatibility:

    3. Errors, in special, are not documented. Applications should NOT check
       for specific errors classes or data (it's strongly recommended to only
       check for the "error" key)

This is a bit unrealistic today though, as this was written when we were
still unsure about QMP's future and errors are getting documented in the
schema anyway.
Luiz Capitulino June 20, 2012, 5:48 p.m. UTC | #27
Yet another thread fork.

After talking with Daniel and Markus about QMP errors (which is not just about
QMP, as this affects QEMU as whole), I've put together the proposal below.

I'll discuss three points. First, the error format and classes. Second, the
internal API and third compatibility.

Don't be afraid about the length of this email, only the first section is long
but it mostly contains error classes listings.

1. Error format and classes

We can keep the same error format we have today, which is:

 { "error": { "class": json-string,
              "data": json-object,
              "desc": json-string }, "id": json-value }

  where 'data', 'desc' and 'id' are optional fields.

However, we'd change how we use 'desc' and our error classes. 'desc' would
become a string which is filled by a printf-like function (see section 2) and
we'd replace all error classes we have today by the following ones:

  o ParameterError: any error which involves a bad parameter. Replaces
    InvalidParameter, InvalidParameterCombination, InvalidParameterType,
    InvalidParameterValue, MissingParameter

  o SystemError: syscall or library errors. Replaces BufferOverrun,
    IOError, OpenFileFailed, PermissionDenied, TooManyFiles,
    SockConnectInprogress, SockConnectFailed, SockListenFailed,
    SockBindFailed, SockCreateFailed.

    This error can include an optional 'os-error' field in the 'data'
    member (see section 2).

  o QEMUError: errors that are internal to QEMU, like AmbiguousPath,
    BadBusForDevice, BusNoHotplug, BusNotFound, CommandDisabled,
    CommandNotFound, DuplicateId, FeatureDisabled, JSONParseError,
    JSONParsing, KVMMissingCap, MigrationActive, MigrationNotSupported,
    MigrationExpected, NoBusForDevice, NotSupported, PropertyValueBad,
    PropertyValueInUse, PropertyValueNotFound, PropertyValueNotPowerOf2,
    PropertyValueOutOfRange, ResetRequired, SetPasswdFailed, Unsupported,
    VirtFSFeatureBlocksMigration, VNCServerFailed

  o UndefinedError: the same it's today, undefined :)

Now, there are two important points to be observed:

 - We check for DeviceEncrypted in hmp.c and fetch its parameters. This
   probably indicates that we might want to create specialized classes
   when necessary

 - I don't know where to put all the DeviceFoo errors, but they probably fit
   in QEMUError or a new class like DeviceError

2. Internal API

This is very straightforward:

 o error_set_deprecated();

   Current error_set(), case we keep it for compatibility (see section 3).

 o error_set(ErrorClass err_class, const char *fmt, ...);

   Main function, sets the error classes and allow for a free human-readable
   error message.

 o error_set_sys_fmt(int err_no, const char *fmt, ...);
 o error_set_sys(int err_no);

   Helpers for setting SystemError errors. error_set_sys() gets the 'desc'
   string from strerror().

3. Compatibility

We have two options here:

 1. Maintain the current errors and implement the new way only for new
    commands (includes commands merged for 1.2).

      Pros: maintains compatibility and won't require any code churn
      Cons: we'll have two different errors schemas in use at the same
            time and will be carrying garbage forward

 2. Do a full conversion to the new way.

      Pros: we drop bad code and avoid pollution (good for Rio+20)
      Cons: possibly breaks compatibility and will require a lot of code
            churn up front
Anthony Liguori June 20, 2012, 6:46 p.m. UTC | #28
On 06/20/2012 12:48 PM, Luiz Capitulino wrote:
> Yet another thread fork.
>
> After talking with Daniel and Markus about QMP errors (which is not just about
> QMP, as this affects QEMU as whole), I've put together the proposal below.
>
> I'll discuss three points. First, the error format and classes. Second, the
> internal API and third compatibility.
>
> Don't be afraid about the length of this email, only the first section is long
> but it mostly contains error classes listings.
>
> 1. Error format and classes
>
> We can keep the same error format we have today, which is:
>
>   { "error": { "class": json-string,
>                "data": json-object,
>                "desc": json-string }, "id": json-value }
>
>    where 'data', 'desc' and 'id' are optional fields.
>
> However, we'd change how we use 'desc' and our error classes. 'desc' would
> become a string which is filled by a printf-like function (see section 2) and
> we'd replace all error classes we have today by the following ones:
>
>    o ParameterError: any error which involves a bad parameter. Replaces
>      InvalidParameter, InvalidParameterCombination, InvalidParameterType,
>      InvalidParameterValue, MissingParameter
>
>    o SystemError: syscall or library errors. Replaces BufferOverrun,
>      IOError, OpenFileFailed, PermissionDenied, TooManyFiles,
>      SockConnectInprogress, SockConnectFailed, SockListenFailed,
>      SockBindFailed, SockCreateFailed.
>
>      This error can include an optional 'os-error' field in the 'data'
>      member (see section 2).
>
>    o QEMUError: errors that are internal to QEMU, like AmbiguousPath,
>      BadBusForDevice, BusNoHotplug, BusNotFound, CommandDisabled,
>      CommandNotFound, DuplicateId, FeatureDisabled, JSONParseError,
>      JSONParsing, KVMMissingCap, MigrationActive, MigrationNotSupported,
>      MigrationExpected, NoBusForDevice, NotSupported, PropertyValueBad,
>      PropertyValueInUse, PropertyValueNotFound, PropertyValueNotPowerOf2,
>      PropertyValueOutOfRange, ResetRequired, SetPasswdFailed, Unsupported,
>      VirtFSFeatureBlocksMigration, VNCServerFailed
>
>    o UndefinedError: the same it's today, undefined :)
>
> Now, there are two important points to be observed:
>
>   - We check for DeviceEncrypted in hmp.c and fetch its parameters. This
>     probably indicates that we might want to create specialized classes
>     when necessary
>
>   - I don't know where to put all the DeviceFoo errors, but they probably fit
>     in QEMUError or a new class like DeviceError
>
> 2. Internal API
>
> This is very straightforward:
>
>   o error_set_deprecated();
>
>     Current error_set(), case we keep it for compatibility (see section 3).
>
>   o error_set(ErrorClass err_class, const char *fmt, ...);
>
>     Main function, sets the error classes and allow for a free human-readable
>     error message.
>
>   o error_set_sys_fmt(int err_no, const char *fmt, ...);
>   o error_set_sys(int err_no);
>
>     Helpers for setting SystemError errors. error_set_sys() gets the 'desc'
>     string from strerror().

Um, why not just do:

#define GENERIC_ERROR "{'class': 'GenericError', 'data': { 'domain': %s, 'msg': %s}"

And then just use:

error_set(errp, GENERIC_ERROR, SOME_DOMAIN, "This operation failed!");

If you want to make a little wrapper that does:

static void error_set_generic(Error **errp, const char *domain, const char *msg, 
...);

That's fine too.

But let's not overdo this and let's absolutely not change existing code! 
There's simply no need to go through a convert-everything project here.

Regards,

Anthony Liguori
Luiz Capitulino June 20, 2012, 7:40 p.m. UTC | #29
On Wed, 20 Jun 2012 13:46:08 -0500
Anthony Liguori <anthony@codemonkey.ws> wrote:

> On 06/20/2012 12:48 PM, Luiz Capitulino wrote:
> > Yet another thread fork.
> >
> > After talking with Daniel and Markus about QMP errors (which is not just about
> > QMP, as this affects QEMU as whole), I've put together the proposal below.
> >
> > I'll discuss three points. First, the error format and classes. Second, the
> > internal API and third compatibility.
> >
> > Don't be afraid about the length of this email, only the first section is long
> > but it mostly contains error classes listings.
> >
> > 1. Error format and classes
> >
> > We can keep the same error format we have today, which is:
> >
> >   { "error": { "class": json-string,
> >                "data": json-object,
> >                "desc": json-string }, "id": json-value }
> >
> >    where 'data', 'desc' and 'id' are optional fields.
> >
> > However, we'd change how we use 'desc' and our error classes. 'desc' would
> > become a string which is filled by a printf-like function (see section 2) and
> > we'd replace all error classes we have today by the following ones:
> >
> >    o ParameterError: any error which involves a bad parameter. Replaces
> >      InvalidParameter, InvalidParameterCombination, InvalidParameterType,
> >      InvalidParameterValue, MissingParameter
> >
> >    o SystemError: syscall or library errors. Replaces BufferOverrun,
> >      IOError, OpenFileFailed, PermissionDenied, TooManyFiles,
> >      SockConnectInprogress, SockConnectFailed, SockListenFailed,
> >      SockBindFailed, SockCreateFailed.
> >
> >      This error can include an optional 'os-error' field in the 'data'
> >      member (see section 2).
> >
> >    o QEMUError: errors that are internal to QEMU, like AmbiguousPath,
> >      BadBusForDevice, BusNoHotplug, BusNotFound, CommandDisabled,
> >      CommandNotFound, DuplicateId, FeatureDisabled, JSONParseError,
> >      JSONParsing, KVMMissingCap, MigrationActive, MigrationNotSupported,
> >      MigrationExpected, NoBusForDevice, NotSupported, PropertyValueBad,
> >      PropertyValueInUse, PropertyValueNotFound, PropertyValueNotPowerOf2,
> >      PropertyValueOutOfRange, ResetRequired, SetPasswdFailed, Unsupported,
> >      VirtFSFeatureBlocksMigration, VNCServerFailed
> >
> >    o UndefinedError: the same it's today, undefined :)
> >
> > Now, there are two important points to be observed:
> >
> >   - We check for DeviceEncrypted in hmp.c and fetch its parameters. This
> >     probably indicates that we might want to create specialized classes
> >     when necessary
> >
> >   - I don't know where to put all the DeviceFoo errors, but they probably fit
> >     in QEMUError or a new class like DeviceError
> >
> > 2. Internal API
> >
> > This is very straightforward:
> >
> >   o error_set_deprecated();
> >
> >     Current error_set(), case we keep it for compatibility (see section 3).
> >
> >   o error_set(ErrorClass err_class, const char *fmt, ...);
> >
> >     Main function, sets the error classes and allow for a free human-readable
> >     error message.
> >
> >   o error_set_sys_fmt(int err_no, const char *fmt, ...);
> >   o error_set_sys(int err_no);
> >
> >     Helpers for setting SystemError errors. error_set_sys() gets the 'desc'
> >     string from strerror().
> 
> Um, why not just do:
> 
> #define GENERIC_ERROR "{'class': 'GenericError', 'data': { 'domain': %s, 'msg': %s}"
> 
> And then just use:
> 
> error_set(errp, GENERIC_ERROR, SOME_DOMAIN, "This operation failed!");
> 
> If you want to make a little wrapper that does:
> 
> static void error_set_generic(Error **errp, const char *domain, const char *msg, 
> ...);
> 
> That's fine too.

Would 'domain' be one of the classes I suggested above?

I'm not sure this is better because it suggests that all classes we have today
are still valid. My main goal is to simplify, so keep using the current classes
goes against that. I think we should deprecate the current errors (vs. adding a
new one to the pile).

Also, maybe it's just how I'm interpreting it, but GenericError reminds of
UndefinedError in the sense that we'd prefer commands to use more specific
errors instead.

Actually, it's not clear to me when a command should use GenericError vs. a more
specific error?

> But let's not overdo this and let's absolutely not change existing code! 
> There's simply no need to go through a convert-everything project here.

I'm fine with keeping the current code, but I think this proposal overly
simplifies something that we're already overdoing.
Anthony Liguori June 20, 2012, 7:47 p.m. UTC | #30
On 06/20/2012 02:40 PM, Luiz Capitulino wrote:
> On Wed, 20 Jun 2012 13:46:08 -0500
> Anthony Liguori<anthony@codemonkey.ws>  wrote:
>
>> On 06/20/2012 12:48 PM, Luiz Capitulino wrote:
>>> Yet another thread fork.
>>>
>>> After talking with Daniel and Markus about QMP errors (which is not just about
>>> QMP, as this affects QEMU as whole), I've put together the proposal below.
>>>
>>> I'll discuss three points. First, the error format and classes. Second, the
>>> internal API and third compatibility.
>>>
>>> Don't be afraid about the length of this email, only the first section is long
>>> but it mostly contains error classes listings.
>>>
>>> 1. Error format and classes
>>>
>>> We can keep the same error format we have today, which is:
>>>
>>>    { "error": { "class": json-string,
>>>                 "data": json-object,
>>>                 "desc": json-string }, "id": json-value }
>>>
>>>     where 'data', 'desc' and 'id' are optional fields.
>>>
>>> However, we'd change how we use 'desc' and our error classes. 'desc' would
>>> become a string which is filled by a printf-like function (see section 2) and
>>> we'd replace all error classes we have today by the following ones:
>>>
>>>     o ParameterError: any error which involves a bad parameter. Replaces
>>>       InvalidParameter, InvalidParameterCombination, InvalidParameterType,
>>>       InvalidParameterValue, MissingParameter
>>>
>>>     o SystemError: syscall or library errors. Replaces BufferOverrun,
>>>       IOError, OpenFileFailed, PermissionDenied, TooManyFiles,
>>>       SockConnectInprogress, SockConnectFailed, SockListenFailed,
>>>       SockBindFailed, SockCreateFailed.
>>>
>>>       This error can include an optional 'os-error' field in the 'data'
>>>       member (see section 2).
>>>
>>>     o QEMUError: errors that are internal to QEMU, like AmbiguousPath,
>>>       BadBusForDevice, BusNoHotplug, BusNotFound, CommandDisabled,
>>>       CommandNotFound, DuplicateId, FeatureDisabled, JSONParseError,
>>>       JSONParsing, KVMMissingCap, MigrationActive, MigrationNotSupported,
>>>       MigrationExpected, NoBusForDevice, NotSupported, PropertyValueBad,
>>>       PropertyValueInUse, PropertyValueNotFound, PropertyValueNotPowerOf2,
>>>       PropertyValueOutOfRange, ResetRequired, SetPasswdFailed, Unsupported,
>>>       VirtFSFeatureBlocksMigration, VNCServerFailed
>>>
>>>     o UndefinedError: the same it's today, undefined :)
>>>
>>> Now, there are two important points to be observed:
>>>
>>>    - We check for DeviceEncrypted in hmp.c and fetch its parameters. This
>>>      probably indicates that we might want to create specialized classes
>>>      when necessary
>>>
>>>    - I don't know where to put all the DeviceFoo errors, but they probably fit
>>>      in QEMUError or a new class like DeviceError
>>>
>>> 2. Internal API
>>>
>>> This is very straightforward:
>>>
>>>    o error_set_deprecated();
>>>
>>>      Current error_set(), case we keep it for compatibility (see section 3).
>>>
>>>    o error_set(ErrorClass err_class, const char *fmt, ...);
>>>
>>>      Main function, sets the error classes and allow for a free human-readable
>>>      error message.
>>>
>>>    o error_set_sys_fmt(int err_no, const char *fmt, ...);
>>>    o error_set_sys(int err_no);
>>>
>>>      Helpers for setting SystemError errors. error_set_sys() gets the 'desc'
>>>      string from strerror().
>>
>> Um, why not just do:
>>
>> #define GENERIC_ERROR "{'class': 'GenericError', 'data': { 'domain': %s, 'msg': %s}"
>>
>> And then just use:
>>
>> error_set(errp, GENERIC_ERROR, SOME_DOMAIN, "This operation failed!");
>>
>> If you want to make a little wrapper that does:
>>
>> static void error_set_generic(Error **errp, const char *domain, const char *msg,
>> ...);
>>
>> That's fine too.
>
> Would 'domain' be one of the classes I suggested above?

No.  We shouldn't attempt to design new error classes.  Just let it evolve 
naturally.

> I'm not sure this is better because it suggests that all classes we have today
> are still valid.

Yes, they are still valid.  We cannot and should not change any of the error 
behavior we have today.

> My main goal is to simplify,

My main goal is to get rid of all the printf() droppings we have scattered 
through the code base.  There is absolutely not benefit in touching the current 
users of Error.  They work fine.  We need to focus our attention on cleaning up 
the crap that we have left.

> Also, maybe it's just how I'm interpreting it, but GenericError reminds of
> UndefinedError in the sense that we'd prefer commands to use more specific
> errors instead.

Using strings mean that errors are basically useless from a programmatic 
perspective.  So yeah, it's just like using UndefinedError.  Except we may have 
a few additional kinds of "UndefinedErrors" by having domains.

> I'm fine with keeping the current code, but I think this proposal overly
> simplifies something that we're already overdoing.

We have something that works--why should we change it in any way?  There's no 
maintenance burden here.

We need to focus on the parts of the code that don't work--all of the random 
users of printf/error_report.

Regards,

Anthony Liguori
Luiz Capitulino June 20, 2012, 8:13 p.m. UTC | #31
On Wed, 20 Jun 2012 14:47:14 -0500
Anthony Liguori <anthony@codemonkey.ws> wrote:

> > I'm not sure this is better because it suggests that all classes we have today
> > are still valid.
> 
> Yes, they are still valid.  We cannot and should not change any of the error 
> behavior we have today.

We can keep them and do the new way only for new commands.

> > My main goal is to simplify,
> 
> My main goal is to get rid of all the printf() droppings we have scattered 
> through the code base.  There is absolutely not benefit in touching the current 
> users of Error.  They work fine.  We need to focus our attention on cleaning up 
> the crap that we have left.

We're talking about two different problems.

First, I agree about the issue with printf() and also agree about GenericError
being a solution to fix that. Although it's not clear to me how 'domain' should
be used and maybe we could extend UndefinedError to contain a user string and
use that instead.

The second problem is the 'rich error reporting has failed' one. This proposal
tries to address that. Declaring the current errors as deprecated doesn't
require us changing current users.

> > Also, maybe it's just how I'm interpreting it, but GenericError reminds of
> > UndefinedError in the sense that we'd prefer commands to use more specific
> > errors instead.
> 
> Using strings mean that errors are basically useless from a programmatic 
> perspective.  So yeah, it's just like using UndefinedError.  Except we may have 
> a few additional kinds of "UndefinedErrors" by having domains.
> 
> > I'm fine with keeping the current code, but I think this proposal overly
> > simplifies something that we're already overdoing.
> 
> We have something that works--why should we change it in any way?  There's no 
> maintenance burden here.

Because it got too complicated. We have 70+ error classes and a lot to come
with keep the current way.  It's already very difficult to choose the correct
class for an error and I don't doubt clients have a similar trouble in their end.
Besides, several errors are missing information to be really useful, like the
errno discussion.

I agree the burden is very little to maintain the current errors and I'm fine
with it, but I think we should move away from having hundreds of not so useful
classes.
Daniel P. Berrangé June 21, 2012, 12:42 p.m. UTC | #32
On Wed, Jun 20, 2012 at 02:48:38PM -0300, Luiz Capitulino wrote:
> Yet another thread fork.
> 
> After talking with Daniel and Markus about QMP errors (which is not just about
> QMP, as this affects QEMU as whole), I've put together the proposal below.
> 
> I'll discuss three points. First, the error format and classes. Second, the
> internal API and third compatibility.
> 
> Don't be afraid about the length of this email, only the first section is long
> but it mostly contains error classes listings.
> 
> 1. Error format and classes
> 
> We can keep the same error format we have today, which is:
> 
>  { "error": { "class": json-string,
>               "data": json-object,
>               "desc": json-string }, "id": json-value }
> 
>   where 'data', 'desc' and 'id' are optional fields.

Yep, that is good.

> However, we'd change how we use 'desc' and our error classes. 'desc' would
> become a string which is filled by a printf-like function (see section 2) and

Good, using a printf-like string for desc is the big change
I wanted to see in QMP errors.

> we'd replace all error classes we have today by the following ones:

Nooo, that's going a bit to far in simplification....

>   o ParameterError: any error which involves a bad parameter. Replaces
>     InvalidParameter, InvalidParameterCombination, InvalidParameterType,
>     InvalidParameterValue, MissingParameter
> 
>   o SystemError: syscall or library errors. Replaces BufferOverrun,
>     IOError, OpenFileFailed, PermissionDenied, TooManyFiles,
>     SockConnectInprogress, SockConnectFailed, SockListenFailed,
>     SockBindFailed, SockCreateFailed.
> 
>     This error can include an optional 'os-error' field in the 'data'
>     member (see section 2).
> 
>   o QEMUError: errors that are internal to QEMU, like AmbiguousPath,
>     BadBusForDevice, BusNoHotplug, BusNotFound, CommandDisabled,
>     CommandNotFound, DuplicateId, FeatureDisabled, JSONParseError,
>     JSONParsing, KVMMissingCap, MigrationActive, MigrationNotSupported,
>     MigrationExpected, NoBusForDevice, NotSupported, PropertyValueBad,
>     PropertyValueInUse, PropertyValueNotFound, PropertyValueNotPowerOf2,
>     PropertyValueOutOfRange, ResetRequired, SetPasswdFailed, Unsupported,
>     VirtFSFeatureBlocksMigration, VNCServerFailed
> 
>   o UndefinedError: the same it's today, undefined :)

There is a balance to be struck here - previously we were tending
to invent a new error class for every conceivable error condition.
This proposal meanwhile is swinging too far to the other extreme
having only 4/5 classes. There is a balance to be had here.

It is perfectly reasonable, and indeed useful, to have distinct
errors like CommandNotFound, CommandDisabled, and so on.  What
we shouldn't do, however, is do things like invent a new error
for every possible errno value. The examples of  PropertyValueNotFound,
PropertyValueNotPowerOf2, PropertyValueOutOfRange are cases where
we invented too many codes, and in the new world we would do
something like

   PropertyValueInvalid
     msg = "Value must be a power of 2"
     msg = "Value must be in range 1-30"
     msg = "Value not found"

We have to just use prudent judgement to decide whether to use
an existing error, or create a new one.

In libvirt we have always reserved the right to change the error
code reported for particular scenarios. So, for example, alot of
our errors started out as "InternalError" (equiv to UndefinedError)
but over time we have changed some to more specialized values
eg "InvalidOperation", "ConfigNotSupported" and so on.

We should probably explicitly note that any use of "UndefinedError"
is liable to be changed in a future QEMU release.

> Now, there are two important points to be observed:
> 
>  - We check for DeviceEncrypted in hmp.c and fetch its parameters. This
>    probably indicates that we might want to create specialized classes
>    when necessary
> 
>  - I don't know where to put all the DeviceFoo errors, but they probably fit
>    in QEMUError or a new class like DeviceError
> 3. Compatibility
> 
> We have two options here:
> 
>  1. Maintain the current errors and implement the new way only for new
>     commands (includes commands merged for 1.2).
> 
>       Pros: maintains compatibility and won't require any code churn
>       Cons: we'll have two different errors schemas in use at the same
>             time and will be carrying garbage forward
> 
>  2. Do a full conversion to the new way.
> 
>       Pros: we drop bad code and avoid pollution (good for Rio+20)
>       Cons: possibly breaks compatibility and will require a lot of code
>             churn up front

Just maintain existing usage, but apply appropriate judgement for new
conversions.

Regards,
Daniel
Luiz Capitulino June 25, 2012, 7:56 p.m. UTC | #33
On Thu, 21 Jun 2012 13:42:19 +0100
"Daniel P. Berrange" <berrange@redhat.com> wrote:

[...]

> > However, we'd change how we use 'desc' and our error classes. 'desc' would
> > become a string which is filled by a printf-like function (see section 2) and
> 
> Good, using a printf-like string for desc is the big change
> I wanted to see in QMP errors.
> 
> > we'd replace all error classes we have today by the following ones:
> 
> Nooo, that's going a bit to far in simplification....

[...]

> There is a balance to be struck here - previously we were tending
> to invent a new error class for every conceivable error condition.
> This proposal meanwhile is swinging too far to the other extreme
> having only 4/5 classes. There is a balance to be had here.
> 
> It is perfectly reasonable, and indeed useful, to have distinct
> errors like CommandNotFound, CommandDisabled, and so on.  What
> we shouldn't do, however, is do things like invent a new error
> for every possible errno value. The examples of  PropertyValueNotFound,
> PropertyValueNotPowerOf2, PropertyValueOutOfRange are cases where
> we invented too many codes, and in the new world we would do
> something like
> 
>    PropertyValueInvalid
>      msg = "Value must be a power of 2"
>      msg = "Value must be in range 1-30"
>      msg = "Value not found"
> 
> We have to just use prudent judgement to decide whether to use
> an existing error, or create a new one.

This seems a good middle ground too me, but let's elaborate it further.

I think what you're proposing boils down to three points:

 1. Allow printf-like strings for desc

 2. Avoid creating a new error class for each errno

 3. Don't create similar error classes

Item 3 seems quite obvious, but we've failed doing that and the main reason
for that is probably because desc is too strict today, so people end up
creating new errors just to have a different desc.

This means that having a printf-like string will help. I propose having
the following function for this:

 error_set_fmt(Error **errp, ErrorClass err_class, const char *fmt, ...);

Or, we could rename the current error_set() function to error_set_deprecated()
and use the above function only (renamed to error_set()).

Also, I believe this fits well with some internal improvements I'm planning
for some time now, like having qapi-errors.json (I already have a PoC for this,
but haven't posted it yet).

Lastly, regarding item 2, I think we already have consensus that returning:

 { "error": "WriteFailed", "data": { "os-error": "eio" },
   "desc": "input/output error while writing to '/tmp/forty-two'" }

Is better than returning:

 { "error": "IOError" }
Markus Armbruster June 26, 2012, 7:54 a.m. UTC | #34
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Thu, 21 Jun 2012 13:42:19 +0100
> "Daniel P. Berrange" <berrange@redhat.com> wrote:
>
> [...]
>
>> > However, we'd change how we use 'desc' and our error classes. 'desc' would
>> > become a string which is filled by a printf-like function (see section 2) and
>> 
>> Good, using a printf-like string for desc is the big change
>> I wanted to see in QMP errors.
>> 
>> > we'd replace all error classes we have today by the following ones:
>> 
>> Nooo, that's going a bit to far in simplification....
>
> [...]
>
>> There is a balance to be struck here - previously we were tending
>> to invent a new error class for every conceivable error condition.
>> This proposal meanwhile is swinging too far to the other extreme
>> having only 4/5 classes. There is a balance to be had here.
>> 
>> It is perfectly reasonable, and indeed useful, to have distinct
>> errors like CommandNotFound, CommandDisabled, and so on.  What

"Useful" means "users put it to use".  Since we have users, we can work
with *data* on users' needs rather than hypothesises.

>> we shouldn't do, however, is do things like invent a new error
>> for every possible errno value. The examples of  PropertyValueNotFound,
>> PropertyValueNotPowerOf2, PropertyValueOutOfRange are cases where
>> we invented too many codes, and in the new world we would do
>> something like
>> 
>>    PropertyValueInvalid
>>      msg = "Value must be a power of 2"
>>      msg = "Value must be in range 1-30"
>>      msg = "Value not found"
>> 
>> We have to just use prudent judgement to decide whether to use
>> an existing error, or create a new one.

If a user wants to handle a failure mode (or a set of failure modes)
differently than others, then it needs its own error.

To get sane errors rather than the byzantine mess we have now, co-evolve
them with users.

> This seems a good middle ground too me, but let's elaborate it further.
>
> I think what you're proposing boils down to three points:
>
>  1. Allow printf-like strings for desc

I want this badly, because it gives me a fighting chance at emitting
descriptions fit for human consumption.

>  2. Avoid creating a new error class for each errno

Silliness avoidance.

>  3. Don't create similar error classes
>
> Item 3 seems quite obvious, but we've failed doing that and the main reason
> for that is probably because desc is too strict today, so people end up
> creating new errors just to have a different desc.

Yes, that's one reason.  The other reason is following bad examples and
try to emit the most "complete" error object conceivable, whether users
need it or not.  Generally results in completely baroque error objects.

> This means that having a printf-like string will help. I propose having
> the following function for this:
>
>  error_set_fmt(Error **errp, ErrorClass err_class, const char *fmt, ...);
>
> Or, we could rename the current error_set() function to error_set_deprecated()
> and use the above function only (renamed to error_set()).

I'm very much in favour of giving the preferred error reporting function
the most obvious and concise name: error_set().  That means renaming the
squatter.

> Also, I believe this fits well with some internal improvements I'm planning
> for some time now, like having qapi-errors.json (I already have a PoC for this,
> but haven't posted it yet).
>
> Lastly, regarding item 2, I think we already have consensus that returning:
>
>  { "error": "WriteFailed", "data": { "os-error": "eio" },
>    "desc": "input/output error while writing to '/tmp/forty-two'" }
>
> Is better than returning:
>
>  { "error": "IOError" }

It's better only if users actually care for the difference.

In your example: after EIO, you're generally hosed.  What differences in
handling read vs. write error do you envisage?

[Snipped part restored:]
>> In libvirt we have always reserved the right to change the error
>> code reported for particular scenarios. So, for example, alot of
>> our errors started out as "InternalError" (equiv to UndefinedError)
>> but over time we have changed some to more specialized values
>> eg "InvalidOperation", "ConfigNotSupported" and so on.

Do you reserve the right to make changes other than from InternalError
to a more specific one?

>> We should probably explicitly note that any use of "UndefinedError"
>> is liable to be changed in a future QEMU release.

Makes sense to me.
Markus Armbruster June 26, 2012, 8:01 a.m. UTC | #35
Anthony Liguori <anthony@codemonkey.ws> writes:

> Um, why not just do:
>
> #define GENERIC_ERROR "{'class': 'GenericError', 'data': { 'domain': %s, 'msg': %s}"
>
> And then just use:
>
> error_set(errp, GENERIC_ERROR, SOME_DOMAIN, "This operation failed!");

Please explain the intended use of domain, and why it is necessary,
i.e. why class alone won't do.

[...]
Daniel P. Berrangé June 26, 2012, 9:35 a.m. UTC | #36
On Tue, Jun 26, 2012 at 09:54:21AM +0200, Markus Armbruster wrote:
> Luiz Capitulino <lcapitulino@redhat.com> writes:
> 
> > On Thu, 21 Jun 2012 13:42:19 +0100
> > "Daniel P. Berrange" <berrange@redhat.com> wrote:
> >
> > [...]
> >
> >> > However, we'd change how we use 'desc' and our error classes. 'desc' would
> >> > become a string which is filled by a printf-like function (see section 2) and
> >> 
> >> Good, using a printf-like string for desc is the big change
> >> I wanted to see in QMP errors.
> >> 
> >> > we'd replace all error classes we have today by the following ones:
> >> 
> >> Nooo, that's going a bit to far in simplification....
> >
> > [...]
> >
> >> There is a balance to be struck here - previously we were tending
> >> to invent a new error class for every conceivable error condition.
> >> This proposal meanwhile is swinging too far to the other extreme
> >> having only 4/5 classes. There is a balance to be had here.
> >> 
> >> It is perfectly reasonable, and indeed useful, to have distinct
> >> errors like CommandNotFound, CommandDisabled, and so on.  What
> 
> "Useful" means "users put it to use".  Since we have users, we can work
> with *data* on users' needs rather than hypothesises.
> 
> >> we shouldn't do, however, is do things like invent a new error
> >> for every possible errno value. The examples of  PropertyValueNotFound,
> >> PropertyValueNotPowerOf2, PropertyValueOutOfRange are cases where
> >> we invented too many codes, and in the new world we would do
> >> something like
> >> 
> >>    PropertyValueInvalid
> >>      msg = "Value must be a power of 2"
> >>      msg = "Value must be in range 1-30"
> >>      msg = "Value not found"
> >> 
> >> We have to just use prudent judgement to decide whether to use
> >> an existing error, or create a new one.
> 
> If a user wants to handle a failure mode (or a set of failure modes)
> differently than others, then it needs its own error.
> 
> To get sane errors rather than the byzantine mess we have now, co-evolve
> them with users.
> 
> > This seems a good middle ground too me, but let's elaborate it further.
> >
> > I think what you're proposing boils down to three points:
> >
> >  1. Allow printf-like strings for desc
> 
> I want this badly, because it gives me a fighting chance at emitting
> descriptions fit for human consumption.
> 
> >  2. Avoid creating a new error class for each errno
> 
> Silliness avoidance.
> 
> >  3. Don't create similar error classes
> >
> > Item 3 seems quite obvious, but we've failed doing that and the main reason
> > for that is probably because desc is too strict today, so people end up
> > creating new errors just to have a different desc.
> 
> Yes, that's one reason.  The other reason is following bad examples and
> try to emit the most "complete" error object conceivable, whether users
> need it or not.  Generally results in completely baroque error objects.
> 
> > This means that having a printf-like string will help. I propose having
> > the following function for this:
> >
> >  error_set_fmt(Error **errp, ErrorClass err_class, const char *fmt, ...);
> >
> > Or, we could rename the current error_set() function to error_set_deprecated()
> > and use the above function only (renamed to error_set()).
> 
> I'm very much in favour of giving the preferred error reporting function
> the most obvious and concise name: error_set().  That means renaming the
> squatter.
> 
> > Also, I believe this fits well with some internal improvements I'm planning
> > for some time now, like having qapi-errors.json (I already have a PoC for this,
> > but haven't posted it yet).
> >
> > Lastly, regarding item 2, I think we already have consensus that returning:
> >
> >  { "error": "WriteFailed", "data": { "os-error": "eio" },
> >    "desc": "input/output error while writing to '/tmp/forty-two'" }
> >
> > Is better than returning:
> >
> >  { "error": "IOError" }
> 
> It's better only if users actually care for the difference.
> 
> In your example: after EIO, you're generally hosed.  What differences in
> handling read vs. write error do you envisage?
> 
> [Snipped part restored:]
> >> In libvirt we have always reserved the right to change the error
> >> code reported for particular scenarios. So, for example, alot of
> >> our errors started out as "InternalError" (equiv to UndefinedError)
> >> but over time we have changed some to more specialized values
> >> eg "InvalidOperation", "ConfigNotSupported" and so on.
> 
> Do you reserve the right to make changes other than from InternalError
> to a more specific one?

If I'm perfectly honest, then yes. We have tried not to gratuitouslyy
change errors being reported, but they have definitely evolved over
time, so more distinct error scenarios are reported, where previously
many things would have been lumped under one code.

> >> We should probably explicitly note that any use of "UndefinedError"
> >> is liable to be changed in a future QEMU release.
> 
> Makes sense to me.

Daniel
Markus Armbruster June 26, 2012, 11:21 a.m. UTC | #37
"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Tue, Jun 26, 2012 at 09:54:21AM +0200, Markus Armbruster wrote:
>> Luiz Capitulino <lcapitulino@redhat.com> writes:
>> 
>> > On Thu, 21 Jun 2012 13:42:19 +0100
>> > "Daniel P. Berrange" <berrange@redhat.com> wrote:
[...]
>> >> In libvirt we have always reserved the right to change the error
>> >> code reported for particular scenarios. So, for example, alot of
>> >> our errors started out as "InternalError" (equiv to UndefinedError)
>> >> but over time we have changed some to more specialized values
>> >> eg "InvalidOperation", "ConfigNotSupported" and so on.
>> 
>> Do you reserve the right to make changes other than from InternalError
>> to a more specific one?
>
> If I'm perfectly honest, then yes. We have tried not to gratuitouslyy
> change errors being reported, but they have definitely evolved over
> time, so more distinct error scenarios are reported, where previously
> many things would have been lumped under one code.

Pragmatic co-evolution of errors with their actual use.  Makes sense to
me, and I'd recommend we do the same in QEMU.

[...]
Anthony Liguori June 26, 2012, 4:25 p.m. UTC | #38
On 06/26/2012 06:21 AM, Markus Armbruster wrote:
> "Daniel P. Berrange"<berrange@redhat.com>  writes:
>
>> On Tue, Jun 26, 2012 at 09:54:21AM +0200, Markus Armbruster wrote:
>>> Luiz Capitulino<lcapitulino@redhat.com>  writes:
>>>
>>>> On Thu, 21 Jun 2012 13:42:19 +0100
>>>> "Daniel P. Berrange"<berrange@redhat.com>  wrote:
> [...]
>>>>> In libvirt we have always reserved the right to change the error
>>>>> code reported for particular scenarios. So, for example, alot of
>>>>> our errors started out as "InternalError" (equiv to UndefinedError)
>>>>> but over time we have changed some to more specialized values
>>>>> eg "InvalidOperation", "ConfigNotSupported" and so on.
>>>
>>> Do you reserve the right to make changes other than from InternalError
>>> to a more specific one?
>>
>> If I'm perfectly honest, then yes. We have tried not to gratuitouslyy
>> change errors being reported, but they have definitely evolved over
>> time, so more distinct error scenarios are reported, where previously
>> many things would have been lumped under one code.
>
> Pragmatic co-evolution of errors with their actual use.  Makes sense to
> me, and I'd recommend we do the same in QEMU.

Just to be clear, what we're discussing is:

> diff --git a/qerror.c b/qerror.c
> index 92c4eff..ebe2eb0 100644
> --- a/qerror.c
> +++ b/qerror.c
> @@ -328,6 +328,10 @@ static const QErrorStringTable qerror_table[] = {
>          .error_fmt = QERR_SOCKET_CREATE_FAILED,
>          .desc      = "Failed to create socket",
>      },
> +    {
> +        .error_fmt = QERR_UNDEFINED_ERROR,
> +        .desc      = "Undefined Error: %(message)",
> +    },
>      {}
>  };
>
> diff --git a/qerror.h b/qerror.h
> index b4c8758..da8f086 100644
> --- a/qerror.h
> +++ b/qerror.h
> @@ -266,4 +266,7 @@ QError *qobject_to_qerror(const QObject *obj);
>  #define QERR_SOCKET_CREATE_FAILED \
>      "{ 'class': 'SockCreateFailed', 'data': {} }"
>
> +#define QERR_UNDEFINED_ERROR \
> +    "{ 'class': 'UndefinedError', 'data': { 'message': %s } }"
> +
>  #endif /* QERROR_H */

Nothing more, nothing less.  No changes to wire protocol, no changes to existing 
error uses, etc.

Regards,

Anthony Liguori

>
> [...]
>
>
Markus Armbruster June 28, 2012, 7:50 a.m. UTC | #39
Anthony Liguori <anthony@codemonkey.ws> writes:

> On 06/26/2012 06:21 AM, Markus Armbruster wrote:
>> "Daniel P. Berrange"<berrange@redhat.com>  writes:
>>
>>> On Tue, Jun 26, 2012 at 09:54:21AM +0200, Markus Armbruster wrote:
>>>> Luiz Capitulino<lcapitulino@redhat.com>  writes:
>>>>
>>>>> On Thu, 21 Jun 2012 13:42:19 +0100
>>>>> "Daniel P. Berrange"<berrange@redhat.com>  wrote:
>> [...]
>>>>>> In libvirt we have always reserved the right to change the error
>>>>>> code reported for particular scenarios. So, for example, alot of
>>>>>> our errors started out as "InternalError" (equiv to UndefinedError)
>>>>>> but over time we have changed some to more specialized values
>>>>>> eg "InvalidOperation", "ConfigNotSupported" and so on.
>>>>
>>>> Do you reserve the right to make changes other than from InternalError
>>>> to a more specific one?
>>>
>>> If I'm perfectly honest, then yes. We have tried not to gratuitouslyy
>>> change errors being reported, but they have definitely evolved over
>>> time, so more distinct error scenarios are reported, where previously
>>> many things would have been lumped under one code.
>>
>> Pragmatic co-evolution of errors with their actual use.  Makes sense to
>> me, and I'd recommend we do the same in QEMU.
>
> Just to be clear, what we're discussing is:
>
>> diff --git a/qerror.c b/qerror.c
>> index 92c4eff..ebe2eb0 100644
>> --- a/qerror.c
>> +++ b/qerror.c
>> @@ -328,6 +328,10 @@ static const QErrorStringTable qerror_table[] = {
>>          .error_fmt = QERR_SOCKET_CREATE_FAILED,
>>          .desc      = "Failed to create socket",
>>      },
>> +    {
>> +        .error_fmt = QERR_UNDEFINED_ERROR,
>> +        .desc      = "Undefined Error: %(message)",
>> +    },
>>      {}
>>  };
>>
>> diff --git a/qerror.h b/qerror.h
>> index b4c8758..da8f086 100644
>> --- a/qerror.h
>> +++ b/qerror.h
>> @@ -266,4 +266,7 @@ QError *qobject_to_qerror(const QObject *obj);
>>  #define QERR_SOCKET_CREATE_FAILED \
>>      "{ 'class': 'SockCreateFailed', 'data': {} }"
>>
>> +#define QERR_UNDEFINED_ERROR \
>> +    "{ 'class': 'UndefinedError', 'data': { 'message': %s } }"
>> +
>>  #endif /* QERROR_H */
>
> Nothing more, nothing less.  No changes to wire protocol, no changes
> to existing error uses, etc.

What we're discussing is whatever people bring up, actually.

So far, there haven't been any calls for a change of the wire protocol.

There has been an agreement of sorts that the current "rich errors" have
been "a failure" (your words).  Doesn't mean we all agree on the nature
of the failure.

Several people pointed out that:

* our "rich" errors are so cumbersome to emit that adoption has been
  slow,

* our "rich" errors' human-readable descriptions lead to piss-poor
  human-readable error messages[*] (pardon my french), and

* the "richness" has no known uses after 2+ years.

Perhaps your idea of the rich errors failure stops at the first item
(slow adoption).  Perhaps you even entertain hopes of solving the
adoption problem by first asking for adoption of "simplified rich
errors", and once that's done, push again for your original design.

I disagree with both notions.

Slow adoption is a *symptom*, not a cause: it's been slow, because the
costs and drawbacks outweigh the benefits.  In other words, it's been
slow, because people have had the good sense not to do something that's
plainly not worth it.

I agree converting existing uses of the failed rich errors API to
whatever new API we come up with before anything else isn't useful.

But I challenge the idea that we must not change existing uses of "rich"
error reporting ever.  When such a use produces bad human-readable
messages, that's a bug, and the most expedient fix could well be a
switch to the new, saner API, even when that drops a bit of the richness
(which after all has no known uses).

For me, known suffering of human users weighs more heavily than
hypothetical suffering of hypothetical tools.


[*] This is why I refuse to work on error conversions.  Turning decent
error messages into garbage is just too frustrating for me.  Yes, my
life would be easier if I didn't care for users.
Paolo Bonzini July 2, 2012, 8:03 a.m. UTC | #40
Il 20/06/2012 19:48, Luiz Capitulino ha scritto:
> However, we'd change how we use 'desc' and our error classes. 'desc' would
> become a string which is filled by a printf-like function (see section 2) and
> we'd replace all error classes we have today by the following ones:
> 
>   o ParameterError: any error which involves a bad parameter. Replaces
>     InvalidParameter, InvalidParameterCombination, InvalidParameterType,
>     InvalidParameterValue, MissingParameter

PropertyValueNotFound, PropertyValueNotPowerOf2,
PropertyValueOutOfRange, AmbiguousPath, BadBusForDevice, BusNotFound,
CommandNotFound, DuplicateId

> 
>   o SystemError: syscall or library errors. Replaces BufferOverrun,
>     IOError, OpenFileFailed, PermissionDenied, TooManyFiles,
>     SockConnectInprogress, SockConnectFailed, SockListenFailed,
>     SockBindFailed, SockCreateFailed.
> 
>     This error can include an optional 'os-error' field in the 'data'
>     member (see section 2).
> 
>   o QEMUError: errors that are internal to QEMU, like AmbiguousPath,
>     BadBusForDevice, BusNoHotplug, BusNotFound, CommandDisabled,
>     CommandNotFound, DuplicateId, FeatureDisabled, JSONParseError,
>     JSONParsing, KVMMissingCap, MigrationActive, MigrationNotSupported,
>     MigrationExpected, NoBusForDevice, NotSupported, PropertyValueBad,
>     PropertyValueInUse, PropertyValueNotFound, PropertyValueNotPowerOf2,
>     PropertyValueOutOfRange, ResetRequired, SetPasswdFailed, Unsupported,
>     VirtFSFeatureBlocksMigration, VNCServerFailed

I agree with Daniel that this is going a bit too far, in particular for
the last example.  Many of these are actually ParameterErrors, and we
can lump all of these in a small number of meaningful classes.  Keeping
in mind Markus's "check failure-check transient-report", my proposal is:

* ParameterError is "the" non-transient caused by user error or a bug in
management, hopefully the former: the ones you gave above, but also
PropertyValueNotFound, PropertyValueNotPowerOf2,
PropertyValueOutOfRange, AmbiguousPath, BadBusForDevice, BusNotFound,
CommandNotFound, DuplicateId.  Perhaps PropertyValueInUse too.

* FeatureNotSupportedError is "the" non-transient error caused by user
or admin configuration (FeatureDisabled, Unsupported, CommandDisabled,
MigrationNotSupported, NotSupported, VirtFSFeatureBlocksMigration).

* SystemError could be transient or not, but is kept separate because it
is pretty much the only case in which a rich error is useful (the
richness will for now be limited to errno, perhaps in the future a file
name or socket address)

* InvalidStateError is generally caused by the interaction with other
commands, could be fixed by sending some commands and retrying:
DeviceEncrypted, MigrationActive, MigrationExpected, NotSupported,
PropertyValueInUse, ResetRequired.

* JSONParseError should be a separate error, also because it may be very
difficult to recover.

There are some cases where I'm a bit doubtful about how to proceed, but
for these we can keep the current rich errors.  One example is
InvalidPassword.  In some cases, we could use events to remove the need
for rich errors.  For example, DeviceEncrypted could be changed to a
KEY_REQUESTED event that is sent early for all devices, rather than at
"cont" time.

Paolo
Anthony Liguori July 2, 2012, 12:47 p.m. UTC | #41
On 06/28/2012 02:50 AM, Markus Armbruster wrote:
> Anthony Liguori<anthony@codemonkey.ws>  writes:
>
>> On 06/26/2012 06:21 AM, Markus Armbruster wrote:
>>> "Daniel P. Berrange"<berrange@redhat.com>   writes:
>>>
>>>> On Tue, Jun 26, 2012 at 09:54:21AM +0200, Markus Armbruster wrote:
>>>>> Luiz Capitulino<lcapitulino@redhat.com>   writes:
>>>>>
>>>>>> On Thu, 21 Jun 2012 13:42:19 +0100
>>>>>> "Daniel P. Berrange"<berrange@redhat.com>   wrote:
>>> [...]
>>>>>>> In libvirt we have always reserved the right to change the error
>>>>>>> code reported for particular scenarios. So, for example, alot of
>>>>>>> our errors started out as "InternalError" (equiv to UndefinedError)
>>>>>>> but over time we have changed some to more specialized values
>>>>>>> eg "InvalidOperation", "ConfigNotSupported" and so on.
>>>>>
>>>>> Do you reserve the right to make changes other than from InternalError
>>>>> to a more specific one?
>>>>
>>>> If I'm perfectly honest, then yes. We have tried not to gratuitouslyy
>>>> change errors being reported, but they have definitely evolved over
>>>> time, so more distinct error scenarios are reported, where previously
>>>> many things would have been lumped under one code.
>>>
>>> Pragmatic co-evolution of errors with their actual use.  Makes sense to
>>> me, and I'd recommend we do the same in QEMU.
>>
>> Just to be clear, what we're discussing is:
>>
>>> diff --git a/qerror.c b/qerror.c
>>> index 92c4eff..ebe2eb0 100644
>>> --- a/qerror.c
>>> +++ b/qerror.c
>>> @@ -328,6 +328,10 @@ static const QErrorStringTable qerror_table[] = {
>>>           .error_fmt = QERR_SOCKET_CREATE_FAILED,
>>>           .desc      = "Failed to create socket",
>>>       },
>>> +    {
>>> +        .error_fmt = QERR_UNDEFINED_ERROR,
>>> +        .desc      = "Undefined Error: %(message)",
>>> +    },
>>>       {}
>>>   };
>>>
>>> diff --git a/qerror.h b/qerror.h
>>> index b4c8758..da8f086 100644
>>> --- a/qerror.h
>>> +++ b/qerror.h
>>> @@ -266,4 +266,7 @@ QError *qobject_to_qerror(const QObject *obj);
>>>   #define QERR_SOCKET_CREATE_FAILED \
>>>       "{ 'class': 'SockCreateFailed', 'data': {} }"
>>>
>>> +#define QERR_UNDEFINED_ERROR \
>>> +    "{ 'class': 'UndefinedError', 'data': { 'message': %s } }"
>>> +
>>>   #endif /* QERROR_H */
>>
>> Nothing more, nothing less.  No changes to wire protocol, no changes
>> to existing error uses, etc.
>
> What we're discussing is whatever people bring up, actually.
>
> So far, there haven't been any calls for a change of the wire protocol.

Changing all existing errors and repurposing 'desc' is effectively changing the 
wire protocol.

>
> There has been an agreement of sorts that the current "rich errors" have
> been "a failure" (your words).  Doesn't mean we all agree on the nature
> of the failure.
>
> Several people pointed out that:
>
> * our "rich" errors are so cumbersome to emit that adoption has been
>    slow,
>
> * our "rich" errors' human-readable descriptions lead to piss-poor
>    human-readable error messages[*] (pardon my french), and
>
> * the "richness" has no known uses after 2+ years.

Do you know of *any* QMP user beyond libvirt?  Let's face it, QMP is a protocol 
for libvirt.  What it looks like shouldn't be dictated by anything other than 
what libvirt wants/needs.

If libvirt isn't going to do more than look at an error type, then there's no 
point in providing more than that.

> Perhaps your idea of the rich errors failure stops at the first item
> (slow adoption).  Perhaps you even entertain hopes of solving the
> adoption problem by first asking for adoption of "simplified rich
> errors", and once that's done, push again for your original design.
>
> I disagree with both notions.
>
> Slow adoption is a *symptom*, not a cause: it's been slow, because the
> costs and drawbacks outweigh the benefits.  In other words, it's been
> slow, because people have had the good sense not to do something that's
> plainly not worth it.
>
> I agree converting existing uses of the failed rich errors API to
> whatever new API we come up with before anything else isn't useful.
>
> But I challenge the idea that we must not change existing uses of "rich"
> error reporting ever.

I think you're missing the forest for the trees.

The real problem we need to solve is not the quality of error messages. 
Honestly, we have all of the tools today to improve error messages.

The problem we need to solve is error propagation because without it, we cannot 
have asynchronous commands.  We keep ending up with extremely complex/cumbersome 
management interfaces that we need to support forever because we still have 
out-of-band errors that cannot be associated with a specific QMP command.

So I don't want to see us focus a bunch of effort on rewriting existing error 
users.  Is that a hard and fast rule?  Of course not.  If it makes sense to 
tweak an error here and there, that's fine.

But the problem to solve is killing off error_report.

Regards,

Anthony Liguori

   When such a use produces bad human-readable
> messages, that's a bug, and the most expedient fix could well be a
> switch to the new, saner API, even when that drops a bit of the richness
> (which after all has no known uses).
>
> For me, known suffering of human users weighs more heavily than
> hypothetical suffering of hypothetical tools.
>
>
> [*] This is why I refuse to work on error conversions.  Turning decent
> error messages into garbage is just too frustrating for me.  Yes, my
> life would be easier if I didn't care for users.
Luiz Capitulino July 2, 2012, 1:47 p.m. UTC | #42
On Mon, 02 Jul 2012 07:47:56 -0500
Anthony Liguori <anthony@codemonkey.ws> wrote:

> > So far, there haven't been any calls for a change of the wire protocol.
> 
> Changing all existing errors and repurposing 'desc' is effectively changing the 
> wire protocol.

We're not going to change existing errors and I disagree that making 'desc'
a "free" string is an incompatible change, as we note in the spec that it
shouldn't be parsed.

> > There has been an agreement of sorts that the current "rich errors" have
> > been "a failure" (your words).  Doesn't mean we all agree on the nature
> > of the failure.
> >
> > Several people pointed out that:
> >
> > * our "rich" errors are so cumbersome to emit that adoption has been
> >    slow,
> >
> > * our "rich" errors' human-readable descriptions lead to piss-poor
> >    human-readable error messages[*] (pardon my french), and
> >
> > * the "richness" has no known uses after 2+ years.
> 
> Do you know of *any* QMP user beyond libvirt?  

Apart from casual people scripting around QMP, no I don't. But that doesn't
imply they don't exist.

> Let's face it, QMP is a protocol 
> for libvirt.  What it looks like shouldn't be dictated by anything other than 
> what libvirt wants/needs.
> 
> If libvirt isn't going to do more than look at an error type, then there's no 
> point in providing more than that.

They have asked us to turn 'desc' into a free string and I believe this is
a good change.

> The real problem we need to solve is not the quality of error messages. 
> Honestly, we have all of the tools today to improve error messages.
> 
> The problem we need to solve is error propagation because without it, we cannot 
> have asynchronous commands.  We keep ending up with extremely complex/cumbersome 
> management interfaces that we need to support forever because we still have 
> out-of-band errors that cannot be associated with a specific QMP command.
> 
> So I don't want to see us focus a bunch of effort on rewriting existing error 
> users.  Is that a hard and fast rule?  Of course not.  If it makes sense to 
> tweak an error here and there, that's fine.
> 
> But the problem to solve is killing off error_report.

I agree this is an important problem too.
diff mbox

Patch

diff --git a/qerror.c b/qerror.c
index 6c78759..97670a3 100644
--- a/qerror.c
+++ b/qerror.c
@@ -152,6 +152,14 @@  static const QErrorStringTable qerror_table[] = {
         .desc      = "The feature '%(name)' is not enabled",
     },
     {
+        .error_fmt = QERR_FILE_TOO_BIG,
+        .desc      = "File exceeds maxium file size limit",
+    },
+    {
+        .error_fmt = QERR_INVALID_ACCESS,
+        .desc      = "The access is invalid",
+    },
+    {
         .error_fmt = QERR_INVALID_BLOCK_FORMAT,
         .desc      = "Invalid block format '%(name)'",
     },
@@ -213,10 +221,22 @@  static const QErrorStringTable qerror_table[] = {
         .desc      = "Parameter '%(name)' is missing",
     },
     {
+        .error_fmt = QERR_NAME_TOO_LONG,
+        .desc      = "Name is too long",
+    },
+    {
         .error_fmt = QERR_NO_BUS_FOR_DEVICE,
         .desc      = "No '%(bus)' bus found for device '%(device)'",
     },
     {
+        .error_fmt = QERR_NO_SPACE,
+        .desc      = "No space left in device",
+    },
+    {
+        .error_fmt = QERR_NO_FILE_DIR,
+        .desc      = "No such file or directory",
+    },
+    {
         .error_fmt = QERR_NOT_SUPPORTED,
         .desc      = "Not supported",
     },
@@ -275,6 +295,10 @@  static const QErrorStringTable qerror_table[] = {
         .desc      = "QMP input object member '%(member)' is unexpected",
     },
     {
+        .error_fmt = QERR_READ_ONLY_FS,
+        .desc      = "File System is read-only",
+    },
+    {
         .error_fmt = QERR_RESET_REQUIRED,
         .desc      = "Resetting the Virtual Machine is required",
     },
@@ -283,6 +307,10 @@  static const QErrorStringTable qerror_table[] = {
         .desc      = "Could not set password",
     },
     {
+        .error_fmt = QERR_TOO_MANY_FILES_PROC,
+        .desc      = "Too many open files by process",
+    },
+    {
         .error_fmt = QERR_TOO_MANY_FILES_SYS,
         .desc      = "Too many open files in system",
     },
diff --git a/qerror.h b/qerror.h
index fb7c642..d157e68 100644
--- a/qerror.h
+++ b/qerror.h
@@ -136,6 +136,12 @@  QError *qobject_to_qerror(const QObject *obj);
 #define QERR_FEATURE_DISABLED \
     "{ 'class': 'FeatureDisabled', 'data': { 'name': %s } }"
 
+#define QERR_FILE_TOO_BIG \
+    "{ 'class': 'FileTooBig', 'data': {} }"
+
+#define QERR_INVALID_ACCESS \
+    "{ 'class': 'InvalidAccess', 'data': {} }"
+
 #define QERR_INVALID_BLOCK_FORMAT \
     "{ 'class': 'InvalidBlockFormat', 'data': { 'name': %s } }"
 
@@ -181,9 +187,18 @@  QError *qobject_to_qerror(const QObject *obj);
 #define QERR_MISSING_PARAMETER \
     "{ 'class': 'MissingParameter', 'data': { 'name': %s } }"
 
+#define QERR_NAME_TOO_LONG \
+    "{ 'class': 'NameTooLong', 'data': {} }"
+
 #define QERR_NO_BUS_FOR_DEVICE \
     "{ 'class': 'NoBusForDevice', 'data': { 'device': %s, 'bus': %s } }"
 
+#define QERR_NO_SPACE \
+    "{ 'class': 'NoSpace', 'data': {} }"
+
+#define QERR_NO_FILE_DIR \
+    "{ 'class': 'NoSuchFileOrDirectory', 'data': {} }"
+
 #define QERR_NOT_SUPPORTED \
     "{ 'class': 'NotSupported', 'data': {} }"
 
@@ -227,12 +242,18 @@  QError *qobject_to_qerror(const QObject *obj);
 #define QERR_QMP_EXTRA_MEMBER \
     "{ 'class': 'QMPExtraInputObjectMember', 'data': { 'member': %s } }"
 
+#define QERR_READ_ONLY_FS \
+    "{ 'class': 'ReadOnlyFileSystem', 'data': {} }"
+
 #define QERR_RESET_REQUIRED \
     "{ 'class': 'ResetRequired', 'data': {} }"
 
 #define QERR_SET_PASSWD_FAILED \
     "{ 'class': 'SetPasswdFailed', 'data': {} }"
 
+#define QERR_TOO_MANY_FILES_PROC \
+    "{ 'class': 'TooManyFilesByProcess', 'data': {} }"
+
 #define QERR_TOO_MANY_FILES_SYS \
     "{ 'class': 'TooManyFilesInSystem', 'data': {} }"