Patchwork [6/9] QError: Add qdev not found error

login
register
mail settings
Submitter Luiz Capitulino
Date Oct. 13, 2009, 4:57 p.m.
Message ID <1255453026-18637-7-git-send-email-lcapitulino@redhat.com>
Download mbox | patch
Permalink /patch/35878/
State New
Headers show

Comments

Luiz Capitulino - Oct. 13, 2009, 4:57 p.m.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 qerror.c |   12 ++++++++++++
 qerror.h |    1 +
 2 files changed, 13 insertions(+), 0 deletions(-)
Hollis Blanchard - Oct. 14, 2009, 11:02 p.m.
On Tue, 2009-10-13 at 13:57 -0300, Luiz Capitulino wrote:
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  qerror.c |   12 ++++++++++++
>  qerror.h |    1 +
>  2 files changed, 13 insertions(+), 0 deletions(-)
> 
> diff --git a/qerror.c b/qerror.c
> index bbea770..88a8208 100644
> --- a/qerror.c
> +++ b/qerror.c
> @@ -24,11 +24,23 @@ static const QType qerror_type = {
>      .destroy = qerror_destroy_obj,
>  };
> 
> +static void qemu_err_qdev_nodev(const QError *qerror)
> +{
> +    QDict *qdict = qobject_to_qdict(qerror->data);
> +    qemu_error("Device \"%s\" not found.  Try -device '?' for a list.\n",
> +               qdict_get_str(qdict, "name"));
> +}
> +
>  static QErrorTable qerror_table[] = {
>      {
>          .code   = QERR_UNKNOWN,
>          .desc   = "unknown error",
>      },
> +    {
> +        .code   = QERR_QDEV_NFOUND,
> +        .desc   = "device not found",
> +        .user_print = qemu_err_qdev_nodev,
> +    },
>  };
> 
>  /**
> diff --git a/qerror.h b/qerror.h
> index ed25ef1..820f25e 100644
> --- a/qerror.h
> +++ b/qerror.h
> @@ -21,6 +21,7 @@
>   */
>  typedef enum QErrorCode {
>      QERR_UNKNOWN,
> +    QERR_QDEV_NFOUND,
>      QERR_MAX,
>  } QErrorCode;

I'm not really seeing the point: what is gained by moving the error text
from the original site into this function-inside-a-structure?
Luiz Capitulino - Oct. 15, 2009, 1:34 p.m.
On Wed, 14 Oct 2009 16:02:10 -0700
Hollis Blanchard <hollisb@us.ibm.com> wrote:

> On Tue, 2009-10-13 at 13:57 -0300, Luiz Capitulino wrote:
> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> > ---
> >  qerror.c |   12 ++++++++++++
> >  qerror.h |    1 +
> >  2 files changed, 13 insertions(+), 0 deletions(-)
> > 
> > diff --git a/qerror.c b/qerror.c
> > index bbea770..88a8208 100644
> > --- a/qerror.c
> > +++ b/qerror.c
> > @@ -24,11 +24,23 @@ static const QType qerror_type = {
> >      .destroy = qerror_destroy_obj,
> >  };
> > 
> > +static void qemu_err_qdev_nodev(const QError *qerror)
> > +{
> > +    QDict *qdict = qobject_to_qdict(qerror->data);
> > +    qemu_error("Device \"%s\" not found.  Try -device '?' for a list.\n",
> > +               qdict_get_str(qdict, "name"));
> > +}
> > +
> >  static QErrorTable qerror_table[] = {
> >      {
> >          .code   = QERR_UNKNOWN,
> >          .desc   = "unknown error",
> >      },
> > +    {
> > +        .code   = QERR_QDEV_NFOUND,
> > +        .desc   = "device not found",
> > +        .user_print = qemu_err_qdev_nodev,
> > +    },
> >  };
> > 
> >  /**
> > diff --git a/qerror.h b/qerror.h
> > index ed25ef1..820f25e 100644
> > --- a/qerror.h
> > +++ b/qerror.h
> > @@ -21,6 +21,7 @@
> >   */
> >  typedef enum QErrorCode {
> >      QERR_UNKNOWN,
> > +    QERR_QDEV_NFOUND,
> >      QERR_MAX,
> >  } QErrorCode;
> 
> I'm not really seeing the point: what is gained by moving the error text
> from the original site into this function-inside-a-structure?

 Compatibility and a way of having pretty printing functions w/o
breaking the machine protocol.
Hollis Blanchard - Oct. 15, 2009, 5:16 p.m.
On Thu, 2009-10-15 at 10:34 -0300, Luiz Capitulino wrote:
> On Wed, 14 Oct 2009 16:02:10 -0700
> Hollis Blanchard <hollisb@us.ibm.com> wrote:
> 
> > On Tue, 2009-10-13 at 13:57 -0300, Luiz Capitulino wrote:
> > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> > > ---
> > >  qerror.c |   12 ++++++++++++
> > >  qerror.h |    1 +
> > >  2 files changed, 13 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/qerror.c b/qerror.c
> > > index bbea770..88a8208 100644
> > > --- a/qerror.c
> > > +++ b/qerror.c
> > > @@ -24,11 +24,23 @@ static const QType qerror_type = {
> > >      .destroy = qerror_destroy_obj,
> > >  };
> > > 
> > > +static void qemu_err_qdev_nodev(const QError *qerror)
> > > +{
> > > +    QDict *qdict = qobject_to_qdict(qerror->data);
> > > +    qemu_error("Device \"%s\" not found.  Try -device '?' for a list.\n",
> > > +               qdict_get_str(qdict, "name"));
> > > +}
> > > +
> > >  static QErrorTable qerror_table[] = {
> > >      {
> > >          .code   = QERR_UNKNOWN,
> > >          .desc   = "unknown error",
> > >      },
> > > +    {
> > > +        .code   = QERR_QDEV_NFOUND,
> > > +        .desc   = "device not found",
> > > +        .user_print = qemu_err_qdev_nodev,
> > > +    },
> > >  };
> > > 
> > >  /**
> > > diff --git a/qerror.h b/qerror.h
> > > index ed25ef1..820f25e 100644
> > > --- a/qerror.h
> > > +++ b/qerror.h
> > > @@ -21,6 +21,7 @@
> > >   */
> > >  typedef enum QErrorCode {
> > >      QERR_UNKNOWN,
> > > +    QERR_QDEV_NFOUND,
> > >      QERR_MAX,
> > >  } QErrorCode;
> > 
> > I'm not really seeing the point: what is gained by moving the error text
> > from the original site into this function-inside-a-structure?
> 
>  Compatibility and a way of having pretty printing functions w/o
> breaking the machine protocol.

Huh? Compatibility with what? I don't understand this "pretty printing"
comment either.

You could easily have
	qemu_error(code, "device not found");
throughout the code, and transmit that as
	{ "error": { "code": code, "desc": "device not found" } }
Luiz Capitulino - Oct. 15, 2009, 5:52 p.m.
On Thu, 15 Oct 2009 10:16:00 -0700
Hollis Blanchard <hollisb@us.ibm.com> wrote:

> On Thu, 2009-10-15 at 10:34 -0300, Luiz Capitulino wrote:
> > On Wed, 14 Oct 2009 16:02:10 -0700
> > Hollis Blanchard <hollisb@us.ibm.com> wrote:
> > 
> > > On Tue, 2009-10-13 at 13:57 -0300, Luiz Capitulino wrote:
> > > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> > > > ---
> > > >  qerror.c |   12 ++++++++++++
> > > >  qerror.h |    1 +
> > > >  2 files changed, 13 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/qerror.c b/qerror.c
> > > > index bbea770..88a8208 100644
> > > > --- a/qerror.c
> > > > +++ b/qerror.c
> > > > @@ -24,11 +24,23 @@ static const QType qerror_type = {
> > > >      .destroy = qerror_destroy_obj,
> > > >  };
> > > > 
> > > > +static void qemu_err_qdev_nodev(const QError *qerror)
> > > > +{
> > > > +    QDict *qdict = qobject_to_qdict(qerror->data);
> > > > +    qemu_error("Device \"%s\" not found.  Try -device '?' for a list.\n",
> > > > +               qdict_get_str(qdict, "name"));
> > > > +}
> > > > +
> > > >  static QErrorTable qerror_table[] = {
> > > >      {
> > > >          .code   = QERR_UNKNOWN,
> > > >          .desc   = "unknown error",
> > > >      },
> > > > +    {
> > > > +        .code   = QERR_QDEV_NFOUND,
> > > > +        .desc   = "device not found",
> > > > +        .user_print = qemu_err_qdev_nodev,
> > > > +    },
> > > >  };
> > > > 
> > > >  /**
> > > > diff --git a/qerror.h b/qerror.h
> > > > index ed25ef1..820f25e 100644
> > > > --- a/qerror.h
> > > > +++ b/qerror.h
> > > > @@ -21,6 +21,7 @@
> > > >   */
> > > >  typedef enum QErrorCode {
> > > >      QERR_UNKNOWN,
> > > > +    QERR_QDEV_NFOUND,
> > > >      QERR_MAX,
> > > >  } QErrorCode;
> > > 
> > > I'm not really seeing the point: what is gained by moving the error text
> > > from the original site into this function-inside-a-structure?
> > 
> >  Compatibility and a way of having pretty printing functions w/o
> > breaking the machine protocol.
> 
> Huh? Compatibility with what?

 With existing errors. I'm avoiding changing them because existing
applications which parse QEMU output may rely on them.

 On the other hand, I'm not sure if this is a hard requirement.

> I don't understand this "pretty printing" comment either.

 The structured error output can sometimes be too rigid for humans,
this qdev error is an example. When we fail we pass the 'Try -device'
hint, which doesn't make sense for the protocol.

 Also, it's not uncommon to have error strings like this:

monitor_printf(mon, "husb: host usb device %d.%d is already open\n",
                      bus_num, addr);

 Which is what I call 'pretty printing'.
 
> You could easily have
> 	qemu_error(code, "device not found");

 This doesn't solve the problems I mentioned above, besides I don't
see why you need to specify both, the error code and the description,
they describe the same thing.

 Also, they must not change after the protocol gets into production,
having them defined in the same place will help.
Hollis Blanchard - Oct. 15, 2009, 6:13 p.m.
On Thu, 2009-10-15 at 14:52 -0300, Luiz Capitulino wrote:
> 
> > You could easily have
> >       qemu_error(code, "device not found");
> 
>  This doesn't solve the problems I mentioned above, besides I don't
> see why you need to specify both, the error code and the description,
> they describe the same thing.

This is how FTP and other protocols work, for example.

You supplied a perfect example:
	qemu_error(404, "husb: host usb device %d.%d is already open\n",
                      bus_num, addr);

In this case, an interactive client could only display the "pretty"
text.

A machine client would interpret the code as Markus described in a
previous mail (temporary server error, permanent server error, etc). In
case of a permanent server error, the client could *also* pass through
the pretty text to display to the user.

>  Also, they must not change after the protocol gets into production,
> having them defined in the same place will help.

This seems really unnecessary and too much abstraction.

Aside from that, we should certainly be able to change the pretty text,
for example, to provide additional clarification to the user. The
machine-interpreted code, on the other hand, wouldn't change.
Luiz Capitulino - Oct. 15, 2009, 7:08 p.m.
On Thu, 15 Oct 2009 11:13:53 -0700
Hollis Blanchard <hollisb@us.ibm.com> wrote:

> On Thu, 2009-10-15 at 14:52 -0300, Luiz Capitulino wrote:
> > 
> > > You could easily have
> > >       qemu_error(code, "device not found");
> > 
> >  This doesn't solve the problems I mentioned above, besides I don't
> > see why you need to specify both, the error code and the description,
> > they describe the same thing.
> 
> This is how FTP and other protocols work, for example.

 I meant in the qemu_error() call, we will certainly emit them both
in the protocol.

> You supplied a perfect example:
> 	qemu_error(404, "husb: host usb device %d.%d is already open\n",
>                       bus_num, addr);
> 
> In this case, an interactive client could only display the "pretty"
> text.
> 
> A machine client would interpret the code as Markus described in a
> previous mail (temporary server error, permanent server error, etc). In
> case of a permanent server error, the client could *also* pass through
> the pretty text to display to the user.

 Ok, so this design is really different from what I had in mind.

 In your design error codes are more like errors classes, right? So,
using your example, 404 could have the meaning 'already open, temporary',
this way it can be used with USB, PCI or even for files.

 How do you plan to emit it on the wire? Like this:

{ "error": { "code": 404,
             "desc": "husb: host usb device 0.12 is already open" } }

 In case I'm right, there are two problems with it:

1. It's hard to retrieve the variable information
2. Maybe minor, but I see too much information for a protocol

 In the design I'm proposing, *each* different error will get a code,
no problem to have classes of errors though, but the point is that
error text doesn't go on the call to qemu_error().

 The above example can be emitted like this:

{ "error": { "code": 12
             "desc": "device already open",
             "data": { "bus": 0, "address": 12 } } }

 Note that this also can be reused by any bus, as the "data" information
is built at error time and can contain anything.

> >  Also, they must not change after the protocol gets into production,
> > having them defined in the same place will help.
> 
> This seems really unnecessary and too much abstraction.

 I don't agree it's unnecessary, but I agree it has too much abstraction
and making the errors harder to report will make developers tentative
in not reporting errors at all (Markus argument).

 Maybe we need clearer requirements?

> Aside from that, we should certainly be able to change the pretty text,
> for example, to provide additional clarification to the user. The
> machine-interpreted code, on the other hand, wouldn't change.

 How do you plan to do it in the design you're proposing? If you
use the text from qemu_error() to build the user and protocol
outputs, then you can't change it.

 In current QError this is possible.
Hollis Blanchard - Oct. 15, 2009, 8:13 p.m.
On Thu, 2009-10-15 at 16:08 -0300, Luiz Capitulino wrote:
> On Thu, 15 Oct 2009 11:13:53 -0700
> Hollis Blanchard <hollisb@us.ibm.com> wrote:
> 
> > On Thu, 2009-10-15 at 14:52 -0300, Luiz Capitulino wrote:
> > > 
> > > > You could easily have
> > > >       qemu_error(code, "device not found");
> > > 
> > >  This doesn't solve the problems I mentioned above, besides I don't
> > > see why you need to specify both, the error code and the description,
> > > they describe the same thing.
> > 
> > This is how FTP and other protocols work, for example.
> 
>  I meant in the qemu_error() call, we will certainly emit them both
> in the protocol.
> 
> > You supplied a perfect example:
> > 	qemu_error(404, "husb: host usb device %d.%d is already open\n",
> >                       bus_num, addr);
> > 
> > In this case, an interactive client could only display the "pretty"
> > text.
> > 
> > A machine client would interpret the code as Markus described in a
> > previous mail (temporary server error, permanent server error, etc). In
> > case of a permanent server error, the client could *also* pass through
> > the pretty text to display to the user.
> 
>  Ok, so this design is really different from what I had in mind.

I'm starting to see that. :)

>  In your design error codes are more like errors classes, right? So,
> using your example, 404 could have the meaning 'already open, temporary',
> this way it can be used with USB, PCI or even for files.
> 
>  How do you plan to emit it on the wire? Like this:
> 
> { "error": { "code": 404,
>              "desc": "husb: host usb device 0.12 is already open" } }
> 
>  In case I'm right, there are two problems with it:
> 
> 1. It's hard to retrieve the variable information
> 2. Maybe minor, but I see too much information for a protocol

I'm really confused by your objections. This is simply HTTP-style
"Status Code" and "Reason
Phrase" (http://tools.ietf.org/html/rfc2616#section-6.1.1) in JSON form.

1. The client receiving the response clearly has a JSON interpreter, so
how is it hard to retrieve anything?
2. Have you ever used software that said "Failed." without explanation?
Unless you're suggesting that all clients would build in their own
code->description tables (I hope to god you're not :), we must transmit
the description in the protocol.

> > Aside from that, we should certainly be able to change the pretty text,
> > for example, to provide additional clarification to the user. The
> > machine-interpreted code, on the other hand, wouldn't change.
> 
>  How do you plan to do it in the design you're proposing? If you
> use the text from qemu_error() to build the user and protocol
> outputs, then you can't change it.

Today it looks like this:
        C: open host USB device foo
        S: error 404, host USB device foo is already open

Tomorrow it could look like this:
        C: open host USB device foo
        S: error 404, host USB device foo is already open by PID 27

As described in HTTP's section 6.1.1:
   The Status-Code is intended
   for use by automata and the Reason-Phrase is intended for the human
   user. The client is not required to examine or display the Reason-
   Phrase.

Since the automata only interprets the status code, nothing breaks, and
the user benefits from more information.
Anthony Liguori - Oct. 15, 2009, 8:57 p.m.
>>> Aside from that, we should certainly be able to change the pretty text,
>>> for example, to provide additional clarification to the user. The
>>> machine-interpreted code, on the other hand, wouldn't change.
>>>       
>>  How do you plan to do it in the design you're proposing? If you
>> use the text from qemu_error() to build the user and protocol
>> outputs, then you can't change it.
>>     
>
> Today it looks like this:
>         C: open host USB device foo
>         S: error 404, host USB device foo is already open
>
> Tomorrow it could look like this:
>         C: open host USB device foo
>         S: error 404, host USB device foo is already open by PID 27
>   

What's tough about this sort of error handling is that it's not 
conducive to localization.  It's not unusual for the server to have a 
different locale than the client so you really need the client to be 
able to translate error messages into meaningful messages in the client 
locale.

This is the typical argument for highly structured error reporting.
Hollis Blanchard - Oct. 15, 2009, 9:18 p.m.
On Thu, 2009-10-15 at 15:57 -0500, Anthony Liguori wrote:
> >>> Aside from that, we should certainly be able to change the pretty text,
> >>> for example, to provide additional clarification to the user. The
> >>> machine-interpreted code, on the other hand, wouldn't change.
> >>>       
> >>  How do you plan to do it in the design you're proposing? If you
> >> use the text from qemu_error() to build the user and protocol
> >> outputs, then you can't change it.
> >>     
> >
> > Today it looks like this:
> >         C: open host USB device foo
> >         S: error 404, host USB device foo is already open
> >
> > Tomorrow it could look like this:
> >         C: open host USB device foo
> >         S: error 404, host USB device foo is already open by PID 27
> >   
> 
> What's tough about this sort of error handling is that it's not 
> conducive to localization.  It's not unusual for the server to have a 
> different locale than the client so you really need the client to be 
> able to translate error messages into meaningful messages in the client 
> locale.

It's hard enough to keep localized strings updated for every release
within a single code tree. Beyond that, you expect that every *client*
will update its localized strings for every *server* release? If you
agree that is an unrealistic expectation, then even a "highly
structured" reporting mechanism won't help.

On the other hand, if you truly want to invest the energy required to
localize qemu (the server), I can imagine setting the per-client locale
as a monitor command.
Anthony Liguori - Oct. 15, 2009, 9:27 p.m.
hollisb@linux.vnet.ibm.com wrote:
> It's hard enough to keep localized strings updated for every release
> within a single code tree. Beyond that, you expect that every *client*
> will update its localized strings for every *server* release?

What I meant by highly structured is that you wouldn't pass a string at 
all.  There would just be an error code and whatever information went 
along with that error code.  It would be up to the client to decide how 
to present a message to the user.

 From a usability perspective, this is better both for localization but 
also to ensure a consistent user experience with respect to how errors 
are described to a user.

Basically, I think the conventional wisdom in UI design is that you 
never want to pass error messages from a  server directly to a user.
Hollis Blanchard - Oct. 15, 2009, 10:44 p.m.
On Thu, 2009-10-15 at 16:27 -0500, Anthony Liguori wrote:
> hollisb@linux.vnet.ibm.com wrote:
> > It's hard enough to keep localized strings updated for every release
> > within a single code tree. Beyond that, you expect that every *client*
> > will update its localized strings for every *server* release?
> 
> What I meant by highly structured is that you wouldn't pass a string at 
> all.  There would just be an error code and whatever information went 
> along with that error code.  It would be up to the client to decide how 
> to present a message to the user.
> 
>  From a usability perspective, this is better both for localization but 
> also to ensure a consistent user experience with respect to how errors 
> are described to a user.
> 
> Basically, I think the conventional wisdom in UI design is that you 
> never want to pass error messages from a  server directly to a user.

I think it's also conventional wisdom not to show the user vague "some
error occurred" messages. :)

How about this (basically what Paolo suggested):

{ "error": { "code": 12,
             "desc": "device %{bus}:%{address} already open",
             "data": { "bus": 0, "address": 12 } } }

'desc' *may* be used by the client, or may be replaced with a localized
version. This ensures that users need never be confronted with an
"Error: 12" dialog box; a client can at least display some (English)
information for unknown errors.

I am now seeing how Luiz got to where he did with the original patch:
once you have unique error codes, why embed the format string at the
call site? (I think this would have been a better patch description in
the first place, rather than unspecified "compatibility".)

However, I think the part that's still bothering me is the user_print
function pointer. It seems to me that there should be no difference
between the text sent to a remote client and the text displayed to a
local user, and if that were true it would remove some ugliness.
Gerd Hoffmann - Oct. 16, 2009, 7:30 a.m.
Hi,

> { "error": { "code": 404,
>               "desc": "husb: host usb device 0.12 is already open" } }
>
>   In case I'm right, there are two problems with it:
>
> 1. It's hard to retrieve the variable information

Markus asked it already, I'm asking again:

*Do we actually need the variable information?*

The point is that the information usually is available from context 
anyway, i.e. if you get the reply above from a command like 'usb_add 
host:0.12' you know it is host device 0.12 *without* parsing reply 
because you know you've asked for host device 0.12.

cheers,
   Gerd
Paolo Bonzini - Oct. 16, 2009, 8:02 a.m.
On 10/15/2009 09:08 PM, Luiz Capitulino wrote:
> { "error": { "code": 12
>               "desc": "device already open",
>               "data": { "bus": 0, "address": 12 } } }
>
>   Note that this also can be reused by any bus, as the "data" information
> is built at error time and can contain anything.

The "desc" is not even necessary on the wire.

Paolo
Paolo Bonzini - Oct. 16, 2009, 8:06 a.m.
On 10/16/2009 12:44 AM, Hollis Blanchard wrote:
> How about this (basically what Paolo suggested):
>
> { "error": { "code": 12,
>               "desc": "device %{bus}:%{address} already open",
>               "data": { "bus": 0, "address": 12 } } }
>
> 'desc'*may*  be used by the client, or may be replaced with a localized
> version.

I would say that desc need not go on the wire too.  The client might not 
even want to show the same string to the user, for example they may want 
to say "mouse already" open.

The "device %{bus}:%{address} already open" would be strictly inside 
QEMU, for consumption of the monitor interface.  Of course since the 
server is in QEMU too it makes sense to consolidate it in the same 
struct, but this does not mean that everything in the struct needs to go 
on the wire.

Paolo
Luiz Capitulino - Oct. 16, 2009, 12:39 p.m.
On Fri, 16 Oct 2009 09:30:35 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

>    Hi,
> 
> > { "error": { "code": 404,
> >               "desc": "husb: host usb device 0.12 is already open" } }
> >
> >   In case I'm right, there are two problems with it:
> >
> > 1. It's hard to retrieve the variable information
> 
> Markus asked it already, I'm asking again:
> 
> *Do we actually need the variable information?*

 That's a good question and to answer it properly I think we should
have to go over all errors and check if any of them generate
information which is not available from the context.

 There is also the compatibility issue, I'm not sure if it's ok
to change the error output, we would have to check at least the
majors QEMU users.
Luiz Capitulino - Oct. 16, 2009, 1:05 p.m.
On Fri, 16 Oct 2009 10:06:10 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 10/16/2009 12:44 AM, Hollis Blanchard wrote:
> > How about this (basically what Paolo suggested):
> >
> > { "error": { "code": 12,
> >               "desc": "device %{bus}:%{address} already open",
> >               "data": { "bus": 0, "address": 12 } } }
> >
> > 'desc'*may*  be used by the client, or may be replaced with a localized
> > version.
> 
> I would say that desc need not go on the wire too.  The client might not 
> even want to show the same string to the user, for example they may want 
> to say "mouse already" open.
> 
> The "device %{bus}:%{address} already open" would be strictly inside 
> QEMU, for consumption of the monitor interface.  Of course since the 
> server is in QEMU too it makes sense to consolidate it in the same 
> struct, but this does not mean that everything in the struct needs to go 
> on the wire.

 This is what my current proposal does, "desc" goes on the wire
because it's a simple description but messages for users consumption
are printed by .user_print.

 I think we could make this very simple if we use a solution along
the lines suggested by Hollis and do _not_ allow variable information
(ie. 'handler data') to go on the wire.

 I mean, we could let current errors as they are but add error codes
and new error descriptions to be used by the protocol only.

 For example, a call to:

monitor_printf(mon, "husb: host usb device %d.%d is already open\n",
                      bus_num, addr);

 Would become:

qemu_error_structed(404, "husb: host usb device %d.%d is already open\n",
                          bus_num, addr);

 When in user protocol the message is printed normally, when in protocol
mode the error code is used to index the error table and on the wire
we emit:

{ "error": { "code": 404, "desc": "device already open" } }

 Now I need to know if it's ok to have such simple error information
on the wire.
Paolo Bonzini - Oct. 16, 2009, 1:34 p.m.
On 10/16/2009 02:39 PM, Luiz Capitulino wrote:
>   That's a good question and to answer it properly I think we should
> have to go over all errors and check if any of them generate
> information which is not available from the context.

As more communication (including errors) becomes asynchronous, the 
probability of this happening will quickly go to 1.

I think we just need good QObject printf/scanf like functions.

Paolo
Anthony Liguori - Oct. 16, 2009, 1:37 p.m.
Gerd Hoffmann wrote:
>   Hi,
>
>> { "error": { "code": 404,
>>               "desc": "husb: host usb device 0.12 is already open" } }
>>
>>   In case I'm right, there are two problems with it:
>>
>> 1. It's hard to retrieve the variable information
>
> Markus asked it already, I'm asking again:
>
> *Do we actually need the variable information?*
>
> The point is that the information usually is available from context 
> anyway, i.e. if you get the reply above from a command like 'usb_add 
> host:0.12' you know it is host device 0.12 *without* parsing reply 
> because you know you've asked for host device 0.12.

I would recommend treating errors like exceptions.  In fact, if I were 
writing a python client library, this is exactly how I would model it.

In fact, it would make sense to switch code to some sort of class name.  
This would be the exception class.  The data argument would be the 
exception object.  desc would basically be __str__() method although 
more limited.

Very often, exceptions have very little data associated with them.  
Sometimes it's useful to include a bit of info though.
Anthony Liguori - Oct. 16, 2009, 1:39 p.m.
Paolo Bonzini wrote:
> On 10/16/2009 12:44 AM, Hollis Blanchard wrote:
>> How about this (basically what Paolo suggested):
>>
>> { "error": { "code": 12,
>>               "desc": "device %{bus}:%{address} already open",
>>               "data": { "bus": 0, "address": 12 } } }
>>
>> 'desc'*may*  be used by the client, or may be replaced with a localized
>> version.
>
> I would say that desc need not go on the wire too.  The client might 
> not even want to show the same string to the user, for example they 
> may want to say "mouse already" open.
>
> The "device %{bus}:%{address} already open" would be strictly inside 
> QEMU, for consumption of the monitor interface.  Of course since the 
> server is in QEMU too it makes sense to consolidate it in the same 
> struct, but this does not mean that everything in the struct needs to 
> go on the wire.

Agreed.  It could also go in a client library and we could provide 
something equivalent to strerror().  The advantage of putting this in a 
client library is that we could potentially localize it.
Luiz Capitulino - Oct. 16, 2009, 2:17 p.m.
On Fri, 16 Oct 2009 08:37:57 -0500
Anthony Liguori <aliguori@us.ibm.com> wrote:

> Gerd Hoffmann wrote:
> >   Hi,
> >
> >> { "error": { "code": 404,
> >>               "desc": "husb: host usb device 0.12 is already open" } }
> >>
> >>   In case I'm right, there are two problems with it:
> >>
> >> 1. It's hard to retrieve the variable information
> >
> > Markus asked it already, I'm asking again:
> >
> > *Do we actually need the variable information?*
> >
> > The point is that the information usually is available from context 
> > anyway, i.e. if you get the reply above from a command like 'usb_add 
> > host:0.12' you know it is host device 0.12 *without* parsing reply 
> > because you know you've asked for host device 0.12.
> 
> I would recommend treating errors like exceptions.  In fact, if I were 
> writing a python client library, this is exactly how I would model it.
> 
> In fact, it would make sense to switch code to some sort of class name.  
> This would be the exception class.  The data argument would be the 
> exception object.  desc would basically be __str__() method although 
> more limited.

 This seems useful (and a lot of C projects end up creating their
"error object" anyway), but I'm getting concerned.

 First, and more important, we have our immediate needs. Which is to have
most of the protocol code merged in 15 days.

 Second, I fear we are going too far with the objects idea. It
solves the monitor's problem well and although can be used by
other subsystems I wonder if pushing them to the extreme like that
is the way to go.
Paolo Bonzini - Oct. 16, 2009, 5:28 p.m.
On 10/16/2009 04:17 PM, Luiz Capitulino wrote:
>   Second, I fear we are going too far with the objects idea. It
> solves the monitor's problem well and although can be used by
> other subsystems I wonder if pushing them to the extreme like that
> is the way to go.

I think that in practice Anthony is asking you only to write the 
stringified enum name (e.g. "QEMU_ERROR_NODEV"), instead of the enum 
value (i.e. 2). :-)

I agree about what the immediate needs are.  On the other hand, making 
things too complicated to extend later is not a good idea...

Paolo, who's amazed at the amount of design changes that went into QEMU 
since last June...
Anthony Liguori - Oct. 16, 2009, 5:47 p.m.
Paolo Bonzini wrote:
> On 10/16/2009 04:17 PM, Luiz Capitulino wrote:
>>   Second, I fear we are going too far with the objects idea. It
>> solves the monitor's problem well and although can be used by
>> other subsystems I wonder if pushing them to the extreme like that
>> is the way to go.
>
> I think that in practice Anthony is asking you only to write the 
> stringified enum name (e.g. "QEMU_ERROR_NODEV"), instead of the enum 
> value (i.e. 2). :-)

Yup.

> I agree about what the immediate needs are.  On the other hand, making 
> things too complicated to extend later is not a good idea...

There are just a few things we have to get right and we should take our 
time on those.  The error semantics are one of those important things 
that we'll regret in the future if we rush.
Jamie Lokier - Oct. 18, 2009, 4:25 a.m.
Anthony Liguori wrote:
> hollisb@linux.vnet.ibm.com wrote:
> >It's hard enough to keep localized strings updated for every release
> >within a single code tree. Beyond that, you expect that every *client*
> >will update its localized strings for every *server* release?
> 
> What I meant by highly structured is that you wouldn't pass a string at 
> all.  There would just be an error code and whatever information went 
> along with that error code.  It would be up to the client to decide how 
> to present a message to the user.

The main problem with that is when you add a new error, several things
need to be added in different places, including in clients, so it
discourages adding new errors and existing ones are abused.

Down that road lies "Error, something failed", as well as "File not
found" for things which aren't files.

The practice of providing a string at the call site encourages
messages to be informative and accurate first, with localisation and
error-specific actions added later as needed.

The manual for GNU gettext explains quite well why gettext takes a
message string as argument, instead of a "message code".  Imho, a
similar case can be made for error messages at call sites.

> From a usability perspective, this is better both for localization but 
> also to ensure a consistent user experience with respect to how errors 
> are described to a user.

It's not great for localisation if N clients all maintain separate,
different sets of localisation for the same messages, especially if
the set of messages is changing regularly too from one qemu version to
the next.

It's also not a consistent user experience, if each client has a
different message for the same obscure errors.

Only the most widely used clients are likely to get localised, and
even those will get out of date quickly, e.g. when using a distro's
client with a newer-than-distry qemu.

> Basically, I think the conventional wisdom in UI design is that you 
> never want to pass error messages from a  server directly to a user.

You can probably guess I don't agree with that.  I'm not sure if it is
actually conventional wisdom, either.  It happens a lot in popular
things with highly refined UIs like web browsers, FTP clients and the
like.  They tend to _add_ their own explanation to the server-provided
error messages, but show the server's message too, especially for
uncommon error types that the client doesn't know about (because it
can't know them all).

-- Jamie
Jamie Lokier - Oct. 18, 2009, 4:28 a.m.
Paolo Bonzini wrote:
> On 10/15/2009 09:08 PM, Luiz Capitulino wrote:
> >{ "error": { "code": 12
> >              "desc": "device already open",
> >              "data": { "bus": 0, "address": 12 } } }
> >
> >  Note that this also can be reused by any bus, as the "data" information
> >is built at error time and can contain anything.
> 
> The "desc" is not even necessary on the wire.

When you send an error code that that client doesn't know yet (because
you can't update every client immediately), it'll be very helpful to
users to see "device already open" instead of "unknown error 12".

-- Jamie
Jamie Lokier - Oct. 18, 2009, 4:34 a.m.
Jamie Lokier wrote:
> Paolo Bonzini wrote:
> > On 10/15/2009 09:08 PM, Luiz Capitulino wrote:
> > >{ "error": { "code": 12
> > >              "desc": "device already open",
> > >              "data": { "bus": 0, "address": 12 } } }
> > >
> > >  Note that this also can be reused by any bus, as the "data" information
> > >is built at error time and can contain anything.
> > 
> > The "desc" is not even necessary on the wire.
> 
> When you send an error code that that client doesn't know yet (because
> you can't update every client immediately), it'll be very helpful to
> users to see "device already open" instead of "unknown error 12".

About that technique in general.  It works much better when the client
and server are managed together, for example as a single project, or
by the same people working on both.

Then you can keep the client's set of error codes in sync with the
server in every version.

But that's not possible when there are umpteen clients maintained by
other people on their own schedule, each used by users who may combine
them with newer servers.  Which I gather is something that the new
monitor protocol is intended to support.

-- Jamie
Paolo Bonzini - Oct. 18, 2009, 12:17 p.m.
On 10/18/2009 06:25 AM, Jamie Lokier wrote:
> The manual for GNU gettext explains quite well why gettext takes a
> message string as argument, instead of a "message code".  Imho, a
> similar case can be made for error messages at call sites.

That's true.  However here we have the case of having errors consumed by 
programs as well as users, so we want something that can be easily made 
into language bindings.

In other words, this situation is much more similar to errno/strerror, 
than to a compiler error message (which will often be used by other 
programs, but where the actual error text will be read by a person; this 
can and should use gettext).

Paolo
Daniel P. Berrange - Oct. 19, 2009, 10:25 a.m.
On Fri, Oct 16, 2009 at 10:05:44AM -0300, Luiz Capitulino wrote:
> On Fri, 16 Oct 2009 10:06:10 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > On 10/16/2009 12:44 AM, Hollis Blanchard wrote:
> > > How about this (basically what Paolo suggested):
> > >
> > > { "error": { "code": 12,
> > >               "desc": "device %{bus}:%{address} already open",
> > >               "data": { "bus": 0, "address": 12 } } }
> > >
> > > 'desc'*may*  be used by the client, or may be replaced with a localized
> > > version.
> > 
> > I would say that desc need not go on the wire too.  The client might not 
> > even want to show the same string to the user, for example they may want 
> > to say "mouse already" open.
> > 
> > The "device %{bus}:%{address} already open" would be strictly inside 
> > QEMU, for consumption of the monitor interface.  Of course since the 
> > server is in QEMU too it makes sense to consolidate it in the same 
> > struct, but this does not mean that everything in the struct needs to go 
> > on the wire.
> 
>  This is what my current proposal does, "desc" goes on the wire
> because it's a simple description but messages for users consumption
> are printed by .user_print.
> 
>  I think we could make this very simple if we use a solution along
> the lines suggested by Hollis and do _not_ allow variable information
> (ie. 'handler data') to go on the wire.
> 
>  I mean, we could let current errors as they are but add error codes
> and new error descriptions to be used by the protocol only.
> 
>  For example, a call to:
> 
> monitor_printf(mon, "husb: host usb device %d.%d is already open\n",
>                       bus_num, addr);
> 
>  Would become:
> 
> qemu_error_structed(404, "husb: host usb device %d.%d is already open\n",
>                           bus_num, addr);
> 
>  When in user protocol the message is printed normally, when in protocol
> mode the error code is used to index the error table and on the wire
> we emit:
> 
> { "error": { "code": 404, "desc": "device already open" } }
> 
>  Now I need to know if it's ok to have such simple error information
> on the wire.

In so much as its providing an error code + message I think that is
sufficient, but I think we should take care to ensure that the error
description contains enough information to allow useful troubleshooting.
As such doing a simple index lookup from error code -> message is not
sufficient, because it would be throwing away potentially important
error data.

Consider a user files a bug report attaching client app logs which
show the monitor command issued and the it error returned. Ask yourself,
is that enough information to locate where in the code the error came
from & can you make a reasonable attempt at identifying a root cause. 

In your example above, the usb_add command already included the bus_num
+ addr, so there'd be no need to also include that data in the reply.
But consider when QEMU actually tries to open the host USB device, there
are easily 10-20 different things that could go wrong, and many even
simple system calls like open() could generate many different errors.
It is important that this fine detail be reported back to the client
app for developers to have a good attempt at troubleshooting

Regards,
Daniel
Luiz Capitulino - Oct. 19, 2009, 12:28 p.m.
On Mon, 19 Oct 2009 11:25:19 +0100
"Daniel P. Berrange" <berrange@redhat.com> wrote:

> On Fri, Oct 16, 2009 at 10:05:44AM -0300, Luiz Capitulino wrote:
> > On Fri, 16 Oct 2009 10:06:10 +0200
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> > > On 10/16/2009 12:44 AM, Hollis Blanchard wrote:
> > > > How about this (basically what Paolo suggested):
> > > >
> > > > { "error": { "code": 12,
> > > >               "desc": "device %{bus}:%{address} already open",
> > > >               "data": { "bus": 0, "address": 12 } } }
> > > >
> > > > 'desc'*may*  be used by the client, or may be replaced with a localized
> > > > version.
> > > 
> > > I would say that desc need not go on the wire too.  The client might not 
> > > even want to show the same string to the user, for example they may want 
> > > to say "mouse already" open.
> > > 
> > > The "device %{bus}:%{address} already open" would be strictly inside 
> > > QEMU, for consumption of the monitor interface.  Of course since the 
> > > server is in QEMU too it makes sense to consolidate it in the same 
> > > struct, but this does not mean that everything in the struct needs to go 
> > > on the wire.
> > 
> >  This is what my current proposal does, "desc" goes on the wire
> > because it's a simple description but messages for users consumption
> > are printed by .user_print.
> > 
> >  I think we could make this very simple if we use a solution along
> > the lines suggested by Hollis and do _not_ allow variable information
> > (ie. 'handler data') to go on the wire.
> > 
> >  I mean, we could let current errors as they are but add error codes
> > and new error descriptions to be used by the protocol only.
> > 
> >  For example, a call to:
> > 
> > monitor_printf(mon, "husb: host usb device %d.%d is already open\n",
> >                       bus_num, addr);
> > 
> >  Would become:
> > 
> > qemu_error_structed(404, "husb: host usb device %d.%d is already open\n",
> >                           bus_num, addr);
> > 
> >  When in user protocol the message is printed normally, when in protocol
> > mode the error code is used to index the error table and on the wire
> > we emit:
> > 
> > { "error": { "code": 404, "desc": "device already open" } }
> > 
> >  Now I need to know if it's ok to have such simple error information
> > on the wire.
> 
> In so much as its providing an error code + message I think that is
> sufficient, but I think we should take care to ensure that the error
> description contains enough information to allow useful troubleshooting.
> As such doing a simple index lookup from error code -> message is not
> sufficient, because it would be throwing away potentially important
> error data.

 Yes, it's going to throw away important info. We could have a big enough
table with all possible errors, but this seems insane at first.

 Another solution could be to change the semantics of the error data,
instead of passing to it information which is already there we could pass
additional info which is not part of the original message.

 On the other hand, I think this will make the usage of qemu_error_structed()
a bit confusing.

> Consider a user files a bug report attaching client app logs which
> show the monitor command issued and the it error returned. Ask yourself,
> is that enough information to locate where in the code the error came
> from & can you make a reasonable attempt at identifying a root cause. 
> 
> In your example above, the usb_add command already included the bus_num
> + addr, so there'd be no need to also include that data in the reply.
> But consider when QEMU actually tries to open the host USB device, there
> are easily 10-20 different things that could go wrong, and many even
> simple system calls like open() could generate many different errors.
> It is important that this fine detail be reported back to the client
> app for developers to have a good attempt at troubleshooting

 I agree, although there is going to be a huge effort in the conversion
work, as I think error handling in QEMU is not that good.
Daniel P. Berrange - Oct. 19, 2009, 12:42 p.m.
On Mon, Oct 19, 2009 at 10:28:17AM -0200, Luiz Capitulino wrote:
> On Mon, 19 Oct 2009 11:25:19 +0100
> "Daniel P. Berrange" <berrange@redhat.com> wrote:
> 
> > On Fri, Oct 16, 2009 at 10:05:44AM -0300, Luiz Capitulino wrote:
> > > On Fri, 16 Oct 2009 10:06:10 +0200
> > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > 
> > > > On 10/16/2009 12:44 AM, Hollis Blanchard wrote:
> > > > > How about this (basically what Paolo suggested):
> > > > >
> > > > > { "error": { "code": 12,
> > > > >               "desc": "device %{bus}:%{address} already open",
> > > > >               "data": { "bus": 0, "address": 12 } } }
> > > > >
> > > > > 'desc'*may*  be used by the client, or may be replaced with a localized
> > > > > version.
> > > > 
> > > > I would say that desc need not go on the wire too.  The client might not 
> > > > even want to show the same string to the user, for example they may want 
> > > > to say "mouse already" open.
> > > > 
> > > > The "device %{bus}:%{address} already open" would be strictly inside 
> > > > QEMU, for consumption of the monitor interface.  Of course since the 
> > > > server is in QEMU too it makes sense to consolidate it in the same 
> > > > struct, but this does not mean that everything in the struct needs to go 
> > > > on the wire.
> > > 
> > >  This is what my current proposal does, "desc" goes on the wire
> > > because it's a simple description but messages for users consumption
> > > are printed by .user_print.
> > > 
> > >  I think we could make this very simple if we use a solution along
> > > the lines suggested by Hollis and do _not_ allow variable information
> > > (ie. 'handler data') to go on the wire.
> > > 
> > >  I mean, we could let current errors as they are but add error codes
> > > and new error descriptions to be used by the protocol only.
> > > 
> > >  For example, a call to:
> > > 
> > > monitor_printf(mon, "husb: host usb device %d.%d is already open\n",
> > >                       bus_num, addr);
> > > 
> > >  Would become:
> > > 
> > > qemu_error_structed(404, "husb: host usb device %d.%d is already open\n",
> > >                           bus_num, addr);
> > > 
> > >  When in user protocol the message is printed normally, when in protocol
> > > mode the error code is used to index the error table and on the wire
> > > we emit:
> > > 
> > > { "error": { "code": 404, "desc": "device already open" } }
> > > 
> > >  Now I need to know if it's ok to have such simple error information
> > > on the wire.
> > 
> > In so much as its providing an error code + message I think that is
> > sufficient, but I think we should take care to ensure that the error
> > description contains enough information to allow useful troubleshooting.
> > As such doing a simple index lookup from error code -> message is not
> > sufficient, because it would be throwing away potentially important
> > error data.
> 
>  Yes, it's going to throw away important info. We could have a big enough
> table with all possible errors, but this seems insane at first.
> 
>  Another solution could be to change the semantics of the error data,
> instead of passing to it information which is already there we could pass
> additional info which is not part of the original message.
> 
>  On the other hand, I think this will make the usage of qemu_error_structed()
> a bit confusing.

Why not include all of it, the error code, the generic string for the
error code, and then the formatted specific error string the monitor
currently has. eg

  { "error": { "code": 404,
               "desc": "device already open",
               "detail": "husb: host usb device 001.003 is already open" } }

Since the latter error messages all already exist, it should make it easier
to adapt, and allow clients flexibility in how they report/handle the 
errors whether to show the detail error to users, or just the generic msg

Daniel
Hollis Blanchard - Oct. 19, 2009, 4:50 p.m.
On Sun, 2009-10-18 at 14:17 +0200, Paolo Bonzini wrote:
> On 10/18/2009 06:25 AM, Jamie Lokier wrote:
> > The manual for GNU gettext explains quite well why gettext takes a
> > message string as argument, instead of a "message code".  Imho, a
> > similar case can be made for error messages at call sites.
> 
> That's true.  However here we have the case of having errors consumed by 
> programs as well as users, so we want something that can be easily made 
> into language bindings.
> 
> In other words, this situation is much more similar to errno/strerror, 
> than to a compiler error message (which will often be used by other 
> programs, but where the actual error text will be read by a person; this 
> can and should use gettext).

So to extend your analogy, I think we're reaching the conclusion that
both errno *and* string should be returned to the client, right? The
client can localize based on errno, and in addition optionally provide
the server-provided string to the user.
Paolo Bonzini - Oct. 19, 2009, 9:16 p.m.
On 10/19/2009 06:50 PM, Hollis Blanchard wrote:
> On Sun, 2009-10-18 at 14:17 +0200, Paolo Bonzini wrote:
>> On 10/18/2009 06:25 AM, Jamie Lokier wrote:
>>> The manual for GNU gettext explains quite well why gettext takes a
>>> message string as argument, instead of a "message code".  Imho, a
>>> similar case can be made for error messages at call sites.
>>
>> That's true.  However here we have the case of having errors consumed by
>> programs as well as users, so we want something that can be easily made
>> into language bindings.
>>
>> In other words, this situation is much more similar to errno/strerror,
>> than to a compiler error message (which will often be used by other
>> programs, but where the actual error text will be read by a person; this
>> can and should use gettext).
>
> So to extend your analogy, I think we're reaching the conclusion that
> both errno *and* string should be returned to the client, right?

I'd say so, yes.  And any parameters too, all as JSON.  I think errno is 
just one kind of data that can be in an exception, and you want 
different exceptions for not opening a sound device or not opening an 
image file (even if both may fail with EACCES or ENOENT).

Paolo

Patch

diff --git a/qerror.c b/qerror.c
index bbea770..88a8208 100644
--- a/qerror.c
+++ b/qerror.c
@@ -24,11 +24,23 @@  static const QType qerror_type = {
     .destroy = qerror_destroy_obj,
 };
 
+static void qemu_err_qdev_nodev(const QError *qerror)
+{
+    QDict *qdict = qobject_to_qdict(qerror->data);
+    qemu_error("Device \"%s\" not found.  Try -device '?' for a list.\n",
+               qdict_get_str(qdict, "name"));
+}
+
 static QErrorTable qerror_table[] = {
     {
         .code   = QERR_UNKNOWN,
         .desc   = "unknown error",
     },
+    {
+        .code   = QERR_QDEV_NFOUND,
+        .desc   = "device not found",
+        .user_print = qemu_err_qdev_nodev,
+    },
 };
 
 /**
diff --git a/qerror.h b/qerror.h
index ed25ef1..820f25e 100644
--- a/qerror.h
+++ b/qerror.h
@@ -21,6 +21,7 @@ 
  */
 typedef enum QErrorCode {
     QERR_UNKNOWN,
+    QERR_QDEV_NFOUND,
     QERR_MAX,
 } QErrorCode;