Patchwork [3/4] QError: Don't abort on multiple faults

login
register
mail settings
Submitter Luiz Capitulino
Date Feb. 4, 2010, 8:13 p.m.
Message ID <1265314396-6583-4-git-send-email-lcapitulino@redhat.com>
Download mbox | patch
Permalink /patch/44553/
State New
Headers show

Comments

Luiz Capitulino - Feb. 4, 2010, 8:13 p.m.
Ideally, Monitor code should report an error only once and
return the error information up the call chain.

To assure that this happens as expected and that no error is
lost, we have an assert() in qemu_error_internal().

However, we still have not fully converted handlers using
monitor_printf() to report errors. As there can be multiple
monitor_printf() calls on an error, the assertion is easily
triggered when debugging is enabled; and we will get a memory
leak if it's not.

The solution to this problem is to allow multiple faults by only
reporting the first one, and to release the additional error objects.

A better mechanism to report multiple errors to programmers is
underway.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 monitor.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)
Markus Armbruster - Feb. 5, 2010, 9:15 a.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> Ideally, Monitor code should report an error only once and
> return the error information up the call chain.
>
> To assure that this happens as expected and that no error is
> lost, we have an assert() in qemu_error_internal().
>
> However, we still have not fully converted handlers using
> monitor_printf() to report errors. As there can be multiple
> monitor_printf() calls on an error, the assertion is easily
> triggered when debugging is enabled; and we will get a memory
> leak if it's not.
>
> The solution to this problem is to allow multiple faults by only
> reporting the first one, and to release the additional error objects.

I want this badly.

[...]
Markus Armbruster - Feb. 5, 2010, 2:21 p.m.
Markus Armbruster <armbru@redhat.com> writes:

> Luiz Capitulino <lcapitulino@redhat.com> writes:
>
>> Ideally, Monitor code should report an error only once and
>> return the error information up the call chain.
>>
>> To assure that this happens as expected and that no error is
>> lost, we have an assert() in qemu_error_internal().
>>
>> However, we still have not fully converted handlers using
>> monitor_printf() to report errors. As there can be multiple
>> monitor_printf() calls on an error, the assertion is easily
>> triggered when debugging is enabled; and we will get a memory
>> leak if it's not.
>>
>> The solution to this problem is to allow multiple faults by only
>> reporting the first one, and to release the additional error objects.
>
> I want this badly.
>
> [...]

Let me elaborate a bit.  While this patch is a much wanted improvement,
what I *really* want is something else.

Right now, we have 41 uses of qemu_error_new().  We still have >300 uses
of monitor_printf(), many of them errors.  Plus some 100 uses of
qemu_error(), which boils down to monitor_printf() when running within a
monitor.  Not to mention >1000 uses of stderr.

To convert a monitor handler to QError, we have to make it report
exactly one error on every unsuccessful path, with qemu_error_new().
That's not too hard.  Then we have to ensure it does not call
monitor_printf() directly (not hard either) or indirectly (ouch).  I say
"ouch", because those prints can hide behind long call chains, in code
shared with other users.  Cleaning up all those stray prints will take
time.

Without this patch, a stray print is fatal, unless it happens to be the
only one *and* there is no real error.

With this patch, we survive, but the UndefinedError triggered by the
stray print displaces any later real error.

What I really want is that stray prints do not mess with my real errors.
Luiz Capitulino - Feb. 5, 2010, 2:44 p.m.
On Fri, 05 Feb 2010 15:21:13 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Luiz Capitulino <lcapitulino@redhat.com> writes:
> >
> >> Ideally, Monitor code should report an error only once and
> >> return the error information up the call chain.
> >>
> >> To assure that this happens as expected and that no error is
> >> lost, we have an assert() in qemu_error_internal().
> >>
> >> However, we still have not fully converted handlers using
> >> monitor_printf() to report errors. As there can be multiple
> >> monitor_printf() calls on an error, the assertion is easily
> >> triggered when debugging is enabled; and we will get a memory
> >> leak if it's not.
> >>
> >> The solution to this problem is to allow multiple faults by only
> >> reporting the first one, and to release the additional error objects.
> >
> > I want this badly.
> >
> > [...]
> 
> Let me elaborate a bit.  While this patch is a much wanted improvement,
> what I *really* want is something else.
> 
> Right now, we have 41 uses of qemu_error_new().  We still have >300 uses
> of monitor_printf(), many of them errors.  Plus some 100 uses of
> qemu_error(), which boils down to monitor_printf() when running within a
> monitor.  Not to mention >1000 uses of stderr.
> 
> To convert a monitor handler to QError, we have to make it report
> exactly one error on every unsuccessful path, with qemu_error_new().
> That's not too hard.  Then we have to ensure it does not call
> monitor_printf() directly (not hard either) or indirectly (ouch).  I say
> "ouch", because those prints can hide behind long call chains, in code
> shared with other users.  Cleaning up all those stray prints will take
> time.

 As we have talked, this situation will be improved by making cmd_new
return an error code, right?

 I've started working on it already, patches will be sent soon.

> Without this patch, a stray print is fatal, unless it happens to be the
> only one *and* there is no real error.
> 
> With this patch, we survive, but the UndefinedError triggered by the
> stray print displaces any later real error.
> 
> What I really want is that stray prints do not mess with my real errors.

 There are two issues here:

1. In command handlers stray prints _usually_ report errors. If we go
with shallow conversion, I believe that clients should be informed
(in the form of an undefined error) that monitor_printf() has been
called

2. We have agreed that multiple faults are not allowed and that
reporting only the first one is fine. In shallow conversion (or
even in buggy conversions) we can get multiple faults and we have
to handle it

 So, the situation will be improved by my next series as we can
use the return code to 'audit' qemu_error_new() usage (which
includes monitor_printf() calls).
Markus Armbruster - Feb. 5, 2010, 3:15 p.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Fri, 05 Feb 2010 15:21:13 +0100
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Luiz Capitulino <lcapitulino@redhat.com> writes:
>> >
>> >> Ideally, Monitor code should report an error only once and
>> >> return the error information up the call chain.
>> >>
>> >> To assure that this happens as expected and that no error is
>> >> lost, we have an assert() in qemu_error_internal().
>> >>
>> >> However, we still have not fully converted handlers using
>> >> monitor_printf() to report errors. As there can be multiple
>> >> monitor_printf() calls on an error, the assertion is easily
>> >> triggered when debugging is enabled; and we will get a memory
>> >> leak if it's not.
>> >>
>> >> The solution to this problem is to allow multiple faults by only
>> >> reporting the first one, and to release the additional error objects.
>> >
>> > I want this badly.
>> >
>> > [...]
>> 
>> Let me elaborate a bit.  While this patch is a much wanted improvement,
>> what I *really* want is something else.
>> 
>> Right now, we have 41 uses of qemu_error_new().  We still have >300 uses
>> of monitor_printf(), many of them errors.  Plus some 100 uses of
>> qemu_error(), which boils down to monitor_printf() when running within a
>> monitor.  Not to mention >1000 uses of stderr.
>> 
>> To convert a monitor handler to QError, we have to make it report
>> exactly one error on every unsuccessful path, with qemu_error_new().
>> That's not too hard.  Then we have to ensure it does not call
>> monitor_printf() directly (not hard either) or indirectly (ouch).  I say
>> "ouch", because those prints can hide behind long call chains, in code
>> shared with other users.  Cleaning up all those stray prints will take
>> time.
>
>  As we have talked, this situation will be improved by making cmd_new
> return an error code, right?

Yes.

>  I've started working on it already, patches will be sent soon.

Excellent.

>> Without this patch, a stray print is fatal, unless it happens to be the
>> only one *and* there is no real error.
>> 
>> With this patch, we survive, but the UndefinedError triggered by the
>> stray print displaces any later real error.
>> 
>> What I really want is that stray prints do not mess with my real errors.
>
>  There are two issues here:
>
> 1. In command handlers stray prints _usually_ report errors. If we go
> with shallow conversion, I believe that clients should be informed
> (in the form of an undefined error) that monitor_printf() has been
> called

It's not so easy.

A command should report an error if and only if it really failed.
Reporting an error even though the command succeeded is just as bad as
not reporting an error when it failed.

Barring bugs, a handler *knows* whether it got the job done or not.  It
can and should communicate that knowledge up by returning status.  I
understand one of your next patch series will do that.

If a handler returns failure, and we haven't reported an error, we must
report UndefinedError whether we had stray prints or not.

If a handler returns success, we should *not* report UndefinedError just
because it had stray prints.  A stray print does not necessarily imply
command failure, and hence a stray print should not make an otherwise
successful command fail.

Adding a suitable mechanism to alert developers to stray prints is fine.
But abusing error replies for that is not.

> 2. We have agreed that multiple faults are not allowed and that
> reporting only the first one is fine. In shallow conversion (or
> even in buggy conversions) we can get multiple faults and we have
> to handle it

Reporting only the first one sucks when it's an UndefinedError triggered
by a stray print, while one of the other ones is the real error.

>  So, the situation will be improved by my next series as we can
> use the return code to 'audit' qemu_error_new() usage (which
> includes monitor_printf() calls).

Yes.
Luiz Capitulino - Feb. 5, 2010, 5:07 p.m.
On Fri, 05 Feb 2010 16:15:56 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Luiz Capitulino <lcapitulino@redhat.com> writes:
> 
> > On Fri, 05 Feb 2010 15:21:13 +0100
> > Markus Armbruster <armbru@redhat.com> wrote:
> >
> >> Markus Armbruster <armbru@redhat.com> writes:
> >> 
> >> > Luiz Capitulino <lcapitulino@redhat.com> writes:
> >> >
> >> >> Ideally, Monitor code should report an error only once and
> >> >> return the error information up the call chain.
> >> >>
> >> >> To assure that this happens as expected and that no error is
> >> >> lost, we have an assert() in qemu_error_internal().
> >> >>
> >> >> However, we still have not fully converted handlers using
> >> >> monitor_printf() to report errors. As there can be multiple
> >> >> monitor_printf() calls on an error, the assertion is easily
> >> >> triggered when debugging is enabled; and we will get a memory
> >> >> leak if it's not.
> >> >>
> >> >> The solution to this problem is to allow multiple faults by only
> >> >> reporting the first one, and to release the additional error objects.
> >> >
> >> > I want this badly.
> >> >
> >> > [...]
> >> 
> >> Let me elaborate a bit.  While this patch is a much wanted improvement,
> >> what I *really* want is something else.
> >> 
> >> Right now, we have 41 uses of qemu_error_new().  We still have >300 uses
> >> of monitor_printf(), many of them errors.  Plus some 100 uses of
> >> qemu_error(), which boils down to monitor_printf() when running within a
> >> monitor.  Not to mention >1000 uses of stderr.
> >> 
> >> To convert a monitor handler to QError, we have to make it report
> >> exactly one error on every unsuccessful path, with qemu_error_new().
> >> That's not too hard.  Then we have to ensure it does not call
> >> monitor_printf() directly (not hard either) or indirectly (ouch).  I say
> >> "ouch", because those prints can hide behind long call chains, in code
> >> shared with other users.  Cleaning up all those stray prints will take
> >> time.
> >
> >  As we have talked, this situation will be improved by making cmd_new
> > return an error code, right?
> 
> Yes.
> 
> >  I've started working on it already, patches will be sent soon.
> 
> Excellent.
> 
> >> Without this patch, a stray print is fatal, unless it happens to be the
> >> only one *and* there is no real error.
> >> 
> >> With this patch, we survive, but the UndefinedError triggered by the
> >> stray print displaces any later real error.
> >> 
> >> What I really want is that stray prints do not mess with my real errors.
> >
> >  There are two issues here:
> >
> > 1. In command handlers stray prints _usually_ report errors. If we go
> > with shallow conversion, I believe that clients should be informed
> > (in the form of an undefined error) that monitor_printf() has been
> > called
> 
> It's not so easy.
> 
> A command should report an error if and only if it really failed.
> Reporting an error even though the command succeeded is just as bad as
> not reporting an error when it failed.
> 
> Barring bugs, a handler *knows* whether it got the job done or not.  It
> can and should communicate that knowledge up by returning status.  I
> understand one of your next patch series will do that.

 Yes.

> If a handler returns failure, and we haven't reported an error, we must
> report UndefinedError whether we had stray prints or not.

 Agreed and the stray prints have to reported to the developer.

> If a handler returns success, we should *not* report UndefinedError just
> because it had stray prints.  A stray print does not necessarily imply
> command failure, and hence a stray print should not make an otherwise
> successful command fail.

 Right, but a stray print which is not reporting an error and which is
not just OK, is returning data.

 We have to detect this, because this has to be done by using the
QObject API and shallow conversion can't miss that.

 I'm not saying we should use UndefinedError, I'm saying we have to
have a mechanism to detect this reliably. Unfortunately this can
only be detected at run time.

Patch

diff --git a/monitor.c b/monitor.c
index cb7eb65..c8b63aa 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4625,8 +4625,13 @@  void qemu_error_internal(const char *file, int linenr, const char *func,
         QDECREF(qerror);
         break;
     case ERR_SINK_MONITOR:
-        assert(qemu_error_sink->mon->error == NULL);
-        qemu_error_sink->mon->error = qerror;
+        /* report only the first error */
+        if (!qemu_error_sink->mon->error) {
+            qemu_error_sink->mon->error = qerror;
+        } else {
+            /* XXX: warn the programmer */
+            QDECREF(qerror);
+        }
         break;
     }
 }