Message ID | 46e453b90b9822456f6ffb97f9c03ccbf36a0214.1386203851.git.peter.crosthwaite@xilinx.com |
---|---|
State | New |
Headers | show |
Peter Crosthwaite <peter.crosthwaite@xilinx.com> writes: > Add a special Error * that can be passed to error handling APIs to > signal that any errors are fatal and should abort QEMU. There are two > advantages to this: > > - allows for brevity when wishing to assert success of Error ** > accepting APIs. No need for this pattern: > Error * local_err = NULL; > api_call(foo, bar, &local_err); > assert_no_error(local_err); > This also removes the need for _nofail variants of APIs with > asserting call sites now reduced to 1LOC. > - SIGABRT happens from within the offending API. When a fatal error > occurs in an API call (when the caller is asserting sucess) failure > often means the API itself is broken. With the abort happening in the > API call now, the stack frames into the call are available at debug > time. In the assert_no_error scheme the abort happens after the fact. > > The exact semantic is that when an error is raised, if the argument > Error ** matches &error_abort, then the abort occurs immediately. The > error messaged is reported. > > For error_propagate, if the destination error is &error_abort, then > the abort happens at propagation time. > > Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> > --- > changed since v1: > Delayed assertions that *errp == NULL. Care to explain why you want to delay these assertions? I'm not sure I get it... [...] > @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) > if (errp == NULL) { > return; > } > - assert(*errp == NULL); > > err = g_malloc0(sizeof(*err)); > > @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) > va_end(ap); > err->err_class = err_class; > > + if (errp == &error_abort) { > + error_report("%s", error_get_pretty(err)); > + abort(); > + } > + > + assert(*errp == NULL); > *errp = err; > } > [...]
On 12/05/2013 03:13 AM, Markus Armbruster wrote: >> >> For error_propagate, if the destination error is &error_abort, then >> the abort happens at propagation time. >> >> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> >> --- >> changed since v1: >> Delayed assertions that *errp == NULL. > > Care to explain why you want to delay these assertions? I'm not sure I > get it... error_abort as a global variable is always NULL. > > [...] >> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) >> if (errp == NULL) { >> return; >> } >> - assert(*errp == NULL); So *&error_abort is null and this assertion would fire, unless we delay the check for NULL... >> >> err = g_malloc0(sizeof(*err)); >> >> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) >> va_end(ap); >> err->err_class = err_class; >> >> + if (errp == &error_abort) { >> + error_report("%s", error_get_pretty(err)); >> + abort(); >> + } >> + >> + assert(*errp == NULL); ...until after the check for &error_abort.
Eric Blake <eblake@redhat.com> writes: > On 12/05/2013 03:13 AM, Markus Armbruster wrote: > >>> >>> For error_propagate, if the destination error is &error_abort, then >>> the abort happens at propagation time. >>> >>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> >>> --- >>> changed since v1: >>> Delayed assertions that *errp == NULL. >> >> Care to explain why you want to delay these assertions? I'm not sure I >> get it... > > error_abort as a global variable is always NULL. > >> >> [...] >>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) >>> if (errp == NULL) { >>> return; >>> } >>> - assert(*errp == NULL); > > So *&error_abort is null and this assertion would fire, unless we delay > the check for NULL... Err, one of us is confused :) When errp == &error_abort, then *errp should be null. If it isn't, then something got stored in error_abort, which is quite wrong. Leaving the assertion where it is catches that. >>> >>> err = g_malloc0(sizeof(*err)); >>> >>> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass >>> err_class, const char *fmt, ...) >>> va_end(ap); >>> err->err_class = err_class; >>> >>> + if (errp == &error_abort) { >>> + error_report("%s", error_get_pretty(err)); >>> + abort(); >>> + } >>> + >>> + assert(*errp == NULL); > > ...until after the check for &error_abort.
On Fri, Dec 6, 2013 at 9:59 PM, Markus Armbruster <armbru@redhat.com> wrote: > Eric Blake <eblake@redhat.com> writes: > >> On 12/05/2013 03:13 AM, Markus Armbruster wrote: >> >>>> >>>> For error_propagate, if the destination error is &error_abort, then >>>> the abort happens at propagation time. >>>> >>>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> >>>> --- >>>> changed since v1: >>>> Delayed assertions that *errp == NULL. >>> >>> Care to explain why you want to delay these assertions? I'm not sure I >>> get it... >> >> error_abort as a global variable is always NULL. >> >>> >>> [...] >>>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) >>>> if (errp == NULL) { >>>> return; >>>> } >>>> - assert(*errp == NULL); >> >> So *&error_abort is null and this assertion would fire, unless we delay >> the check for NULL... The assertion evaluates to true so there is no issue leaving it where it is. I am perhaps being overly defensive ... > > Err, one of us is confused :) > > When errp == &error_abort, then *errp should be null. If it isn't, then > something got stored in error_abort, which is quite wrong. Leaving the > assertion where it is catches that. > In a rather obscure way. Following on from the "what happens if someone overwrites error_abort" discussion, I was going for the "limp on" approach when someone stores to error abort. My thinking is that error abort is potentially corruptable, given it's whole reason to be is to trap fatally broken code. Hardening it against a bad pointer deref leading up to the fatal error it actually traps may make sense. Although I'm probably chasing ghosts there. Happy to revert however depending on consensus. Both v1 and v2 are of equivalent functionality in the 99.99% case and I have no strong opinion one way or the other. Votes welcome :) Regards, Peter >>>> >>>> err = g_malloc0(sizeof(*err)); >>>> >>>> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass >>>> err_class, const char *fmt, ...) >>>> va_end(ap); >>>> err->err_class = err_class; >>>> >>>> + if (errp == &error_abort) { >>>> + error_report("%s", error_get_pretty(err)); >>>> + abort(); >>>> + } >>>> + >>>> + assert(*errp == NULL); >> >> ...until after the check for &error_abort. >
Peter Crosthwaite <peter.crosthwaite@xilinx.com> writes: > On Fri, Dec 6, 2013 at 9:59 PM, Markus Armbruster <armbru@redhat.com> wrote: >> Eric Blake <eblake@redhat.com> writes: >> >>> On 12/05/2013 03:13 AM, Markus Armbruster wrote: >>> >>>>> >>>>> For error_propagate, if the destination error is &error_abort, then >>>>> the abort happens at propagation time. >>>>> >>>>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> >>>>> --- >>>>> changed since v1: >>>>> Delayed assertions that *errp == NULL. >>>> >>>> Care to explain why you want to delay these assertions? I'm not sure I >>>> get it... >>> >>> error_abort as a global variable is always NULL. >>> >>>> >>>> [...] >>>>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) >>>>> if (errp == NULL) { >>>>> return; >>>>> } >>>>> - assert(*errp == NULL); >>> >>> So *&error_abort is null and this assertion would fire, unless we delay >>> the check for NULL... > > The assertion evaluates to true so there is no issue leaving it where > it is. I am perhaps being overly defensive ... > >> >> Err, one of us is confused :) >> >> When errp == &error_abort, then *errp should be null. If it isn't, then >> something got stored in error_abort, which is quite wrong. Leaving the >> assertion where it is catches that. >> > > In a rather obscure way. Following on from the "what happens if > someone overwrites error_abort" discussion, I was going for the "limp > on" approach when someone stores to error abort. My thinking is that > error abort is potentially corruptable, given it's whole reason to be > is to trap fatally broken code. Hardening it against a bad pointer > deref leading up to the fatal error it actually traps may make sense. > Although I'm probably chasing ghosts there. Happy to revert however > depending on consensus. Both v1 and v2 are of equivalent functionality > in the 99.99% case and I have no strong opinion one way or the other. > > Votes welcome :) I'd leave the assertion where it is now.
diff --git a/include/qapi/error.h b/include/qapi/error.h index 7d4c696..c0f0c3b 100644 --- a/include/qapi/error.h +++ b/include/qapi/error.h @@ -95,4 +95,10 @@ void error_propagate(Error **dst_err, Error *local_err); */ void error_free(Error *err); +/** + * If passed to error_set and friends, abort(). + */ + +extern Error *error_abort; + #endif diff --git a/util/error.c b/util/error.c index ec0faa6..129bc1e 100644 --- a/util/error.c +++ b/util/error.c @@ -23,6 +23,8 @@ struct Error ErrorClass err_class; }; +Error *error_abort; + void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) { Error *err; @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) if (errp == NULL) { return; } - assert(*errp == NULL); err = g_malloc0(sizeof(*err)); @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...) va_end(ap); err->err_class = err_class; + if (errp == &error_abort) { + error_report("%s", error_get_pretty(err)); + abort(); + } + + assert(*errp == NULL); *errp = err; } @@ -53,7 +60,6 @@ void error_set_errno(Error **errp, int os_errno, ErrorClass err_class, if (errp == NULL) { return; } - assert(*errp == NULL); err = g_malloc0(sizeof(*err)); @@ -68,6 +74,12 @@ void error_set_errno(Error **errp, int os_errno, ErrorClass err_class, va_end(ap); err->err_class = err_class; + if (errp == &error_abort) { + error_report("%s", error_get_pretty(err)); + abort(); + } + + assert(*errp == NULL); *errp = err; } @@ -88,7 +100,6 @@ void error_set_win32(Error **errp, int win32_err, ErrorClass err_class, if (errp == NULL) { return; } - assert(*errp == NULL); err = g_malloc0(sizeof(*err)); @@ -106,6 +117,12 @@ void error_set_win32(Error **errp, int win32_err, ErrorClass err_class, va_end(ap); err->err_class = err_class; + if (errp == &error_abort) { + error_report("%s", error_get_pretty(err)); + abort(); + } + + assert(*errp == NULL); *errp = err; } @@ -147,7 +164,10 @@ void error_free(Error *err) void error_propagate(Error **dst_err, Error *local_err) { - if (dst_err && !*dst_err) { + if (local_err && dst_err == &error_abort) { + error_report("%s", error_get_pretty(local_err)); + abort(); + } else if (dst_err && !*dst_err) { *dst_err = local_err; } else if (local_err) { error_free(local_err);
Add a special Error * that can be passed to error handling APIs to signal that any errors are fatal and should abort QEMU. There are two advantages to this: - allows for brevity when wishing to assert success of Error ** accepting APIs. No need for this pattern: Error * local_err = NULL; api_call(foo, bar, &local_err); assert_no_error(local_err); This also removes the need for _nofail variants of APIs with asserting call sites now reduced to 1LOC. - SIGABRT happens from within the offending API. When a fatal error occurs in an API call (when the caller is asserting sucess) failure often means the API itself is broken. With the abort happening in the API call now, the stack frames into the call are available at debug time. In the assert_no_error scheme the abort happens after the fact. The exact semantic is that when an error is raised, if the argument Error ** matches &error_abort, then the abort occurs immediately. The error messaged is reported. For error_propagate, if the destination error is &error_abort, then the abort happens at propagation time. Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> --- changed since v1: Delayed assertions that *errp == NULL. include/qapi/error.h | 6 ++++++ util/error.c | 28 ++++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-)