diff mbox

[v2,1/6] error: Add error_abort

Message ID 46e453b90b9822456f6ffb97f9c03ccbf36a0214.1386203851.git.peter.crosthwaite@xilinx.com
State New
Headers show

Commit Message

Peter Crosthwaite Dec. 5, 2013, 12:48 a.m. UTC
Add a special Error * that can be passed to error handling APIs to
signal that any errors are fatal and should abort QEMU. There are two
advantages to this:

- allows for brevity when wishing to assert success of Error **
  accepting APIs. No need for this pattern:
        Error * local_err = NULL;
        api_call(foo, bar, &local_err);
        assert_no_error(local_err);
  This also removes the need for _nofail variants of APIs with
  asserting call sites now reduced to 1LOC.
- SIGABRT happens from within the offending API. When a fatal error
  occurs in an API call (when the caller is asserting sucess) failure
  often means the API itself is broken. With the abort happening in the
  API call now, the stack frames into the call are available at debug
  time. In the assert_no_error scheme the abort happens after the fact.

The exact semantic is that when an error is raised, if the argument
Error ** matches &error_abort, then the abort occurs immediately. The
error messaged is reported.

For error_propagate, if the destination error is &error_abort, then
the abort happens at propagation time.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
---
changed since v1:
Delayed assertions that *errp == NULL.

 include/qapi/error.h |  6 ++++++
 util/error.c         | 28 ++++++++++++++++++++++++----
 2 files changed, 30 insertions(+), 4 deletions(-)

Comments

Markus Armbruster Dec. 5, 2013, 10:13 a.m. UTC | #1
Peter Crosthwaite <peter.crosthwaite@xilinx.com> writes:

> Add a special Error * that can be passed to error handling APIs to
> signal that any errors are fatal and should abort QEMU. There are two
> advantages to this:
>
> - allows for brevity when wishing to assert success of Error **
>   accepting APIs. No need for this pattern:
>         Error * local_err = NULL;
>         api_call(foo, bar, &local_err);
>         assert_no_error(local_err);
>   This also removes the need for _nofail variants of APIs with
>   asserting call sites now reduced to 1LOC.
> - SIGABRT happens from within the offending API. When a fatal error
>   occurs in an API call (when the caller is asserting sucess) failure
>   often means the API itself is broken. With the abort happening in the
>   API call now, the stack frames into the call are available at debug
>   time. In the assert_no_error scheme the abort happens after the fact.
>
> The exact semantic is that when an error is raised, if the argument
> Error ** matches &error_abort, then the abort occurs immediately. The
> error messaged is reported.
>
> For error_propagate, if the destination error is &error_abort, then
> the abort happens at propagation time.
>
> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
> ---
> changed since v1:
> Delayed assertions that *errp == NULL.

Care to explain why you want to delay these assertions?  I'm not sure I
get it...

[...]
> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>      if (errp == NULL) {
>          return;
>      }
> -    assert(*errp == NULL);
>  
>      err = g_malloc0(sizeof(*err));
>  
> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>      va_end(ap);
>      err->err_class = err_class;
>  
> +    if (errp == &error_abort) {
> +        error_report("%s", error_get_pretty(err));
> +        abort();
> +    }
> +
> +    assert(*errp == NULL);
>      *errp = err;
>  }
>  
[...]
Eric Blake Dec. 5, 2013, 1:21 p.m. UTC | #2
On 12/05/2013 03:13 AM, Markus Armbruster wrote:

>>
>> For error_propagate, if the destination error is &error_abort, then
>> the abort happens at propagation time.
>>
>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
>> ---
>> changed since v1:
>> Delayed assertions that *errp == NULL.
> 
> Care to explain why you want to delay these assertions?  I'm not sure I
> get it...

error_abort as a global variable is always NULL.

> 
> [...]
>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>>      if (errp == NULL) {
>>          return;
>>      }
>> -    assert(*errp == NULL);

So *&error_abort is null and this assertion would fire, unless we delay
the check for NULL...

>>  
>>      err = g_malloc0(sizeof(*err));
>>  
>> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>>      va_end(ap);
>>      err->err_class = err_class;
>>  
>> +    if (errp == &error_abort) {
>> +        error_report("%s", error_get_pretty(err));
>> +        abort();
>> +    }
>> +
>> +    assert(*errp == NULL);

...until after the check for &error_abort.
Markus Armbruster Dec. 6, 2013, 11:59 a.m. UTC | #3
Eric Blake <eblake@redhat.com> writes:

> On 12/05/2013 03:13 AM, Markus Armbruster wrote:
>
>>>
>>> For error_propagate, if the destination error is &error_abort, then
>>> the abort happens at propagation time.
>>>
>>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
>>> ---
>>> changed since v1:
>>> Delayed assertions that *errp == NULL.
>> 
>> Care to explain why you want to delay these assertions?  I'm not sure I
>> get it...
>
> error_abort as a global variable is always NULL.
>
>> 
>> [...]
>>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>>>      if (errp == NULL) {
>>>          return;
>>>      }
>>> -    assert(*errp == NULL);
>
> So *&error_abort is null and this assertion would fire, unless we delay
> the check for NULL...

Err, one of us is confused :)

When errp == &error_abort, then *errp should be null.  If it isn't, then
something got stored in error_abort, which is quite wrong.  Leaving the
assertion where it is catches that.

>>>  
>>>      err = g_malloc0(sizeof(*err));
>>>  
>>> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass
>>> err_class, const char *fmt, ...)
>>>      va_end(ap);
>>>      err->err_class = err_class;
>>>  
>>> +    if (errp == &error_abort) {
>>> +        error_report("%s", error_get_pretty(err));
>>> +        abort();
>>> +    }
>>> +
>>> +    assert(*errp == NULL);
>
> ...until after the check for &error_abort.
Peter Crosthwaite Dec. 6, 2013, 12:43 p.m. UTC | #4
On Fri, Dec 6, 2013 at 9:59 PM, Markus Armbruster <armbru@redhat.com> wrote:
> Eric Blake <eblake@redhat.com> writes:
>
>> On 12/05/2013 03:13 AM, Markus Armbruster wrote:
>>
>>>>
>>>> For error_propagate, if the destination error is &error_abort, then
>>>> the abort happens at propagation time.
>>>>
>>>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
>>>> ---
>>>> changed since v1:
>>>> Delayed assertions that *errp == NULL.
>>>
>>> Care to explain why you want to delay these assertions?  I'm not sure I
>>> get it...
>>
>> error_abort as a global variable is always NULL.
>>
>>>
>>> [...]
>>>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>>>>      if (errp == NULL) {
>>>>          return;
>>>>      }
>>>> -    assert(*errp == NULL);
>>
>> So *&error_abort is null and this assertion would fire, unless we delay
>> the check for NULL...

The assertion evaluates to true so there is no issue leaving it where
it is. I am perhaps being overly defensive ...

>
> Err, one of us is confused :)
>
> When errp == &error_abort, then *errp should be null.  If it isn't, then
> something got stored in error_abort, which is quite wrong.  Leaving the
> assertion where it is catches that.
>

In a rather obscure way. Following on from the "what happens if
someone overwrites error_abort" discussion, I was going for the "limp
on" approach when someone stores to error abort. My thinking is that
error abort is potentially corruptable, given it's whole reason to be
is to trap fatally broken code. Hardening it against a bad pointer
deref leading up to the fatal error it actually traps may make sense.
Although I'm probably chasing ghosts there. Happy to revert however
depending on consensus. Both v1 and v2 are of equivalent functionality
in the 99.99% case and I have no strong opinion one way or the other.

Votes welcome :)

Regards,
Peter

>>>>
>>>>      err = g_malloc0(sizeof(*err));
>>>>
>>>> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass
>>>> err_class, const char *fmt, ...)
>>>>      va_end(ap);
>>>>      err->err_class = err_class;
>>>>
>>>> +    if (errp == &error_abort) {
>>>> +        error_report("%s", error_get_pretty(err));
>>>> +        abort();
>>>> +    }
>>>> +
>>>> +    assert(*errp == NULL);
>>
>> ...until after the check for &error_abort.
>
Markus Armbruster Dec. 6, 2013, 1:16 p.m. UTC | #5
Peter Crosthwaite <peter.crosthwaite@xilinx.com> writes:

> On Fri, Dec 6, 2013 at 9:59 PM, Markus Armbruster <armbru@redhat.com> wrote:
>> Eric Blake <eblake@redhat.com> writes:
>>
>>> On 12/05/2013 03:13 AM, Markus Armbruster wrote:
>>>
>>>>>
>>>>> For error_propagate, if the destination error is &error_abort, then
>>>>> the abort happens at propagation time.
>>>>>
>>>>> Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
>>>>> ---
>>>>> changed since v1:
>>>>> Delayed assertions that *errp == NULL.
>>>>
>>>> Care to explain why you want to delay these assertions?  I'm not sure I
>>>> get it...
>>>
>>> error_abort as a global variable is always NULL.
>>>
>>>>
>>>> [...]
>>>>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
>>>>>      if (errp == NULL) {
>>>>>          return;
>>>>>      }
>>>>> -    assert(*errp == NULL);
>>>
>>> So *&error_abort is null and this assertion would fire, unless we delay
>>> the check for NULL...
>
> The assertion evaluates to true so there is no issue leaving it where
> it is. I am perhaps being overly defensive ...
>
>>
>> Err, one of us is confused :)
>>
>> When errp == &error_abort, then *errp should be null.  If it isn't, then
>> something got stored in error_abort, which is quite wrong.  Leaving the
>> assertion where it is catches that.
>>
>
> In a rather obscure way. Following on from the "what happens if
> someone overwrites error_abort" discussion, I was going for the "limp
> on" approach when someone stores to error abort. My thinking is that
> error abort is potentially corruptable, given it's whole reason to be
> is to trap fatally broken code. Hardening it against a bad pointer
> deref leading up to the fatal error it actually traps may make sense.
> Although I'm probably chasing ghosts there. Happy to revert however
> depending on consensus. Both v1 and v2 are of equivalent functionality
> in the 99.99% case and I have no strong opinion one way or the other.
>
> Votes welcome :)

I'd leave the assertion where it is now.
diff mbox

Patch

diff --git a/include/qapi/error.h b/include/qapi/error.h
index 7d4c696..c0f0c3b 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -95,4 +95,10 @@  void error_propagate(Error **dst_err, Error *local_err);
  */
 void error_free(Error *err);
 
+/**
+ * If passed to error_set and friends, abort().
+ */
+
+extern Error *error_abort;
+
 #endif
diff --git a/util/error.c b/util/error.c
index ec0faa6..129bc1e 100644
--- a/util/error.c
+++ b/util/error.c
@@ -23,6 +23,8 @@  struct Error
     ErrorClass err_class;
 };
 
+Error *error_abort;
+
 void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
 {
     Error *err;
@@ -31,7 +33,6 @@  void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
     if (errp == NULL) {
         return;
     }
-    assert(*errp == NULL);
 
     err = g_malloc0(sizeof(*err));
 
@@ -40,6 +41,12 @@  void error_set(Error **errp, ErrorClass err_class, const char *fmt, ...)
     va_end(ap);
     err->err_class = err_class;
 
+    if (errp == &error_abort) {
+        error_report("%s", error_get_pretty(err));
+        abort();
+    }
+
+    assert(*errp == NULL);
     *errp = err;
 }
 
@@ -53,7 +60,6 @@  void error_set_errno(Error **errp, int os_errno, ErrorClass err_class,
     if (errp == NULL) {
         return;
     }
-    assert(*errp == NULL);
 
     err = g_malloc0(sizeof(*err));
 
@@ -68,6 +74,12 @@  void error_set_errno(Error **errp, int os_errno, ErrorClass err_class,
     va_end(ap);
     err->err_class = err_class;
 
+    if (errp == &error_abort) {
+        error_report("%s", error_get_pretty(err));
+        abort();
+    }
+
+    assert(*errp == NULL);
     *errp = err;
 }
 
@@ -88,7 +100,6 @@  void error_set_win32(Error **errp, int win32_err, ErrorClass err_class,
     if (errp == NULL) {
         return;
     }
-    assert(*errp == NULL);
 
     err = g_malloc0(sizeof(*err));
 
@@ -106,6 +117,12 @@  void error_set_win32(Error **errp, int win32_err, ErrorClass err_class,
     va_end(ap);
     err->err_class = err_class;
 
+    if (errp == &error_abort) {
+        error_report("%s", error_get_pretty(err));
+        abort();
+    }
+
+    assert(*errp == NULL);
     *errp = err;
 }
 
@@ -147,7 +164,10 @@  void error_free(Error *err)
 
 void error_propagate(Error **dst_err, Error *local_err)
 {
-    if (dst_err && !*dst_err) {
+    if (local_err && dst_err == &error_abort) {
+        error_report("%s", error_get_pretty(local_err));
+        abort();
+    } else if (dst_err && !*dst_err) {
         *dst_err = local_err;
     } else if (local_err) {
         error_free(local_err);