diff mbox

[v6,1/5] util: Introduce error reporting functions with fatal/abort

Message ID 87io26ulip.fsf@blackfin.pond.sub.org
State New
Headers show

Commit Message

Markus Armbruster Feb. 3, 2016, 10:38 a.m. UTC
Thomas Huth <thuth@redhat.com> writes:

> On 03.02.2016 10:48, Markus Armbruster wrote:
>> David Gibson <david@gibson.dropbear.id.au> writes:
>> 
>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>>>> On 02.02.2016 19:53, Markus Armbruster wrote:
>>>>> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>>>> ...
>>>>
>>>>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>>>>>> index 7ab2355..6c2f142 100644
>>>>>> --- a/include/qemu/error-report.h
>>>>>> +++ b/include/qemu/error-report.h
>>>>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>  const char *error_get_progname(void);
>>>>>>  extern bool enable_timestamp_msg;
>>>>>>  
>>>>>> +/* Report message and exit with error */
>>>>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) GCC_FMT_ATTR(1, 0);
>>>>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>
>>>>> This lets people write things like
>>>>>
>>>>>     error_report_fatal("The sky is falling");
>>>>>
>>>>> instead of
>>>>>
>>>>>     error_report("The sky is falling");
>>>>>     exit(1);
>>>>>
>>>>> or
>>>>>
>>>>>     fprintf(stderr, "The sky is falling\n");
>>>>>     exit(1);
>>>>>
>>>>> I don't think that's an improvement in clarity.
>>>>
>>>> The problem is not the existing code, but that in a couple of new
>>>> patches, I've now already seen that people are trying to use
>>>>
>>>>      error_setg(&error_fatal, ... );
>>>
>>> So, I don't actually see any real advantage to error_report_fatal(...)
>>> over error_setg(&error_fatal, ...).
>> 
>> I do.  Compare:
>> 
>> (a) error_report(...);
>>     exit(1);
>> 
>> (b) error_report_fatal(...);
>> 
>> (c) error_setg(&error_fatal, ...);
>> 
>> In my opinion, (a) is clearest: even a relatively clueless reader will
>> know what exit(1) does, can guess what error_report() approximately
>> does, and doesn't need to know what it does exactly.  (b) is slightly
>> less obvious, and (c) is positively opaque.
>> 
>> Let's stick to the obvious (a) and be done with it.
>
> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
> maybe add that information to your patch that updates the HACKING text?

I feel such detailed advice belings into error.h.  Sketch appended.

If that doesn't succeed in keeping (c) out, make checkpatch flag it.

> (and sorry for the fuzz with error_report_fatal() ... I thought it would
> be a good solution to avoid (c), but if (a) is preferred instead, then
> we should go with that solution instead).
>
> And, by the way, what about the spots that currently already use
> error_setg(&error_abort, ....) ? Should they be turned into
> error_report() + abort() instead? Or only abort(), without error
> message, since abort() is only about programming errors?

As I wrote in my first reply to this thread, I'd like them to be cleaned
up to just abort() or assert().

I like assert(), because it gives me exactly what I can use to debug the
programming error: a core dump (if enabled) and a source location
(useful when no core dump).  I never bought the argument that we should
use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
define NDEBUG, our 600+ abort()s won't save you from our 4000+
assert()s.

Comments

Lluís Vilanova Feb. 3, 2016, 1:42 p.m. UTC | #1
Markus Armbruster writes:

> Thomas Huth <thuth@redhat.com> writes:
>> On 03.02.2016 10:48, Markus Armbruster wrote:
>>> David Gibson <david@gibson.dropbear.id.au> writes:
>>> 
>>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>>>>> On 02.02.2016 19:53, Markus Armbruster wrote:
>>>>>> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>>>>> ...
>>>>> 
>>>>>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>>>>>>> index 7ab2355..6c2f142 100644
>>>>>>> --- a/include/qemu/error-report.h
>>>>>>> +++ b/include/qemu/error-report.h
>>>>>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>> const char *error_get_progname(void);
>>>>>>> extern bool enable_timestamp_msg;
>>>>>>> 
>>>>>>> +/* Report message and exit with error */
>>>>>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) GCC_FMT_ATTR(1, 0);
>>>>>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>> 
>>>>>> This lets people write things like
>>>>>> 
>>>>>> error_report_fatal("The sky is falling");
>>>>>> 
>>>>>> instead of
>>>>>> 
>>>>>> error_report("The sky is falling");
>>>>>> exit(1);
>>>>>> 
>>>>>> or
>>>>>> 
>>>>>> fprintf(stderr, "The sky is falling\n");
>>>>>> exit(1);
>>>>>> 
>>>>>> I don't think that's an improvement in clarity.
>>>>> 
>>>>> The problem is not the existing code, but that in a couple of new
>>>>> patches, I've now already seen that people are trying to use
>>>>> 
>>>>> error_setg(&error_fatal, ... );
>>>> 
>>>> So, I don't actually see any real advantage to error_report_fatal(...)
>>>> over error_setg(&error_fatal, ...).
>>> 
>>> I do.  Compare:
>>> 
>>> (a) error_report(...);
>>> exit(1);
>>> 
>>> (b) error_report_fatal(...);
>>> 
>>> (c) error_setg(&error_fatal, ...);
>>> 
>>> In my opinion, (a) is clearest: even a relatively clueless reader will
>>> know what exit(1) does, can guess what error_report() approximately
>>> does, and doesn't need to know what it does exactly.  (b) is slightly
>>> less obvious, and (c) is positively opaque.
>>> 
>>> Let's stick to the obvious (a) and be done with it.
>> 
>> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
>> maybe add that information to your patch that updates the HACKING text?

> I feel such detailed advice belings into error.h.  Sketch appended.

> If that doesn't succeed in keeping (c) out, make checkpatch flag it.

>> (and sorry for the fuzz with error_report_fatal() ... I thought it would
>> be a good solution to avoid (c), but if (a) is preferred instead, then
>> we should go with that solution instead).

I can easily change that, no problem. I'm just happy consensus is landing on
this subject.


>> And, by the way, what about the spots that currently already use
>> error_setg(&error_abort, ....) ? Should they be turned into
>> error_report() + abort() instead? Or only abort(), without error
>> message, since abort() is only about programming errors?

> As I wrote in my first reply to this thread, I'd like them to be cleaned
> up to just abort() or assert().

> I like assert(), because it gives me exactly what I can use to debug the
> programming error: a core dump (if enabled) and a source location
> (useful when no core dump).  I never bought the argument that we should
> use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
> define NDEBUG, our 600+ abort()s won't save you from our 4000+
> assert()s.

Sorry, but I don't buy the argument of, "I prefer assert() because there's
already lots of them". To me, there's a semantic difference between debug builds
and regular ones (aka, assert vs abort). Also, I think it adds to the confusion
that assert and abort seem to be used interchangeably in the code.

What about this definition?

* exit(): user-triggered errors
* abort(): general programming errors
* assert(): additional sanity/consistency checks against programming errors

Now, abort & assert have an overlap. Should we discourage one in favour of the
other?

Also:

* error_report_fatal ensures the same exit code is always used (otherwise it can
  fail with inconsistent error codes)
* error_report_abort brings the code information of assert into abort

But of course, I'm happy either way :)


> diff --git a/include/qapi/error.h b/include/qapi/error.h
> index 45d6c72..ea7e74f 100644
> --- a/include/qapi/error.h
> +++ b/include/qapi/error.h
> @@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err);
>   * human-readable error message is made from printf-style @fmt, ...
>   * The resulting message should be a single phrase, with no newline or
>   * trailing punctuation.
> + * Please don't error_setg(&error_fatal, ...), use error_report() and
> + * exit(), because that's more obvious.
> + * Likewise, don't error_setg(&error_abort, ...), use assert().
>   */
>  #define error_setg(errp, fmt, ...)                              \
>      error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
> @@ -213,6 +216,8 @@ void error_setg_win32_internal(Error **errp,
>   * the error object.
>   * Else, move the error object from @local_err to *@dst_errp.
>   * On return, @local_err is invalid.
> + * Please don't error_propagate(&error_fatal, ...), use
> + * error_report_err() and exit(), because that's more obvious.
>   */
>  void error_propagate(Error **dst_errp, Error *local_err);
 
> @@ -291,12 +296,14 @@ void error_set_internal(Error **errp,
>      GCC_FMT_ATTR(6, 7);
 
>  /*
> - * Pass to error_setg() & friends to abort() on error.
> + * Special error destination to abort on error.
> + * See error_setg() and error_propagate() for details.
>   */
>  extern Error *error_abort;
 
>  /*
> - * Pass to error_setg() & friends to exit(1) on error.
> + * Special error destination to exit(1) on error.
> + * See error_setg() and error_propagate() for details.
>   */
>  extern Error *error_fatal;
 
I see, this will make it clearer for people looking for functions without
reading HACKING. I can add this and reference it from the document.


Thanks,
  Lluis
Markus Armbruster Feb. 3, 2016, 2:34 p.m. UTC | #2
Lluís Vilanova <vilanova@ac.upc.edu> writes:

> Markus Armbruster writes:
>
>> Thomas Huth <thuth@redhat.com> writes:
>>> On 03.02.2016 10:48, Markus Armbruster wrote:
>>>> David Gibson <david@gibson.dropbear.id.au> writes:
>>>> 
>>>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>>>>>> On 02.02.2016 19:53, Markus Armbruster wrote:
>>>>>>> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>>>>>> ...
>>>>>> 
>>>>>>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>>>>>>>> index 7ab2355..6c2f142 100644
>>>>>>>> --- a/include/qemu/error-report.h
>>>>>>>> +++ b/include/qemu/error-report.h
>>>>>>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>>> const char *error_get_progname(void);
>>>>>>>> extern bool enable_timestamp_msg;
>>>>>>>> 
>>>>>>>> +/* Report message and exit with error */
>>>>>>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) GCC_FMT_ATTR(1, 0);
>>>>>>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>> 
>>>>>>> This lets people write things like
>>>>>>> 
>>>>>>> error_report_fatal("The sky is falling");
>>>>>>> 
>>>>>>> instead of
>>>>>>> 
>>>>>>> error_report("The sky is falling");
>>>>>>> exit(1);
>>>>>>> 
>>>>>>> or
>>>>>>> 
>>>>>>> fprintf(stderr, "The sky is falling\n");
>>>>>>> exit(1);
>>>>>>> 
>>>>>>> I don't think that's an improvement in clarity.
>>>>>> 
>>>>>> The problem is not the existing code, but that in a couple of new
>>>>>> patches, I've now already seen that people are trying to use
>>>>>> 
>>>>>> error_setg(&error_fatal, ... );
>>>>> 
>>>>> So, I don't actually see any real advantage to error_report_fatal(...)
>>>>> over error_setg(&error_fatal, ...).
>>>> 
>>>> I do.  Compare:
>>>> 
>>>> (a) error_report(...);
>>>> exit(1);
>>>> 
>>>> (b) error_report_fatal(...);
>>>> 
>>>> (c) error_setg(&error_fatal, ...);
>>>> 
>>>> In my opinion, (a) is clearest: even a relatively clueless reader will
>>>> know what exit(1) does, can guess what error_report() approximately
>>>> does, and doesn't need to know what it does exactly.  (b) is slightly
>>>> less obvious, and (c) is positively opaque.
>>>> 
>>>> Let's stick to the obvious (a) and be done with it.
>>> 
>>> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
>>> maybe add that information to your patch that updates the HACKING text?
>
>> I feel such detailed advice belings into error.h.  Sketch appended.
>
>> If that doesn't succeed in keeping (c) out, make checkpatch flag it.
>
>>> (and sorry for the fuzz with error_report_fatal() ... I thought it would
>>> be a good solution to avoid (c), but if (a) is preferred instead, then
>>> we should go with that solution instead).
>
> I can easily change that, no problem. I'm just happy consensus is landing on
> this subject.
>
>
>>> And, by the way, what about the spots that currently already use
>>> error_setg(&error_abort, ....) ? Should they be turned into
>>> error_report() + abort() instead? Or only abort(), without error
>>> message, since abort() is only about programming errors?
>
>> As I wrote in my first reply to this thread, I'd like them to be cleaned
>> up to just abort() or assert().
>
>> I like assert(), because it gives me exactly what I can use to debug the
>> programming error: a core dump (if enabled) and a source location
>> (useful when no core dump).  I never bought the argument that we should
>> use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
>> define NDEBUG, our 600+ abort()s won't save you from our 4000+
>> assert()s.
>
> Sorry, but I don't buy the argument of, "I prefer assert() because there's
> already lots of them". To me, there's a semantic difference between debug builds
> and regular ones (aka, assert vs abort).

That's not what I said :)

In the past, people have argued in favor of abort() by pointing to
NDEBUG.  I don't buy that argument, but me not buying it is not why I
prefer assert().  I do because it prints additional information that's
occasionally useful.

>                                          Also, I think it adds to the confusion
> that assert and abort seem to be used interchangeably in the code.

For better or worse, we overwhelmingly use abort() instead of assert(0),
but don't use if (!good) abort() instead of assert(good).  Doesn't make
sense to me, but my appetite for tree-wide changes and the debates that
go with them has limits.

> What about this definition?
>
> * exit(): user-triggered errors
> * abort(): general programming errors
> * assert(): additional sanity/consistency checks against programming errors
>
> Now, abort & assert have an overlap. Should we discourage one in favour of the
> other?

I can't see how to decide whether a programming error is "general" or
"additional", or why an "additional" one error deserves a message
pointing to source code, but a "general" one does not.

> Also:
>
> * error_report_fatal ensures the same exit code is always used (otherwise it can
>   fail with inconsistent error codes)

What if you *want* to use a different exit code?

But I grant you that we should almost always use exit(1) for fatal
errors.  And in fact we do!  There are a bunch of misguided exit(-1) in
the code, but git-log -S'exit(-1)' finds only half a dozen offending
commits since 2013, and none since 2015, so preventing more seems to be
a mostly solved problem.

> * error_report_abort brings the code information of assert into abort

If you want your crashes to print source location information, don't
reinvent the wheel, just use assert().

&error_abort can't because the interesting spot isn't where we decide to
abort, but where the error got created.

> But of course, I'm happy either way :)
>
>
>> diff --git a/include/qapi/error.h b/include/qapi/error.h
>> index 45d6c72..ea7e74f 100644
>> --- a/include/qapi/error.h
>> +++ b/include/qapi/error.h
>> @@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err);
>>   * human-readable error message is made from printf-style @fmt, ...
>>   * The resulting message should be a single phrase, with no newline or
>>   * trailing punctuation.
>> + * Please don't error_setg(&error_fatal, ...), use error_report() and
>> + * exit(), because that's more obvious.
>> + * Likewise, don't error_setg(&error_abort, ...), use assert().
>>   */
>>  #define error_setg(errp, fmt, ...)                              \
>>      error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
>> @@ -213,6 +216,8 @@ void error_setg_win32_internal(Error **errp,
>>   * the error object.
>>   * Else, move the error object from @local_err to *@dst_errp.
>>   * On return, @local_err is invalid.
>> + * Please don't error_propagate(&error_fatal, ...), use
>> + * error_report_err() and exit(), because that's more obvious.
>>   */
>>  void error_propagate(Error **dst_errp, Error *local_err);
>  
>> @@ -291,12 +296,14 @@ void error_set_internal(Error **errp,
>>      GCC_FMT_ATTR(6, 7);
>  
>>  /*
>> - * Pass to error_setg() & friends to abort() on error.
>> + * Special error destination to abort on error.
>> + * See error_setg() and error_propagate() for details.
>>   */
>>  extern Error *error_abort;
>  
>>  /*
>> - * Pass to error_setg() & friends to exit(1) on error.
>> + * Special error destination to exit(1) on error.
>> + * See error_setg() and error_propagate() for details.
>>   */
>>  extern Error *error_fatal;
>  
> I see, this will make it clearer for people looking for functions without
> reading HACKING. I can add this and reference it from the document.

If you like, I can post it as a formal patch you can then include in
your series.
Lluís Vilanova Feb. 3, 2016, 3:11 p.m. UTC | #3
Markus Armbruster writes:

> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>> Markus Armbruster writes:
>> 
>>> Thomas Huth <thuth@redhat.com> writes:
>>>> On 03.02.2016 10:48, Markus Armbruster wrote:
>>>>> David Gibson <david@gibson.dropbear.id.au> writes:
>>>>> 
>>>>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>>>>>>> On 02.02.2016 19:53, Markus Armbruster wrote:
>>>>>>>> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>>>>>>> ...
>>>>>>> 
>>>>>>>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>>>>>>>>> index 7ab2355..6c2f142 100644
>>>>>>>>> --- a/include/qemu/error-report.h
>>>>>>>>> +++ b/include/qemu/error-report.h
>>>>>>>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>>>> const char *error_get_progname(void);
>>>>>>>>> extern bool enable_timestamp_msg;
>>>>>>>>> 
>>>>>>>>> +/* Report message and exit with error */
>>>>>>>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) GCC_FMT_ATTR(1, 0);
>>>>>>>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>>> 
>>>>>>>> This lets people write things like
>>>>>>>> 
>>>>>>>> error_report_fatal("The sky is falling");
>>>>>>>> 
>>>>>>>> instead of
>>>>>>>> 
>>>>>>>> error_report("The sky is falling");
>>>>>>>> exit(1);
>>>>>>>> 
>>>>>>>> or
>>>>>>>> 
>>>>>>>> fprintf(stderr, "The sky is falling\n");
>>>>>>>> exit(1);
>>>>>>>> 
>>>>>>>> I don't think that's an improvement in clarity.
>>>>>>> 
>>>>>>> The problem is not the existing code, but that in a couple of new
>>>>>>> patches, I've now already seen that people are trying to use
>>>>>>> 
>>>>>>> error_setg(&error_fatal, ... );
>>>>>> 
>>>>>> So, I don't actually see any real advantage to error_report_fatal(...)
>>>>>> over error_setg(&error_fatal, ...).
>>>>> 
>>>>> I do.  Compare:
>>>>> 
>>>>> (a) error_report(...);
>>>>> exit(1);
>>>>> 
>>>>> (b) error_report_fatal(...);
>>>>> 
>>>>> (c) error_setg(&error_fatal, ...);
>>>>> 
>>>>> In my opinion, (a) is clearest: even a relatively clueless reader will
>>>>> know what exit(1) does, can guess what error_report() approximately
>>>>> does, and doesn't need to know what it does exactly.  (b) is slightly
>>>>> less obvious, and (c) is positively opaque.
>>>>> 
>>>>> Let's stick to the obvious (a) and be done with it.
>>>> 
>>>> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
>>>> maybe add that information to your patch that updates the HACKING text?
>> 
>>> I feel such detailed advice belings into error.h.  Sketch appended.
>> 
>>> If that doesn't succeed in keeping (c) out, make checkpatch flag it.
>> 
>>>> (and sorry for the fuzz with error_report_fatal() ... I thought it would
>>>> be a good solution to avoid (c), but if (a) is preferred instead, then
>>>> we should go with that solution instead).
>> 
>> I can easily change that, no problem. I'm just happy consensus is landing on
>> this subject.
>> 
>> 
>>>> And, by the way, what about the spots that currently already use
>>>> error_setg(&error_abort, ....) ? Should they be turned into
>>>> error_report() + abort() instead? Or only abort(), without error
>>>> message, since abort() is only about programming errors?
>> 
>>> As I wrote in my first reply to this thread, I'd like them to be cleaned
>>> up to just abort() or assert().
>> 
>>> I like assert(), because it gives me exactly what I can use to debug the
>>> programming error: a core dump (if enabled) and a source location
>>> (useful when no core dump).  I never bought the argument that we should
>>> use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
>>> define NDEBUG, our 600+ abort()s won't save you from our 4000+
>>> assert()s.
>> 
>> Sorry, but I don't buy the argument of, "I prefer assert() because there's
>> already lots of them". To me, there's a semantic difference between debug builds
>> and regular ones (aka, assert vs abort).

> That's not what I said :)

> In the past, people have argued in favor of abort() by pointing to
> NDEBUG.  I don't buy that argument, but me not buying it is not why I
> prefer assert().  I do because it prints additional information that's
> occasionally useful.

>> Also, I think it adds to the confusion
>> that assert and abort seem to be used interchangeably in the code.

> For better or worse, we overwhelmingly use abort() instead of assert(0),
> but don't use if (!good) abort() instead of assert(good).  Doesn't make
> sense to me, but my appetite for tree-wide changes and the debates that
> go with them has limits.

>> What about this definition?
>> 
>> * exit(): user-triggered errors
>> * abort(): general programming errors
>> * assert(): additional sanity/consistency checks against programming errors
>> 
>> Now, abort & assert have an overlap. Should we discourage one in favour of the
>> other?

> I can't see how to decide whether a programming error is "general" or
> "additional", or why an "additional" one error deserves a message
> pointing to source code, but a "general" one does not.

>> Also:
>> 
>> * error_report_fatal ensures the same exit code is always used (otherwise it can
>> fail with inconsistent error codes)

> What if you *want* to use a different exit code?

> But I grant you that we should almost always use exit(1) for fatal
> errors.  And in fact we do!  There are a bunch of misguided exit(-1) in
> the code, but git-log -S'exit(-1)' finds only half a dozen offending
> commits since 2013, and none since 2015, so preventing more seems to be
> a mostly solved problem.

>> * error_report_abort brings the code information of assert into abort

> If you want your crashes to print source location information, don't
> reinvent the wheel, just use assert().

> &error_abort can't because the interesting spot isn't where we decide to
> abort, but where the error got created.

Fair enough. I don't want a flame on style either, although I might look like
wanting one :)


>> But of course, I'm happy either way :)
>> 
>> 
>>> diff --git a/include/qapi/error.h b/include/qapi/error.h
>>> index 45d6c72..ea7e74f 100644
>>> --- a/include/qapi/error.h
>>> +++ b/include/qapi/error.h
>>> @@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err);
>>> * human-readable error message is made from printf-style @fmt, ...
>>> * The resulting message should be a single phrase, with no newline or
>>> * trailing punctuation.
>>> + * Please don't error_setg(&error_fatal, ...), use error_report() and
>>> + * exit(), because that's more obvious.
>>> + * Likewise, don't error_setg(&error_abort, ...), use assert().
>>> */
>>> #define error_setg(errp, fmt, ...)                              \
>>> error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
>>> @@ -213,6 +216,8 @@ void error_setg_win32_internal(Error **errp,
>>> * the error object.
>>> * Else, move the error object from @local_err to *@dst_errp.
>>> * On return, @local_err is invalid.
>>> + * Please don't error_propagate(&error_fatal, ...), use
>>> + * error_report_err() and exit(), because that's more obvious.
>>> */
>>> void error_propagate(Error **dst_errp, Error *local_err);
>> 
>>> @@ -291,12 +296,14 @@ void error_set_internal(Error **errp,
>>> GCC_FMT_ATTR(6, 7);
>> 
>>> /*
>>> - * Pass to error_setg() & friends to abort() on error.
>>> + * Special error destination to abort on error.
>>> + * See error_setg() and error_propagate() for details.
>>> */
>>> extern Error *error_abort;
>> 
>>> /*
>>> - * Pass to error_setg() & friends to exit(1) on error.
>>> + * Special error destination to exit(1) on error.
>>> + * See error_setg() and error_propagate() for details.
>>> */
>>> extern Error *error_fatal;
>> 
>> I see, this will make it clearer for people looking for functions without
>> reading HACKING. I can add this and reference it from the document.

> If you like, I can post it as a formal patch you can then include in
> your series.

That'd be great. Please cc me when you send it.

Thanks,
  Lluis
Markus Armbruster Feb. 3, 2016, 6:06 p.m. UTC | #4
Lluís Vilanova <vilanova@ac.upc.edu> writes:

> Markus Armbruster writes:
>
>> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>>> Markus Armbruster writes:
>>> 
>>>> Thomas Huth <thuth@redhat.com> writes:
>>>>> On 03.02.2016 10:48, Markus Armbruster wrote:
>>>>>> David Gibson <david@gibson.dropbear.id.au> writes:
>>>>>> 
>>>>>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>>>>>>>> On 02.02.2016 19:53, Markus Armbruster wrote:
>>>>>>>>> Lluís Vilanova <vilanova@ac.upc.edu> writes:
>>>>>>>> ...
>>>>>>>> 
>>>>>>>>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>>>>>>>>>> index 7ab2355..6c2f142 100644
>>>>>>>>>> --- a/include/qemu/error-report.h
>>>>>>>>>> +++ b/include/qemu/error-report.h
>>>>>>>>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>>>>> const char *error_get_progname(void);
>>>>>>>>>> extern bool enable_timestamp_msg;
>>>>>>>>>> 
>>>>>>>>>> +/* Report message and exit with error */
>>>>>>>>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) GCC_FMT_ATTR(1, 0);
>>>>>>>>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>>>>>>>>> 
>>>>>>>>> This lets people write things like
>>>>>>>>> 
>>>>>>>>> error_report_fatal("The sky is falling");
>>>>>>>>> 
>>>>>>>>> instead of
>>>>>>>>> 
>>>>>>>>> error_report("The sky is falling");
>>>>>>>>> exit(1);
>>>>>>>>> 
>>>>>>>>> or
>>>>>>>>> 
>>>>>>>>> fprintf(stderr, "The sky is falling\n");
>>>>>>>>> exit(1);
>>>>>>>>> 
>>>>>>>>> I don't think that's an improvement in clarity.
>>>>>>>> 
>>>>>>>> The problem is not the existing code, but that in a couple of new
>>>>>>>> patches, I've now already seen that people are trying to use
>>>>>>>> 
>>>>>>>> error_setg(&error_fatal, ... );
>>>>>>> 
>>>>>>> So, I don't actually see any real advantage to error_report_fatal(...)
>>>>>>> over error_setg(&error_fatal, ...).
>>>>>> 
>>>>>> I do.  Compare:
>>>>>> 
>>>>>> (a) error_report(...);
>>>>>> exit(1);
>>>>>> 
>>>>>> (b) error_report_fatal(...);
>>>>>> 
>>>>>> (c) error_setg(&error_fatal, ...);
>>>>>> 
>>>>>> In my opinion, (a) is clearest: even a relatively clueless reader will
>>>>>> know what exit(1) does, can guess what error_report() approximately
>>>>>> does, and doesn't need to know what it does exactly.  (b) is slightly
>>>>>> less obvious, and (c) is positively opaque.
>>>>>> 
>>>>>> Let's stick to the obvious (a) and be done with it.
>>>>> 
>>>>> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
>>>>> maybe add that information to your patch that updates the HACKING text?
>>> 
>>>> I feel such detailed advice belings into error.h.  Sketch appended.
>>> 
>>>> If that doesn't succeed in keeping (c) out, make checkpatch flag it.
>>> 
>>>>> (and sorry for the fuzz with error_report_fatal() ... I thought it would
>>>>> be a good solution to avoid (c), but if (a) is preferred instead, then
>>>>> we should go with that solution instead).
>>> 
>>> I can easily change that, no problem. I'm just happy consensus is landing on
>>> this subject.
>>> 
>>> 
>>>>> And, by the way, what about the spots that currently already use
>>>>> error_setg(&error_abort, ....) ? Should they be turned into
>>>>> error_report() + abort() instead? Or only abort(), without error
>>>>> message, since abort() is only about programming errors?
>>> 
>>>> As I wrote in my first reply to this thread, I'd like them to be cleaned
>>>> up to just abort() or assert().
>>> 
>>>> I like assert(), because it gives me exactly what I can use to debug the
>>>> programming error: a core dump (if enabled) and a source location
>>>> (useful when no core dump).  I never bought the argument that we should
>>>> use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
>>>> define NDEBUG, our 600+ abort()s won't save you from our 4000+
>>>> assert()s.
>>> 
>>> Sorry, but I don't buy the argument of, "I prefer assert() because there's
>>> already lots of them". To me, there's a semantic difference between debug builds
>>> and regular ones (aka, assert vs abort).
>
>> That's not what I said :)
>
>> In the past, people have argued in favor of abort() by pointing to
>> NDEBUG.  I don't buy that argument, but me not buying it is not why I
>> prefer assert().  I do because it prints additional information that's
>> occasionally useful.
>
>>> Also, I think it adds to the confusion
>>> that assert and abort seem to be used interchangeably in the code.
>
>> For better or worse, we overwhelmingly use abort() instead of assert(0),
>> but don't use if (!good) abort() instead of assert(good).  Doesn't make
>> sense to me, but my appetite for tree-wide changes and the debates that
>> go with them has limits.
>
>>> What about this definition?
>>> 
>>> * exit(): user-triggered errors
>>> * abort(): general programming errors
>>> * assert(): additional sanity/consistency checks against programming errors
>>> 
>>> Now, abort & assert have an overlap. Should we discourage one in favour of the
>>> other?
>
>> I can't see how to decide whether a programming error is "general" or
>> "additional", or why an "additional" one error deserves a message
>> pointing to source code, but a "general" one does not.
>
>>> Also:
>>> 
>>> * error_report_fatal ensures the same exit code is always used (otherwise it can
>>> fail with inconsistent error codes)
>
>> What if you *want* to use a different exit code?
>
>> But I grant you that we should almost always use exit(1) for fatal
>> errors.  And in fact we do!  There are a bunch of misguided exit(-1) in
>> the code, but git-log -S'exit(-1)' finds only half a dozen offending
>> commits since 2013, and none since 2015, so preventing more seems to be
>> a mostly solved problem.
>
>>> * error_report_abort brings the code information of assert into abort
>
>> If you want your crashes to print source location information, don't
>> reinvent the wheel, just use assert().
>
>> &error_abort can't because the interesting spot isn't where we decide to
>> abort, but where the error got created.
>
> Fair enough. I don't want a flame on style either, although I might look like
> wanting one :)

I think we're having a civil, constructive discussion on error handling
and reporting that happens to include stylistic aspects :)

>>> But of course, I'm happy either way :)
>>> 
>>> 
>>>> diff --git a/include/qapi/error.h b/include/qapi/error.h
>>>> index 45d6c72..ea7e74f 100644
>>>> --- a/include/qapi/error.h
>>>> +++ b/include/qapi/error.h
>>>> @@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err);
>>>> * human-readable error message is made from printf-style @fmt, ...
>>>> * The resulting message should be a single phrase, with no newline or
>>>> * trailing punctuation.
>>>> + * Please don't error_setg(&error_fatal, ...), use error_report() and
>>>> + * exit(), because that's more obvious.
>>>> + * Likewise, don't error_setg(&error_abort, ...), use assert().
>>>> */
>>>> #define error_setg(errp, fmt, ...)                              \
>>>> error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
>>>> @@ -213,6 +216,8 @@ void error_setg_win32_internal(Error **errp,
>>>> * the error object.
>>>> * Else, move the error object from @local_err to *@dst_errp.
>>>> * On return, @local_err is invalid.
>>>> + * Please don't error_propagate(&error_fatal, ...), use
>>>> + * error_report_err() and exit(), because that's more obvious.
>>>> */
>>>> void error_propagate(Error **dst_errp, Error *local_err);
>>> 
>>>> @@ -291,12 +296,14 @@ void error_set_internal(Error **errp,
>>>> GCC_FMT_ATTR(6, 7);
>>> 
>>>> /*
>>>> - * Pass to error_setg() & friends to abort() on error.
>>>> + * Special error destination to abort on error.
>>>> + * See error_setg() and error_propagate() for details.
>>>> */
>>>> extern Error *error_abort;
>>> 
>>>> /*
>>>> - * Pass to error_setg() & friends to exit(1) on error.
>>>> + * Special error destination to exit(1) on error.
>>>> + * See error_setg() and error_propagate() for details.
>>>> */
>>>> extern Error *error_fatal;
>>> 
>>> I see, this will make it clearer for people looking for functions without
>>> reading HACKING. I can add this and reference it from the document.
>
>> If you like, I can post it as a formal patch you can then include in
>> your series.
>
> That'd be great. Please cc me when you send it.

Done: [PATCH 0/2] error: Documentation updates
diff mbox

Patch

diff --git a/include/qapi/error.h b/include/qapi/error.h
index 45d6c72..ea7e74f 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -162,6 +162,9 @@  ErrorClass error_get_class(const Error *err);
  * human-readable error message is made from printf-style @fmt, ...
  * The resulting message should be a single phrase, with no newline or
  * trailing punctuation.
+ * Please don't error_setg(&error_fatal, ...), use error_report() and
+ * exit(), because that's more obvious.
+ * Likewise, don't error_setg(&error_abort, ...), use assert().
  */
 #define error_setg(errp, fmt, ...)                              \
     error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
@@ -213,6 +216,8 @@  void error_setg_win32_internal(Error **errp,
  * the error object.
  * Else, move the error object from @local_err to *@dst_errp.
  * On return, @local_err is invalid.
+ * Please don't error_propagate(&error_fatal, ...), use
+ * error_report_err() and exit(), because that's more obvious.
  */
 void error_propagate(Error **dst_errp, Error *local_err);
 
@@ -291,12 +296,14 @@  void error_set_internal(Error **errp,
     GCC_FMT_ATTR(6, 7);
 
 /*
- * Pass to error_setg() & friends to abort() on error.
+ * Special error destination to abort on error.
+ * See error_setg() and error_propagate() for details.
  */
 extern Error *error_abort;
 
 /*
- * Pass to error_setg() & friends to exit(1) on error.
+ * Special error destination to exit(1) on error.
+ * See error_setg() and error_propagate() for details.
  */
 extern Error *error_fatal;