Patchwork [2/4] check-qjson: Fix up a few bogus comments

login
register
mail settings
Submitter Markus Armbruster
Date March 14, 2013, 5:49 p.m.
Message ID <1363283360-26220-3-git-send-email-armbru@redhat.com>
Download mbox | patch
Permalink /patch/227764/
State New
Headers show

Comments

Markus Armbruster - March 14, 2013, 5:49 p.m.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 tests/check-qjson.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
Laszlo Ersek - March 21, 2013, 8:06 p.m.
I don't understand what's going on here.

On 03/14/13 18:49, Markus Armbruster wrote:
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  tests/check-qjson.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
> index ec85a0c..852124a 100644
> --- a/tests/check-qjson.c
> +++ b/tests/check-qjson.c
> @@ -4,7 +4,7 @@
>   *
>   * Authors:
>   *  Anthony Liguori   <aliguori@us.ibm.com>
> - *  Markus Armbruster <armbru@redhat.com>,
> + *  Markus Armbruster <armbru@redhat.com>
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
>   * See the COPYING.LIB file in the top-level directory.
> @@ -462,8 +462,7 @@ static void utf8_string(void)
>          },
>          /* 3.3.4  5-byte sequence with last byte missing (U+0000) */
>          {
> -            /* invalid */
> -            "\"\xF8\x80\x80\x80\"", /* bug: not corrected */
> +            "\"\xF8\x80\x80\x80\"",

In this test case, we use an invalid UTF-8 sequence in a JSON string
literal (json_in). So "/* invalid */" could be justified; perhaps it's
just too laconic.

The "/* bug: not corrected */" comment seems indeed wrong, "json_in" is
*input*, there's nothing to correct on it.

>              NULL,                   /* bug: rejected */

When the JSON parser rejects the invalid sequence, it's actually
correct. So why the "bug" comment? Are we expecting (according to the
comment in utf8_string()) U+FFFD?


>              "\"\\u8000\\uFFFF\"",   /* bug: want "\"\\uFFFF\"" */
>              "\xF8\x80\x80\x80",

In this test we're trying to format a UTF-8 byte sequence (utf8_in) as a
JSON string. The source is invalid. The JSON formatter should either
reject it, or emit an U+FFFD in its place. The actual JSON output is
probably wrong, hence the "bug" part is OK, but I thought what we
expected is not U+FFFF but U+FFFD. Hence that part of the comment is
wrong. ... Hm, OK the leading comment has some notes on this as well.

So what your patch does here is:
- remove the halfway OK comment "/* invalid */" -- I think it wasn't
really wrong, but I won't miss it,
- removes an in fact bogus comment,
- (removes a runaway comma).

My eyes are bleeding.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Markus Armbruster - March 22, 2013, 1:27 p.m.
Laszlo Ersek <lersek@redhat.com> writes:

> I don't understand what's going on here.
>
> On 03/14/13 18:49, Markus Armbruster wrote:
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  tests/check-qjson.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>> 
>> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
>> index ec85a0c..852124a 100644
>> --- a/tests/check-qjson.c
>> +++ b/tests/check-qjson.c
>> @@ -4,7 +4,7 @@
>>   *
>>   * Authors:
>>   *  Anthony Liguori   <aliguori@us.ibm.com>
>> - *  Markus Armbruster <armbru@redhat.com>,
>> + *  Markus Armbruster <armbru@redhat.com>
>>   *
>>   * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
>>   * See the COPYING.LIB file in the top-level directory.
>> @@ -462,8 +462,7 @@ static void utf8_string(void)
>>          },
>>          /* 3.3.4  5-byte sequence with last byte missing (U+0000) */
>>          {
>> -            /* invalid */
>> -            "\"\xF8\x80\x80\x80\"", /* bug: not corrected */
>> +            "\"\xF8\x80\x80\x80\"",
>
> In this test case, we use an invalid UTF-8 sequence in a JSON string
> literal (json_in). So "/* invalid */" could be justified; perhaps it's
> just too laconic.

There are many more invalid sequences in other test cases.  I have no
idea why I tacked /* invalid */ to this one.

For what it's worth, the comment right above should make it clear enough
that the sequence is invalid.

> The "/* bug: not corrected */" comment seems indeed wrong, "json_in" is
> *input*, there's nothing to correct on it.

Exactly.

>>              NULL,                   /* bug: rejected */
>
> When the JSON parser rejects the invalid sequence, it's actually
> correct. So why the "bug" comment? Are we expecting (according to the
> comment in utf8_string()) U+FFFD?

        /*
         * Bug markers used here:
         * - bug: not corrected
         *   JSON parser fails to correct invalid sequence(s)
         * - bug: rejected
         *   JSON parser rejects invalid sequence(s)
--->     *   We may choose to define this as feature
         * - bug: want "..."
         *   JSON parser produces incorrect result, this is the
         *   correct one, assuming replacement character U+FFFF
         *   We may choose to reject instead of replace
         */

The comments here take care not to pass judgement on what the JSON
parser should do for invalid sequences.  Obviously, it should either
consistently reject them, or consistently replace them with a suitable
replacement character.

If I understand Anthony correctly, his advice is to always reject.
That's what I intend to do should I actually get around to fixing the
parser.

>>              "\"\\u8000\\uFFFF\"",   /* bug: want "\"\\uFFFF\"" */
>>              "\xF8\x80\x80\x80",
>
> In this test we're trying to format a UTF-8 byte sequence (utf8_in) as a
> JSON string. The source is invalid. The JSON formatter should either
> reject it, or emit an U+FFFD in its place. The actual JSON output is
> probably wrong, hence the "bug" part is OK, but I thought what we
> expected is not U+FFFF but U+FFFD. Hence that part of the comment is
> wrong. ... Hm, OK the leading comment has some notes on this as well.

When I wrote this test case, I thought the JSON parser *intentionally*
replaced by U+FFFF.  Only later I learned that FFFF comes from
unintended sign extension %-}

> So what your patch does here is:
> - remove the halfway OK comment "/* invalid */" -- I think it wasn't
> really wrong, but I won't miss it,
> - removes an in fact bogus comment,
> - (removes a runaway comma).
>
> My eyes are bleeding.

/me hands over tissues

> Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks!

Patch

diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index ec85a0c..852124a 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -4,7 +4,7 @@ 
  *
  * Authors:
  *  Anthony Liguori   <aliguori@us.ibm.com>
- *  Markus Armbruster <armbru@redhat.com>,
+ *  Markus Armbruster <armbru@redhat.com>
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
  * See the COPYING.LIB file in the top-level directory.
@@ -462,8 +462,7 @@  static void utf8_string(void)
         },
         /* 3.3.4  5-byte sequence with last byte missing (U+0000) */
         {
-            /* invalid */
-            "\"\xF8\x80\x80\x80\"", /* bug: not corrected */
+            "\"\xF8\x80\x80\x80\"",
             NULL,                   /* bug: rejected */
             "\"\\u8000\\uFFFF\"",   /* bug: want "\"\\uFFFF\"" */
             "\xF8\x80\x80\x80",