diff mbox

[4/4] qobject: Output valid JSON for non-finite numbers

Message ID 1465526889-8339-5-git-send-email-eblake@redhat.com
State New
Headers show

Commit Message

Eric Blake June 10, 2016, 2:48 a.m. UTC
It's better to give downstream clients a valid JSON string,
even if they are semantically expecting a number, than it is
to give them a bare keyword extension that can cause a
lexical error.

Of course, as long as we don't recognize (certain) strings as valid
numbers during a conversion to QObject, this means our extension
of accepting bare keywords for non-finite numbers cannot undergo
a round trip (once converted into a string, we never get back to
a QFloat).  However, non-finite input is rare enough that it's
not worth bothering with at the moment.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 qobject/qjson.c     | 15 ++++++++++++---
 tests/check-qjson.c | 11 +++++------
 docs/qmp-spec.txt   |  2 +-
 3 files changed, 18 insertions(+), 10 deletions(-)

Comments

Markus Armbruster June 16, 2016, 4:17 p.m. UTC | #1
Eric Blake <eblake@redhat.com> writes:

> It's better to give downstream clients a valid JSON string,
> even if they are semantically expecting a number, than it is
> to give them a bare keyword extension that can cause a
> lexical error.

Incompatible change.  If all clients are choking on non-finite numbers,
then the incompatibility is an improvement.  If a client exists that
groks non-finite numbers, ...  Absence is always hard to show.

Moreover, it turns query-qmp-schema into a liar: the schema it returns
claims a certain member of the reply has "type": "number", and then we
go on to send a string anyway.

> Of course, as long as we don't recognize (certain) strings as valid
> numbers during a conversion to QObject,

That would be even crazier!

>                                         this means our extension
> of accepting bare keywords for non-finite numbers cannot undergo
> a round trip (once converted into a string, we never get back to
> a QFloat).  However, non-finite input is rare enough that it's
> not worth bothering with at the moment.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>

I'm afraid the only sane solution is to find all uses of number in QMP
output, audit the code producing them, then assert isfinite() in the
monitor.  For commands without a side effect, we could fail the command
instead of tripping an assertion.  We'd have to declare such commands.

Let's examine the occurences of "number" in output of query-qmp-schema,
or actually in the qmp-introspect.c that gets generated with -u:

* Object q_obj_migrate_set_downtime-arg member value: input

* Builtin number: d'uh!

* Object MigrationStats member mbps: in output of query-migrate

* Object XBZRLECacheStats member overflow: likewise

* Object KeyValue case number: not a type.

* Object BlockDeviceTimedStats members avg_rd_queue_depth,
  avg_wr_queue_depth: in output of query-blockstats

* Enum CommandLineParameterType member: not a type

* Enum JSONType member: not a type

* Enum KeyValueKind: not a type

* Object PciBusInfo member: not a type

So it's just query-migrate and query-blockstats.
Eric Blake June 17, 2016, 3:06 a.m. UTC | #2
On 06/16/2016 10:17 AM, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
>> It's better to give downstream clients a valid JSON string,
>> even if they are semantically expecting a number, than it is
>> to give them a bare keyword extension that can cause a
>> lexical error.
> 
> Incompatible change.  If all clients are choking on non-finite numbers,
> then the incompatibility is an improvement.  If a client exists that
> groks non-finite numbers, ...  Absence is always hard to show.

The 'id' field is an outlier - there, we replay the user's input with no
contextual interpretation (however, we DO reserve the right to reorder
the keys in the dicts that we replay, and to canonicalize UTF-8 text or
otherwise alter their input to something "equivalent").

> 
> Moreover, it turns query-qmp-schema into a liar: the schema it returns
> claims a certain member of the reply has "type": "number", and then we
> go on to send a string anyway.

The 'id' field is documented as sending ANY JSON value, so if we argue
that canonicalizing their extension input of a bare inf into a proper
JSON string on output is reasonable, then we may want this patch in
addition to adding assertions that none of the QMP commands with
introspectible 'number' ever output non-finite values.

> 
>> Of course, as long as we don't recognize (certain) strings as valid
>> numbers during a conversion to QObject,
> 
> That would be even crazier!
> 
>>                                         this means our extension
>> of accepting bare keywords for non-finite numbers cannot undergo
>> a round trip (once converted into a string, we never get back to
>> a QFloat).  However, non-finite input is rare enough that it's
>> not worth bothering with at the moment.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> I'm afraid the only sane solution is to find all uses of number in QMP
> output, audit the code producing them, then assert isfinite() in the
> monitor.  For commands without a side effect, we could fail the command
> instead of tripping an assertion.  We'd have to declare such commands.
> 
> Let's examine the occurences of "number" in output of query-qmp-schema,
> or actually in the qmp-introspect.c that gets generated with -u:
> 
> * Object q_obj_migrate_set_downtime-arg member value: input

Even though it's not output, it does need to be checked that it will
behave sanely with Inf or NaN input if we extend our parser to allow
those (behaving sanely may include a graceful error that the input was
out of range).

> 
> * Builtin number: d'uh!
> 
> * Object MigrationStats member mbps: in output of query-migrate
> 
> * Object XBZRLECacheStats member overflow: likewise
> 
> * Object KeyValue case number: not a type.
> 
> * Object BlockDeviceTimedStats members avg_rd_queue_depth,
>   avg_wr_queue_depth: in output of query-blockstats
> 
> * Enum CommandLineParameterType member: not a type
> 
> * Enum JSONType member: not a type
> 
> * Enum KeyValueKind: not a type
> 
> * Object PciBusInfo member: not a type
> 
> So it's just query-migrate and query-blockstats.
> 

Okay, looks like I need to respin this, and the rest of my JSON output
visitor on top of it, with this audit done first.
Markus Armbruster June 17, 2016, 8:14 a.m. UTC | #3
Eric Blake <eblake@redhat.com> writes:

> On 06/16/2016 10:17 AM, Markus Armbruster wrote:
>> Eric Blake <eblake@redhat.com> writes:
>> 
>>> It's better to give downstream clients a valid JSON string,
>>> even if they are semantically expecting a number, than it is
>>> to give them a bare keyword extension that can cause a
>>> lexical error.
>> 
>> Incompatible change.  If all clients are choking on non-finite numbers,
>> then the incompatibility is an improvement.  If a client exists that
>> groks non-finite numbers, ...  Absence is always hard to show.
>
> The 'id' field is an outlier - there, we replay the user's input with no
> contextual interpretation (however, we DO reserve the right to reorder
> the keys in the dicts that we replay, and to canonicalize UTF-8 text or
> otherwise alter their input to something "equivalent").

Yes, the response's id must the the same JSON value, but it needn't be
the same text.

>> Moreover, it turns query-qmp-schema into a liar: the schema it returns
>> claims a certain member of the reply has "type": "number", and then we
>> go on to send a string anyway.
>
> The 'id' field is documented as sending ANY JSON value, so if we argue
> that canonicalizing their extension input of a bare inf into a proper
> JSON string on output is reasonable, then we may want this patch in
> addition to adding assertions that none of the QMP commands with
> introspectible 'number' ever output non-finite values.

I read this thrice, and I'm still not sure I got the argument :)

>>> Of course, as long as we don't recognize (certain) strings as valid
>>> numbers during a conversion to QObject,
>> 
>> That would be even crazier!
>> 
>>>                                         this means our extension
>>> of accepting bare keywords for non-finite numbers cannot undergo
>>> a round trip (once converted into a string, we never get back to
>>> a QFloat).  However, non-finite input is rare enough that it's
>>> not worth bothering with at the moment.
>>>
>>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> 
>> I'm afraid the only sane solution is to find all uses of number in QMP
>> output, audit the code producing them, then assert isfinite() in the
>> monitor.  For commands without a side effect, we could fail the command
>> instead of tripping an assertion.  We'd have to declare such commands.
>> 
>> Let's examine the occurences of "number" in output of query-qmp-schema,
>> or actually in the qmp-introspect.c that gets generated with -u:
>> 
>> * Object q_obj_migrate_set_downtime-arg member value: input
>
> Even though it's not output, it does need to be checked that it will
> behave sanely with Inf or NaN input if we extend our parser to allow
> those (behaving sanely may include a graceful error that the input was
> out of range).

Yes, *if* we extend QMP.

>> 
>> * Builtin number: d'uh!
>> 
>> * Object MigrationStats member mbps: in output of query-migrate
>> 
>> * Object XBZRLECacheStats member overflow: likewise
>> 
>> * Object KeyValue case number: not a type.
>> 
>> * Object BlockDeviceTimedStats members avg_rd_queue_depth,
>>   avg_wr_queue_depth: in output of query-blockstats
>> 
>> * Enum CommandLineParameterType member: not a type
>> 
>> * Enum JSONType member: not a type
>> 
>> * Enum KeyValueKind: not a type
>> 
>> * Object PciBusInfo member: not a type
>> 
>> So it's just query-migrate and query-blockstats.
>> 
>
> Okay, looks like I need to respin this, and the rest of my JSON output
> visitor on top of it, with this audit done first.

Audit, plus isfinite() assertions to guard the JSON output.

The (misnamed) QMP output visitor shouldn't assert, because it can
legitimately be used for purposes other than QMP.  Only the actual
conversion to JSON should assert.  Currently, to_json().  With your JSON
output visitor, it would be qstring_append_json_number(), or its caller.
diff mbox

Patch

diff --git a/qobject/qjson.c b/qobject/qjson.c
index ef160d2..465b080 100644
--- a/qobject/qjson.c
+++ b/qobject/qjson.c
@@ -12,6 +12,7 @@ 
  */

 #include "qemu/osdep.h"
+#include <math.h>
 #include "qapi/qmp/json-lexer.h"
 #include "qapi/qmp/json-parser.h"
 #include "qapi/qmp/json-streamer.h"
@@ -236,19 +237,27 @@  static void to_json(const QObject *obj, QString *str, int pretty, int indent)
     }
     case QTYPE_QFLOAT: {
         QFloat *val = qobject_to_qfloat(obj);
+        double d = qfloat_get_double(val);
         char buffer[1024];
         int len;

+        if (!isfinite(d)) {
+            /* Always output valid JSON - a semantic error due to a
+             * string where a number was expected is better than a
+             * lexical error from a strict parser */
+            snprintf(buffer, sizeof(buffer), "\"%f\"", d);
+            qstring_append(str, buffer);
+            break;
+        }
+
         /* FIXME: snprintf() is locale dependent; but JSON requires
          * numbers to be formatted as if in the C locale. Dependence
          * on C locale is a pervasive issue in QEMU. */
-        /* FIXME: This risks printing Inf or NaN, which are not valid
-         * JSON values. */
         /* FIXME: the default precision of 6 for %f often causes
          * rounding errors; we should be using DBL_DECIMAL_DIG (17),
          * and only rounding to a shorter number if the result would
          * still produce the same floating point value.  */
-        len = snprintf(buffer, sizeof(buffer), "%f", qfloat_get_double(val));
+        len = snprintf(buffer, sizeof(buffer), "%f", d);
         while (len > 0 && buffer[len - 1] == '0') {
             len--;
         }
diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index 95adf64..b0a9178 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -974,12 +974,12 @@  static void non_finite_number(void)
         double decoded;
         const char *canonical;
     } test_cases[] = {
-        { "nan", NAN },
-        { "NaN", NAN, "nan" },
+        { "nan", NAN, "\"nan\"" },
+        { "NaN", NAN, "\"nan\"" },
         /* Not all libc implementations can round-trip '-nan' */
         /* We do not support forms like 'NAN(0)' */
-        { "inf", INFINITY },
-        { "-INFINITY", -INFINITY, "-inf" },
+        { "inf", INFINITY, "\"inf\"" },
+        { "-INFINITY", -INFINITY, "\"-inf\"" },
         { },
     };

@@ -1002,8 +1002,7 @@  static void non_finite_number(void)
         }

         str = qobject_to_json(obj);
-        g_assert_cmpstr(qstring_get_str(str), ==,
-                        test_cases[i].canonical ?: test_cases[i].encoded);
+        g_assert_cmpstr(qstring_get_str(str), ==, test_cases[i].canonical);
         QDECREF(str);
         QDECREF(qfloat);
     }
diff --git a/docs/qmp-spec.txt b/docs/qmp-spec.txt
index bfba431..75c68d9 100644
--- a/docs/qmp-spec.txt
+++ b/docs/qmp-spec.txt
@@ -53,7 +53,7 @@  double quoting on output.

 As an extension, the server understands case-insensitive forms of
 non-finite numbers, such as "NaN", "Inf", and "-infinity" (but not
-"NaN(...)").
+"NaN(...)"); however, such numbers are output as a json-string.

 2.1 General Definitions
 -----------------------