diff mbox

[RFC,v2,26/47] qapi-types: Convert to QAPISchemaVisitor, fixing flat unions

Message ID 1435782155-31412-27-git-send-email-armbru@redhat.com
State New
Headers show

Commit Message

Markus Armbruster July 1, 2015, 8:22 p.m. UTC
Fixes flat unions to get the base's base members.  Test case is from
commit 2fc0043, in qapi-schema-test.json:

    { 'union': 'UserDefFlatUnion',
      'base': 'UserDefUnionBase',
      'discriminator': 'enum1',
      'data': { 'value1' : 'UserDefA',
                'value2' : 'UserDefB',
                'value3' : 'UserDefB' } }

    { 'struct': 'UserDefUnionBase',
      'base': 'UserDefZero',
      'data': { 'string': 'str', 'enum1': 'EnumOne' } }

    { 'struct': 'UserDefZero',
      'data': { 'integer': 'int' } }

Patch's effect on UserDefFlatUnion:

     struct UserDefFlatUnion {
    +    int64_t integer;
         char *string;
         EnumOne enum1;
         /* union tag is EnumOne enum1 */
        union {
            void *data;
            UserDefA *value1;
            UserDefB *value2;
            UserDefB *value3;
        };
    };

Flat union visitors remain broken.  They'll be fixed next.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 scripts/qapi-types.py                   | 268 +++++++++++++-------------------
 tests/qapi-schema/qapi-schema-test.json |   4 +-
 2 files changed, 114 insertions(+), 158 deletions(-)

Comments

Eric Blake July 22, 2015, 5:34 p.m. UTC | #1
On 07/01/2015 02:22 PM, Markus Armbruster wrote:
> Fixes flat unions to get the base's base members.  Test case is from
> commit 2fc0043, in qapi-schema-test.json:
> 

> 
> Flat union visitors remain broken.  They'll be fixed next.

Sadly, the generated files had a huge diffstat, making it very hard to
determine if this change is sane:

 qapi-types.c                        | 1412 ++++++-------
 qapi-types.h                        | 3705
++++++++++++++++++++----------------
 qga/qapi-generated/qga-qapi-types.c |  110 -
 qga/qapi-generated/qga-qapi-types.h |  542 +++--
 4 files changed, 3208 insertions(+), 2561 deletions(-)

At first, it looks easy, as in:

qapi-types.c:
+const int BlockdevRef_qtypes[QTYPE_MAX] = {
...
 const char *const BlockdevRefKind_lookup[] = {
...
-const int BlockdevRef_qtypes[QTYPE_MAX] = {
...

(that is, the new visitor outputs the two arrays in a different order -
I can live with that).

Then it gets a bit crazy when using normal diff algorithms:

-void qapi_free_int32List(int32List *obj)
+void qapi_free_ACPIOSTInfo(ACPIOSTInfo *obj)

where declarations are rearranged (so the previous commit, which
attempted to sort declarations to make this one easier, did not quite
succeed).

Even in qapi-types.h things were rearranged:


-#ifndef QAPI_TYPES_BUILTIN_STRUCT_DECL
-#define QAPI_TYPES_BUILTIN_STRUCT_DECL
+#ifndef QAPI_TYPES_BUILTIN
+#define QAPI_TYPES_BUILTIN


-typedef struct int32List {
+typedef struct boolList boolList;
+
+struct boolList {

I can understand the minor change in #ifdef name, and even the
separation between typedef and struct declaration (even though the old
approach of doing it all at once works).  But the overall diff in the
generated files would be easier to review if done in stages, and either
make 25/47 sort closer to what this patch does, or add yet more
prerequisite patches that further tweak sorting.  Ideally, this patch
will be a lot easier to review if the generated code is much closer to
the pre-patch version (that is, separating sorting changes from other
changes).  On the other hand, yes, it's kind of a pain to patch old code
to do things differently just before throwing it away with the new code
in its place, so it becomes a judgment call of how confident we want to
be that the new implementation isn't breaking things.

I was able to make a bit more sense of things by using git's 'diff
patience' algorithm, which showed things more as code motion rather than
one-or-two-line changes to lots of common boilerplate, but even that
diffstat is still big:

 qapi-types1.c | 2030 +++++++++++++++----------------
 qapi-types1.h | 3707
+++++++++++++++++++++++++++++++++------------------------
 2 files changed, 3148 insertions(+), 2589 deletions(-)

That said, I'll still at least review this code by inspection, and
things still compile fine, so although I'm reluctant to give R-b while
the patch is in RFC stage (because I didn't want to take the time to be
certain the results are the same amidst so much churn), they are mostly
sane.

> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  scripts/qapi-types.py                   | 268 +++++++++++++-------------------
>  tests/qapi-schema/qapi-schema-test.json |   4 +-
>  2 files changed, 114 insertions(+), 158 deletions(-)
> 
> diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
> index a48ad9c..d6185c6 100644
> --- a/scripts/qapi-types.py
> +++ b/scripts/qapi-types.py
> @@ -2,88 +2,67 @@
>  # QAPI types generator
>  #
>  # Copyright IBM, Corp. 2011
> +# Copyright (c) 2013-2015 Red Hat Inc.
>  #
>  # Authors:
>  #  Anthony Liguori <aliguori@us.ibm.com>
> +#  Markus Armbruster <armbru@redhat.com>
>  #
>  # This work is licensed under the terms of the GNU GPL, version 2.
>  # See the COPYING file in the top-level directory.
>  
> -from ordereddict import OrderedDict
>  from qapi import *
>  
> -def generate_fwd_builtin(name):
> -    return mcgen('''
> -
> -typedef struct %(name)sList {

For example, the change of splitting this into:

typedef struct %(name)sList;

struct %(name)sList {

done as a separate patch would make the generated diff of this patch
smaller.


>  
> -def generate_struct_fields(members):
> +def gen_struct_field(name, typ, optional):
>      ret = ''
>  
> -    for argname, argentry, optional in parse_args(members):
> -        if optional:
> -            ret += mcgen('''
> +    if optional:
> +        ret += mcgen('''

(sometimes it's annoying that python requires indentation changes when
changing control flow; for C files, sometimes I split a patch that
merely adds {} and indentation from another patch that changes logic, so
that the logic change isn't drowned by the reindentation)

> +class QAPISchemaGenTypeVisitor(QAPISchemaVisitor):
> +    def __init__(self):
> +        self.decl = None
> +        self.defn = None
> +        self.fwdecl = None
> +        self.fwdefn = None
> +        self.btin = None
> +    def visit_begin(self):
> +        self.decl = ''
> +        self.defn = ''
> +        self.fwdecl = ''
> +        self.fwdefn = ''
> +        self.btin = guardstart('QAPI_TYPES_BUILTIN')
> +    def visit_end(self):
> +        self.decl = self.fwdecl + self.decl
> +        self.fwdecl = None
> +        self.defn = self.fwdefn + self.defn
> +        self.fwdefn = None
> +        # To avoid header dependency hell, we always generate
> +        # declarations for built-in types in our header files and
> +        # simply guard them.
> +        self.btin += guardend('QAPI_TYPES_BUILTIN')
> +        self.decl = self.btin + self.decl
> +        self.btin = None
> +        # Doesn't work for cases where we link in multiple objects
> +        # that have the functions defined, so generate them only with
> +        # option -b (do_builtins).

Does this comment...

> +    def _gen_type_cleanup(self, name):
> +        self.decl += generate_type_cleanup_decl(name)
> +        self.defn += generate_type_cleanup(name)
> +    def visit_enum_type(self, name, info, values):
> +        self.fwdecl += generate_enum(name, values)
> +        self.fwdefn += generate_enum_lookup(name, values)
> +    def visit_array_type(self, name, info, element_type):
> +        if isinstance(element_type, QAPISchemaBuiltinType):
> +            self.btin += gen_fwd_object_or_array(name)
> +            self.btin += gen_array(name, element_type)
> +            self.btin += generate_type_cleanup_decl(name)
> +            if do_builtins:
> +                self.defn += generate_type_cleanup(name)

...fit better here?

> +        else:
> +            self.fwdecl += gen_fwd_object_or_array(name)
> +            self.decl += gen_array(name, element_type)
> +            self._gen_type_cleanup(name)
> +    def visit_object_type(self, name, info, base, members, variants):
> +        if info:

So to make sure I understand, this ensures that implicit types (those
handled merely by parameters in a function or as members of a struct)
have no declarations required in the -types files.

> +            self.fwdecl += gen_fwd_object_or_array(name)
> +            if variants:
> +                self.decl += gen_union(name, base, variants)

Is it worth 'assert not members' at this point?

> +            else:
> +                self.decl += gen_struct(name, base, members)
> +            self._gen_type_cleanup(name)
> +    def visit_alternate_type(self, name, info, variants):
> +        self.fwdecl += gen_fwd_object_or_array(name)
> +        self.fwdefn += gen_alternate_qtypes(name, variants)
> +        self.decl += gen_union(name, None, variants)
> +        self.decl += gen_alternate_qtypes_decl(name)
> +        self._gen_type_cleanup(name)
> +

The visitor looks reasonable from inspection review.

>  do_builtins = False
>  
>  (input_file, output_dir, do_c, do_h, prefix, opts) = \
> @@ -325,77 +348,10 @@ fdecl.write(mcgen('''
>  #include <stdint.h>
>  '''))
>  
> -exprs = QAPISchema(input_file).get_exprs()
> -
> -fdecl.write(guardstart("QAPI_TYPES_BUILTIN_STRUCT_DECL"))
> -for typename in builtin_types.keys():
> -    fdecl.write(generate_fwd_builtin(typename))
> -fdecl.write(guardend("QAPI_TYPES_BUILTIN_STRUCT_DECL"))

This is a place where adding sorting in 25/47 (or a separate patch)
would make the impact on generated code from this patch smaller.

> -
> -for expr in exprs:
> -    ret = ""
> -    if expr.has_key('struct'):
> -        ret += generate_fwd_struct(expr['struct'])
> -    elif expr.has_key('enum'):
> -        ret += generate_enum(expr['enum'], expr['data'])
> -        ret += generate_fwd_enum_struct(expr['enum'])
> -        fdef.write(generate_enum_lookup(expr['enum'], expr['data']))
> -    elif expr.has_key('union'):
> -        ret += generate_fwd_struct(expr['union'])
> -        enum_define = discriminator_find_enum_define(expr)
> -        if not enum_define:
> -            ret += generate_enum('%sKind' % expr['union'], expr['data'].keys())
> -            fdef.write(generate_enum_lookup('%sKind' % expr['union'],
> -                                            expr['data'].keys()))
> -    elif expr.has_key('alternate'):
> -        ret += generate_fwd_struct(expr['alternate'])
> -        ret += generate_enum('%sKind' % expr['alternate'], expr['data'].keys())
> -        fdef.write(generate_enum_lookup('%sKind' % expr['alternate'],
> -                                        expr['data'].keys()))

Tweaking the order of _qtypes[] vs. _lookup[] here in a pre-req patch
would also help review of the diff generated by this patch.

> -        fdef.write(generate_alternate_qtypes(expr))
> -    else:
> -        continue
> -    fdecl.write(ret)
> -
> -# to avoid header dependency hell, we always generate declarations
> -# for built-in types in our header files and simply guard them
> -fdecl.write(guardstart("QAPI_TYPES_BUILTIN_CLEANUP_DECL"))
> -for typename in builtin_types.keys():
> -    fdecl.write(generate_type_cleanup_decl(typename + "List"))
> -fdecl.write(guardend("QAPI_TYPES_BUILTIN_CLEANUP_DECL"))
> -
> -# ...this doesn't work for cases where we link in multiple objects that
> -# have the functions defined, so we use -b option to provide control
> -# over these cases
> -if do_builtins:
> -    for typename in builtin_types.keys():
> -        fdef.write(generate_type_cleanup(typename + "List"))

Another iteration worth sorting in an earlier patch.

> -
> -for expr in exprs:
> -    ret = ""
> -    if expr.has_key('struct'):
> -        ret += generate_struct(expr) + "\n"
> -        ret += generate_type_cleanup_decl(expr['struct'] + "List")
> -        fdef.write(generate_type_cleanup(expr['struct'] + "List"))
> -        ret += generate_type_cleanup_decl(expr['struct'])
> -        fdef.write(generate_type_cleanup(expr['struct']))

I think part of the large size of the generated diff comes from whether
array fooList types are emitted in a different order via the visitor.

> -    elif expr.has_key('union'):
> -        ret += generate_union(expr, 'union') + "\n"
> -        ret += generate_type_cleanup_decl(expr['union'] + "List")
> -        fdef.write(generate_type_cleanup(expr['union'] + "List"))
> -        ret += generate_type_cleanup_decl(expr['union'])
> -        fdef.write(generate_type_cleanup(expr['union']))
> -    elif expr.has_key('alternate'):
> -        ret += generate_union(expr, 'alternate') + "\n"
> -        ret += generate_type_cleanup_decl(expr['alternate'] + "List")
> -        fdef.write(generate_type_cleanup(expr['alternate'] + "List"))
> -        ret += generate_type_cleanup_decl(expr['alternate'])
> -        fdef.write(generate_type_cleanup(expr['alternate']))
> -    elif expr.has_key('enum'):
> -        ret += "\n" + generate_type_cleanup_decl(expr['enum'] + "List")
> -        fdef.write(generate_type_cleanup(expr['enum'] + "List"))
> -    else:
> -        continue
> -    fdecl.write(ret)
> +schema = QAPISchema(input_file)
> +gen = QAPISchemaGenTypeVisitor()
> +schema.visit(gen)
> +fdef.write(gen.defn)
> +fdecl.write(gen.decl)
>  
>  close_output(fdef, fdecl)
> diff --git a/tests/qapi-schema/qapi-schema-test.json b/tests/qapi-schema/qapi-schema-test.json
> index a9e5aab..257b4d4 100644
> --- a/tests/qapi-schema/qapi-schema-test.json
> +++ b/tests/qapi-schema/qapi-schema-test.json
> @@ -39,8 +39,8 @@
>    'data': { 'value1' : 'UserDefA',
>              'value2' : 'UserDefB',
>              'value3' : 'UserDefB' } }
> -# FIXME generated struct UserDefFlatUnion has members for direct base
> -# UserDefUnionBase, but lacks members for indirect base UserDefZero
> +# FIXME generated visit_type_UserDefFlatUnion_fields() fails to visit
> +# members of indirect base UserDefZero
>  
>  { 'struct': 'UserDefUnionBase',
>    'base': 'UserDefZero',
> 

Overall, moving in the right direction.
Eric Blake July 22, 2015, 8:07 p.m. UTC | #2
On 07/22/2015 11:34 AM, Eric Blake wrote:

> 
> -typedef struct int32List {
> +typedef struct boolList boolList;
> +
> +struct boolList {
> 
> I can understand the minor change in #ifdef name, and even the
> separation between typedef and struct declaration (even though the old
> approach of doing it all at once works).

Actually, I don't see any clear-cut policy on the matter (all HACKING
says is that "Typedefs are used to eliminate the redundant 'struct'
keyword", but does not say how aggressively they must be used).  In
fact, it looks like most of qemu.git prefers the more compact
representation, while your patch goes to the less-common one:

$ git grep 'typedef struct .* {' | wc
   1370    5549   71105
$ git grep 'typedef struct .*;' | wc
    624    2515   41449

However, one thing that your style has going for it, that was not
possible with the compact version, is to use:

 typedef struct intList intList;
 struct intList {
     union {
         int64_t value;
         uint64_t padding;
     };
-    struct intList *next;
+    intList *next;
 };

that is, because your typedef is separate, you have the option of being
able to drop the use of 'struct' from within the actual intList struct.
 Again, if we want that, it would be better to do that as a separate commit.
Markus Armbruster July 27, 2015, 3:59 p.m. UTC | #3
Eric Blake <eblake@redhat.com> writes:

> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>> Fixes flat unions to get the base's base members.  Test case is from
>> commit 2fc0043, in qapi-schema-test.json:
>> 
>
>> 
>> Flat union visitors remain broken.  They'll be fixed next.
>
> Sadly, the generated files had a huge diffstat, making it very hard to
> determine if this change is sane:
>
>  qapi-types.c                        | 1412 ++++++-------
>  qapi-types.h                        | 3705 ++++++++++++++++++++----------------
>  qga/qapi-generated/qga-qapi-types.c |  110 -
>  qga/qapi-generated/qga-qapi-types.h |  542 +++--
>  4 files changed, 3208 insertions(+), 2561 deletions(-)
>
> At first, it looks easy, as in:
>
> qapi-types.c:
> +const int BlockdevRef_qtypes[QTYPE_MAX] = {
> ...
>  const char *const BlockdevRefKind_lookup[] = {
> ...
> -const int BlockdevRef_qtypes[QTYPE_MAX] = {
> ...
>
> (that is, the new visitor outputs the two arrays in a different order -
> I can live with that).

The old code generates enum lookup tables for implicit union or
alternate enumerations from within the union or alternate code.

The new code doesn't, because it makes all these enums full-fledged
types, so the ordinary enum generator does everything we need.

And that's why generating stuff in alphabetical doesn't help here: the
new types get inserted in the sorted sequence, but that's not where the
old code generated the implicit enum stuff.

> Then it gets a bit crazy when using normal diff algorithms:
>
> -void qapi_free_int32List(int32List *obj)
> +void qapi_free_ACPIOSTInfo(ACPIOSTInfo *obj)
>
> where declarations are rearranged (so the previous commit, which
> attempted to sort declarations to make this one easier, did not quite
> succeed).

I found that quite disappointing myself, but I didn't have better ideas
at the time.

> Even in qapi-types.h things were rearranged:
>
>
> -#ifndef QAPI_TYPES_BUILTIN_STRUCT_DECL
> -#define QAPI_TYPES_BUILTIN_STRUCT_DECL
> +#ifndef QAPI_TYPES_BUILTIN
> +#define QAPI_TYPES_BUILTIN
>
>
> -typedef struct int32List {
> +typedef struct boolList boolList;
> +
> +struct boolList {
>
> I can understand the minor change in #ifdef name, and even the
> separation between typedef and struct declaration (even though the old
> approach of doing it all at once works).

Artifact of my attempt to DRY: I sometimes need the typedef in one place
(say .h) and the actual struct in another place (say .c).  I make do
with the functions generating that even when I want both of them in the
same place.

>                                           But the overall diff in the
> generated files would be easier to review if done in stages, and either
> make 25/47 sort closer to what this patch does, or add yet more
> prerequisite patches that further tweak sorting.  Ideally, this patch
> will be a lot easier to review if the generated code is much closer to
> the pre-patch version (that is, separating sorting changes from other
> changes).  On the other hand, yes, it's kind of a pain to patch old code
> to do things differently just before throwing it away with the new code
> in its place,

Especially when you're throwing away the old code because you can't
stand working with it :)

>               so it becomes a judgment call of how confident we want to
> be that the new implementation isn't breaking things.

Different idea: I hack up a script to split file FOO into one part per
top-level syntactical construct (typedef, function, whatever), and store
each part in a file named like NUM-NAME-FOO.  The script will most
likely be hairy, so use diff FOO <(cat *-FOO) to verify it works.  Then
diff old/*-NAME-FOO new/*-NAME-FOO to see what changed.

Writing the script will take non-trivial effort, but I'll try if you
think it's needed.

For what it's worth, I sanity-checked with my "diff sorted" trick
mentioned in my reply to your review of PATCH 25.

> I was able to make a bit more sense of things by using git's 'diff
> patience' algorithm, which showed things more as code motion rather than
> one-or-two-line changes to lots of common boilerplate, but even that
> diffstat is still big:
>
>  qapi-types1.c | 2030 +++++++++++++++----------------
>  qapi-types1.h | 3707 +++++++++++++++++++++++++++++++++------------------------
>  2 files changed, 3148 insertions(+), 2589 deletions(-)
>
> That said, I'll still at least review this code by inspection, and
> things still compile fine, so although I'm reluctant to give R-b while
> the patch is in RFC stage (because I didn't want to take the time to be
> certain the results are the same amidst so much churn), they are mostly
> sane.

Fair enough.

>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  scripts/qapi-types.py | 268 +++++++++++++-------------------
>>  tests/qapi-schema/qapi-schema-test.json |   4 +-
>>  2 files changed, 114 insertions(+), 158 deletions(-)
>> 
>> diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
>> index a48ad9c..d6185c6 100644
>> --- a/scripts/qapi-types.py
>> +++ b/scripts/qapi-types.py
>> @@ -2,88 +2,67 @@
>>  # QAPI types generator
>>  #
>>  # Copyright IBM, Corp. 2011
>> +# Copyright (c) 2013-2015 Red Hat Inc.
>>  #
>>  # Authors:
>>  #  Anthony Liguori <aliguori@us.ibm.com>
>> +#  Markus Armbruster <armbru@redhat.com>
>>  #
>>  # This work is licensed under the terms of the GNU GPL, version 2.
>>  # See the COPYING file in the top-level directory.
>>  
>> -from ordereddict import OrderedDict
>>  from qapi import *
>>  
>> -def generate_fwd_builtin(name):
>> -    return mcgen('''
>> -
>> -typedef struct %(name)sList {
>
> For example, the change of splitting this into:
>
> typedef struct %(name)sList;
>
> struct %(name)sList {
>
> done as a separate patch would make the generated diff of this patch
> smaller.

I'm willing to mess with the old code *locally* to reduce the diffs, if
that helps.

>> -def generate_struct_fields(members):
>> +def gen_struct_field(name, typ, optional):
>>      ret = ''
>>  
>> -    for argname, argentry, optional in parse_args(members):
>> -        if optional:
>> -            ret += mcgen('''
>> +    if optional:
>> +        ret += mcgen('''
>
> (sometimes it's annoying that python requires indentation changes when
> changing control flow; for C files, sometimes I split a patch that
> merely adds {} and indentation from another patch that changes logic, so
> that the logic change isn't drowned by the reindentation)

I sometimes use C-c C-w in diff-mode to view a patch hunk without its
whitespace changes.

>> +class QAPISchemaGenTypeVisitor(QAPISchemaVisitor):
>> +    def __init__(self):
>> +        self.decl = None
>> +        self.defn = None
>> +        self.fwdecl = None
>> +        self.fwdefn = None
>> +        self.btin = None
>> +    def visit_begin(self):
>> +        self.decl = ''
>> +        self.defn = ''
>> +        self.fwdecl = ''
>> +        self.fwdefn = ''
>> +        self.btin = guardstart('QAPI_TYPES_BUILTIN')
>> +    def visit_end(self):
>> +        self.decl = self.fwdecl + self.decl
>> +        self.fwdecl = None
>> +        self.defn = self.fwdefn + self.defn
>> +        self.fwdefn = None
>> +        # To avoid header dependency hell, we always generate
>> +        # declarations for built-in types in our header files and
>> +        # simply guard them.
>> +        self.btin += guardend('QAPI_TYPES_BUILTIN')
>> +        self.decl = self.btin + self.decl
>> +        self.btin = None
>> +        # Doesn't work for cases where we link in multiple objects
>> +        # that have the functions defined, so generate them only with
>> +        # option -b (do_builtins).
>
> Does this comment...
>
>> +    def _gen_type_cleanup(self, name):
>> +        self.decl += generate_type_cleanup_decl(name)
>> +        self.defn += generate_type_cleanup(name)
>> +    def visit_enum_type(self, name, info, values):
>> +        self.fwdecl += generate_enum(name, values)
>> +        self.fwdefn += generate_enum_lookup(name, values)
>> +    def visit_array_type(self, name, info, element_type):
>> +        if isinstance(element_type, QAPISchemaBuiltinType):
>> +            self.btin += gen_fwd_object_or_array(name)
>> +            self.btin += gen_array(name, element_type)
>> +            self.btin += generate_type_cleanup_decl(name)
>> +            if do_builtins:
>> +                self.defn += generate_type_cleanup(name)
>
> ...fit better here?

If I rephrase it, probably yes.

>> +        else:
>> +            self.fwdecl += gen_fwd_object_or_array(name)
>> +            self.decl += gen_array(name, element_type)
>> +            self._gen_type_cleanup(name)
>> +    def visit_object_type(self, name, info, base, members, variants):
>> +        if info:
>
> So to make sure I understand, this ensures that implicit types (those
> handled merely by parameters in a function or as members of a struct)
> have no declarations required in the -types files.

Yes, (not info) means the object type is implicit.  We create implicit
object types for command parameters, command returns, and event
paramaters that aren't type names.

We don't generate C for these types, they're purely internal.  We
generate *parameter lists* for them.

Hmm, that's actually not true for command returns.  And current master
duly bombs if I add such a command to qapi-schema-test.json:

    { 'command': 'user_def_cmd4', 'returns': { 'a': 'int' } }

      GEN   tests/test-qmp-commands.h
    Traceback (most recent call last):
      File "/work/armbru/qemu/scripts/qapi-commands.py", line 360, in <module>
        ret = generate_command_decl(cmd['command'], arglist, ret_type) + "\n"
      File "/work/armbru/qemu/scripts/qapi-commands.py", line 29, in generate_command_decl
        ret_type=c_type(ret_type), name=c_name(name),
      File "/work/armbru/qemu/scripts/qapi.py", line 924, in c_type
        assert isinstance(value, str) and value != ""
    AssertionError

The easy fix is to outlaw it.

>> +            self.fwdecl += gen_fwd_object_or_array(name)
>> +            if variants:
>> +                self.decl += gen_union(name, base, variants)
>
> Is it worth 'assert not members' at this point?

Yes, with a TODO, because it's an artificial restriction I'd like to
lift some day.

>> +            else:
>> +                self.decl += gen_struct(name, base, members)
>> +            self._gen_type_cleanup(name)
>> +    def visit_alternate_type(self, name, info, variants):
>> +        self.fwdecl += gen_fwd_object_or_array(name)
>> +        self.fwdefn += gen_alternate_qtypes(name, variants)
>> +        self.decl += gen_union(name, None, variants)
>> +        self.decl += gen_alternate_qtypes_decl(name)
>> +        self._gen_type_cleanup(name)
>> +
>
> The visitor looks reasonable from inspection review.
>
>>  do_builtins = False
>>  
>>  (input_file, output_dir, do_c, do_h, prefix, opts) = \
>> @@ -325,77 +348,10 @@ fdecl.write(mcgen('''
>>  #include <stdint.h>
>>  '''))
>>  
>> -exprs = QAPISchema(input_file).get_exprs()
>> -
>> -fdecl.write(guardstart("QAPI_TYPES_BUILTIN_STRUCT_DECL"))
>> -for typename in builtin_types.keys():
>> -    fdecl.write(generate_fwd_builtin(typename))
>> -fdecl.write(guardend("QAPI_TYPES_BUILTIN_STRUCT_DECL"))
>
> This is a place where adding sorting in 25/47 (or a separate patch)
> would make the impact on generated code from this patch smaller.

You mean iterating over sorted(builtin_types.keys())?  I can try and see
whether it helps.

>> -
>> -for expr in exprs:
>> -    ret = ""
>> -    if expr.has_key('struct'):
>> -        ret += generate_fwd_struct(expr['struct'])
>> -    elif expr.has_key('enum'):
>> -        ret += generate_enum(expr['enum'], expr['data'])
>> -        ret += generate_fwd_enum_struct(expr['enum'])
>> -        fdef.write(generate_enum_lookup(expr['enum'], expr['data']))
>> -    elif expr.has_key('union'):
>> -        ret += generate_fwd_struct(expr['union'])
>> -        enum_define = discriminator_find_enum_define(expr)
>> -        if not enum_define:
>> - ret += generate_enum('%sKind' % expr['union'],
>> expr['data'].keys())
>> -            fdef.write(generate_enum_lookup('%sKind' % expr['union'],
>> -                                            expr['data'].keys()))
>> -    elif expr.has_key('alternate'):
>> -        ret += generate_fwd_struct(expr['alternate'])
>> - ret += generate_enum('%sKind' % expr['alternate'],
>> expr['data'].keys())
>> -        fdef.write(generate_enum_lookup('%sKind' % expr['alternate'],
>> -                                        expr['data'].keys()))
>
> Tweaking the order of _qtypes[] vs. _lookup[] here in a pre-req patch
> would also help review of the diff generated by this patch.

Can try and see.

>> -        fdef.write(generate_alternate_qtypes(expr))
>> -    else:
>> -        continue
>> -    fdecl.write(ret)
>> -
>> -# to avoid header dependency hell, we always generate declarations
>> -# for built-in types in our header files and simply guard them
>> -fdecl.write(guardstart("QAPI_TYPES_BUILTIN_CLEANUP_DECL"))
>> -for typename in builtin_types.keys():
>> -    fdecl.write(generate_type_cleanup_decl(typename + "List"))
>> -fdecl.write(guardend("QAPI_TYPES_BUILTIN_CLEANUP_DECL"))
>> -
>> -# ...this doesn't work for cases where we link in multiple objects that
>> -# have the functions defined, so we use -b option to provide control
>> -# over these cases
>> -if do_builtins:
>> -    for typename in builtin_types.keys():
>> -        fdef.write(generate_type_cleanup(typename + "List"))
>
> Another iteration worth sorting in an earlier patch.

Can try and see.

>> -
>> -for expr in exprs:
>> -    ret = ""
>> -    if expr.has_key('struct'):
>> -        ret += generate_struct(expr) + "\n"
>> -        ret += generate_type_cleanup_decl(expr['struct'] + "List")
>> -        fdef.write(generate_type_cleanup(expr['struct'] + "List"))
>> -        ret += generate_type_cleanup_decl(expr['struct'])
>> -        fdef.write(generate_type_cleanup(expr['struct']))
>
> I think part of the large size of the generated diff comes from whether
> array fooList types are emitted in a different order via the visitor.

That's harder to avoid.  The old code generates them from within the
code generating the element type.  The new code doesn't, because it
makes all array types full-fledged types, which get visited like any
other type.

>> -    elif expr.has_key('union'):
>> -        ret += generate_union(expr, 'union') + "\n"
>> -        ret += generate_type_cleanup_decl(expr['union'] + "List")
>> -        fdef.write(generate_type_cleanup(expr['union'] + "List"))
>> -        ret += generate_type_cleanup_decl(expr['union'])
>> -        fdef.write(generate_type_cleanup(expr['union']))
>> -    elif expr.has_key('alternate'):
>> -        ret += generate_union(expr, 'alternate') + "\n"
>> -        ret += generate_type_cleanup_decl(expr['alternate'] + "List")
>> -        fdef.write(generate_type_cleanup(expr['alternate'] + "List"))
>> -        ret += generate_type_cleanup_decl(expr['alternate'])
>> -        fdef.write(generate_type_cleanup(expr['alternate']))
>> -    elif expr.has_key('enum'):
>> -        ret += "\n" + generate_type_cleanup_decl(expr['enum'] + "List")
>> -        fdef.write(generate_type_cleanup(expr['enum'] + "List"))
>> -    else:
>> -        continue
>> -    fdecl.write(ret)
>> +schema = QAPISchema(input_file)
>> +gen = QAPISchemaGenTypeVisitor()
>> +schema.visit(gen)
>> +fdef.write(gen.defn)
>> +fdecl.write(gen.decl)
>>  
>>  close_output(fdef, fdecl)
>> diff --git a/tests/qapi-schema/qapi-schema-test.json b/tests/qapi-schema/qapi-schema-test.json
>> index a9e5aab..257b4d4 100644
>> --- a/tests/qapi-schema/qapi-schema-test.json
>> +++ b/tests/qapi-schema/qapi-schema-test.json
>> @@ -39,8 +39,8 @@
>>    'data': { 'value1' : 'UserDefA',
>>              'value2' : 'UserDefB',
>>              'value3' : 'UserDefB' } }
>> -# FIXME generated struct UserDefFlatUnion has members for direct base
>> -# UserDefUnionBase, but lacks members for indirect base UserDefZero
>> +# FIXME generated visit_type_UserDefFlatUnion_fields() fails to visit
>> +# members of indirect base UserDefZero
>>  
>>  { 'struct': 'UserDefUnionBase',
>>    'base': 'UserDefZero',
>> 
>
> Overall, moving in the right direction.

Thanks!
Eric Blake July 29, 2015, 11:11 p.m. UTC | #4
On 07/01/2015 02:22 PM, Markus Armbruster wrote:
> Fixes flat unions to get the base's base members.  Test case is from
> commit 2fc0043, in qapi-schema-test.json:
> 

> -def generate_alternate_qtypes(expr):
> +def gen_alternate_qtypes_decl(name):
> +    return mcgen('''
>  
> -    name = expr['alternate']
> -    members = expr['data']
> +extern const int %(c_name)s_qtypes[];
> +''',
> +                 c_name=c_name(name))
>  
> +def gen_alternate_qtypes(name, variants):
>      ret = mcgen('''
>  
>  const int %(name)s_qtypes[QTYPE_MAX] = {
>  ''',
>                  name=c_name(name))
>  
> -    for key in members:
> -        qtype = find_alternate_member_qtype(members[key])
> -        assert qtype, "Invalid alternate member"
> +    for var in variants.variants:
> +        qtype = var.type.alternate_qtype()
> +        assert qtype

I think I found a couple more corner case bugs here. We are using C99
initialization of the array; for example:

const int BlockdevRef_qtypes[QTYPE_MAX] = {
    [QTYPE_QDICT] = BLOCKDEV_REF_KIND_DEFINITION,
    [QTYPE_QSTRING] = BLOCKDEV_REF_KIND_REFERENCE,
};

but paired with an enum that starts at 0:

typedef enum BlockdevRefKind {
    BLOCKDEV_REF_KIND_DEFINITION = 0,
    BLOCKDEV_REF_KIND_REFERENCE = 1,
    BLOCKDEV_REF_KIND_MAX = 2,
} BlockdevRefKind;


and that means that every QTYPE_ constant that we don't specify in
_qtypes[] is also assigned the value 0 (aka BLOCKDEV_REF_KIND_DEFINITION
in this example).  In operation, calling something like:

{"execute":"blockdev-add","arguments":{"options":
 {"driver":"raw","id":"a","file":true}}}

which is invalid per the .json description ("file" must be string or
object, not boolean), still manages to get past visit_get_next_type()
with success, and fall through to the 0th branch of the switch.  If that
0th branch happens to be a struct (as it is for BlockdevRef), then we
fortunately catch the error on the very next parse call, where
qmp_input_start_struct() complains:

{"error": {"class": "GenericError", "desc": "Invalid parameter type for
'file', expected: QDict"}}

But what happens if the 0th branch is mapped to a different parser, as
would be the case if one of the alternate's branches is 'number'?  In
particular, qmp_input_type_number() accepts BOTH QFloat and QInt types.
 So, if we have this qapi:
 { 'alternate': 'Foo', 'data': { 'a': 'str', 'b': 'number' } }
but pass in an integer, visit_get_next_type() will see a qtype of QInt,
but Foo_qtypes[QTYPE_QINT] will be 0 (due to default initialization) and
we will wrongly try to visit the 0th branch (FOO_KIND_A) and fail (the
string parser doesn't like ints) even though the parse should succeed by
using the FOO_KIND_B branch.

Interestingly, this means that if we ever write an alternate type that
accepts both 'int' and 'number' (we have not attempted that so far),
then the number branch will only be taken for inputs that don't also
look like ints (normally, 'number' accepts anything numeric). Maybe that
means we should document and enforce that 'number' and 'int' cannot be
mixed in the same alternate?

So, the bugs are: visit_get_next_type() can't tell the difference
between a *_qtypes[] lookup that was explicitly initialized to 0 from
one that was accidentally left that way, and therefore can't report
failure for an unexpected type (but mostly mitigated by the fact that
always returning 0 means the parser will attempt to parse the first
branch of the alternate and gracefully fail); and that we don't properly
handle QInt for an alternate that accepts 'number' but not 'int'.

I don't think either bug has to be fixed in your series, although you
may want to add tests.

The first bug could be resolved by guaranteeing that the _qtypes[] array
has non-zero values for the explicitly initialized lookups, and teaching
visit_get_next_type() that a lookup that produces 0 meant that an
unexpected type was encountered.  Perhaps by changing the creation of
_qtypes[] in qapi-types.c to list:

const int BlockdevRef_qtypes[QTYPE_MAX] = {
  [QTYPE_QDICT] = BLOCKDEV_REF_KIND_DEFINITION + 1,
  [QTYPE_QSTRING] = BLOCKDEV_REF_KIND_REFERENCE + 1,
};

and then having visit_get_next_type() subtract one after verifying a
non-zero value was looked up.  Or perhaps leave _qtypes alone, and
instead change the alternate enum to have a placeholder at 0:

typedef enum BlockdevRefKind {
    BLOCKDEV_REF_KIND_INVALID = 0,
    BLOCKDEV_REF_KIND_DEFINITION = 1,
    BLOCKDEV_REF_KIND_REFERENCE = 2,
    BLOCKDEV_REF_KIND_MAX = 3,
} BlockdevRefKind;

and then teaching the generator for visit_type_BlockdevRef() to emit an
error if branch 0 is hit.

Fixing the second bug probably entails teaching the generator that if an
alternate contains 'number' but not 'int', then we need [QTYPE_QINT] to
map to the same lookup value as [QTYPE_QNUMBER].
Markus Armbruster July 30, 2015, 6:42 a.m. UTC | #5
Eric Blake <eblake@redhat.com> writes:

> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>> Fixes flat unions to get the base's base members.  Test case is from
>> commit 2fc0043, in qapi-schema-test.json:
>> 
>
>> -def generate_alternate_qtypes(expr):
>> +def gen_alternate_qtypes_decl(name):
>> +    return mcgen('''
>>  
>> -    name = expr['alternate']
>> -    members = expr['data']
>> +extern const int %(c_name)s_qtypes[];
>> +''',
>> +                 c_name=c_name(name))
>>  
>> +def gen_alternate_qtypes(name, variants):
>>      ret = mcgen('''
>>  
>>  const int %(name)s_qtypes[QTYPE_MAX] = {
>>  ''',
>>                  name=c_name(name))
>>  
>> -    for key in members:
>> -        qtype = find_alternate_member_qtype(members[key])
>> -        assert qtype, "Invalid alternate member"
>> +    for var in variants.variants:
>> +        qtype = var.type.alternate_qtype()
>> +        assert qtype
>
> I think I found a couple more corner case bugs here. We are using C99
> initialization of the array; for example:
>
> const int BlockdevRef_qtypes[QTYPE_MAX] = {
>     [QTYPE_QDICT] = BLOCKDEV_REF_KIND_DEFINITION,
>     [QTYPE_QSTRING] = BLOCKDEV_REF_KIND_REFERENCE,
> };
>
> but paired with an enum that starts at 0:
>
> typedef enum BlockdevRefKind {
>     BLOCKDEV_REF_KIND_DEFINITION = 0,
>     BLOCKDEV_REF_KIND_REFERENCE = 1,
>     BLOCKDEV_REF_KIND_MAX = 2,
> } BlockdevRefKind;
>
>
> and that means that every QTYPE_ constant that we don't specify in
> _qtypes[] is also assigned the value 0 (aka BLOCKDEV_REF_KIND_DEFINITION

I see where this is going...

> in this example).  In operation, calling something like:
>
> {"execute":"blockdev-add","arguments":{"options":
>  {"driver":"raw","id":"a","file":true}}}
>
> which is invalid per the .json description ("file" must be string or
> object, not boolean), still manages to get past visit_get_next_type()
> with success, and fall through to the 0th branch of the switch.  If that
> 0th branch happens to be a struct (as it is for BlockdevRef), then we
> fortunately catch the error on the very next parse call, where
> qmp_input_start_struct() complains:
>
> {"error": {"class": "GenericError", "desc": "Invalid parameter type for
> 'file', expected: QDict"}}

A lucky case.

> But what happens if the 0th branch is mapped to a different parser, as
> would be the case if one of the alternate's branches is 'number'?  In
> particular, qmp_input_type_number() accepts BOTH QFloat and QInt types.
>  So, if we have this qapi:
>  { 'alternate': 'Foo', 'data': { 'a': 'str', 'b': 'number' } }
> but pass in an integer, visit_get_next_type() will see a qtype of QInt,
> but Foo_qtypes[QTYPE_QINT] will be 0 (due to default initialization) and
> we will wrongly try to visit the 0th branch (FOO_KIND_A) and fail (the
> string parser doesn't like ints) even though the parse should succeed by
> using the FOO_KIND_B branch.

Yup, bug.

> Interestingly, this means that if we ever write an alternate type that
> accepts both 'int' and 'number' (we have not attempted that so far),
> then the number branch will only be taken for inputs that don't also
> look like ints (normally, 'number' accepts anything numeric). Maybe that
> means we should document and enforce that 'number' and 'int' cannot be
> mixed in the same alternate?

Even if we outlaw mixing the two, I'm afraid we still have this bug: an
alternate with a 'number' member rejects input that gets parsed as
QTYPE_QINT.

Let's simply make alternates behave sanely:

    alternate has      case selected for
    'int'  'number'    QTYPE_QINT  QTYPE_QFLOAT
      no        no     error       error
      no       yes     'number'    'number'
     yes        no     'int'       error
     yes       yes     'int'       'number'

> So, the bugs are: visit_get_next_type() can't tell the difference
> between a *_qtypes[] lookup that was explicitly initialized to 0 from
> one that was accidentally left that way, and therefore can't report
> failure for an unexpected type (but mostly mitigated by the fact that
> always returning 0 means the parser will attempt to parse the first
> branch of the alternate and gracefully fail);

Yes.

>                                               and that we don't properly
> handle QInt for an alternate that accepts 'number' but not 'int'.

Yes.

> I don't think either bug has to be fixed in your series, although you
> may want to add tests.

Agree.

> The first bug could be resolved by guaranteeing that the _qtypes[] array
> has non-zero values for the explicitly initialized lookups, and teaching
> visit_get_next_type() that a lookup that produces 0 meant that an
> unexpected type was encountered.  Perhaps by changing the creation of
> _qtypes[] in qapi-types.c to list:
>
> const int BlockdevRef_qtypes[QTYPE_MAX] = {
>   [QTYPE_QDICT] = BLOCKDEV_REF_KIND_DEFINITION + 1,
>   [QTYPE_QSTRING] = BLOCKDEV_REF_KIND_REFERENCE + 1,
> };
>
> and then having visit_get_next_type() subtract one after verifying a
> non-zero value was looked up.

+ 1 works, because the element type is int, not BlockdevRefKind.  It's
int so it can serve as argument for visit_get_next_type()'s parameter
const int *qtypes.

The + 1, - 1 business could be mildly confusing.  We could set all
unused elements to -1 instead:

    const int BlockdevRef_qtypes[QTYPE_MAX] = {
        [QTYPE_NONE] = -1,
        [QTYPE_QNULL] = -1,
        [QTYPE_QINT] = -1,
        [QTYPE_QSTRING] = BLOCKDEV_REF_KIND_REFERENCE + 1,
        [QTYPE_QDICT] = BLOCKDEV_REF_KIND_DEFINITION + 1,
        [QTYPE_QLIST] = -1,
        [QTYPE_QFLOAT] = -1,
        [QTYPE_QBOOL] = -1,
    };

I wouldn't recommend that in hand-written code, but generating it should
be fine.

>                                Or perhaps leave _qtypes alone, and
> instead change the alternate enum to have a placeholder at 0:
>
> typedef enum BlockdevRefKind {
>     BLOCKDEV_REF_KIND_INVALID = 0,
>     BLOCKDEV_REF_KIND_DEFINITION = 1,
>     BLOCKDEV_REF_KIND_REFERENCE = 2,
>     BLOCKDEV_REF_KIND_MAX = 3,
> } BlockdevRefKind;
>
> and then teaching the generator for visit_type_BlockdevRef() to emit an
> error if branch 0 is hit.

BLOCKDEV_REF_KIND_INVALID could get in the way elsewhere, e.g. with
-Wswitch.

> Fixing the second bug probably entails teaching the generator that if an
> alternate contains 'number' but not 'int', then we need [QTYPE_QINT] to
> map to the same lookup value as [QTYPE_QNUMBER].

Add test eight test cases from my table above, then fix the generator to
make them pass.
Eric Blake July 30, 2015, 12:46 p.m. UTC | #6
On 07/30/2015 12:42 AM, Markus Armbruster wrote:

>> But what happens if the 0th branch is mapped to a different parser, as
>> would be the case if one of the alternate's branches is 'number'?  In
>> particular, qmp_input_type_number() accepts BOTH QFloat and QInt types.
>>  So, if we have this qapi:
>>  { 'alternate': 'Foo', 'data': { 'a': 'str', 'b': 'number' } }
>> but pass in an integer, visit_get_next_type() will see a qtype of QInt,
>> but Foo_qtypes[QTYPE_QINT] will be 0 (due to default initialization) and
>> we will wrongly try to visit the 0th branch (FOO_KIND_A) and fail (the
>> string parser doesn't like ints) even though the parse should succeed by
>> using the FOO_KIND_B branch.
> 
> Yup, bug.

And it's an order-dependent bug - merely declaring 'b' first makes it
appear to work correctly.

> 
>> Interestingly, this means that if we ever write an alternate type that
>> accepts both 'int' and 'number' (we have not attempted that so far),
>> then the number branch will only be taken for inputs that don't also
>> look like ints (normally, 'number' accepts anything numeric). Maybe that
>> means we should document and enforce that 'number' and 'int' cannot be
>> mixed in the same alternate?
> 
> Even if we outlaw mixing the two, I'm afraid we still have this bug: an
> alternate with a 'number' member rejects input that gets parsed as
> QTYPE_QINT.
> 
> Let's simply make alternates behave sanely:
> 
>     alternate has      case selected for
>     'int'  'number'    QTYPE_QINT  QTYPE_QFLOAT
>       no        no     error       error
>       no       yes     'number'    'number'
>      yes        no     'int'       error
>      yes       yes     'int'       'number'

Works for me.


> 
> + 1 works, because the element type is int, not BlockdevRefKind.  It's
> int so it can serve as argument for visit_get_next_type()'s parameter
> const int *qtypes.
> 
> The + 1, - 1 business could be mildly confusing.  We could set all
> unused elements to -1 instead:

Or, we could ditch the qtypes lookup altogether, and merely create the
alternate enum as a non-consecutive QTYPE mapping, for one less level of
indirection, as in:

typedef enum BlockdevRefKind {
    BLOCKDEV_REF_DEFINITION = QTYPE_QOBJECT,
    BLOCKDEV_REF_REFERENCE = QTYPE_QSTRING,
};

then rewrite visit_get_next_type() to directly return the qtype, as well
as rewrite the generated switch statement in visit_type_BlockdevRef() to
directly inspect the qtypes it cares about.  In fact, that's the
approach I'm currently playing with.

> Add test eight test cases from my table above, then fix the generator to
> make them pass.

I hope to post an RFC followup patch along those lines later today.
Markus Armbruster July 30, 2015, 3:53 p.m. UTC | #7
Eric Blake <eblake@redhat.com> writes:

> On 07/30/2015 12:42 AM, Markus Armbruster wrote:
>
>>> But what happens if the 0th branch is mapped to a different parser, as
>>> would be the case if one of the alternate's branches is 'number'?  In
>>> particular, qmp_input_type_number() accepts BOTH QFloat and QInt types.
>>>  So, if we have this qapi:
>>>  { 'alternate': 'Foo', 'data': { 'a': 'str', 'b': 'number' } }
>>> but pass in an integer, visit_get_next_type() will see a qtype of QInt,
>>> but Foo_qtypes[QTYPE_QINT] will be 0 (due to default initialization) and
>>> we will wrongly try to visit the 0th branch (FOO_KIND_A) and fail (the
>>> string parser doesn't like ints) even though the parse should succeed by
>>> using the FOO_KIND_B branch.
>> 
>> Yup, bug.
>
> And it's an order-dependent bug - merely declaring 'b' first makes it
> appear to work correctly.
>
>> 
>>> Interestingly, this means that if we ever write an alternate type that
>>> accepts both 'int' and 'number' (we have not attempted that so far),
>>> then the number branch will only be taken for inputs that don't also
>>> look like ints (normally, 'number' accepts anything numeric). Maybe that
>>> means we should document and enforce that 'number' and 'int' cannot be
>>> mixed in the same alternate?
>> 
>> Even if we outlaw mixing the two, I'm afraid we still have this bug: an
>> alternate with a 'number' member rejects input that gets parsed as
>> QTYPE_QINT.
>> 
>> Let's simply make alternates behave sanely:
>> 
>>     alternate has      case selected for
>>     'int'  'number'    QTYPE_QINT  QTYPE_QFLOAT
>>       no        no     error       error
>>       no       yes     'number'    'number'
>>      yes        no     'int'       error
>>      yes       yes     'int'       'number'
>
> Works for me.
>
>
>> 
>> + 1 works, because the element type is int, not BlockdevRefKind.  It's
>> int so it can serve as argument for visit_get_next_type()'s parameter
>> const int *qtypes.
>> 
>> The + 1, - 1 business could be mildly confusing.  We could set all
>> unused elements to -1 instead:
>
> Or, we could ditch the qtypes lookup altogether, and merely create the
> alternate enum as a non-consecutive QTYPE mapping, for one less level of
> indirection, as in:
>
> typedef enum BlockdevRefKind {
>     BLOCKDEV_REF_DEFINITION = QTYPE_QOBJECT,

QTYPE_QDICT, but I get what you mean.

>     BLOCKDEV_REF_REFERENCE = QTYPE_QSTRING,
> };
>
> then rewrite visit_get_next_type() to directly return the qtype, as well
> as rewrite the generated switch statement in visit_type_BlockdevRef() to
> directly inspect the qtypes it cares about.  In fact, that's the
> approach I'm currently playing with.

Hmm.  The schema currently doesn't let you control enumeration values.
qapi-code-gen.txt specifies:

    The enumeration values [...] are encoded as C enum integral values
    in generated code.  While the C code starts numbering at 0, it is
    better to use explicit comparisons to enum values than implicit
    comparisons to 0; the C code will also include a generated enum
    member ending in _MAX for tracking the size of the enum, useful when
    using common functions for converting between strings and enum
    values.

Strictly speaking, this needn't apply to implicit enums like
BlockdevRefKind.  But is it a good idea to make them different?

Hmm, your new BlockdevRefKind is basically a subset of qtype_code with
the members renamed.  Could we simply use qtype_code directly?

>> Add test eight test cases from my table above, then fix the generator to
>> make them pass.
>
> I hope to post an RFC followup patch along those lines later today.
Eric Blake July 30, 2015, 4:36 p.m. UTC | #8
On 07/30/2015 09:53 AM, Markus Armbruster wrote:

>> Or, we could ditch the qtypes lookup altogether, and merely create the
>> alternate enum as a non-consecutive QTYPE mapping, for one less level of
>> indirection, as in:
>>
>> typedef enum BlockdevRefKind {
>>     BLOCKDEV_REF_DEFINITION = QTYPE_QOBJECT,
> 
> QTYPE_QDICT, but I get what you mean.
> 
>>     BLOCKDEV_REF_REFERENCE = QTYPE_QSTRING,
>> };
>>
>> then rewrite visit_get_next_type() to directly return the qtype, as well
>> as rewrite the generated switch statement in visit_type_BlockdevRef() to
>> directly inspect the qtypes it cares about.  In fact, that's the
>> approach I'm currently playing with.
> 
> Hmm.  The schema currently doesn't let you control enumeration values.

For public enums. But the enum for alternate is never public - you never
send the name of the branch over the wire, so we don't need to stringize
the name anywhere.

> qapi-code-gen.txt specifies:
> 
>     The enumeration values [...] are encoded as C enum integral values
>     in generated code.  While the C code starts numbering at 0, it is
>     better to use explicit comparisons to enum values than implicit
>     comparisons to 0; the C code will also include a generated enum
>     member ending in _MAX for tracking the size of the enum, useful when
>     using common functions for converting between strings and enum
>     values.
> 
> Strictly speaking, this needn't apply to implicit enums like
> BlockdevRefKind.  But is it a good idea to make them different?

I think so, but maybe under a different name (maybe
BLOCKDEV_REF_REFERENCE_QTYPE) to make it more obvious that this is not
the usual generated enum, but special to the alternate.

> 
> Hmm, your new BlockdevRefKind is basically a subset of qtype_code with
> the members renamed.  Could we simply use qtype_code directly?

We could, except that clients that manipulate the generated struct then
have to know the qtype mapping directly; while keeping symbolic names
lets them do 'foo->type = BLOCKDEV_REF_REFERENCE; foo->reference = xyz;'
as a nice visual indicator of which union member within the struct is
being assigned according to the discriminator.

I guess I'll see how much code currently manipulates the generated
structs (I already recall from other patches in this series that
blockdev played a bit loose by  validating that the QMP was okay and
then using QDict for everything else rather than the generated struct)
and make my decision when posting my RFC patch.
Eric Blake July 30, 2015, 9:51 p.m. UTC | #9
On 07/30/2015 10:36 AM, Eric Blake wrote:
> On 07/30/2015 09:53 AM, Markus Armbruster wrote:
> 
>>> Or, we could ditch the qtypes lookup altogether, and merely create the
>>> alternate enum as a non-consecutive QTYPE mapping, for one less level of
>>> indirection, as in:
>>>
>>> typedef enum BlockdevRefKind {
>>>     BLOCKDEV_REF_DEFINITION = QTYPE_QOBJECT,
>>
>> QTYPE_QDICT, but I get what you mean.
>>
>>>     BLOCKDEV_REF_REFERENCE = QTYPE_QSTRING,
>>> };
>>>

>> Hmm, your new BlockdevRefKind is basically a subset of qtype_code with
>> the members renamed.  Could we simply use qtype_code directly?
> 
> We could, except that clients that manipulate the generated struct then
> have to know the qtype mapping directly; while keeping symbolic names
> lets them do 'foo->type = BLOCKDEV_REF_REFERENCE; foo->reference = xyz;'
> as a nice visual indicator of which union member within the struct is
> being assigned according to the discriminator.
> 
> I guess I'll see how much code currently manipulates the generated
> structs (I already recall from other patches in this series that
> blockdev played a bit loose by  validating that the QMP was okay and
> then using QDict for everything else rather than the generated struct)
> and make my decision when posting my RFC patch.

Turns out that using it directly was easier, and less code:
http://thread.gmane.org/gmane.comp.emulators.qemu/353204/focus=354008
diff mbox

Patch

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index a48ad9c..d6185c6 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -2,88 +2,67 @@ 
 # QAPI types generator
 #
 # Copyright IBM, Corp. 2011
+# Copyright (c) 2013-2015 Red Hat Inc.
 #
 # Authors:
 #  Anthony Liguori <aliguori@us.ibm.com>
+#  Markus Armbruster <armbru@redhat.com>
 #
 # This work is licensed under the terms of the GNU GPL, version 2.
 # See the COPYING file in the top-level directory.
 
-from ordereddict import OrderedDict
 from qapi import *
 
-def generate_fwd_builtin(name):
-    return mcgen('''
-
-typedef struct %(name)sList {
-    union {
-        %(type)s value;
-        uint64_t padding;
-    };
-    struct %(name)sList *next;
-} %(name)sList;
-''',
-                 type=c_type(name),
-                 name=name)
-
-def generate_fwd_struct(name):
+def gen_fwd_object_or_array(name):
     return mcgen('''
 
 typedef struct %(name)s %(name)s;
-
-typedef struct %(name)sList {
-    union {
-        %(name)s *value;
-        uint64_t padding;
-    };
-    struct %(name)sList *next;
-} %(name)sList;
 ''',
                  name=c_name(name))
 
-def generate_fwd_enum_struct(name):
+def gen_array(name, element_type):
     return mcgen('''
 
-typedef struct %(name)sList {
+struct %(name)s {
     union {
-        %(name)s value;
+        %(c_type)s value;
         uint64_t padding;
     };
-    struct %(name)sList *next;
-} %(name)sList;
+    struct %(name)s *next;
+};
 ''',
-                 name=c_name(name))
+                 name=c_name(name), c_type=element_type.c_type())
 
-def generate_struct_fields(members):
+def gen_struct_field(name, typ, optional):
     ret = ''
 
-    for argname, argentry, optional in parse_args(members):
-        if optional:
-            ret += mcgen('''
+    if optional:
+        ret += mcgen('''
     bool has_%(c_name)s;
 ''',
-                         c_name=c_name(argname))
-        ret += mcgen('''
+                     c_name=c_name(name))
+    ret += mcgen('''
     %(c_type)s %(c_name)s;
 ''',
-                     c_type=c_type(argentry), c_name=c_name(argname))
-
+                 c_type=typ.c_type(), c_name=c_name(name))
     return ret
 
-def generate_struct(expr):
+def generate_struct_fields(members):
+    ret = ''
 
-    structname = expr.get('struct', "")
-    members = expr['data']
-    base = expr.get('base')
+    for memb in members:
+        ret += gen_struct_field(memb.name, memb.type, memb.optional)
+    return ret
 
+def gen_struct(name, base, members):
     ret = mcgen('''
 
 struct %(name)s {
 ''',
-          name=c_name(structname))
+                name=c_name(name))
 
     if base:
-        ret += generate_struct_fields({'base': base})
+        ret += gen_struct_field('base', base, False)
 
     ret += generate_struct_fields(members)
 
@@ -156,46 +135,38 @@  typedef enum %(name)s {
 
     return enum_decl + lookup_decl
 
-def generate_alternate_qtypes(expr):
+def gen_alternate_qtypes_decl(name):
+    return mcgen('''
 
-    name = expr['alternate']
-    members = expr['data']
+extern const int %(c_name)s_qtypes[];
+''',
+                 c_name=c_name(name))
 
+def gen_alternate_qtypes(name, variants):
     ret = mcgen('''
 
 const int %(name)s_qtypes[QTYPE_MAX] = {
 ''',
                 name=c_name(name))
 
-    for key in members:
-        qtype = find_alternate_member_qtype(members[key])
-        assert qtype, "Invalid alternate member"
+    for var in variants.variants:
+        qtype = var.type.alternate_qtype()
+        assert qtype
 
         ret += mcgen('''
     [%(qtype)s] = %(enum_const)s,
 ''',
                      qtype = qtype,
-                     enum_const = c_enum_const(name + 'Kind', key))
+                     enum_const=c_enum_const(variants.tag_member.type.name,
+                                             var.name))
 
     ret += mcgen('''
 };
 ''')
     return ret
 
-
-def generate_union(expr, meta):
-
-    name = c_name(expr[meta])
-    typeinfo = expr['data']
-
-    base = expr.get('base')
-    discriminator = expr.get('discriminator')
-
-    enum_define = discriminator_find_enum_define(expr)
-    if enum_define:
-        discriminator_type_name = enum_define['enum_name']
-    else:
-        discriminator_type_name = '%sKind' % (name)
+def gen_union(name, base, variants):
+    name = c_name(name)
 
     ret = mcgen('''
 
@@ -203,56 +174,49 @@  struct %(name)s {
 ''',
                 name=name)
     if base:
-        base_fields = find_struct(base)['data']
-        ret += generate_struct_fields(base_fields)
+        ret += generate_struct_fields(base.members)
         ret += mcgen('''
     /* union tag is %(discriminator_type_name)s %(c_name)s */
 ''',
-                     discriminator_type_name=c_name(discriminator_type_name),
-                     c_name=c_name(discriminator))
+                     discriminator_type_name=c_name(variants.tag_member.type.name),
+                     c_name=c_name(variants.tag_member.name))
     else:
-        assert not discriminator
         ret += mcgen('''
     %(discriminator_type_name)s kind;
 ''',
-                     discriminator_type_name=c_name(discriminator_type_name))
+                     discriminator_type_name=c_name(variants.tag_member.type.name))
 
     ret += mcgen('''
     union {
         void *data;
 ''')
 
-    for key in typeinfo:
+    for var in variants.variants:
         ret += mcgen('''
         %(c_type)s %(c_name)s;
 ''',
-                     c_type=c_type(typeinfo[key]),
-                     c_name=c_name(key))
+                     c_type=var.type.c_type(),
+                     c_name=c_name(var.name))
 
     ret += mcgen('''
     };
 };
 ''')
-    if meta == 'alternate':
-        ret += mcgen('''
-extern const int %(name)s_qtypes[];
-''',
-            name=name)
-
 
     return ret
 
 def generate_type_cleanup_decl(name):
     ret = mcgen('''
-void qapi_free_%(name)s(%(c_type)s obj);
+
+void qapi_free_%(name)s(%(name)s *obj);
 ''',
-                c_type=c_type(name), name=c_name(name))
+                name=c_name(name))
     return ret
 
 def generate_type_cleanup(name):
     ret = mcgen('''
 
-void qapi_free_%(name)s(%(c_type)s obj)
+void qapi_free_%(name)s(%(name)s *obj)
 {
     QapiDeallocVisitor *md;
     Visitor *v;
@@ -267,9 +231,68 @@  void qapi_free_%(name)s(%(c_type)s obj)
     qapi_dealloc_visitor_cleanup(md);
 }
 ''',
-                c_type=c_type(name), name=c_name(name))
+                name=c_name(name))
     return ret
 
+class QAPISchemaGenTypeVisitor(QAPISchemaVisitor):
+    def __init__(self):
+        self.decl = None
+        self.defn = None
+        self.fwdecl = None
+        self.fwdefn = None
+        self.btin = None
+    def visit_begin(self):
+        self.decl = ''
+        self.defn = ''
+        self.fwdecl = ''
+        self.fwdefn = ''
+        self.btin = guardstart('QAPI_TYPES_BUILTIN')
+    def visit_end(self):
+        self.decl = self.fwdecl + self.decl
+        self.fwdecl = None
+        self.defn = self.fwdefn + self.defn
+        self.fwdefn = None
+        # To avoid header dependency hell, we always generate
+        # declarations for built-in types in our header files and
+        # simply guard them.
+        self.btin += guardend('QAPI_TYPES_BUILTIN')
+        self.decl = self.btin + self.decl
+        self.btin = None
+        # Doesn't work for cases where we link in multiple objects
+        # that have the functions defined, so generate them only with
+        # option -b (do_builtins).
+    def _gen_type_cleanup(self, name):
+        self.decl += generate_type_cleanup_decl(name)
+        self.defn += generate_type_cleanup(name)
+    def visit_enum_type(self, name, info, values):
+        self.fwdecl += generate_enum(name, values)
+        self.fwdefn += generate_enum_lookup(name, values)
+    def visit_array_type(self, name, info, element_type):
+        if isinstance(element_type, QAPISchemaBuiltinType):
+            self.btin += gen_fwd_object_or_array(name)
+            self.btin += gen_array(name, element_type)
+            self.btin += generate_type_cleanup_decl(name)
+            if do_builtins:
+                self.defn += generate_type_cleanup(name)
+        else:
+            self.fwdecl += gen_fwd_object_or_array(name)
+            self.decl += gen_array(name, element_type)
+            self._gen_type_cleanup(name)
+    def visit_object_type(self, name, info, base, members, variants):
+        if info:
+            self.fwdecl += gen_fwd_object_or_array(name)
+            if variants:
+                self.decl += gen_union(name, base, variants)
+            else:
+                self.decl += gen_struct(name, base, members)
+            self._gen_type_cleanup(name)
+    def visit_alternate_type(self, name, info, variants):
+        self.fwdecl += gen_fwd_object_or_array(name)
+        self.fwdefn += gen_alternate_qtypes(name, variants)
+        self.decl += gen_union(name, None, variants)
+        self.decl += gen_alternate_qtypes_decl(name)
+        self._gen_type_cleanup(name)
+
 do_builtins = False
 
 (input_file, output_dir, do_c, do_h, prefix, opts) = \
@@ -325,77 +348,10 @@  fdecl.write(mcgen('''
 #include <stdint.h>
 '''))
 
-exprs = QAPISchema(input_file).get_exprs()
-
-fdecl.write(guardstart("QAPI_TYPES_BUILTIN_STRUCT_DECL"))
-for typename in builtin_types.keys():
-    fdecl.write(generate_fwd_builtin(typename))
-fdecl.write(guardend("QAPI_TYPES_BUILTIN_STRUCT_DECL"))
-
-for expr in exprs:
-    ret = ""
-    if expr.has_key('struct'):
-        ret += generate_fwd_struct(expr['struct'])
-    elif expr.has_key('enum'):
-        ret += generate_enum(expr['enum'], expr['data'])
-        ret += generate_fwd_enum_struct(expr['enum'])
-        fdef.write(generate_enum_lookup(expr['enum'], expr['data']))
-    elif expr.has_key('union'):
-        ret += generate_fwd_struct(expr['union'])
-        enum_define = discriminator_find_enum_define(expr)
-        if not enum_define:
-            ret += generate_enum('%sKind' % expr['union'], expr['data'].keys())
-            fdef.write(generate_enum_lookup('%sKind' % expr['union'],
-                                            expr['data'].keys()))
-    elif expr.has_key('alternate'):
-        ret += generate_fwd_struct(expr['alternate'])
-        ret += generate_enum('%sKind' % expr['alternate'], expr['data'].keys())
-        fdef.write(generate_enum_lookup('%sKind' % expr['alternate'],
-                                        expr['data'].keys()))
-        fdef.write(generate_alternate_qtypes(expr))
-    else:
-        continue
-    fdecl.write(ret)
-
-# to avoid header dependency hell, we always generate declarations
-# for built-in types in our header files and simply guard them
-fdecl.write(guardstart("QAPI_TYPES_BUILTIN_CLEANUP_DECL"))
-for typename in builtin_types.keys():
-    fdecl.write(generate_type_cleanup_decl(typename + "List"))
-fdecl.write(guardend("QAPI_TYPES_BUILTIN_CLEANUP_DECL"))
-
-# ...this doesn't work for cases where we link in multiple objects that
-# have the functions defined, so we use -b option to provide control
-# over these cases
-if do_builtins:
-    for typename in builtin_types.keys():
-        fdef.write(generate_type_cleanup(typename + "List"))
-
-for expr in exprs:
-    ret = ""
-    if expr.has_key('struct'):
-        ret += generate_struct(expr) + "\n"
-        ret += generate_type_cleanup_decl(expr['struct'] + "List")
-        fdef.write(generate_type_cleanup(expr['struct'] + "List"))
-        ret += generate_type_cleanup_decl(expr['struct'])
-        fdef.write(generate_type_cleanup(expr['struct']))
-    elif expr.has_key('union'):
-        ret += generate_union(expr, 'union') + "\n"
-        ret += generate_type_cleanup_decl(expr['union'] + "List")
-        fdef.write(generate_type_cleanup(expr['union'] + "List"))
-        ret += generate_type_cleanup_decl(expr['union'])
-        fdef.write(generate_type_cleanup(expr['union']))
-    elif expr.has_key('alternate'):
-        ret += generate_union(expr, 'alternate') + "\n"
-        ret += generate_type_cleanup_decl(expr['alternate'] + "List")
-        fdef.write(generate_type_cleanup(expr['alternate'] + "List"))
-        ret += generate_type_cleanup_decl(expr['alternate'])
-        fdef.write(generate_type_cleanup(expr['alternate']))
-    elif expr.has_key('enum'):
-        ret += "\n" + generate_type_cleanup_decl(expr['enum'] + "List")
-        fdef.write(generate_type_cleanup(expr['enum'] + "List"))
-    else:
-        continue
-    fdecl.write(ret)
+schema = QAPISchema(input_file)
+gen = QAPISchemaGenTypeVisitor()
+schema.visit(gen)
+fdef.write(gen.defn)
+fdecl.write(gen.decl)
 
 close_output(fdef, fdecl)
diff --git a/tests/qapi-schema/qapi-schema-test.json b/tests/qapi-schema/qapi-schema-test.json
index a9e5aab..257b4d4 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -39,8 +39,8 @@ 
   'data': { 'value1' : 'UserDefA',
             'value2' : 'UserDefB',
             'value3' : 'UserDefB' } }
-# FIXME generated struct UserDefFlatUnion has members for direct base
-# UserDefUnionBase, but lacks members for indirect base UserDefZero
+# FIXME generated visit_type_UserDefFlatUnion_fields() fails to visit
+# members of indirect base UserDefZero
 
 { 'struct': 'UserDefUnionBase',
   'base': 'UserDefZero',