diff mbox

[RFC,v2,21/47] qapi: New QAPISchema intermediate reperesentation

Message ID 1435782155-31412-22-git-send-email-armbru@redhat.com
State New
Headers show

Commit Message

Markus Armbruster July 1, 2015, 8:22 p.m. UTC
The QAPI code generators work with a syntax tree (nested dictionaries)
plus a few symbol tables (also dictionaties) on the side.

They have clearly outgrown these simple data structures.  There's lots
of rummaging around in dictionaries, and information is recomputed on
the fly.  For the work I'm going to do, I want more clearly defined
and more convenient interfaces.

Going forward, I also want less coupling between the back-ends and the
syntax tree, to make messing with the syntax easier.

Create a bunch of classes to represent QAPI schemata.

Have the QAPISchema initializer call the parser, then walk the syntax
tree to create the new internal representation, and finally perform
semantic analysis.

Shortcut: the semantic analysis still relies on existing check_exprs()
to do the actual semantic checking.  All this code needs to move into
the classes.  Mark as TODO.

We generate array types eagerly, even though most of them aren't used.
Mark as TODO.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 scripts/qapi-commands.py       |   2 +-
 scripts/qapi-event.py          |   2 +-
 scripts/qapi-types.py          |   2 +-
 scripts/qapi-visit.py          |   2 +-
 scripts/qapi.py                | 351 +++++++++++++++++++++++++++++++++++++++--
 tests/qapi-schema/test-qapi.py |   2 +-
 6 files changed, 347 insertions(+), 14 deletions(-)

Comments

Eric Blake July 21, 2015, 8:32 p.m. UTC | #1
On 07/01/2015 02:22 PM, Markus Armbruster wrote:
> The QAPI code generators work with a syntax tree (nested dictionaries)
> plus a few symbol tables (also dictionaties) on the side.

s/dictionaties/dictionaries/

> 
> They have clearly outgrown these simple data structures.  There's lots
> of rummaging around in dictionaries, and information is recomputed on
> the fly.  For the work I'm going to do, I want more clearly defined
> and more convenient interfaces.
> 
> Going forward, I also want less coupling between the back-ends and the
> syntax tree, to make messing with the syntax easier.

Indeed - should be a lot easier to add new qapi .json syntax sugar on
the front-end, or to tweak generated C layout on the backend, without
close coupling between the two.  Particularly when adding front-end
sugar extensions that still normalize into the same backend structs.

> 
> Create a bunch of classes to represent QAPI schemata.
> 
> Have the QAPISchema initializer call the parser, then walk the syntax
> tree to create the new internal representation, and finally perform
> semantic analysis.
> 
> Shortcut: the semantic analysis still relies on existing check_exprs()
> to do the actual semantic checking.  All this code needs to move into
> the classes.  Mark as TODO.
> 
> We generate array types eagerly, even though most of them aren't used.
> Mark as TODO.
> 

No change to generated files at this stage in the game (this is mostly
additive, then later patches shed their old ways of doing things by
using this).  Good division of labor between patches.

> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  scripts/qapi-commands.py       |   2 +-
>  scripts/qapi-event.py          |   2 +-
>  scripts/qapi-types.py          |   2 +-
>  scripts/qapi-visit.py          |   2 +-
>  scripts/qapi.py                | 351 +++++++++++++++++++++++++++++++++++++++--
>  tests/qapi-schema/test-qapi.py |   2 +-
>  6 files changed, 347 insertions(+), 14 deletions(-)
> 

> +++ b/scripts/qapi.py

> +
> +#
> +# Schema compiler frontend
> +#
> +
> +class QAPISchemaEntity(object):
> +    def __init__(self, name, info):
> +        assert isinstance(name, str)
> +        self.name = name
> +        self.info = info
> +    def check(self, schema):
> +        pass
> +
> +class QAPISchemaType(QAPISchemaEntity):
> +    pass
> +
> +class QAPISchemaBuiltinType(QAPISchemaType):
> +    def __init__(self, name):
> +        QAPISchemaType.__init__(self, name, None)
> +
> +class QAPISchemaEnumType(QAPISchemaType):
> +    def __init__(self, name, info, values):
> +        QAPISchemaType.__init__(self, name, info)
> +        for v in values:
> +            assert isinstance(v, str)
> +        self.values = values

worth a check() method to ensure values don't collide?  Especially since
you already do that in QAPISchemaObjectTypeMember.check()

> +
> +class QAPISchemaArrayType(QAPISchemaType):
> +    def __init__(self, name, info, element_type):
> +        QAPISchemaType.__init__(self, name, info)
> +        assert isinstance(element_type, str)
> +        self.element_type_name = element_type
> +        self.element_type = None
> +    def check(self, schema):
> +        self.element_type = schema.lookup_type(self.element_type_name)
> +        assert self.element_type
> +
> +class QAPISchemaObjectType(QAPISchemaType):
> +    def __init__(self, name, info, base, local_members, variants):
> +        QAPISchemaType.__init__(self, name, info)
> +        assert base == None or isinstance(base, str)
> +        for m in local_members:
> +            assert isinstance(m, QAPISchemaObjectTypeMember)
> +        if variants != None:
> +            assert isinstance(variants, QAPISchemaObjectTypeVariants)
> +        self.base_name = base
> +        self.base = None
> +        self.local_members = local_members
> +        self.variants = variants
> +        self.members = None
> +    def check(self, schema):
> +        if self.members:
> +            return                      # already checked
> +        assert self.members == None     # not running in cycles
> +        self.members = False            # mark as being checked

Interesting, but makes sense (since we allow self-referential nesting,
populating our own members may require revisiting the same type).  Cute
that python allows both None and False as distinct non-true settings, so
that you end up using the variable as a tri-state.

> +        if self.base_name:
> +            self.base = schema.lookup_type(self.base_name)
> +            self.base.check(schema)

Do you need 'assert self.base' here, similar to how you did it in
QAPISchemaArrayType.check()?  I guess you'll still get a python
exception that None.check() doesn't exist if the lookup_type failed, so
it just depends on what error message you want.

> +            members = list(self.base.members)

For that matter, do you want to assert isinstance(self.base,
QAPISchemaObjectType), since lookup_type can return other types, but
other child classes of QAPISchemaType do not have a .members?

> +        else:
> +            members = []
> +        seen = {}
> +        for m in members:
> +            seen[m.name] = m
> +        for m in self.local_members:
> +            m.check(schema, members, seen)
> +        if self.variants:
> +            self.variants.check(schema, members, seen)
> +        self.members = members
> +
> +class QAPISchemaObjectTypeMember(object):

Is 'object' the right base class, or should it be QAPISchemaEntity?

> +    def __init__(self, name, typ, optional):
> +        assert isinstance(name, str)
> +        assert isinstance(typ, str)

I guess that means all callers flatten array types into an unambiguous
string first.

> +        assert isinstance(optional, bool)
> +        self.name = name
> +        self.type_name = typ
> +        self.type = None
> +        self.optional = optional
> +    def check(self, schema, all_members, seen):
> +        assert self.name not in seen
> +        self.type = schema.lookup_type(self.type_name)
> +        assert self.type
> +        all_members.append(self)
> +        seen[self.name] = self
> +
> +class QAPISchemaObjectTypeVariants(object):

Is 'object' the right base class, or should it be QAPISchemaEntity?

> +    def __init__(self, tag_name, tag_enum, variants):
> +        assert tag_name == None or isinstance(tag_name, str)
> +        assert tag_enum == None or isinstance(tag_enum, str)
> +        for v in variants:
> +            assert isinstance(v, QAPISchemaObjectTypeVariant)
> +        self.tag_name = tag_name
> +        if tag_name:
> +            assert not tag_enum
> +            self.tag_member = None
> +        else:
> +            self.tag_member = QAPISchemaObjectTypeMember('kind', tag_enum,
> +                                                         False)

Nice - the tag is either part of the base struct (flat union) or
implicitly created (plain union).  By the time we are done with check(),
we then have both its type (possibly generated implicitly) and its name.

> +        self.variants = variants
> +    def check(self, schema, members, seen):
> +        if self.tag_name:
> +            self.tag_member = seen[self.tag_name]
> +        else:
> +            self.tag_member.check(schema, members, seen)

A little bit tricky here in that flat unions (tag_name was set) must
find the tag from the members already loaded from the base class, while
plain unions (tag_name is None) add their own implicit member into the
union.  But it works.

> +        assert isinstance(self.tag_member.type, QAPISchemaEnumType)
> +        for v in self.variants:
> +            vseen = dict(seen)
> +            v.check(schema, self.tag_member.type, vseen)
> +
> +class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):
> +    def __init__(self, name, typ, flat):
> +        QAPISchemaObjectTypeMember.__init__(self, name, typ, False)
> +        assert isinstance(flat, bool)
> +        self.flat = flat
> +    def check(self, schema, tag_type, seen):
> +        QAPISchemaObjectTypeMember.check(self, schema, [], seen)
> +        assert self.name in tag_type.values
> +        if self.flat:
> +            self.type.check(schema)
> +            assert isinstance(self.type, QAPISchemaObjectType)

Do we really want to be tracking self.flat for each variant?  After all,
docs/qapi-code-gen.txt already describes the mapping from simple union
into flat union (as if the flat union had a base class with single
member 'kind' of the right type, then each branch of the union composed
of an implicit object with a lone member 'data' of the correct type).
In other words, is it any better to just normalize into that form now,
such that each QAPISchemaObjectTypeVariant is merely a (often
one-element) list of name:type members being added to the overall
QAPISchemaObject?  But I guess it remains to be seen how you use
self.flat before knowing if it is worth normalizing away from it.

> +
> +class QAPISchemaAlternateType(QAPISchemaType):
> +    def __init__(self, name, info, variants):
> +        QAPISchemaType.__init__(self, name, info)
> +        assert isinstance(variants, QAPISchemaObjectTypeVariants)
> +        assert not variants.tag_name
> +        for v in variants.variants:
> +            assert not v.flat

Hmm. You are using .flat after all.

> +        self.variants = variants
> +    def check(self, schema):
> +        self.variants.check(schema, [], {})
> +
> +class QAPISchemaCommand(QAPISchemaEntity):
> +    def __init__(self, name, info, args, rets, gen, success_response):
> +        QAPISchemaEntity.__init__(self, name, info)
> +        assert not args or isinstance(args, str)
> +        assert not rets or isinstance(rets, str)

Again, this means that the caller has already had to convert qapi
dictionaries into an implicit type and pass in that type name here.
Works for me.

> +        self.args_name = args
> +        self.args = None
> +        self.rets_name = rets
> +        self.rets = None
> +        self.gen = gen
> +        self.success_response = success_response
> +    def check(self, schema):
> +        if self.args_name:
> +            self.args = schema.lookup_type(self.args_name)
> +            assert isinstance(self.args, QAPISchemaObjectType)
> +            assert not self.args.variants # not implemented
> +            self.args.check(schema)
> +        if self.rets_name:
> +            self.rets = schema.lookup_type(self.rets_name)
> +            assert isinstance(self.rets, QAPISchemaType)
> +            self.rets.check(schema)

Matches our existing difference in requiring a dictionary for arguments,
but allowing the return of any type (although we now have a whitelist to
prevent too many additions of non-dictionaries).

> +
> +class QAPISchemaEvent(QAPISchemaEntity):
> +    def __init__(self, name, info, data):
> +        QAPISchemaEntity.__init__(self, name, info)
> +        assert not data or isinstance(data, str)
> +        self.data_name = data
> +        self.data = None
> +    def check(self, schema):
> +        if self.data_name:
> +            self.data = schema.lookup_type(self.data_name)
> +            assert isinstance(self.data, QAPISchemaObjectType)
> +            self.data.check(schema)
> +
> +class QAPISchema(object):
> +    def __init__(self, fname):
> +        try:
> +            self.exprs = check_exprs(QAPISchemaParser(open(fname, "r")).exprs)
> +        except (QAPISchemaError, QAPIExprError), err:
> +            print >>sys.stderr, err
> +            exit(1)
> +        self.entity_dict = {}
> +        self._def_predefineds()
> +        self._def_exprs()
> +        self.check()
> +
> +    def get_exprs(self):
> +        return [expr_elem['expr'] for expr_elem in self.exprs]
> +
> +    def _def_entity(self, ent):
> +        assert ent.name not in self.entity_dict
> +        self.entity_dict[ent.name] = ent
> +
> +    def lookup_entity(self, name, typ=None):
> +        ent = self.entity_dict.get(name)
> +        if typ and not isinstance(ent, typ):
> +            return None
> +        return ent

Ah, so you enforce a shared namespace between types, commands, and
events (all three are in self.entity_dict), but can use the typ
parameter to allow limiting a lookup to just types.

> +
> +    def lookup_type(self, name):
> +        return self.lookup_entity(name, QAPISchemaType)
> +
> +    def _def_builtin_type(self, name):
> +        self._def_entity(QAPISchemaBuiltinType(name))
> +        if name != '**':
> +            self._make_array_type(name) # TODO really needed?
> +
> +    def _def_predefineds(self):
> +        for t in ['str', 'number', 'int', 'int8', 'int16', 'int32', 'int64',
> +                  'uint8', 'uint16', 'uint32', 'uint64', 'size', 'bool', '**']:
> +            self._def_builtin_type(t)
> +
> +    def _make_implicit_enum_type(self, name, values):
> +        name = name + 'Kind'
> +        self._def_entity(QAPISchemaEnumType(name, None, values))
> +        return name
> +
> +    def _make_array_type(self, element_type):
> +        name = element_type + 'List'
> +        if not self.lookup_type(name):
> +            self._def_entity(QAPISchemaArrayType(name, None, element_type))
> +        return name
> +
> +    def _make_implicit_object_type(self, name, role, members):
> +        name = ':obj-%s-%s' % (name, role)
> +        self._def_entity(QAPISchemaObjectType(name, None, None, members, None))
> +        return name

We may extend the qapi .json syntax to allow anonymous types in more
places, but so far, this matches that all existing use of anonymous
types are for structs, not unions.

> +
> +    def _def_enum_type(self, expr, info):
> +        name = expr['enum']
> +        data = expr['data']
> +        self._def_entity(QAPISchemaEnumType(name, info, data))
> +        self._make_array_type(name) # TODO really needed?
> +
> +    def _make_member(self, name, typ):
> +        optional = False
> +        if name.startswith('*'):
> +            name = name[1:]
> +            optional = True
> +        if isinstance(typ, list):
> +            assert len(typ) == 1
> +            typ = self._make_array_type(typ[0])
> +        return QAPISchemaObjectTypeMember(name, typ, optional)
> +
> +    def _make_members(self, data):
> +        return [self._make_member(key, data[key]) for key in data.keys()]
> +
> +    def _def_struct_type(self, expr, info):
> +        name = expr['struct']
> +        base = expr.get('base')
> +        data = expr['data']
> +        self._def_entity(QAPISchemaObjectType(name, info, base,
> +                                              self._make_members(data),
> +                                              None))
> +        self._make_array_type(name) # TODO really needed?
> +
> +    def _make_flat_variant(self, case, typ):
> +        return QAPISchemaObjectTypeVariant(case, typ, True)
> +
> +    def _make_flat_variants(self, tag_name, data):
> +        variants = [self._make_flat_variant(key, data[key])
> +                    for key in data.keys()]
> +        return QAPISchemaObjectTypeVariants(tag_name, None, variants)
> +
> +    def _make_simple_variant(self, case, typ):
> +        if isinstance(typ, list):
> +            assert len(typ) == 1
> +            typ = self._make_array_type(typ[0])
> +        return QAPISchemaObjectTypeVariant(case, typ, False)
> +
> +    def _make_simple_variants(self, type_name, data):
> +        variants = [self._make_simple_variant(key, data[key])
> +                    for key in data.keys()]
> +        enum = self._make_implicit_enum_type(type_name,
> +                                             [v.name for v in variants])
> +        return QAPISchemaObjectTypeVariants(None, enum, variants)

Again, I wonder if normalizing simple unions into flat unions (to get
rid of the need for tracking .flat) would make later use any easier, or
if it would just make introspection QMP output more verbose.

> +
> +    def _def_union_type(self, expr, info):
> +        name = expr['union']
> +        data = expr['data']
> +        tag_name = expr.get('discriminator')
> +        if tag_name:
> +            base = expr['base']
> +            variants = self._make_flat_variants(tag_name, data)
> +        else:
> +            base = None
> +            variants = self._make_simple_variants(name, data)
> +        self._def_entity(QAPISchemaObjectType(name, info, base,
> +                                              self._make_members(OrderedDict()),
> +                                              variants))
> +        self._make_array_type(name) # TODO really needed?
> +
> +    def _def_alternate_type(self, expr, info):
> +        name = expr['alternate']
> +        data = expr['data']
> +        self._def_entity(QAPISchemaAlternateType(name, info,
> +                                    self._make_simple_variants(name, data)))
> +        self._make_array_type(name) # TODO really needed?
> +
> +    def _def_command(self, expr, info):
> +        name = expr['command']
> +        args = expr.get('data')
> +        rets = expr.get('returns')
> +        gen = expr.get('gen', True)
> +        success_response = expr.get('success-response', True)
> +        if args and not isinstance(args, str):
> +            args = self._make_implicit_object_type(name, 'args',
> +                                                   self._make_members(args))

If I write { 'command':'Foo', 'data':{} }, does that try and create an
implicit type, with no members?  I'm wondering if it is any simpler to
write args = expr.get('data', {}), and have _make_implicit_object_type()
do the right thing when presented with an empty list of members.

> +        if rets and isinstance(rets, list):
> +            assert len(rets) == 1
> +            rets = self._make_array_type(rets[0])
> +        elif rets and not isinstance(rets, str):
> +            rets = self._make_implicit_object_type(name, 'rets',
> +                                                   self._make_members(rets))
> +        self._def_entity(QAPISchemaCommand(name, info, args, rets, gen,
> +                                           success_response))
> +
> +    def _def_event(self, expr, info):
> +        name = expr['event']
> +        data = expr.get('data')
> +        if data and not isinstance(data, str):
> +            data = self._make_implicit_object_type(name, 'data',
> +                                                   self._make_members(data))
> +        self._def_entity(QAPISchemaEvent(name, info, data))
> +
> +    def _def_exprs(self):
> +        for expr_elem in self.exprs:
> +            expr = expr_elem['expr']
> +            info = expr_elem['info']
> +            if 'enum' in expr:
> +                self._def_enum_type(expr, info)
> +            elif 'struct' in expr:
> +                self._def_struct_type(expr, info)
> +            elif 'union' in expr:
> +                self._def_union_type(expr, info)
> +            elif 'alternate' in expr:
> +                self._def_alternate_type(expr, info)
> +            elif 'command' in expr:
> +                self._def_command(expr, info)
> +            elif 'event' in expr:
> +                self._def_event(expr, info)
> +            else:
> +                assert False
> +
> +    def check(self):
> +        for ent in self.entity_dict.values():
> +            ent.check(self)

Overall seems like a very nice layout.

I'll reserve R-b until I see how the rest of the series uses this object
hierarchy, but it looks like you won't have to change very much of this
patch when you do your next spin of the series.
Markus Armbruster July 27, 2015, 9:23 a.m. UTC | #2
Eric Blake <eblake@redhat.com> writes:

> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>> The QAPI code generators work with a syntax tree (nested dictionaries)
>> plus a few symbol tables (also dictionaties) on the side.
>
> s/dictionaties/dictionaries/

Will fix.

>> They have clearly outgrown these simple data structures.  There's lots
>> of rummaging around in dictionaries, and information is recomputed on
>> the fly.  For the work I'm going to do, I want more clearly defined
>> and more convenient interfaces.
>> 
>> Going forward, I also want less coupling between the back-ends and the
>> syntax tree, to make messing with the syntax easier.
>
> Indeed - should be a lot easier to add new qapi .json syntax sugar on
> the front-end, or to tweak generated C layout on the backend, without
> close coupling between the two.  Particularly when adding front-end
> sugar extensions that still normalize into the same backend structs.
>
>> 
>> Create a bunch of classes to represent QAPI schemata.
>> 
>> Have the QAPISchema initializer call the parser, then walk the syntax
>> tree to create the new internal representation, and finally perform
>> semantic analysis.
>> 
>> Shortcut: the semantic analysis still relies on existing check_exprs()
>> to do the actual semantic checking.  All this code needs to move into
>> the classes.  Mark as TODO.
>> 
>> We generate array types eagerly, even though most of them aren't used.
>> Mark as TODO.
>> 
>
> No change to generated files at this stage in the game (this is mostly
> additive, then later patches shed their old ways of doing things by
> using this).  Good division of labor between patches.

I'll steal from this paragraph for my commit message.

>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  scripts/qapi-commands.py       |   2 +-
>>  scripts/qapi-event.py          |   2 +-
>>  scripts/qapi-types.py          |   2 +-
>>  scripts/qapi-visit.py          |   2 +-
>>  scripts/qapi.py                | 351 +++++++++++++++++++++++++++++++++++++++--
>>  tests/qapi-schema/test-qapi.py |   2 +-
>>  6 files changed, 347 insertions(+), 14 deletions(-)
>> 
>
>> +++ b/scripts/qapi.py
>
>> +
>> +#
>> +# Schema compiler frontend
>> +#
>> +
>> +class QAPISchemaEntity(object):
>> +    def __init__(self, name, info):
>> +        assert isinstance(name, str)
>> +        self.name = name
>> +        self.info = info
>> +    def check(self, schema):
>> +        pass
>> +
>> +class QAPISchemaType(QAPISchemaEntity):
>> +    pass
>> +
>> +class QAPISchemaBuiltinType(QAPISchemaType):
>> +    def __init__(self, name):
>> +        QAPISchemaType.__init__(self, name, None)
>> +
>> +class QAPISchemaEnumType(QAPISchemaType):
>> +    def __init__(self, name, info, values):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        for v in values:
>> +            assert isinstance(v, str)
>> +        self.values = values
>
> worth a check() method to ensure values don't collide?  Especially since
> you already do that in QAPISchemaObjectTypeMember.check()

Can do.

Aside: everything the check() methods assert must be established by
semantic analysis.  Moving semantic analysis into the classes as
outlined in the commit message will hopefully make that obvious enough
to permit dropping the assertions.

>> +
>> +class QAPISchemaArrayType(QAPISchemaType):
>> +    def __init__(self, name, info, element_type):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        assert isinstance(element_type, str)
>> +        self.element_type_name = element_type
>> +        self.element_type = None
>> +    def check(self, schema):
>> +        self.element_type = schema.lookup_type(self.element_type_name)
>> +        assert self.element_type
>> +
>> +class QAPISchemaObjectType(QAPISchemaType):
>> +    def __init__(self, name, info, base, local_members, variants):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        assert base == None or isinstance(base, str)
>> +        for m in local_members:
>> +            assert isinstance(m, QAPISchemaObjectTypeMember)
>> +        if variants != None:
>> +            assert isinstance(variants, QAPISchemaObjectTypeVariants)
>> +        self.base_name = base
>> +        self.base = None
>> +        self.local_members = local_members
>> +        self.variants = variants
>> +        self.members = None
>> +    def check(self, schema):
>> +        if self.members:
>> +            return                      # already checked
>> +        assert self.members == None     # not running in cycles
>> +        self.members = False            # mark as being checked
>
> Interesting, but makes sense (since we allow self-referential nesting,
> populating our own members may require revisiting the same type).

Yes.  Another way to put it: for a recursive type like this, we better
check the recursion terminates.

>                                                                    Cute
> that python allows both None and False as distinct non-true settings, so
> that you end up using the variable as a tri-state.

Yup :)

>> +        if self.base_name:
>> +            self.base = schema.lookup_type(self.base_name)
>> +            self.base.check(schema)
>
> Do you need 'assert self.base' here, similar to how you did it in
> QAPISchemaArrayType.check()?  I guess you'll still get a python
> exception that None.check() doesn't exist if the lookup_type failed, so
> it just depends on what error message you want.

I don't see a hard need, but consistency is nice.  Since I generally
assert elsewhere, I'll add an assertion here.

>> +            members = list(self.base.members)
>
> For that matter, do you want to assert isinstance(self.base,
> QAPISchemaObjectType), since lookup_type can return other types, but
> other child classes of QAPISchemaType do not have a .members?

Yes, that's the appropriate assertion.

>> +        else:
>> +            members = []
>> +        seen = {}
>> +        for m in members:
>> +            seen[m.name] = m
>> +        for m in self.local_members:
>> +            m.check(schema, members, seen)
>> +        if self.variants:
>> +            self.variants.check(schema, members, seen)
>> +        self.members = members
>> +
>> +class QAPISchemaObjectTypeMember(object):
>
> Is 'object' the right base class, or should it be QAPISchemaEntity?

No, it's really object.  A schema's entities can be object types, other
types, commands and events, but not members.  A schema doesn't have
"members", only an object type has.

>> +    def __init__(self, name, typ, optional):
>> +        assert isinstance(name, str)
>> +        assert isinstance(typ, str)
>
> I guess that means all callers flatten array types into an unambiguous
> string first.

Yes.

Conscious design decision: every type in the internal representation has
a unique name.  If the concrete syntax lets you define an anonymous
type, we make up a name in the internal representation.  One less
special case to worry about.

>> +        assert isinstance(optional, bool)
>> +        self.name = name
>> +        self.type_name = typ
>> +        self.type = None
>> +        self.optional = optional
>> +    def check(self, schema, all_members, seen):
>> +        assert self.name not in seen
>> +        self.type = schema.lookup_type(self.type_name)
>> +        assert self.type
>> +        all_members.append(self)
>> +        seen[self.name] = self
>> +
>> +class QAPISchemaObjectTypeVariants(object):
>
> Is 'object' the right base class, or should it be QAPISchemaEntity?

As for QAPISchemaObjectTypeMember above, object is appropriate.

>> +    def __init__(self, tag_name, tag_enum, variants):
>> +        assert tag_name == None or isinstance(tag_name, str)
>> +        assert tag_enum == None or isinstance(tag_enum, str)
>> +        for v in variants:
>> +            assert isinstance(v, QAPISchemaObjectTypeVariant)
>> +        self.tag_name = tag_name
>> +        if tag_name:
>> +            assert not tag_enum
>> +            self.tag_member = None
>> +        else:
>> +            self.tag_member = QAPISchemaObjectTypeMember('kind', tag_enum,
>> +                                                         False)
>
> Nice - the tag is either part of the base struct (flat union) or
> implicitly created (plain union).  By the time we are done with check(),
> we then have both its type (possibly generated implicitly) and its name.

... and we don't have to know where they came from anymore.

>> +        self.variants = variants
>> +    def check(self, schema, members, seen):
>> +        if self.tag_name:
>> +            self.tag_member = seen[self.tag_name]
>> +        else:
>> +            self.tag_member.check(schema, members, seen)
>
> A little bit tricky here in that flat unions (tag_name was set) must
> find the tag from the members already loaded from the base class, while
> plain unions (tag_name is None) add their own implicit member into the
> union.  But it works.
>
>> +        assert isinstance(self.tag_member.type, QAPISchemaEnumType)
>> +        for v in self.variants:
>> +            vseen = dict(seen)
>> +            v.check(schema, self.tag_member.type, vseen)
>> +
>> +class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):
>> +    def __init__(self, name, typ, flat):
>> +        QAPISchemaObjectTypeMember.__init__(self, name, typ, False)
>> +        assert isinstance(flat, bool)
>> +        self.flat = flat
>> +    def check(self, schema, tag_type, seen):
>> +        QAPISchemaObjectTypeMember.check(self, schema, [], seen)
>> +        assert self.name in tag_type.values
>> +        if self.flat:
>> +            self.type.check(schema)
>> +            assert isinstance(self.type, QAPISchemaObjectType)
>
> Do we really want to be tracking self.flat for each variant?  After all,
> docs/qapi-code-gen.txt already describes the mapping from simple union
> into flat union (as if the flat union had a base class with single
> member 'kind' of the right type, then each branch of the union composed
> of an implicit object with a lone member 'data' of the correct type).
> In other words, is it any better to just normalize into that form now,
> such that each QAPISchemaObjectTypeVariant is merely a (often
> one-element) list of name:type members being added to the overall
> QAPISchemaObject?

I tried to do exactly that, but got bogged down in special cases and
copped out.  Then I went on vacation, and now I don't remember the exact
problems anymore %-}

I guess / hope it's just relatively pointless differences in the
generated C code I didn't want to get rid of at this time.  The series
is long and hairy enough as it is...

>                    But I guess it remains to be seen how you use
> self.flat before knowing if it is worth normalizing away from it.

At least introspect.json is oblivious of it.

>> +
>> +class QAPISchemaAlternateType(QAPISchemaType):
>> +    def __init__(self, name, info, variants):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        assert isinstance(variants, QAPISchemaObjectTypeVariants)
>> +        assert not variants.tag_name
>> +        for v in variants.variants:
>> +            assert not v.flat
>
> Hmm. You are using .flat after all.
>
>> +        self.variants = variants
>> +    def check(self, schema):
>> +        self.variants.check(schema, [], {})
>> +
>> +class QAPISchemaCommand(QAPISchemaEntity):
>> +    def __init__(self, name, info, args, rets, gen, success_response):
>> +        QAPISchemaEntity.__init__(self, name, info)
>> +        assert not args or isinstance(args, str)
>> +        assert not rets or isinstance(rets, str)
>
> Again, this means that the caller has already had to convert qapi
> dictionaries into an implicit type and pass in that type name here.
> Works for me.
>
>> +        self.args_name = args
>> +        self.args = None
>> +        self.rets_name = rets
>> +        self.rets = None
>> +        self.gen = gen
>> +        self.success_response = success_response
>> +    def check(self, schema):
>> +        if self.args_name:
>> +            self.args = schema.lookup_type(self.args_name)
>> +            assert isinstance(self.args, QAPISchemaObjectType)
>> +            assert not self.args.variants # not implemented
>> +            self.args.check(schema)
>> +        if self.rets_name:
>> +            self.rets = schema.lookup_type(self.rets_name)
>> +            assert isinstance(self.rets, QAPISchemaType)
>> +            self.rets.check(schema)
>
> Matches our existing difference in requiring a dictionary for arguments,
> but allowing the return of any type (although we now have a whitelist to
> prevent too many additions of non-dictionaries).

Yes.  In my opinion, the difference between arguments and returns is a
design blemish (a function maps from domain to codomain, and both are
just sets, so any differences are likely pointless baggage), but we need
to work with what we've got.

>> +
>> +class QAPISchemaEvent(QAPISchemaEntity):
>> +    def __init__(self, name, info, data):
>> +        QAPISchemaEntity.__init__(self, name, info)
>> +        assert not data or isinstance(data, str)
>> +        self.data_name = data
>> +        self.data = None
>> +    def check(self, schema):
>> +        if self.data_name:
>> +            self.data = schema.lookup_type(self.data_name)
>> +            assert isinstance(self.data, QAPISchemaObjectType)
>> +            self.data.check(schema)
>> +
>> +class QAPISchema(object):
>> +    def __init__(self, fname):
>> +        try:
>> + self.exprs = check_exprs(QAPISchemaParser(open(fname, "r")).exprs)
>> +        except (QAPISchemaError, QAPIExprError), err:
>> +            print >>sys.stderr, err
>> +            exit(1)
>> +        self.entity_dict = {}
>> +        self._def_predefineds()
>> +        self._def_exprs()
>> +        self.check()
>> +
>> +    def get_exprs(self):
>> +        return [expr_elem['expr'] for expr_elem in self.exprs]
>> +
>> +    def _def_entity(self, ent):
>> +        assert ent.name not in self.entity_dict
>> +        self.entity_dict[ent.name] = ent
>> +
>> +    def lookup_entity(self, name, typ=None):
>> +        ent = self.entity_dict.get(name)
>> +        if typ and not isinstance(ent, typ):
>> +            return None
>> +        return ent
>
> Ah, so you enforce a shared namespace between types, commands, and
> events (all three are in self.entity_dict), but can use the typ
> parameter to allow limiting a lookup to just types.

Yes.  It's a convenience feature.

>> +
>> +    def lookup_type(self, name):
>> +        return self.lookup_entity(name, QAPISchemaType)
>> +
>> +    def _def_builtin_type(self, name):
>> +        self._def_entity(QAPISchemaBuiltinType(name))
>> +        if name != '**':
>> +            self._make_array_type(name) # TODO really needed?
>> +
>> +    def _def_predefineds(self):
>> +        for t in ['str', 'number', 'int', 'int8', 'int16', 'int32', 'int64',
>> + 'uint8', 'uint16', 'uint32', 'uint64', 'size', 'bool', '**']:
>> +            self._def_builtin_type(t)
>> +
>> +    def _make_implicit_enum_type(self, name, values):
>> +        name = name + 'Kind'
>> +        self._def_entity(QAPISchemaEnumType(name, None, values))
>> +        return name
>> +
>> +    def _make_array_type(self, element_type):
>> +        name = element_type + 'List'
>> +        if not self.lookup_type(name):
>> +            self._def_entity(QAPISchemaArrayType(name, None, element_type))
>> +        return name
>> +
>> +    def _make_implicit_object_type(self, name, role, members):
>> +        name = ':obj-%s-%s' % (name, role)
>> +        self._def_entity(QAPISchemaObjectType(name, None, None, members, None))
>> +        return name
>
> We may extend the qapi .json syntax to allow anonymous types in more
> places, but so far, this matches that all existing use of anonymous
> types are for structs, not unions.

Yes.

If we unify struct and union in the concrete syntax like we do in this
internal representation, then anonymous unions (really: anonymous
objects with variants) may well fall out naturally.

>> +
>> +    def _def_enum_type(self, expr, info):
>> +        name = expr['enum']
>> +        data = expr['data']
>> +        self._def_entity(QAPISchemaEnumType(name, info, data))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _make_member(self, name, typ):
>> +        optional = False
>> +        if name.startswith('*'):
>> +            name = name[1:]
>> +            optional = True
>> +        if isinstance(typ, list):
>> +            assert len(typ) == 1
>> +            typ = self._make_array_type(typ[0])
>> +        return QAPISchemaObjectTypeMember(name, typ, optional)
>> +
>> +    def _make_members(self, data):
>> +        return [self._make_member(key, data[key]) for key in data.keys()]
>> +
>> +    def _def_struct_type(self, expr, info):
>> +        name = expr['struct']
>> +        base = expr.get('base')
>> +        data = expr['data']
>> +        self._def_entity(QAPISchemaObjectType(name, info, base,
>> +                                              self._make_members(data),
>> +                                              None))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _make_flat_variant(self, case, typ):
>> +        return QAPISchemaObjectTypeVariant(case, typ, True)
>> +
>> +    def _make_flat_variants(self, tag_name, data):
>> +        variants = [self._make_flat_variant(key, data[key])
>> +                    for key in data.keys()]
>> +        return QAPISchemaObjectTypeVariants(tag_name, None, variants)
>> +
>> +    def _make_simple_variant(self, case, typ):
>> +        if isinstance(typ, list):
>> +            assert len(typ) == 1
>> +            typ = self._make_array_type(typ[0])
>> +        return QAPISchemaObjectTypeVariant(case, typ, False)
>> +
>> +    def _make_simple_variants(self, type_name, data):
>> +        variants = [self._make_simple_variant(key, data[key])
>> +                    for key in data.keys()]
>> +        enum = self._make_implicit_enum_type(type_name,
>> +                                             [v.name for v in variants])
>> +        return QAPISchemaObjectTypeVariants(None, enum, variants)
>
> Again, I wonder if normalizing simple unions into flat unions (to get
> rid of the need for tracking .flat) would make later use any easier, or
> if it would just make introspection QMP output more verbose.

They *are* normalized in introspection output!

>> +
>> +    def _def_union_type(self, expr, info):
>> +        name = expr['union']
>> +        data = expr['data']
>> +        tag_name = expr.get('discriminator')
>> +        if tag_name:
>> +            base = expr['base']
>> +            variants = self._make_flat_variants(tag_name, data)
>> +        else:
>> +            base = None
>> +            variants = self._make_simple_variants(name, data)
>> +        self._def_entity(QAPISchemaObjectType(name, info, base,
>> + self._make_members(OrderedDict()),
>> +                                              variants))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _def_alternate_type(self, expr, info):
>> +        name = expr['alternate']
>> +        data = expr['data']
>> +        self._def_entity(QAPISchemaAlternateType(name, info,
>> +                                    self._make_simple_variants(name, data)))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _def_command(self, expr, info):
>> +        name = expr['command']
>> +        args = expr.get('data')
>> +        rets = expr.get('returns')
>> +        gen = expr.get('gen', True)
>> +        success_response = expr.get('success-response', True)
>> +        if args and not isinstance(args, str):
>> +            args = self._make_implicit_object_type(name, 'args',
>> +                                                   self._make_members(args))
>
> If I write { 'command':'Foo', 'data':{} }, does that try and create an
> implicit type, with no members?

tests/qapi-schema/qapi-schema-test.json tests this:

    { 'command': 'user_def_cmd', 'data': {} }

args = expr.get('data') gets an empty OrderedDict().

Since an empty dictionary counts as false, we skip calling
_make_implicit_object_type(), and pass the empty dictionary to
QAPISchemaCommand().  Works, because the class consistently treats any
false args the same as None.  tests/qapi-schema/qapi-schema-test.out:

    command user_def_cmd None -> None
       gen=True success_response=True

I'll consider cleaning this up a bit.

>                                  I'm wondering if it is any simpler to
> write args = expr.get('data', {}), and have _make_implicit_object_type()
> do the right thing when presented with an empty list of members.
>
>> +        if rets and isinstance(rets, list):
>> +            assert len(rets) == 1
>> +            rets = self._make_array_type(rets[0])
>> +        elif rets and not isinstance(rets, str):
>> +            rets = self._make_implicit_object_type(name, 'rets',
>> +                                                   self._make_members(rets))
>> +        self._def_entity(QAPISchemaCommand(name, info, args, rets, gen,
>> +                                           success_response))
>> +
>> +    def _def_event(self, expr, info):
>> +        name = expr['event']
>> +        data = expr.get('data')
>> +        if data and not isinstance(data, str):
>> +            data = self._make_implicit_object_type(name, 'data',
>> +                                                   self._make_members(data))
>> +        self._def_entity(QAPISchemaEvent(name, info, data))
>> +
>> +    def _def_exprs(self):
>> +        for expr_elem in self.exprs:
>> +            expr = expr_elem['expr']
>> +            info = expr_elem['info']
>> +            if 'enum' in expr:
>> +                self._def_enum_type(expr, info)
>> +            elif 'struct' in expr:
>> +                self._def_struct_type(expr, info)
>> +            elif 'union' in expr:
>> +                self._def_union_type(expr, info)
>> +            elif 'alternate' in expr:
>> +                self._def_alternate_type(expr, info)
>> +            elif 'command' in expr:
>> +                self._def_command(expr, info)
>> +            elif 'event' in expr:
>> +                self._def_event(expr, info)
>> +            else:
>> +                assert False
>> +
>> +    def check(self):
>> +        for ent in self.entity_dict.values():
>> +            ent.check(self)
>
> Overall seems like a very nice layout.
>
> I'll reserve R-b until I see how the rest of the series uses this object
> hierarchy, but it looks like you won't have to change very much of this
> patch when you do your next spin of the series.

Thanks!
Eric Blake July 27, 2015, 2:01 p.m. UTC | #3
On 07/27/2015 03:23 AM, Markus Armbruster wrote:

>>>
>>> Create a bunch of classes to represent QAPI schemata.
>>>
>>> Have the QAPISchema initializer call the parser, then walk the syntax
>>> tree to create the new internal representation, and finally perform
>>> semantic analysis.
>>>

>>> +class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):
>>> +    def __init__(self, name, typ, flat):
>>> +        QAPISchemaObjectTypeMember.__init__(self, name, typ, False)
>>> +        assert isinstance(flat, bool)
>>> +        self.flat = flat
>>> +    def check(self, schema, tag_type, seen):
>>> +        QAPISchemaObjectTypeMember.check(self, schema, [], seen)
>>> +        assert self.name in tag_type.values
>>> +        if self.flat:
>>> +            self.type.check(schema)
>>> +            assert isinstance(self.type, QAPISchemaObjectType)
>>
>> Do we really want to be tracking self.flat for each variant?  After all,
>> docs/qapi-code-gen.txt already describes the mapping from simple union
>> into flat union (as if the flat union had a base class with single
>> member 'kind' of the right type, then each branch of the union composed
>> of an implicit object with a lone member 'data' of the correct type).
>> In other words, is it any better to just normalize into that form now,
>> such that each QAPISchemaObjectTypeVariant is merely a (often
>> one-element) list of name:type members being added to the overall
>> QAPISchemaObject?
> 
> I tried to do exactly that, but got bogged down in special cases and
> copped out.  Then I went on vacation, and now I don't remember the exact
> problems anymore %-}
> 
> I guess / hope it's just relatively pointless differences in the
> generated C code I didn't want to get rid of at this time.  The series
> is long and hairy enough as it is...
> 
>>                    But I guess it remains to be seen how you use
>> self.flat before knowing if it is worth normalizing away from it.
> 
> At least introspect.json is oblivious of it.

Yeah, that was my conclusion by the end of the series - we still had
enough special cases where we generate different code for simple unions
than for flat unions. It would be possible to merge the generator and
simplify things further, but at this point, it is best done as a
follow-up series because it will touch lots of C code (anything that
uses a simple union would have to convert to the common representation).

And you actually did reference .flat in the patch that first added
qapi-introspect.py (good that it did not leak to the introspection
output, but it did have to be considered when figuring out what to
output); again, something you may want to rework before polishing this
into v3, or something you may end up just documenting as a possible
cleanup for a future series. But after having reviewed the whole series
now, I'm able to live with your use of .flat, if only because it makes
the initial conversion faster.

>>> +
>>> +    def lookup_entity(self, name, typ=None):
>>> +        ent = self.entity_dict.get(name)
>>> +        if typ and not isinstance(ent, typ):
>>> +            return None
>>> +        return ent
>>
>> Ah, so you enforce a shared namespace between types, commands, and
>> events (all three are in self.entity_dict), but can use the typ
>> parameter to allow limiting a lookup to just types.
> 
> Yes.  It's a convenience feature.

Documentation comments never hurt :) (You did mention in the cover
letter that part of the reason this was still RFC was that it was
lacking documentation)
diff mbox

Patch

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 223216d..3cdbd97 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -276,7 +276,7 @@  for o, a in opts:
     if o in ("-m", "--middle"):
         middle_mode = True
 
-exprs = parse_schema(input_file)
+exprs = QAPISchema(input_file).get_exprs()
 commands = filter(lambda expr: expr.has_key('command'), exprs)
 commands = filter(lambda expr: not expr.has_key('gen'), commands)
 
diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py
index 7f238df..aec2d32 100644
--- a/scripts/qapi-event.py
+++ b/scripts/qapi-event.py
@@ -263,7 +263,7 @@  fdecl.write(mcgen('''
 ''',
                   prefix=prefix))
 
-exprs = parse_schema(input_file)
+exprs = QAPISchema(input_file).get_exprs()
 
 event_enum_name = c_name(prefix + "QAPIEvent", protect=False)
 event_enum_values = []
diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 4cc17da..a48ad9c 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -325,7 +325,7 @@  fdecl.write(mcgen('''
 #include <stdint.h>
 '''))
 
-exprs = parse_schema(input_file)
+exprs = QAPISchema(input_file).get_exprs()
 
 fdecl.write(guardstart("QAPI_TYPES_BUILTIN_STRUCT_DECL"))
 for typename in builtin_types.keys():
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 427819a..a52a572 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -442,7 +442,7 @@  fdecl.write(mcgen('''
 ''',
                   prefix=prefix))
 
-exprs = parse_schema(input_file)
+exprs = QAPISchema(input_file).get_exprs()
 
 # to avoid header dependency hell, we always generate declarations
 # for built-in types in our header files and simply guard them
diff --git a/scripts/qapi.py b/scripts/qapi.py
index 390ccd0..a80892e 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -302,6 +302,7 @@  class QAPISchemaParser(object):
 
 #
 # Semantic analysis of schema expressions
+# TODO fold into QAPISchema
 #
 
 def find_base_fields(base):
@@ -751,15 +752,347 @@  def check_exprs(exprs):
         else:
             assert False, 'unexpected meta type'
 
-    return map(lambda expr_elem: expr_elem['expr'], exprs)
-
-def parse_schema(fname):
-    try:
-        schema = QAPISchemaParser(open(fname, "r"))
-        return check_exprs(schema.exprs)
-    except (QAPISchemaError, QAPIExprError), e:
-        print >>sys.stderr, e
-        exit(1)
+    return exprs
+
+#
+# Schema compiler frontend
+#
+
+class QAPISchemaEntity(object):
+    def __init__(self, name, info):
+        assert isinstance(name, str)
+        self.name = name
+        self.info = info
+    def check(self, schema):
+        pass
+
+class QAPISchemaType(QAPISchemaEntity):
+    pass
+
+class QAPISchemaBuiltinType(QAPISchemaType):
+    def __init__(self, name):
+        QAPISchemaType.__init__(self, name, None)
+
+class QAPISchemaEnumType(QAPISchemaType):
+    def __init__(self, name, info, values):
+        QAPISchemaType.__init__(self, name, info)
+        for v in values:
+            assert isinstance(v, str)
+        self.values = values
+
+class QAPISchemaArrayType(QAPISchemaType):
+    def __init__(self, name, info, element_type):
+        QAPISchemaType.__init__(self, name, info)
+        assert isinstance(element_type, str)
+        self.element_type_name = element_type
+        self.element_type = None
+    def check(self, schema):
+        self.element_type = schema.lookup_type(self.element_type_name)
+        assert self.element_type
+
+class QAPISchemaObjectType(QAPISchemaType):
+    def __init__(self, name, info, base, local_members, variants):
+        QAPISchemaType.__init__(self, name, info)
+        assert base == None or isinstance(base, str)
+        for m in local_members:
+            assert isinstance(m, QAPISchemaObjectTypeMember)
+        if variants != None:
+            assert isinstance(variants, QAPISchemaObjectTypeVariants)
+        self.base_name = base
+        self.base = None
+        self.local_members = local_members
+        self.variants = variants
+        self.members = None
+    def check(self, schema):
+        if self.members:
+            return                      # already checked
+        assert self.members == None     # not running in cycles
+        self.members = False            # mark as being checked
+        if self.base_name:
+            self.base = schema.lookup_type(self.base_name)
+            self.base.check(schema)
+            members = list(self.base.members)
+        else:
+            members = []
+        seen = {}
+        for m in members:
+            seen[m.name] = m
+        for m in self.local_members:
+            m.check(schema, members, seen)
+        if self.variants:
+            self.variants.check(schema, members, seen)
+        self.members = members
+
+class QAPISchemaObjectTypeMember(object):
+    def __init__(self, name, typ, optional):
+        assert isinstance(name, str)
+        assert isinstance(typ, str)
+        assert isinstance(optional, bool)
+        self.name = name
+        self.type_name = typ
+        self.type = None
+        self.optional = optional
+    def check(self, schema, all_members, seen):
+        assert self.name not in seen
+        self.type = schema.lookup_type(self.type_name)
+        assert self.type
+        all_members.append(self)
+        seen[self.name] = self
+
+class QAPISchemaObjectTypeVariants(object):
+    def __init__(self, tag_name, tag_enum, variants):
+        assert tag_name == None or isinstance(tag_name, str)
+        assert tag_enum == None or isinstance(tag_enum, str)
+        for v in variants:
+            assert isinstance(v, QAPISchemaObjectTypeVariant)
+        self.tag_name = tag_name
+        if tag_name:
+            assert not tag_enum
+            self.tag_member = None
+        else:
+            self.tag_member = QAPISchemaObjectTypeMember('kind', tag_enum,
+                                                         False)
+        self.variants = variants
+    def check(self, schema, members, seen):
+        if self.tag_name:
+            self.tag_member = seen[self.tag_name]
+        else:
+            self.tag_member.check(schema, members, seen)
+        assert isinstance(self.tag_member.type, QAPISchemaEnumType)
+        for v in self.variants:
+            vseen = dict(seen)
+            v.check(schema, self.tag_member.type, vseen)
+
+class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):
+    def __init__(self, name, typ, flat):
+        QAPISchemaObjectTypeMember.__init__(self, name, typ, False)
+        assert isinstance(flat, bool)
+        self.flat = flat
+    def check(self, schema, tag_type, seen):
+        QAPISchemaObjectTypeMember.check(self, schema, [], seen)
+        assert self.name in tag_type.values
+        if self.flat:
+            self.type.check(schema)
+            assert isinstance(self.type, QAPISchemaObjectType)
+
+class QAPISchemaAlternateType(QAPISchemaType):
+    def __init__(self, name, info, variants):
+        QAPISchemaType.__init__(self, name, info)
+        assert isinstance(variants, QAPISchemaObjectTypeVariants)
+        assert not variants.tag_name
+        for v in variants.variants:
+            assert not v.flat
+        self.variants = variants
+    def check(self, schema):
+        self.variants.check(schema, [], {})
+
+class QAPISchemaCommand(QAPISchemaEntity):
+    def __init__(self, name, info, args, rets, gen, success_response):
+        QAPISchemaEntity.__init__(self, name, info)
+        assert not args or isinstance(args, str)
+        assert not rets or isinstance(rets, str)
+        self.args_name = args
+        self.args = None
+        self.rets_name = rets
+        self.rets = None
+        self.gen = gen
+        self.success_response = success_response
+    def check(self, schema):
+        if self.args_name:
+            self.args = schema.lookup_type(self.args_name)
+            assert isinstance(self.args, QAPISchemaObjectType)
+            assert not self.args.variants # not implemented
+            self.args.check(schema)
+        if self.rets_name:
+            self.rets = schema.lookup_type(self.rets_name)
+            assert isinstance(self.rets, QAPISchemaType)
+            self.rets.check(schema)
+
+class QAPISchemaEvent(QAPISchemaEntity):
+    def __init__(self, name, info, data):
+        QAPISchemaEntity.__init__(self, name, info)
+        assert not data or isinstance(data, str)
+        self.data_name = data
+        self.data = None
+    def check(self, schema):
+        if self.data_name:
+            self.data = schema.lookup_type(self.data_name)
+            assert isinstance(self.data, QAPISchemaObjectType)
+            self.data.check(schema)
+
+class QAPISchema(object):
+    def __init__(self, fname):
+        try:
+            self.exprs = check_exprs(QAPISchemaParser(open(fname, "r")).exprs)
+        except (QAPISchemaError, QAPIExprError), err:
+            print >>sys.stderr, err
+            exit(1)
+        self.entity_dict = {}
+        self._def_predefineds()
+        self._def_exprs()
+        self.check()
+
+    def get_exprs(self):
+        return [expr_elem['expr'] for expr_elem in self.exprs]
+
+    def _def_entity(self, ent):
+        assert ent.name not in self.entity_dict
+        self.entity_dict[ent.name] = ent
+
+    def lookup_entity(self, name, typ=None):
+        ent = self.entity_dict.get(name)
+        if typ and not isinstance(ent, typ):
+            return None
+        return ent
+
+    def lookup_type(self, name):
+        return self.lookup_entity(name, QAPISchemaType)
+
+    def _def_builtin_type(self, name):
+        self._def_entity(QAPISchemaBuiltinType(name))
+        if name != '**':
+            self._make_array_type(name) # TODO really needed?
+
+    def _def_predefineds(self):
+        for t in ['str', 'number', 'int', 'int8', 'int16', 'int32', 'int64',
+                  'uint8', 'uint16', 'uint32', 'uint64', 'size', 'bool', '**']:
+            self._def_builtin_type(t)
+
+    def _make_implicit_enum_type(self, name, values):
+        name = name + 'Kind'
+        self._def_entity(QAPISchemaEnumType(name, None, values))
+        return name
+
+    def _make_array_type(self, element_type):
+        name = element_type + 'List'
+        if not self.lookup_type(name):
+            self._def_entity(QAPISchemaArrayType(name, None, element_type))
+        return name
+
+    def _make_implicit_object_type(self, name, role, members):
+        name = ':obj-%s-%s' % (name, role)
+        self._def_entity(QAPISchemaObjectType(name, None, None, members, None))
+        return name
+
+    def _def_enum_type(self, expr, info):
+        name = expr['enum']
+        data = expr['data']
+        self._def_entity(QAPISchemaEnumType(name, info, data))
+        self._make_array_type(name) # TODO really needed?
+
+    def _make_member(self, name, typ):
+        optional = False
+        if name.startswith('*'):
+            name = name[1:]
+            optional = True
+        if isinstance(typ, list):
+            assert len(typ) == 1
+            typ = self._make_array_type(typ[0])
+        return QAPISchemaObjectTypeMember(name, typ, optional)
+
+    def _make_members(self, data):
+        return [self._make_member(key, data[key]) for key in data.keys()]
+
+    def _def_struct_type(self, expr, info):
+        name = expr['struct']
+        base = expr.get('base')
+        data = expr['data']
+        self._def_entity(QAPISchemaObjectType(name, info, base,
+                                              self._make_members(data),
+                                              None))
+        self._make_array_type(name) # TODO really needed?
+
+    def _make_flat_variant(self, case, typ):
+        return QAPISchemaObjectTypeVariant(case, typ, True)
+
+    def _make_flat_variants(self, tag_name, data):
+        variants = [self._make_flat_variant(key, data[key])
+                    for key in data.keys()]
+        return QAPISchemaObjectTypeVariants(tag_name, None, variants)
+
+    def _make_simple_variant(self, case, typ):
+        if isinstance(typ, list):
+            assert len(typ) == 1
+            typ = self._make_array_type(typ[0])
+        return QAPISchemaObjectTypeVariant(case, typ, False)
+
+    def _make_simple_variants(self, type_name, data):
+        variants = [self._make_simple_variant(key, data[key])
+                    for key in data.keys()]
+        enum = self._make_implicit_enum_type(type_name,
+                                             [v.name for v in variants])
+        return QAPISchemaObjectTypeVariants(None, enum, variants)
+
+    def _def_union_type(self, expr, info):
+        name = expr['union']
+        data = expr['data']
+        tag_name = expr.get('discriminator')
+        if tag_name:
+            base = expr['base']
+            variants = self._make_flat_variants(tag_name, data)
+        else:
+            base = None
+            variants = self._make_simple_variants(name, data)
+        self._def_entity(QAPISchemaObjectType(name, info, base,
+                                              self._make_members(OrderedDict()),
+                                              variants))
+        self._make_array_type(name) # TODO really needed?
+
+    def _def_alternate_type(self, expr, info):
+        name = expr['alternate']
+        data = expr['data']
+        self._def_entity(QAPISchemaAlternateType(name, info,
+                                    self._make_simple_variants(name, data)))
+        self._make_array_type(name) # TODO really needed?
+
+    def _def_command(self, expr, info):
+        name = expr['command']
+        args = expr.get('data')
+        rets = expr.get('returns')
+        gen = expr.get('gen', True)
+        success_response = expr.get('success-response', True)
+        if args and not isinstance(args, str):
+            args = self._make_implicit_object_type(name, 'args',
+                                                   self._make_members(args))
+        if rets and isinstance(rets, list):
+            assert len(rets) == 1
+            rets = self._make_array_type(rets[0])
+        elif rets and not isinstance(rets, str):
+            rets = self._make_implicit_object_type(name, 'rets',
+                                                   self._make_members(rets))
+        self._def_entity(QAPISchemaCommand(name, info, args, rets, gen,
+                                           success_response))
+
+    def _def_event(self, expr, info):
+        name = expr['event']
+        data = expr.get('data')
+        if data and not isinstance(data, str):
+            data = self._make_implicit_object_type(name, 'data',
+                                                   self._make_members(data))
+        self._def_entity(QAPISchemaEvent(name, info, data))
+
+    def _def_exprs(self):
+        for expr_elem in self.exprs:
+            expr = expr_elem['expr']
+            info = expr_elem['info']
+            if 'enum' in expr:
+                self._def_enum_type(expr, info)
+            elif 'struct' in expr:
+                self._def_struct_type(expr, info)
+            elif 'union' in expr:
+                self._def_union_type(expr, info)
+            elif 'alternate' in expr:
+                self._def_alternate_type(expr, info)
+            elif 'command' in expr:
+                self._def_command(expr, info)
+            elif 'event' in expr:
+                self._def_event(expr, info)
+            else:
+                assert False
+
+    def check(self):
+        for ent in self.entity_dict.values():
+            ent.check(self)
 
 #
 # Code generation helpers
diff --git a/tests/qapi-schema/test-qapi.py b/tests/qapi-schema/test-qapi.py
index 634ef2d..461c713 100644
--- a/tests/qapi-schema/test-qapi.py
+++ b/tests/qapi-schema/test-qapi.py
@@ -16,7 +16,7 @@  import os
 import sys
 
 try:
-    exprs = parse_schema(sys.argv[1])
+    exprs = QAPISchema(sys.argv[1]).get_exprs()
 except SystemExit:
     raise