diff mbox

[v4,2/5] qapi: add qapi-introspect.py code generator

Message ID 1390488396-16538-3-git-send-email-akong@redhat.com
State New
Headers show

Commit Message

Amos Kong Jan. 23, 2014, 2:46 p.m. UTC
This is a code generator for qapi introspection. It will parse
qapi-schema.json, extend schema definitions and generate a schema
table with metadata, it references to the new structs which we used
to describe dynamic data structs.  The metadata will help C code to
allocate right structs and provide useful information to management
to checking suported feature and QMP commandline detail. The schema
table will be saved to qapi-introspect.h.

The $(prefix) is used to as a namespace to keep the generated code
from one schema/code-generation separated from others so code and
be generated from multiple schemas with clobbering previously
created code.

Signed-off-by: Amos Kong <akong@redhat.com>
---
 .gitignore                      |   1 +
 Makefile                        |   5 +-
 docs/qmp-full-introspection.txt |  17 ++++
 scripts/qapi-introspect.py      | 172 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100644 scripts/qapi-introspect.py

Comments

Fam Zheng Jan. 24, 2014, 9:12 a.m. UTC | #1
On Thu, 01/23 22:46, Amos Kong wrote:
> This is a code generator for qapi introspection. It will parse
> qapi-schema.json, extend schema definitions and generate a schema
> table with metadata, it references to the new structs which we used
> to describe dynamic data structs.  The metadata will help C code to
> allocate right structs and provide useful information to management
> to checking suported feature and QMP commandline detail. The schema
> table will be saved to qapi-introspect.h.
> 
> The $(prefix) is used to as a namespace to keep the generated code

s/used to as/used as/

> from one schema/code-generation separated from others so code and
> be generated from multiple schemas with clobbering previously

s/with/without/

> created code.
> 
> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  .gitignore                      |   1 +
>  Makefile                        |   5 +-
>  docs/qmp-full-introspection.txt |  17 ++++
>  scripts/qapi-introspect.py      | 172 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 194 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/qapi-introspect.py
> 
> diff --git a/.gitignore b/.gitignore
> index 1c9d63d..de3cb80 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -22,6 +22,7 @@ linux-headers/asm
>  qapi-generated
>  qapi-types.[ch]
>  qapi-visit.[ch]
> +qapi-introspect.h
>  qmp-commands.h
>  qmp-marshal.c
>  qemu-doc.html
> diff --git a/Makefile b/Makefile
> index bdff4e4..1dac5e7 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,7 +45,7 @@ endif
>  endif
>  
>  GENERATED_HEADERS = config-host.h qemu-options.def
> -GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
> +GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-introspect.h
>  GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c
>  
>  GENERATED_HEADERS += trace/generated-events.h
> @@ -229,6 +229,9 @@ $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
>  qmp-commands.h qmp-marshal.c :\
>  $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
>  	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -m -o "." < $<, "  GEN   $@")
> +qapi-introspect.h:\
> +$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py)
> +	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py $(gen-out-type) -o "." < $<, "  GEN   $@")
>  
>  QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
>  $(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
> diff --git a/docs/qmp-full-introspection.txt b/docs/qmp-full-introspection.txt
> index d2cf7b3..8ecbc0c 100644
> --- a/docs/qmp-full-introspection.txt
> +++ b/docs/qmp-full-introspection.txt
> @@ -42,3 +42,20 @@ types.
>  
>  'anonymous-struct' will be used to describe arbitrary structs
>  (dictionary, list or string).
> +
> +== Avoid dead loop in recursive extending ==
> +
> +We have four types (ImageInfo, BlockStats, PciDeviceInfo, ObjectData)
> +that uses themself in their own define data directly or indirectly,

s/themself/themselves/
s/define data/definition/

> +we will not repeatedly extend them to avoid dead loop.
> +
> +We use a 'parents List' to record the visit path, type name of each
> +extended node will be saved to the List.
> +
> +Append type name to the list before extending, and remove type name
> +from the list after extending.
> +
> +If the type name is already extended in parents List, we won't extend
> +it repeatedly for avoiding dead loop.

This "parents" list detail is not reflected in the generated information,
right?  I think it's good enough to describe that "type will not be extented
more than once in a schema, when there's direct or indirect recursive
type composition".

> +
> +'recursive' indicates if the type is extended or not.
> diff --git a/scripts/qapi-introspect.py b/scripts/qapi-introspect.py
> new file mode 100644
> index 0000000..03179fa
> --- /dev/null
> +++ b/scripts/qapi-introspect.py
> @@ -0,0 +1,172 @@
> +#
> +# QAPI introspection info generator
> +#
> +# Copyright (C) 2014 Red Hat, Inc.
> +#
> +# Authors:
> +#  Amos Kong <akong@redhat.com>
> +#
> +# This work is licensed under the terms of the GNU GPLv2.
> +# See the COPYING.LIB file in the top-level directory.
> +
> +from ordereddict import OrderedDict
> +from qapi import *
> +import sys
> +import os
> +import getopt
> +import errno
> +
> +
> +try:
> +    opts, args = getopt.gnu_getopt(sys.argv[1:], "hp:o:",
> +                                   ["header", "prefix=", "output-dir="])
> +except getopt.GetoptError, err:
> +    print str(err)
> +    sys.exit(1)
> +
> +output_dir = ""
> +prefix = ""
> +h_file = 'qapi-introspect.h'
> +
> +do_h = False
> +
> +for o, a in opts:
> +    if o in ("-p", "--prefix"):
> +        prefix = a

Is this option used in your series?

Thanks,
Fam

> +    elif o in ("-o", "--output-dir"):
> +        output_dir = a + "/"
> +    elif o in ("-h", "--header"):
> +        do_h = True
> +
> +h_file = output_dir + prefix + h_file
> +
> +try:
> +    os.makedirs(output_dir)
> +except os.error, e:
> +    if e.errno != errno.EEXIST:
> +        raise
> +
> +def maybe_open(really, name, opt):
> +    if really:
> +        return open(name, opt)
> +    else:
> +        import StringIO
> +        return StringIO.StringIO()
> +
> +fdecl = maybe_open(do_h, h_file, 'w')
> +
> +fdecl.write(mcgen('''
> +/* AUTOMATICALLY GENERATED, DO NOT MODIFY */
> +
> +/*
> + * Head file to store parsed information of QAPI schema
> + *
> + * Copyright (C) 2014 Red Hat, Inc.
> + *
> + * Authors:
> + *  Amos Kong <akong@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#ifndef %(guard)s
> +#define %(guard)s
> +
> +''',
> +                  guard=guardname(h_file)))
> +
> +def extend_schema(expr, parents=[], member=True):
> +    ret = {}
> +    recu = 'False'
> +    name = ""
> +
> +    if type(expr) is OrderedDict:
> +        if not member:
> +            e = expr.popitem(last=False)
> +            typ = e[0]
> +            name = e[1]
> +        else:
> +            typ = "anonymous-struct"
> +
> +        if typ == 'enum':
> +            for key in expr.keys():
> +                ret[key] = expr[key]
> +        else:
> +            ret = {}
> +            for key in expr.keys():
> +                ret[key], parents = extend_schema(expr[key], parents)
> +
> +    elif type(expr) is list:
> +        typ = 'anonymous-struct'
> +        ret = []
> +        for i in expr:
> +            tmp, parents = extend_schema(i, parents)
> +            ret.append(tmp)
> +    elif type(expr) is str:
> +        name = expr
> +        if schema_dict.has_key(expr) and expr not in parents:
> +            parents.append(expr)
> +            typ = schema_dict[expr][1]
> +            recu = 'True'
> +            ret, parents = extend_schema(schema_dict[expr][0].copy(),
> +                                         parents, False)
> +            parents.remove(expr)
> +            ret['_obj_recursive'] = 'True'
> +            return ret, parents
> +        else:
> +            return expr, parents
> +
> +    return {'_obj_member': "%s" % member, '_obj_type': typ,
> +            '_obj_name': name, '_obj_recursive': recu,
> +            '_obj_data': ret}, parents
> +
> +
> +exprs = parse_schema(sys.stdin)
> +schema_dict = {}
> +
> +for expr in exprs:
> +    if expr.has_key('type') or expr.has_key('enum') or expr.has_key('union'):
> +        e = expr.copy()
> +
> +        first = e.popitem(last=False)
> +        schema_dict[first[1]] = [expr.copy(), first[0]]
> +
> +fdecl.write('''const char *const qmp_schema_table[] = {
> +''')
> +
> +def convert(odict):
> +    d = {}
> +    for k, v in odict.items():
> +        if type(v) is OrderedDict:
> +            d[k] = convert(v)
> +        elif type(v) is list:
> +            l = []
> +            for j in v:
> +                if type(j) is OrderedDict:
> +                    l.append(convert(j))
> +                else:
> +                    l.append(j)
> +            d[k] = l
> +        else:
> +            d[k] = v
> +    return d
> +
> +count = 0
> +for expr in exprs:
> +    fdecl.write('''    /* %s */
> +''' % expr)
> +
> +    expr, parents = extend_schema(expr, [], False)
> +    fdecl.write('''    "%s",
> +
> +''' % convert(expr))
> +
> +fdecl.write('''    NULL };
> +
> +#endif
> +''')
> +
> +fdecl.flush()
> +fdecl.close()
> -- 
> 1.8.4.2
> 
>
Amos Kong Jan. 24, 2014, 9:34 a.m. UTC | #2
On Fri, Jan 24, 2014 at 05:12:12PM +0800, Fam Zheng wrote:
> On Thu, 01/23 22:46, Amos Kong wrote:
> > This is a code generator for qapi introspection. It will parse
> > qapi-schema.json, extend schema definitions and generate a schema
> > table with metadata, it references to the new structs which we used
> > to describe dynamic data structs.  The metadata will help C code to
> > allocate right structs and provide useful information to management
> > to checking suported feature and QMP commandline detail. The schema
> > table will be saved to qapi-introspect.h.
> > 
> > The $(prefix) is used to as a namespace to keep the generated code
> 
> s/used to as/used as/
> 
> > from one schema/code-generation separated from others so code and
> > be generated from multiple schemas with clobbering previously
> 
> s/with/without/
> 
> > created code.
> > 
> > Signed-off-by: Amos Kong <akong@redhat.com>
> > ---
> >  .gitignore                      |   1 +
> >  Makefile                        |   5 +-
> >  docs/qmp-full-introspection.txt |  17 ++++
> >  scripts/qapi-introspect.py      | 172 ++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 194 insertions(+), 1 deletion(-)
> >  create mode 100644 scripts/qapi-introspect.py
> > 
> > diff --git a/.gitignore b/.gitignore
> > index 1c9d63d..de3cb80 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -22,6 +22,7 @@ linux-headers/asm
> >  qapi-generated
> >  qapi-types.[ch]
> >  qapi-visit.[ch]
> > +qapi-introspect.h
> >  qmp-commands.h
> >  qmp-marshal.c
> >  qemu-doc.html
> > diff --git a/Makefile b/Makefile
> > index bdff4e4..1dac5e7 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -45,7 +45,7 @@ endif
> >  endif
> >  
> >  GENERATED_HEADERS = config-host.h qemu-options.def
> > -GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
> > +GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-introspect.h
> >  GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c
> >  
> >  GENERATED_HEADERS += trace/generated-events.h
> > @@ -229,6 +229,9 @@ $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
> >  qmp-commands.h qmp-marshal.c :\
> >  $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
> >  	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -m -o "." < $<, "  GEN   $@")
> > +qapi-introspect.h:\
> > +$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py)
> > +	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py $(gen-out-type) -o "." < $<, "  GEN   $@")
> >  
> >  QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
> >  $(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
> > diff --git a/docs/qmp-full-introspection.txt b/docs/qmp-full-introspection.txt
> > index d2cf7b3..8ecbc0c 100644
> > --- a/docs/qmp-full-introspection.txt
> > +++ b/docs/qmp-full-introspection.txt
> > @@ -42,3 +42,20 @@ types.
> >  
> >  'anonymous-struct' will be used to describe arbitrary structs
> >  (dictionary, list or string).
> > +
> > +== Avoid dead loop in recursive extending ==
> > +
> > +We have four types (ImageInfo, BlockStats, PciDeviceInfo, ObjectData)
> > +that uses themself in their own define data directly or indirectly,
> 
> s/themself/themselves/
> s/define data/definition/
> 
> > +we will not repeatedly extend them to avoid dead loop.
> > +
> > +We use a 'parents List' to record the visit path, type name of each
> > +extended node will be saved to the List.
> > +
> > +Append type name to the list before extending, and remove type name
> > +from the list after extending.
> > +
> > +If the type name is already extended in parents List, we won't extend
> > +it repeatedly for avoiding dead loop.
> 
> This "parents" list detail is not reflected in the generated information,
> right?

Not, it just used to help the extending.

>  I think it's good enough to describe that "type will not be extented
> more than once in a schema, when there's direct or indirect recursive
> type composition".

It's more meaningful.
 
> > +
> > +'recursive' indicates if the type is extended or not.
> > diff --git a/scripts/qapi-introspect.py b/scripts/qapi-introspect.py
> > new file mode 100644
> > index 0000000..03179fa
> > --- /dev/null
> > +++ b/scripts/qapi-introspect.py
> > @@ -0,0 +1,172 @@
> > +#
> > +# QAPI introspection info generator
> > +#
> > +# Copyright (C) 2014 Red Hat, Inc.
> > +#
> > +# Authors:
> > +#  Amos Kong <akong@redhat.com>
> > +#
> > +# This work is licensed under the terms of the GNU GPLv2.
> > +# See the COPYING.LIB file in the top-level directory.
> > +
> > +from ordereddict import OrderedDict
> > +from qapi import *
> > +import sys
> > +import os
> > +import getopt
> > +import errno
> > +
> > +
> > +try:
> > +    opts, args = getopt.gnu_getopt(sys.argv[1:], "hp:o:",
> > +                                   ["header", "prefix=", "output-dir="])
> > +except getopt.GetoptError, err:
> > +    print str(err)
> > +    sys.exit(1)
> > +
> > +output_dir = ""
> > +prefix = ""
> > +h_file = 'qapi-introspect.h'
> > +
> > +do_h = False
> > +
> > +for o, a in opts:
> > +    if o in ("-p", "--prefix"):
> > +        prefix = a
> 
> Is this option used in your series?

Not, I will remove it.
 
> Thanks,
> Fam
 
Thanks for your review.
Amos
Eric Blake Feb. 4, 2014, 12:15 a.m. UTC | #3
On 01/23/2014 07:46 AM, Amos Kong wrote:
> This is a code generator for qapi introspection. It will parse
> qapi-schema.json, extend schema definitions and generate a schema
> table with metadata, it references to the new structs which we used
> to describe dynamic data structs.  The metadata will help C code to
> allocate right structs and provide useful information to management
> to checking suported feature and QMP commandline detail. The schema

s/suported/supported/

> table will be saved to qapi-introspect.h.
> 
> The $(prefix) is used to as a namespace to keep the generated code
> from one schema/code-generation separated from others so code and
> be generated from multiple schemas with clobbering previously
> created code.
> 
> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  .gitignore                      |   1 +
>  Makefile                        |   5 +-
>  docs/qmp-full-introspection.txt |  17 ++++
>  scripts/qapi-introspect.py      | 172 ++++++++++++++++++++++++++++++++++++++++

I am NOT a python expert.  But what I _can_ do is apply this patch and
review the generated code for sanity.

>  4 files changed, 194 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/qapi-introspect.py
> 

> +++ b/docs/qmp-full-introspection.txt
> @@ -42,3 +42,20 @@ types.
>  
>  'anonymous-struct' will be used to describe arbitrary structs
>  (dictionary, list or string).
> +
> +== Avoid dead loop in recursive extending ==

I'm still not convinced whether recursive extending is necessary.

Let's step back and look at the big picture: if libvirt asks for the
ENTIRE schema in one go, then it would be nice to have the
representation as compact as possible (minimal time sending data across
the wire, minimal memory consumed).  Libvirt can then create its own
hash table of type references encountered as it consumes the JSON, and
do its own recursive lookups by reading from its hash table on an
as-needed basis.  Recursive expansion avoids the need to look up type
references, but explodes the size of the data sent over the wire.

On the other hand, if we support filtering by one type, then I agree
that getting all details about that type in one command is nicer than
having to issue a command, notice unknown type references, then issue
followup commands to learn about them.  So in this regards, having qemu
do the expansion minimizes back-and-forth traffic.  BUT, should the
expansion be inlined (ie. the return is an array of 1 type, but that
type contains several further layers of JSON giving the full expansion
of every type referenced), or is it smarter to do the expansion via
returning multiple types even though libvirt only asked about one (as an
example, return an array of 10 types, where the first array entry is the
requested type, and the remaining 9 types fill out the type references
mentioned by the first array entry).

> +
> +We have four types (ImageInfo, BlockStats, PciDeviceInfo, ObjectData)
> +that uses themself in their own define data directly or indirectly,

s/uses/themself/use themselves/

> +we will not repeatedly extend them to avoid dead loop.

s/dead loop/infinite loop/

> +
> +We use a 'parents List' to record the visit path, type name of each
> +extended node will be saved to the List.
> +
> +Append type name to the list before extending, and remove type name
> +from the list after extending.
> +
> +If the type name is already extended in parents List, we won't extend
> +it repeatedly for avoiding dead loop.
> +
> +'recursive' indicates if the type is extended or not.

Ah, now we get to the definition of an extended type that I was asking
about in 1/5.

> diff --git a/scripts/qapi-introspect.py b/scripts/qapi-introspect.py
> new file mode 100644
> index 0000000..03179fa
> --- /dev/null
> +++ b/scripts/qapi-introspect.py
> @@ -0,0 +1,172 @@
> +#
> +# QAPI introspection info generator
> +#
> +# Copyright (C) 2014 Red Hat, Inc.
> +#
> +# Authors:
> +#  Amos Kong <akong@redhat.com>
> +#
> +# This work is licensed under the terms of the GNU GPLv2.

Any reason you can't use GPLv2+ (that is, use the "or later" clause)?
(Red Hat already has blanket approval for any contributions from
employees to be relicensed to GPLv2+, but if you copied code from other
files with a tighter license and non-RH contributor, it's harder to use
that blanket approval, so it's easier to ask you now up front.)

> +fdecl.write(mcgen('''
> +/* AUTOMATICALLY GENERATED, DO NOT MODIFY */
> +
> +/*
> + * Head file to store parsed information of QAPI schema
> + *
> + * Copyright (C) 2014 Red Hat, Inc.
> + *
> + * Authors:
> + *  Amos Kong <akong@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.

Or even license your .py under LGPLv2+, since the generated code is
under that license?

> +def extend_schema(expr, parents=[], member=True):
> +    ret = {}
> +    recu = 'False'
> +    name = ""
> +
> +    if type(expr) is OrderedDict:
> +        if not member:
> +            e = expr.popitem(last=False)
> +            typ = e[0]
> +            name = e[1]
> +        else:
> +            typ = "anonymous-struct"
> +
> +        if typ == 'enum':
> +            for key in expr.keys():
> +                ret[key] = expr[key]
> +        else:
> +            ret = {}
> +            for key in expr.keys():
> +                ret[key], parents = extend_schema(expr[key], parents)

This is where I question whether extend by default is the right action.

> +
> +    elif type(expr) is list:
> +        typ = 'anonymous-struct'
> +        ret = []
> +        for i in expr:
> +            tmp, parents = extend_schema(i, parents)
> +            ret.append(tmp)
> +    elif type(expr) is str:
> +        name = expr
> +        if schema_dict.has_key(expr) and expr not in parents:
> +            parents.append(expr)
> +            typ = schema_dict[expr][1]
> +            recu = 'True'
> +            ret, parents = extend_schema(schema_dict[expr][0].copy(),
> +                                         parents, False)
> +            parents.remove(expr)
> +            ret['_obj_recursive'] = 'True'
> +            return ret, parents
> +        else:
> +            return expr, parents
> +
> +    return {'_obj_member': "%s" % member, '_obj_type': typ,
> +            '_obj_name': name, '_obj_recursive': recu,
> +            '_obj_data': ret}, parents
> +

What's with the leading underscore?

> +
> +exprs = parse_schema(sys.stdin)
> +schema_dict = {}
> +
> +for expr in exprs:
> +    if expr.has_key('type') or expr.has_key('enum') or expr.has_key('union'):
> +        e = expr.copy()
> +

We'll eventually need to cover event types.

> +        first = e.popitem(last=False)
> +        schema_dict[first[1]] = [expr.copy(), first[0]]
> +
> +fdecl.write('''const char *const qmp_schema_table[] = {
> +''')
> +
> +def convert(odict):
> +    d = {}
> +    for k, v in odict.items():
> +        if type(v) is OrderedDict:
> +            d[k] = convert(v)
> +        elif type(v) is list:
> +            l = []
> +            for j in v:
> +                if type(j) is OrderedDict:
> +                    l.append(convert(j))
> +                else:
> +                    l.append(j)
> +            d[k] = l
> +        else:
> +            d[k] = v
> +    return d
> +
> +count = 0
> +for expr in exprs:
> +    fdecl.write('''    /* %s */
> +''' % expr)
> +
> +    expr, parents = extend_schema(expr, [], False)
> +    fdecl.write('''    "%s",
> +
> +''' % convert(expr))
> +
> +fdecl.write('''    NULL };
> +
> +#endif
> +''')
> +
> +fdecl.flush()
> +fdecl.close()
> 

Like I said, I think my review will be more helpful by looking at the
generated file; I'll follow up in another email (probably tomorrow,
since it's now late for me) with more comments once I actually finish that.
Eric Blake Feb. 11, 2014, 12:35 a.m. UTC | #4
On 02/03/2014 05:15 PM, Eric Blake wrote:
> On 01/23/2014 07:46 AM, Amos Kong wrote:
>> This is a code generator for qapi introspection. It will parse
>> qapi-schema.json, extend schema definitions and generate a schema
>> table with metadata, it references to the new structs which we used
>> to describe dynamic data structs.  The metadata will help C code to
>> allocate right structs and provide useful information to management
>> to checking suported feature and QMP commandline detail. The schema
> 
> s/suported/supported/
> 
>> table will be saved to qapi-introspect.h.
>>

> 
> I am NOT a python expert.  But what I _can_ do is apply this patch and
> review the generated code for sanity.


>> +fdecl.write(mcgen('''
>> +/* AUTOMATICALLY GENERATED, DO NOT MODIFY */
>> +
>> +/*
>> + * Head file to store parsed information of QAPI schema

s/Head/Header/

> Like I said, I think my review will be more helpful by looking at the
> generated file; I'll follow up in another email (probably tomorrow,
> since it's now late for me) with more comments once I actually finish that.
> 

It took me longer than planned to get back to this.  But here goes my
impressions of the generated file:

-rw-rw-r--. 1 eblake eblake 667643 Feb 10 16:16 qapi-introspect.h
-rw-rw-r--. 1 eblake eblake 126261 Feb  7 17:12 qapi-schema.json
-rw-rw-r--. 1 eblake eblake  80170 Feb  7 17:12 qapi-types.h

Wow, that's a LOT of code.  Why does it take 6 times more C code than
what qapi itself represented everything in?  Are we going too far with
inlining type information?  A larger file MIGHT be okay, if that's what
it takes to make C code that is expressive of the information at hand
(after all, the whole point of our qapi is to give us some shorthand, so
that we can quickly define types in less syntax than C) - but I want to
be sure that we really need that much content.  For comparison, the
generated qapi-types.h is smaller than the input; sure, that's in part
due to the comments being stripped out of the input file, but it's
evidence that the C code representation of qapi should be about the same
size as the input file, not approaching an order of magnitude larger.

const char *const qmp_schema_table[] = {
    /* OrderedDict([('enum', 'ErrorClass'), ('data', ['GenericError',
'CommandNotFound', 'DeviceEncrypted', 'DeviceNotActive',
'DeviceNotFound', 'KVMMissingCap'])]) */
    "{'_obj_member': 'False', '_obj_type': 'enum', '_obj_name':
'ErrorClass', '_obj_data': {'data': ['GenericError', 'CommandNotFound',
'DeviceEncrypted', 'DeviceNotActive', 'DeviceNotFound',
'KVMMissingCap']}, '_obj_recursive': 'False'}",

    /* OrderedDict([('command', 'add_client'), ('data',
OrderedDict([('protocol', 'str'), ('fdname', 'str'), ('*skipauth',
'bool'), ('*tls', 'bool')]))]) */
    "{'_obj_member': 'False', '_obj_type': 'command', '_obj_name':
'add_client', '_obj_data': {'data': {'_obj_type': 'anonymous-struct',
'_obj_member': 'True', '_obj_name': '', '_obj_data': {'*skipauth':
'bool', 'protocol': 'str', 'fdname': 'str', '*tls': 'bool'},
'_obj_recursive': 'False'}}, '_obj_recursive': 'False'}",

Long lines!  Just because it's generated doesn't mean it can't have nice
line wraps.  Make your generator output some strategic whitespace, so
that a human perusing the file stands a chance of understanding it (look
at qapi-types.h for comparison).

No sorting?  This looks like you just dumped the output in hash-table
order.  Please sort the array by type and/or name, so that if we add
filtering, the C code can then do O(log n) lookup via bsearch rather
than an O(n) linear crawl (or maybe even multiple tables, one per type,
with each table sorted by name within that type).

The comments before each string entry is redundant.  Cut your file in
half by listing only what the C compiler cares about - after all the
information is supposed to be self-describing enough that if the comment
actually added anything, we failed at introspecting enough useful
information to the user.

A flat-out array of pre-compiled strings?  I guess it makes generating
the output of your command a little faster (just replay the pre-computed
strings, instead of having to stringify a QObject), but it is lousy if
you ever have to process the data differently.  I was totally expecting
an array of structs.  And not just any struct, but an array of the C
struct that gets generated from the QAPI code.  That is, since qapi is
_already_ the mechanism for generating decent C code structs, and we
want introspection to be self-describing, then your 'struct DataObject'
from qapi-types.h (as generated by patch 1/5) should already be
sufficient as the base of your array.  Or maybe we make an array of yet
one more layer of type:

typedef struct QIntrospection QIntrospection;
struct QIntrospection {
    DataObject data;
    const char *string;
};

so that the C code has access to both the qapi struct and the
pre-rendered string, and can thus get at whichever piece of information
is needed (array[i].data.name when filtering by name, array[i].string
when outputting pre-formatted text).

What I find most appalling about the generated file is that each entry
of the array is a JSON string, but that the JSON string does NOT match
the JSON that would be sent over the wire when sending a DataObject.
Thus, the only way your generated file is usable is if you run the
string through a JSON parser, peel out the portions of the string you
need, and then reconstruct things back into the QMP command.  You REALLY
want either a DataObject[] or a QIntrospection[], so that you DON'T have
to parse strings in your C code.  The python code should have already
generated the header file with everything already placed in C structs
and/or QMP wire format strings in the format that best suits your needs!

That is, I think you want something more like this (rendering the first
two lines that I quoted above in pseudo-C):

const DataObject qmp_schema_table[] = {
    { .has_name = true,
      .name = "ErrorClass",
      .kind = DATA_OBJECT_KIND_ENUMERATION,
      .enumeration = { .value = "GenericError",
             .next = { .value = "CommandNotFound",
             .next = { .value = "DeviceEncrypted",
             .next = { .value = "DeviceNotActive",
             .next = { .value = "DeviceNotFound",
             .next = { .value = "KVMMissingCap",
             .next = NULL } } } } } },
      .has_recursive = false,
    },
    { .has_name = true,
      .name = "add_client",
      .kind = DATA_OBJECT_KIND_COMMAND,
      .command = {
          .has_data = true,
          .data = { .value = { .type = { .kind =
DATA_OBJECT_MEMBER_TYPE_KIND_REFERENCE, .reference = "str" },
                               .has_name = true,
                               .name = "protocol",
                               .has_optional = false,
                               .has_recursive = false },
          .next = { .value = { .type = { .kind =
DATA_OBJECT_MEMBER_TYPE_KIND_REFERENCE, .reference = "bool" },
                               .has_name = true,
                               .name = "skipauth",
                               .has_optional = true,
                               .optional = true,
                               .has_recursive = false },
          ...
          .next = NULL }...},
          .has_returns = false,
          .has_gen = false,
        },
      .has_recursive = false,
    },

and so on.  Of course, as I typed that, I realize that you can't
actually initialize DataObjectCommand* and other object pointers via
simple {} initializers; so you actually need to be more verbose and use
some C99 typed initializers:

...
.enumeration = &(DataObjectEnumeration) { .value = "GenericError",
       .next = &(DataObjectEnumeration) { .value = "CommandNotFound",
...

Or maybe all you need is:

const QIntrospection qmp_schema_table[] = {
    { .name = "ErrorClass",
      .str = "{\"name\":\"ErrorClass\","
              "\"type\":\"enumeration\","
              "\"data\":["
                 "\"GenericError\","
                 "\"CommandNotFound\","
                 "\"DeviceEncrypted\","
                 "\"DeviceNotFound\","
                 "\"KVMMissingCap\""
              "]}"
    },
    { .name = "add_client",
      .str = "{\"name\":\"add_client\","
              "\"type\":\"command\","
              "\"data\":{"
                ...

with just the object name and pre-rendered string in each array entry.
But my point remains - let the python code generate something USEFUL,
and not something that the C code has to re-parse.  If you're going to
store a string, store it in the format that QMP will already hand over
the wire.
diff mbox

Patch

diff --git a/.gitignore b/.gitignore
index 1c9d63d..de3cb80 100644
--- a/.gitignore
+++ b/.gitignore
@@ -22,6 +22,7 @@  linux-headers/asm
 qapi-generated
 qapi-types.[ch]
 qapi-visit.[ch]
+qapi-introspect.h
 qmp-commands.h
 qmp-marshal.c
 qemu-doc.html
diff --git a/Makefile b/Makefile
index bdff4e4..1dac5e7 100644
--- a/Makefile
+++ b/Makefile
@@ -45,7 +45,7 @@  endif
 endif
 
 GENERATED_HEADERS = config-host.h qemu-options.def
-GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
+GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-introspect.h
 GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c
 
 GENERATED_HEADERS += trace/generated-events.h
@@ -229,6 +229,9 @@  $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
 qmp-commands.h qmp-marshal.c :\
 $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
 	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -m -o "." < $<, "  GEN   $@")
+qapi-introspect.h:\
+$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py $(gen-out-type) -o "." < $<, "  GEN   $@")
 
 QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
 $(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
diff --git a/docs/qmp-full-introspection.txt b/docs/qmp-full-introspection.txt
index d2cf7b3..8ecbc0c 100644
--- a/docs/qmp-full-introspection.txt
+++ b/docs/qmp-full-introspection.txt
@@ -42,3 +42,20 @@  types.
 
 'anonymous-struct' will be used to describe arbitrary structs
 (dictionary, list or string).
+
+== Avoid dead loop in recursive extending ==
+
+We have four types (ImageInfo, BlockStats, PciDeviceInfo, ObjectData)
+that uses themself in their own define data directly or indirectly,
+we will not repeatedly extend them to avoid dead loop.
+
+We use a 'parents List' to record the visit path, type name of each
+extended node will be saved to the List.
+
+Append type name to the list before extending, and remove type name
+from the list after extending.
+
+If the type name is already extended in parents List, we won't extend
+it repeatedly for avoiding dead loop.
+
+'recursive' indicates if the type is extended or not.
diff --git a/scripts/qapi-introspect.py b/scripts/qapi-introspect.py
new file mode 100644
index 0000000..03179fa
--- /dev/null
+++ b/scripts/qapi-introspect.py
@@ -0,0 +1,172 @@ 
+#
+# QAPI introspection info generator
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# Authors:
+#  Amos Kong <akong@redhat.com>
+#
+# This work is licensed under the terms of the GNU GPLv2.
+# See the COPYING.LIB file in the top-level directory.
+
+from ordereddict import OrderedDict
+from qapi import *
+import sys
+import os
+import getopt
+import errno
+
+
+try:
+    opts, args = getopt.gnu_getopt(sys.argv[1:], "hp:o:",
+                                   ["header", "prefix=", "output-dir="])
+except getopt.GetoptError, err:
+    print str(err)
+    sys.exit(1)
+
+output_dir = ""
+prefix = ""
+h_file = 'qapi-introspect.h'
+
+do_h = False
+
+for o, a in opts:
+    if o in ("-p", "--prefix"):
+        prefix = a
+    elif o in ("-o", "--output-dir"):
+        output_dir = a + "/"
+    elif o in ("-h", "--header"):
+        do_h = True
+
+h_file = output_dir + prefix + h_file
+
+try:
+    os.makedirs(output_dir)
+except os.error, e:
+    if e.errno != errno.EEXIST:
+        raise
+
+def maybe_open(really, name, opt):
+    if really:
+        return open(name, opt)
+    else:
+        import StringIO
+        return StringIO.StringIO()
+
+fdecl = maybe_open(do_h, h_file, 'w')
+
+fdecl.write(mcgen('''
+/* AUTOMATICALLY GENERATED, DO NOT MODIFY */
+
+/*
+ * Head file to store parsed information of QAPI schema
+ *
+ * Copyright (C) 2014 Red Hat, Inc.
+ *
+ * Authors:
+ *  Amos Kong <akong@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef %(guard)s
+#define %(guard)s
+
+''',
+                  guard=guardname(h_file)))
+
+def extend_schema(expr, parents=[], member=True):
+    ret = {}
+    recu = 'False'
+    name = ""
+
+    if type(expr) is OrderedDict:
+        if not member:
+            e = expr.popitem(last=False)
+            typ = e[0]
+            name = e[1]
+        else:
+            typ = "anonymous-struct"
+
+        if typ == 'enum':
+            for key in expr.keys():
+                ret[key] = expr[key]
+        else:
+            ret = {}
+            for key in expr.keys():
+                ret[key], parents = extend_schema(expr[key], parents)
+
+    elif type(expr) is list:
+        typ = 'anonymous-struct'
+        ret = []
+        for i in expr:
+            tmp, parents = extend_schema(i, parents)
+            ret.append(tmp)
+    elif type(expr) is str:
+        name = expr
+        if schema_dict.has_key(expr) and expr not in parents:
+            parents.append(expr)
+            typ = schema_dict[expr][1]
+            recu = 'True'
+            ret, parents = extend_schema(schema_dict[expr][0].copy(),
+                                         parents, False)
+            parents.remove(expr)
+            ret['_obj_recursive'] = 'True'
+            return ret, parents
+        else:
+            return expr, parents
+
+    return {'_obj_member': "%s" % member, '_obj_type': typ,
+            '_obj_name': name, '_obj_recursive': recu,
+            '_obj_data': ret}, parents
+
+
+exprs = parse_schema(sys.stdin)
+schema_dict = {}
+
+for expr in exprs:
+    if expr.has_key('type') or expr.has_key('enum') or expr.has_key('union'):
+        e = expr.copy()
+
+        first = e.popitem(last=False)
+        schema_dict[first[1]] = [expr.copy(), first[0]]
+
+fdecl.write('''const char *const qmp_schema_table[] = {
+''')
+
+def convert(odict):
+    d = {}
+    for k, v in odict.items():
+        if type(v) is OrderedDict:
+            d[k] = convert(v)
+        elif type(v) is list:
+            l = []
+            for j in v:
+                if type(j) is OrderedDict:
+                    l.append(convert(j))
+                else:
+                    l.append(j)
+            d[k] = l
+        else:
+            d[k] = v
+    return d
+
+count = 0
+for expr in exprs:
+    fdecl.write('''    /* %s */
+''' % expr)
+
+    expr, parents = extend_schema(expr, [], False)
+    fdecl.write('''    "%s",
+
+''' % convert(expr))
+
+fdecl.write('''    NULL };
+
+#endif
+''')
+
+fdecl.flush()
+fdecl.close()