diff mbox

[v5,06/28] qapi: Add some union tests

Message ID 1427227433-5030-7-git-send-email-eblake@redhat.com
State New
Headers show

Commit Message

Eric Blake March 24, 2015, 8:03 p.m. UTC
Demonstrate that the qapi generator doesn't deal well with unions
that aren't up to par. Later patches will update the expected
reseults as the generator is made stricter.

Of particular note, we currently allow 'base' without 'discriminator'
as a way to create a simple union with a base class.  However, none
of the existing QMP or QGA interfaces use it (it is exercised only
in the testsuite).  Meanwhile, it would be nice to allow
'discriminator':'EnumType' without 'base' for creating a simple union
that is type-safe rather than open-coded to a generated enum (right
now, we are only type-safe when using a flat union, but that uses
a different syntax of 'discriminator':'member-name' which requires
a base class containing a 'member-name' enum field).  If both 'base'
and 'discriminator' are optional, then converting a simple union
with base class to a type-safe simple union with an enum discriminator
would not be possible.  So my plan is to get rid of 'base' without
'discriminator' later in the series; this will be no real loss, as any
union that needs additional common fields can always use a flat
union.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 tests/Makefile                                           | 12 +++++++++---
 tests/qapi-schema/alternate-array.err                    |  0
 tests/qapi-schema/alternate-array.exit                   |  1 +
 tests/qapi-schema/alternate-array.json                   |  8 ++++++++
 tests/qapi-schema/alternate-array.out                    |  4 ++++
 tests/qapi-schema/alternate-base.err                     |  0
 tests/qapi-schema/alternate-base.exit                    |  1 +
 tests/qapi-schema/alternate-base.json                    |  7 +++++++
 tests/qapi-schema/alternate-base.out                     |  4 ++++
 tests/qapi-schema/alternate-clash.err                    |  0
 tests/qapi-schema/alternate-clash.exit                   |  1 +
 tests/qapi-schema/alternate-clash.json                   |  4 ++++
 tests/qapi-schema/alternate-clash.out                    |  3 +++
 tests/qapi-schema/alternate-conflict-dict.err            |  0
 tests/qapi-schema/alternate-conflict-dict.exit           |  1 +
 tests/qapi-schema/alternate-conflict-dict.json           |  9 +++++++++
 tests/qapi-schema/alternate-conflict-dict.out            |  6 ++++++
 tests/qapi-schema/alternate-conflict-string.err          |  0
 tests/qapi-schema/alternate-conflict-string.exit         |  1 +
 tests/qapi-schema/alternate-conflict-string.json         |  8 ++++++++
 tests/qapi-schema/alternate-conflict-string.out          |  5 +++++
 tests/qapi-schema/alternate-good.err                     |  0
 tests/qapi-schema/alternate-good.exit                    |  1 +
 tests/qapi-schema/alternate-good.json                    | 10 ++++++++++
 tests/qapi-schema/alternate-good.out                     |  6 ++++++
 tests/qapi-schema/alternate-nested.err                   |  0
 tests/qapi-schema/alternate-nested.exit                  |  1 +
 tests/qapi-schema/alternate-nested.json                  |  7 +++++++
 tests/qapi-schema/alternate-nested.out                   |  5 +++++
 tests/qapi-schema/alternate-unknown.err                  |  0
 tests/qapi-schema/alternate-unknown.exit                 |  1 +
 tests/qapi-schema/alternate-unknown.json                 |  4 ++++
 tests/qapi-schema/alternate-unknown.out                  |  3 +++
 tests/qapi-schema/flat-union-bad-base.err                |  1 +
 tests/qapi-schema/flat-union-bad-base.exit               |  1 +
 tests/qapi-schema/flat-union-bad-base.json               | 13 +++++++++++++
 tests/qapi-schema/flat-union-bad-base.out                |  0
 tests/qapi-schema/flat-union-bad-discriminator.err       |  0
 tests/qapi-schema/flat-union-bad-discriminator.exit      |  1 +
 tests/qapi-schema/flat-union-bad-discriminator.json      | 14 ++++++++++++++
 tests/qapi-schema/flat-union-bad-discriminator.out       | 10 ++++++++++
 tests/qapi-schema/flat-union-base-union.err              |  1 +
 tests/qapi-schema/flat-union-base-union.exit             |  1 +
 tests/qapi-schema/flat-union-base-union.json             | 15 +++++++++++++++
 tests/qapi-schema/flat-union-base-union.out              |  0
 tests/qapi-schema/flat-union-no-base.err                 |  2 +-
 tests/qapi-schema/flat-union-no-base.json                |  7 ++++---
 tests/qapi-schema/flat-union-optional-discriminator.err  |  0
 tests/qapi-schema/flat-union-optional-discriminator.exit |  1 +
 tests/qapi-schema/flat-union-optional-discriminator.json |  9 +++++++++
 tests/qapi-schema/flat-union-optional-discriminator.out  |  5 +++++
 tests/qapi-schema/union-bad-branch.err                   |  0
 tests/qapi-schema/union-bad-branch.exit                  |  1 +
 tests/qapi-schema/union-bad-branch.json                  |  8 ++++++++
 tests/qapi-schema/union-bad-branch.out                   |  6 ++++++
 tests/qapi-schema/union-base-no-discriminator.err        |  0
 tests/qapi-schema/union-base-no-discriminator.exit       |  1 +
 tests/qapi-schema/union-base-no-discriminator.json       | 14 ++++++++++++++
 tests/qapi-schema/union-base-no-discriminator.out        |  8 ++++++++
 tests/qapi-schema/union-invalid-base.err                 |  2 +-
 tests/qapi-schema/union-invalid-base.json                |  4 +++-
 tests/qapi-schema/union-max.err                          |  0
 tests/qapi-schema/union-max.exit                         |  1 +
 tests/qapi-schema/union-max.json                         |  3 +++
 tests/qapi-schema/union-max.out                          |  3 +++
 tests/qapi-schema/union-optional-branch.err              |  0
 tests/qapi-schema/union-optional-branch.exit             |  1 +
 tests/qapi-schema/union-optional-branch.json             |  2 ++
 tests/qapi-schema/union-optional-branch.out              |  3 +++
 tests/qapi-schema/union-unknown.err                      |  0
 tests/qapi-schema/union-unknown.exit                     |  1 +
 tests/qapi-schema/union-unknown.json                     |  3 +++
 tests/qapi-schema/union-unknown.out                      |  3 +++
 73 files changed, 249 insertions(+), 9 deletions(-)
 create mode 100644 tests/qapi-schema/alternate-array.err
 create mode 100644 tests/qapi-schema/alternate-array.exit
 create mode 100644 tests/qapi-schema/alternate-array.json
 create mode 100644 tests/qapi-schema/alternate-array.out
 create mode 100644 tests/qapi-schema/alternate-base.err
 create mode 100644 tests/qapi-schema/alternate-base.exit
 create mode 100644 tests/qapi-schema/alternate-base.json
 create mode 100644 tests/qapi-schema/alternate-base.out
 create mode 100644 tests/qapi-schema/alternate-clash.err
 create mode 100644 tests/qapi-schema/alternate-clash.exit
 create mode 100644 tests/qapi-schema/alternate-clash.json
 create mode 100644 tests/qapi-schema/alternate-clash.out
 create mode 100644 tests/qapi-schema/alternate-conflict-dict.err
 create mode 100644 tests/qapi-schema/alternate-conflict-dict.exit
 create mode 100644 tests/qapi-schema/alternate-conflict-dict.json
 create mode 100644 tests/qapi-schema/alternate-conflict-dict.out
 create mode 100644 tests/qapi-schema/alternate-conflict-string.err
 create mode 100644 tests/qapi-schema/alternate-conflict-string.exit
 create mode 100644 tests/qapi-schema/alternate-conflict-string.json
 create mode 100644 tests/qapi-schema/alternate-conflict-string.out
 create mode 100644 tests/qapi-schema/alternate-good.err
 create mode 100644 tests/qapi-schema/alternate-good.exit
 create mode 100644 tests/qapi-schema/alternate-good.json
 create mode 100644 tests/qapi-schema/alternate-good.out
 create mode 100644 tests/qapi-schema/alternate-nested.err
 create mode 100644 tests/qapi-schema/alternate-nested.exit
 create mode 100644 tests/qapi-schema/alternate-nested.json
 create mode 100644 tests/qapi-schema/alternate-nested.out
 create mode 100644 tests/qapi-schema/alternate-unknown.err
 create mode 100644 tests/qapi-schema/alternate-unknown.exit
 create mode 100644 tests/qapi-schema/alternate-unknown.json
 create mode 100644 tests/qapi-schema/alternate-unknown.out
 create mode 100644 tests/qapi-schema/flat-union-bad-base.err
 create mode 100644 tests/qapi-schema/flat-union-bad-base.exit
 create mode 100644 tests/qapi-schema/flat-union-bad-base.json
 create mode 100644 tests/qapi-schema/flat-union-bad-base.out
 create mode 100644 tests/qapi-schema/flat-union-bad-discriminator.err
 create mode 100644 tests/qapi-schema/flat-union-bad-discriminator.exit
 create mode 100644 tests/qapi-schema/flat-union-bad-discriminator.json
 create mode 100644 tests/qapi-schema/flat-union-bad-discriminator.out
 create mode 100644 tests/qapi-schema/flat-union-base-union.err
 create mode 100644 tests/qapi-schema/flat-union-base-union.exit
 create mode 100644 tests/qapi-schema/flat-union-base-union.json
 create mode 100644 tests/qapi-schema/flat-union-base-union.out
 create mode 100644 tests/qapi-schema/flat-union-optional-discriminator.err
 create mode 100644 tests/qapi-schema/flat-union-optional-discriminator.exit
 create mode 100644 tests/qapi-schema/flat-union-optional-discriminator.json
 create mode 100644 tests/qapi-schema/flat-union-optional-discriminator.out
 create mode 100644 tests/qapi-schema/union-bad-branch.err
 create mode 100644 tests/qapi-schema/union-bad-branch.exit
 create mode 100644 tests/qapi-schema/union-bad-branch.json
 create mode 100644 tests/qapi-schema/union-bad-branch.out
 create mode 100644 tests/qapi-schema/union-base-no-discriminator.err
 create mode 100644 tests/qapi-schema/union-base-no-discriminator.exit
 create mode 100644 tests/qapi-schema/union-base-no-discriminator.json
 create mode 100644 tests/qapi-schema/union-base-no-discriminator.out
 create mode 100644 tests/qapi-schema/union-max.err
 create mode 100644 tests/qapi-schema/union-max.exit
 create mode 100644 tests/qapi-schema/union-max.json
 create mode 100644 tests/qapi-schema/union-max.out
 create mode 100644 tests/qapi-schema/union-optional-branch.err
 create mode 100644 tests/qapi-schema/union-optional-branch.exit
 create mode 100644 tests/qapi-schema/union-optional-branch.json
 create mode 100644 tests/qapi-schema/union-optional-branch.out
 create mode 100644 tests/qapi-schema/union-unknown.err
 create mode 100644 tests/qapi-schema/union-unknown.exit
 create mode 100644 tests/qapi-schema/union-unknown.json
 create mode 100644 tests/qapi-schema/union-unknown.out

Comments

Markus Armbruster March 26, 2015, 1:18 p.m. UTC | #1
Eric Blake <eblake@redhat.com> writes:

> Demonstrate that the qapi generator doesn't deal well with unions
> that aren't up to par. Later patches will update the expected
> reseults as the generator is made stricter.
>
> Of particular note, we currently allow 'base' without 'discriminator'
> as a way to create a simple union with a base class.  However, none
> of the existing QMP or QGA interfaces use it (it is exercised only
> in the testsuite).

qapi-code-gen.txt documents 'base' only for flat unions.  We should
either document its use in simple unions, or outlaw it.

>                     Meanwhile, it would be nice to allow
> 'discriminator':'EnumType' without 'base' for creating a simple union
> that is type-safe rather than open-coded to a generated enum (right
> now, we are only type-safe when using a flat union, but that uses
> a different syntax of 'discriminator':'member-name' which requires
> a base class containing a 'member-name' enum field).

I'm afraid I don't get you.  Can you give examples?

>                                                       If both 'base'
> and 'discriminator' are optional, then converting a simple union
> with base class to a type-safe simple union with an enum discriminator
> would not be possible.  So my plan is to get rid of 'base' without
> 'discriminator' later in the series;

Aha: you're going to outlaw 'base' in simple unions.  Yes, please.

>                                      this will be no real loss, as any
> union that needs additional common fields can always use a flat
> union.

The mathematical concept behind unions is the sum type.

We have three implementations, and we call them simple, flat, anonymous.

Anonymous unions are implicitly tagged.  They come with the obvious
restrictions required to make implicit tagging work.

The other two are explicitly tagged.  The difference between them is
just notation.  I like my unions flat, because for me the extra wrapping
is just notational overhead.

In particular, I can define simple unions in terms of flat ones by
restricting all union cases to a single member named 'data'.  They're
not implemented that way, but that's a historical accident.  Simple
unions are a redundant concept.

This proves your "can always use a flat union" proposition.  The only
way they can earn their keep is making the schema easier to read.  I
haven't thought about that.

Another historical accident is how we express members common to all
union cases: base type.  Probably just because complex types already had
them.  Are you planning to change anything there?

[...]
> diff --git a/tests/qapi-schema/alternate-nested.err b/tests/qapi-schema/alternate-nested.err
> new file mode 100644
> index 0000000..e69de29
> diff --git a/tests/qapi-schema/alternate-nested.exit b/tests/qapi-schema/alternate-nested.exit
> new file mode 100644
> index 0000000..573541a
> --- /dev/null
> +++ b/tests/qapi-schema/alternate-nested.exit
> @@ -0,0 +1 @@
> +0
> diff --git a/tests/qapi-schema/alternate-nested.json b/tests/qapi-schema/alternate-nested.json
> new file mode 100644
> index 0000000..d5812bf
> --- /dev/null
> +++ b/tests/qapi-schema/alternate-nested.json
> @@ -0,0 +1,7 @@
> +# FIXME: we should reject a nested anonymous union branch

Same reason we want to reject nested / anonymous complex types
elsewhere?  Or is there a deeper reason?

> +{ 'union': 'Union1',
> +  'discriminator': {},
> +  'data': { 'name': 'str', 'value': 'int' } }
> +{ 'union': 'Union2',
> +  'discriminator': {},
> +  'data': { 'nested': 'Union1' } }
> diff --git a/tests/qapi-schema/alternate-nested.out b/tests/qapi-schema/alternate-nested.out
> new file mode 100644
> index 0000000..0137c1f
> --- /dev/null
> +++ b/tests/qapi-schema/alternate-nested.out
> @@ -0,0 +1,5 @@
> +[OrderedDict([('union', 'Union1'), ('discriminator', OrderedDict()), ('data', OrderedDict([('name', 'str'), ('value', 'int')]))]),
> + OrderedDict([('union', 'Union2'), ('discriminator', OrderedDict()), ('data', OrderedDict([('nested', 'Union1')]))])]
> +[{'enum_name': 'Union1Kind', 'enum_values': None},
> + {'enum_name': 'Union2Kind', 'enum_values': None}]
> +[]
[...]
> diff --git a/tests/qapi-schema/flat-union-bad-base.err b/tests/qapi-schema/flat-union-bad-base.err
> new file mode 100644
> index 0000000..5962ff4
> --- /dev/null
> +++ b/tests/qapi-schema/flat-union-bad-base.err
> @@ -0,0 +1 @@
> +tests/qapi-schema/flat-union-bad-base.json:9: Base 'OrderedDict([('enum1', 'TestEnum'), ('kind', 'str')])' is not a valid type
> diff --git a/tests/qapi-schema/flat-union-bad-base.exit b/tests/qapi-schema/flat-union-bad-base.exit
> new file mode 100644
> index 0000000..d00491f
> --- /dev/null
> +++ b/tests/qapi-schema/flat-union-bad-base.exit
> @@ -0,0 +1 @@
> +1
> diff --git a/tests/qapi-schema/flat-union-bad-base.json b/tests/qapi-schema/flat-union-bad-base.json
> new file mode 100644
> index 0000000..d69168f
> --- /dev/null
> +++ b/tests/qapi-schema/flat-union-bad-base.json
> @@ -0,0 +1,13 @@
> +# we require the base to be an existing complex type
> +# FIXME: should we allow an anonymous inline base type?

What do you mean by "anonymous inline base type"?

> +{ 'enum': 'TestEnum',
> +  'data': [ 'value1', 'value2' ] }
> +{ 'type': 'TestTypeA',
> +  'data': { 'string': 'str' } }
> +{ 'type': 'TestTypeB',
> +  'data': { 'integer': 'int' } }
> +{ 'union': 'TestUnion',
> +  'base': { 'enum1': 'TestEnum', 'kind': 'str' },
> +  'discriminator': 'TestEnum',
> +  'data': { 'kind1': 'TestTypeA',
> +            'kind2': 'TestTypeB' } }
> diff --git a/tests/qapi-schema/flat-union-bad-base.out b/tests/qapi-schema/flat-union-bad-base.out
> new file mode 100644
> index 0000000..e69de29
[...]

"Doesn't deal well" is an understatement :)

Since all my questions are about your intentions and rationale:

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Markus Armbruster March 26, 2015, 1:23 p.m. UTC | #2
One more:

[...]
> diff --git a/tests/qapi-schema/alternate-conflict-string.json b/tests/qapi-schema/alternate-conflict-string.json
> new file mode 100644
> index 0000000..5fd1a47
> --- /dev/null
> +++ b/tests/qapi-schema/alternate-conflict-string.json
> @@ -0,0 +1,8 @@
> +# FIXME: we should reject anonymous unions with multiple string-like branches
> +{ 'enum': 'Enum',
> +  'data': [ 'hello', 'world' ] }
> +{ 'union': 'MyUnion',
> +  'discriminator': {},
> +  'data': { 'one': 'str',
> +	    'two': 'Enum' } }
> +

/home/armbru/work/qemu/.git/rebase-apply/patch:325: new blank line at EOF.

[...]
Eric Blake March 26, 2015, 1:51 p.m. UTC | #3
On 03/26/2015 07:23 AM, Markus Armbruster wrote:
> One more:
> 
> [...]
>> diff --git a/tests/qapi-schema/alternate-conflict-string.json b/tests/qapi-schema/alternate-conflict-string.json
>> new file mode 100644
>> index 0000000..5fd1a47
>> --- /dev/null
>> +++ b/tests/qapi-schema/alternate-conflict-string.json
>> @@ -0,0 +1,8 @@
>> +# FIXME: we should reject anonymous unions with multiple string-like branches
>> +{ 'enum': 'Enum',
>> +  'data': [ 'hello', 'world' ] }
>> +{ 'union': 'MyUnion',
>> +  'discriminator': {},
>> +  'data': { 'one': 'str',
>> +	    'two': 'Enum' } }
>> +
> 
> /home/armbru/work/qemu/.git/rebase-apply/patch:325: new blank line at EOF.

Huh. I thought I had git set up to reject me from making commits like
that locally, but obviously not.
http://wiki.qemu.org/Contribute/SubmitAPatch should probably mention the
magic one-time setup to use to turn this type of checking on...
Eric Blake March 26, 2015, 3:04 p.m. UTC | #4
On 03/26/2015 07:18 AM, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
>> Demonstrate that the qapi generator doesn't deal well with unions
>> that aren't up to par. Later patches will update the expected
>> reseults as the generator is made stricter.
>>
>> Of particular note, we currently allow 'base' without 'discriminator'
>> as a way to create a simple union with a base class.  However, none
>> of the existing QMP or QGA interfaces use it (it is exercised only
>> in the testsuite).
> 
> qapi-code-gen.txt documents 'base' only for flat unions.  We should
> either document its use in simple unions, or outlaw it.

I went with outlaw later in the series, and the rest of my commit
message was an attempt to explain my reasoning in that choice.  But I
can certainly try to improve the wording, if we need a respin.

> 
>>                     Meanwhile, it would be nice to allow
>> 'discriminator':'EnumType' without 'base' for creating a simple union
>> that is type-safe rather than open-coded to a generated enum (right
>> now, we are only type-safe when using a flat union, but that uses
>> a different syntax of 'discriminator':'member-name' which requires
>> a base class containing a 'member-name' enum field).
> 
> I'm afraid I don't get you.  Can you give examples?

Using this common code with the appropriate union for each example:
{ 'command':'foo', 'data':UNION }

Right now, we have flat unions which are required to be type-safe (all
branches MUST map back to the enum type of the discriminator, enforced
by the generator, so that if the enum later adds a member, the union
must also be updated to match):

[1]
{ 'union':'Safe', 'base':'Base', 'discriminator':'switch',
  'data':{ 'one':'str', 'two':'int' }}

{"execute":"foo", "arguments":{'switch':'one', 'data':'hello'}}

and simple unions which cannot be typesafe (the branches of the union
are open-coded - even if they correlate to an existing enum, there is
nothing enforcing that extensions to the enum be reflected into the union):

[2]
{ 'union':'SimpleButOpenCoded',
  'data':{ 'one': 'str', 'two':'int' }}

{"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}

I'm hoping to add as a followup series a variant of simple unions that
is type-safe:

[3]
{ 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
  'data':{ 'one':'str', 'two':'int' }}

{"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}


But the existing, unused-except-in-testsuite, notion of a simple union
with a base class looks like:

[4]
{ 'type':'Shared', 'data':{'common':'int'}}
{ 'union':'SimpleWithBase', 'base':'Shared',
  'data':{ 'one':'str', 'two':'int' }}

{"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}

If we were to allow the addition of 'discriminator':'EnumType' to a
simple union [3], but then add that discriminator to an existing case of
a simple union with base [4], it would look like:

{ 'type':'Shared', 'data':{'common':'int'}}
{ 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
  'discriminator':'MyEnum',
  'data':{ 'one':'str', 'two':'int' }}

Yuck.  That is indistinguishable from flat unions [1], except by whether
discriminator names an enum type or a member of the base class.

> 
>>                                                       If both 'base'
>> and 'discriminator' are optional, then converting a simple union
>> with base class to a type-safe simple union with an enum discriminator
>> would not be possible.  So my plan is to get rid of 'base' without
>> 'discriminator' later in the series;
> 
> Aha: you're going to outlaw 'base' in simple unions.  Yes, please.

Okay, you came to my desired conclusion; it's just that my wording in
the middle didn't help.

> 
>>                                      this will be no real loss, as any
>> union that needs additional common fields can always use a flat
>> union.
> 
> The mathematical concept behind unions is the sum type.
> 
> We have three implementations, and we call them simple, flat, anonymous.
> 
> Anonymous unions are implicitly tagged.  They come with the obvious
> restrictions required to make implicit tagging work.

and get renamed to 'alternate' later in the series, so they are not
worth worrying about here.

> 
> The other two are explicitly tagged.  The difference between them is
> just notation.  I like my unions flat, because for me the extra wrapping
> is just notational overhead.
> 
> In particular, I can define simple unions in terms of flat ones by
> restricting all union cases to a single member named 'data'.  They're
> not implemented that way, but that's a historical accident.  Simple
> unions are a redundant concept.

Cool.  Or more concretely,

{ 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }

is identical on the wire to:

{ 'enum': 'MyEnum', 'data': ['one', 'two'] }
{ 'type': 'Base', 'data': { 'type': 'MyEnum' } }
{ 'type': 'Branch1', 'data': { 'data': 'str' } }
{ 'type': 'Branch2', 'data': { 'data': 'int' } }
{ 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
  'data': { 'one': 'Branch1', 'two': 'Branch2' } }

Probably worth mentioning in my commit message.

Hmm - that makes me wonder - do we support non-dict branches in a flat
union?  The usual use case of a flat union is that all dictionary keys
in the branch's dictionary are treated as keys at the top level of the
resulting overall union; but a non-dictionary branch has no keys to
flatten into the top level.  I may need to tweak a subsequent patch to
ensure that flat union branches can only use dictionaries (while simple
unions can use any type).

> 
> This proves your "can always use a flat union" proposition.  The only
> way they can earn their keep is making the schema easier to read.  I
> haven't thought about that.
> 
> Another historical accident is how we express members common to all
> union cases: base type.  Probably just because complex types already had
> them.  Are you planning to change anything there?

Maybe, per your own review.  More at point [5] below...

> 
> [...]
>> diff --git a/tests/qapi-schema/alternate-nested.err b/tests/qapi-schema/alternate-nested.err
>> new file mode 100644
>> index 0000000..e69de29
>> diff --git a/tests/qapi-schema/alternate-nested.exit b/tests/qapi-schema/alternate-nested.exit
>> new file mode 100644
>> index 0000000..573541a
>> --- /dev/null
>> +++ b/tests/qapi-schema/alternate-nested.exit
>> @@ -0,0 +1 @@
>> +0
>> diff --git a/tests/qapi-schema/alternate-nested.json b/tests/qapi-schema/alternate-nested.json
>> new file mode 100644
>> index 0000000..d5812bf
>> --- /dev/null
>> +++ b/tests/qapi-schema/alternate-nested.json
>> @@ -0,0 +1,7 @@
>> +# FIXME: we should reject a nested anonymous union branch
> 
> Same reason we want to reject nested / anonymous complex types
> elsewhere?  Or is there a deeper reason?

Similar to the reason as we want to reject {'command':'foo',
'data':'alternate-type'} for allowing non-dictionaries - we aren't
currently using it, and it's easier to reject than to worry about making
corner cases of the generator work on something we won't use.

Also, if I have an alternate A that chooses between string and dict,
then want to create an alternate  B that chooses between int and
alternate A, will that even generate correct code?  An alternate
represents multiple QObject types at once, so determining the QObject
type of the next token while parsing the code for union B would have to
worry about BOTH cases of nested union A.

So the FIXME is correct, and later in the series, I nuke this as
unsupported.


>> +++ b/tests/qapi-schema/flat-union-bad-base.json
>> @@ -0,0 +1,13 @@
>> +# we require the base to be an existing complex type
>> +# FIXME: should we allow an anonymous inline base type?
> 
> What do you mean by "anonymous inline base type"?

[5] basically, that the following example could be legal shorthand...

> 
>> +{ 'enum': 'TestEnum',
>> +  'data': [ 'value1', 'value2' ] }
>> +{ 'type': 'TestTypeA',
>> +  'data': { 'string': 'str' } }
>> +{ 'type': 'TestTypeB',
>> +  'data': { 'integer': 'int' } }
>> +{ 'union': 'TestUnion',
>> +  'base': { 'enum1': 'TestEnum', 'kind': 'str' },
>> +  'discriminator': 'TestEnum',
>> +  'data': { 'kind1': 'TestTypeA',
>> +            'kind2': 'TestTypeB' } }

where the { 'enum1':'TestEnum', 'kind':'str' } anonymous type defined at
'base' should be usable instead of having to name the type with a more
verbose:

{ 'type': 'Base', 'data': {'enum1':'TestEnum', 'kind':'str' }}
{ 'union': 'TestUnion', 'base':'Base', ... }

or in other words, that unions behave more like 'command' allowing
'data':{dict} as an anonymous type in place of 'data':'name'.

>> diff --git a/tests/qapi-schema/flat-union-bad-base.out b/tests/qapi-schema/flat-union-bad-base.out
>> new file mode 100644
>> index 0000000..e69de29
> [...]
> 
> "Doesn't deal well" is an understatement :)
> 
> Since all my questions are about your intentions and rationale:
> 
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
> 
>
Markus Armbruster March 26, 2015, 3:58 p.m. UTC | #5
Eric Blake <eblake@redhat.com> writes:

> On 03/26/2015 07:23 AM, Markus Armbruster wrote:
>> One more:
>> 
>> [...]
>>> diff --git a/tests/qapi-schema/alternate-conflict-string.json
>>> b/tests/qapi-schema/alternate-conflict-string.json
>>> new file mode 100644
>>> index 0000000..5fd1a47
>>> --- /dev/null
>>> +++ b/tests/qapi-schema/alternate-conflict-string.json
>>> @@ -0,0 +1,8 @@
>>> +# FIXME: we should reject anonymous unions with multiple
>>> string-like branches
>>> +{ 'enum': 'Enum',
>>> +  'data': [ 'hello', 'world' ] }
>>> +{ 'union': 'MyUnion',
>>> +  'discriminator': {},
>>> +  'data': { 'one': 'str',
>>> +	    'two': 'Enum' } }
>>> +
>> 
>> /home/armbru/work/qemu/.git/rebase-apply/patch:325: new blank line at EOF.
>
> Huh. I thought I had git set up to reject me from making commits like
> that locally, but obviously not.

There's another one in PATCH 13:

/home/armbru/work/qemu/.git/rebase-apply/patch:156: new blank line at EOF.

> http://wiki.qemu.org/Contribute/SubmitAPatch should probably mention the
> magic one-time setup to use to turn this type of checking on...

Feel free to add it :)
Markus Armbruster March 27, 2015, 12:30 p.m. UTC | #6
Eric Blake <eblake@redhat.com> writes:

> On 03/26/2015 07:18 AM, Markus Armbruster wrote:
>> Eric Blake <eblake@redhat.com> writes:
>> 
>>> Demonstrate that the qapi generator doesn't deal well with unions
>>> that aren't up to par. Later patches will update the expected
>>> reseults as the generator is made stricter.
>>>
>>> Of particular note, we currently allow 'base' without 'discriminator'
>>> as a way to create a simple union with a base class.  However, none
>>> of the existing QMP or QGA interfaces use it (it is exercised only
>>> in the testsuite).
>> 
>> qapi-code-gen.txt documents 'base' only for flat unions.  We should
>> either document its use in simple unions, or outlaw it.
>
> I went with outlaw later in the series, and the rest of my commit
> message was an attempt to explain my reasoning in that choice.  But I
> can certainly try to improve the wording, if we need a respin.

If you respin, I suggest to add that it's undocumented.

>> 
>>>                     Meanwhile, it would be nice to allow
>>> 'discriminator':'EnumType' without 'base' for creating a simple union
>>> that is type-safe rather than open-coded to a generated enum (right
>>> now, we are only type-safe when using a flat union, but that uses
>>> a different syntax of 'discriminator':'member-name' which requires
>>> a base class containing a 'member-name' enum field).
>> 
>> I'm afraid I don't get you.  Can you give examples?
>
> Using this common code with the appropriate union for each example:
> { 'command':'foo', 'data':UNION }
>
> Right now, we have flat unions which are required to be type-safe (all
> branches MUST map back to the enum type of the discriminator, enforced
> by the generator, so that if the enum later adds a member, the union
> must also be updated to match):
>
> [1]
> { 'union':'Safe', 'base':'Base', 'discriminator':'switch',
>   'data':{ 'one':'str', 'two':'int' }}
>
> {"execute":"foo", "arguments":{'switch':'one', 'data':'hello'}}

As far as I can tell, the generator rejects flat union 'data' members
that aren't discriminator values.  It doesn't require the union to cover
all discriminator members.  Makes sense to me, because the union may not
have additional data for some discriminator values.

Your claim "so that if the enum later adds a member, the union must also
be updated to match" appears to be wrong.

> and simple unions which cannot be typesafe (the branches of the union
> are open-coded - even if they correlate to an existing enum, there is
> nothing enforcing that extensions to the enum be reflected into the union):
>
> [2]
> { 'union':'SimpleButOpenCoded',
>   'data':{ 'one': 'str', 'two':'int' }}
>
> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}

Ah, now I get what you mean!  We have a user-defined enum and the simple
union's implicit enum, and no way to connect the two.  In C, we get two
distinct enum types.

Perhaps you can clarify the commit message a bit.  Or maybe just drop
the part that confused me.  Okay since the relevant FIXME doesn't pass
judgement:

# FIXME: either allow base in non-flat unions, or diagnose missing discriminator

> I'm hoping to add as a followup series a variant of simple unions that
> is type-safe:
>
> [3]
> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
>   'data':{ 'one':'str', 'two':'int' }}
>
> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
>
> But the existing, unused-except-in-testsuite, notion of a simple union
> with a base class looks like:
>
> [4]
> { 'type':'Shared', 'data':{'common':'int'}}
> { 'union':'SimpleWithBase', 'base':'Shared',
>   'data':{ 'one':'str', 'two':'int' }}
>
> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
>
> If we were to allow the addition of 'discriminator':'EnumType' to a
> simple union [3], but then add that discriminator to an existing case of
> a simple union with base [4], it would look like:
>
> { 'type':'Shared', 'data':{'common':'int'}}
> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
>   'discriminator':'MyEnum',
>   'data':{ 'one':'str', 'two':'int' }}
>
> Yuck.  That is indistinguishable from flat unions [1], except by whether
> discriminator names an enum type or a member of the base class.

Too subtle.  The difference should be syntactic, so it's immediately
visible.

>>>                                                       If both 'base'
>>> and 'discriminator' are optional, then converting a simple union
>>> with base class to a type-safe simple union with an enum discriminator
>>> would not be possible.  So my plan is to get rid of 'base' without
>>> 'discriminator' later in the series;
>> 
>> Aha: you're going to outlaw 'base' in simple unions.  Yes, please.
>
> Okay, you came to my desired conclusion; it's just that my wording in
> the middle didn't help.
>
>> 
>>>                                      this will be no real loss, as any
>>> union that needs additional common fields can always use a flat
>>> union.
>> 
>> The mathematical concept behind unions is the sum type.
>> 
>> We have three implementations, and we call them simple, flat, anonymous.
>> 
>> Anonymous unions are implicitly tagged.  They come with the obvious
>> restrictions required to make implicit tagging work.
>
> and get renamed to 'alternate' later in the series, so they are not
> worth worrying about here.

Yes.

>> The other two are explicitly tagged.  The difference between them is
>> just notation.  I like my unions flat, because for me the extra wrapping
>> is just notational overhead.
>> 
>> In particular, I can define simple unions in terms of flat ones by
>> restricting all union cases to a single member named 'data'.  They're
>> not implemented that way, but that's a historical accident.  Simple
>> unions are a redundant concept.
>
> Cool.  Or more concretely,
>
> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
>
> is identical on the wire to:
>
> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
> { 'type': 'Branch1', 'data': { 'data': 'str' } }
> { 'type': 'Branch2', 'data': { 'data': 'int' } }
> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
>
> Probably worth mentioning in my commit message.

Yes, please.

> Hmm - that makes me wonder - do we support non-dict branches in a flat
> union?  The usual use case of a flat union is that all dictionary keys
> in the branch's dictionary are treated as keys at the top level of the
> resulting overall union; but a non-dictionary branch has no keys to
> flatten into the top level.

I think you just showed why non-dictionary branches make no sense.

>                              I may need to tweak a subsequent patch to
> ensure that flat union branches can only use dictionaries (while simple
> unions can use any type).

Yes, please.  Follow-up patch okay, if that's easier for you.

>> This proves your "can always use a flat union" proposition.  The only
>> way they can earn their keep is making the schema easier to read.  I
>> haven't thought about that.
>> 
>> Another historical accident is how we express members common to all
>> union cases: base type.  Probably just because complex types already had
>> them.  Are you planning to change anything there?
>
> Maybe, per your own review.  More at point [5] below...
>
>> 
>> [...]
>>> diff --git a/tests/qapi-schema/alternate-nested.err
>>> b/tests/qapi-schema/alternate-nested.err
>>> new file mode 100644
>>> index 0000000..e69de29
>>> diff --git a/tests/qapi-schema/alternate-nested.exit
>>> b/tests/qapi-schema/alternate-nested.exit
>>> new file mode 100644
>>> index 0000000..573541a
>>> --- /dev/null
>>> +++ b/tests/qapi-schema/alternate-nested.exit
>>> @@ -0,0 +1 @@
>>> +0
>>> diff --git a/tests/qapi-schema/alternate-nested.json
>>> b/tests/qapi-schema/alternate-nested.json
>>> new file mode 100644
>>> index 0000000..d5812bf
>>> --- /dev/null
>>> +++ b/tests/qapi-schema/alternate-nested.json
>>> @@ -0,0 +1,7 @@
>>> +# FIXME: we should reject a nested anonymous union branch
>> 
>> Same reason we want to reject nested / anonymous complex types
>> elsewhere?  Or is there a deeper reason?
>
> Similar to the reason as we want to reject {'command':'foo',
> 'data':'alternate-type'} for allowing non-dictionaries - we aren't
> currently using it, and it's easier to reject than to worry about making
> corner cases of the generator work on something we won't use.

Okay, that makes sense.

> Also, if I have an alternate A that chooses between string and dict,
> then want to create an alternate  B that chooses between int and
> alternate A, will that even generate correct code?  An alternate
> represents multiple QObject types at once, so determining the QObject
> type of the next token while parsing the code for union B would have to
> worry about BOTH cases of nested union A.
>
> So the FIXME is correct, and later in the series, I nuke this as
> unsupported.

Good.

>>> +++ b/tests/qapi-schema/flat-union-bad-base.json
>>> @@ -0,0 +1,13 @@
>>> +# we require the base to be an existing complex type
>>> +# FIXME: should we allow an anonymous inline base type?
>> 
>> What do you mean by "anonymous inline base type"?
>
> [5] basically, that the following example could be legal shorthand...
>
>> 
>>> +{ 'enum': 'TestEnum',
>>> +  'data': [ 'value1', 'value2' ] }
>>> +{ 'type': 'TestTypeA',
>>> +  'data': { 'string': 'str' } }
>>> +{ 'type': 'TestTypeB',
>>> +  'data': { 'integer': 'int' } }
>>> +{ 'union': 'TestUnion',
>>> +  'base': { 'enum1': 'TestEnum', 'kind': 'str' },
>>> +  'discriminator': 'TestEnum',

Shouldn't this be 'discriminator': 'enum1'?

>>> +  'data': { 'kind1': 'TestTypeA',
>>> +            'kind2': 'TestTypeB' } }
>
> where the { 'enum1':'TestEnum', 'kind':'str' } anonymous type defined at
> 'base' should be usable instead of having to name the type with a more
> verbose:
>
> { 'type': 'Base', 'data': {'enum1':'TestEnum', 'kind':'str' }}
> { 'union': 'TestUnion', 'base':'Base', ... }
>
> or in other words, that unions behave more like 'command' allowing
> 'data':{dict} as an anonymous type in place of 'data':'name'.

Got it.

Since there's nothing currently broken, I wouldn't label the question
"should we allow an anonymous inline base type?" FIXME.

>>> diff --git a/tests/qapi-schema/flat-union-bad-base.out
>>> b/tests/qapi-schema/flat-union-bad-base.out
>>> new file mode 100644
>>> index 0000000..e69de29
>> [...]
>> 
>> "Doesn't deal well" is an understatement :)
>> 
>> Since all my questions are about your intentions and rationale:
>> 
>> Reviewed-by: Markus Armbruster <armbru@redhat.com>

R-by stands, but I encourage you to polish the commit message and drop
the FIXME: from the comment in flat-union-bad-base.json.
Eric Blake March 27, 2015, 7:47 p.m. UTC | #7
On 03/27/2015 06:30 AM, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
>> On 03/26/2015 07:18 AM, Markus Armbruster wrote:
>>> Eric Blake <eblake@redhat.com> writes:
>>>
>>>> Demonstrate that the qapi generator doesn't deal well with unions
>>>> that aren't up to par. Later patches will update the expected
>>>> reseults as the generator is made stricter.

Oh my. s/reseults/results/

Looks like I'll be doing a v6 to add R-b and clean up these nits, as it
is turning into too large a collection to ask a maintainer to do on my
behalf.  I'll try and split up some of the patches where I crammed
multiple things into one commit, as part of the respin.

>>>>
>>>> Of particular note, we currently allow 'base' without 'discriminator'
>>>> as a way to create a simple union with a base class.  However, none
>>>> of the existing QMP or QGA interfaces use it (it is exercised only
>>>> in the testsuite).
>>>
>>> qapi-code-gen.txt documents 'base' only for flat unions.  We should
>>> either document its use in simple unions, or outlaw it.
>>
>> I went with outlaw later in the series, and the rest of my commit
>> message was an attempt to explain my reasoning in that choice.  But I
>> can certainly try to improve the wording, if we need a respin.
> 
> If you respin, I suggest to add that it's undocumented.

Sure.


>> I'm hoping to add as a followup series a variant of simple unions that
>> is type-safe:
>>
>> [3]
>> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
>>   'data':{ 'one':'str', 'two':'int' }}
>>
>> Hmm - that makes me wonder - do we support non-dict branches in a flat
>> union?  The usual use case of a flat union is that all dictionary keys
>> in the branch's dictionary are treated as keys at the top level of the
>> resulting overall union; but a non-dictionary branch has no keys to
>> flatten into the top level.
> 
> I think you just showed why non-dictionary branches make no sense.
> 
>>                              I may need to tweak a subsequent patch to
>> ensure that flat union branches can only use dictionaries (while simple
>> unions can use any type).
> 
> Yes, please.  Follow-up patch okay, if that's easier for you.

At this point, it may indeed be easier (keep the front half of the patch
down to the bare minimum, and just make the entire series longer, rather
than trying to rebase things into perfection).

>>> What do you mean by "anonymous inline base type"?
>>
>> [5] basically, that the following example could be legal shorthand...
>>
>>>
>>>> +{ 'enum': 'TestEnum',
>>>> +  'data': [ 'value1', 'value2' ] }
>>>> +{ 'type': 'TestTypeA',
>>>> +  'data': { 'string': 'str' } }
>>>> +{ 'type': 'TestTypeB',
>>>> +  'data': { 'integer': 'int' } }
>>>> +{ 'union': 'TestUnion',
>>>> +  'base': { 'enum1': 'TestEnum', 'kind': 'str' },
>>>> +  'discriminator': 'TestEnum',
> 
> Shouldn't this be 'discriminator': 'enum1'?

Yes.  Good catch.

> 
> Since there's nothing currently broken, I wouldn't label the question
> "should we allow an anonymous inline base type?" FIXME.

As part of my respin, I'll change any FIXME into something more benign
like TODO or RFC if it is merely intended to document a possible future
direction and not and existing bug to be corrected by the series (so
that the remaining additions of FIXME in the earlier part of the series
will all disappear by the end of the series).

>>> Since all my questions are about your intentions and rationale:
>>>
>>> Reviewed-by: Markus Armbruster <armbru@redhat.com>
> 
> R-by stands, but I encourage you to polish the commit message and drop
> the FIXME: from the comment in flat-union-bad-base.json.

I think we've racked up enough to warrant a respin; I'll keep your R-b
where the only things I touch are obviously trivial, but it may mean a
few patches come through without R-b.
Eric Blake March 30, 2015, 10:45 p.m. UTC | #8
On 03/26/2015 09:58 AM, Markus Armbruster wrote:

>>> /home/armbru/work/qemu/.git/rebase-apply/patch:325: new blank line at EOF.
>>
>> Huh. I thought I had git set up to reject me from making commits like
>> that locally, but obviously not.
> 
> There's another one in PATCH 13:
> 
> /home/armbru/work/qemu/.git/rebase-apply/patch:156: new blank line at EOF.
> 
>> http://wiki.qemu.org/Contribute/SubmitAPatch should probably mention the
>> magic one-time setup to use to turn this type of checking on...
> 
> Feel free to add it :)

The wiki already mentions how:
*
[http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html
Automate a checkpatch run on commit]

but I had failed to 'chmod +x .git/hooks/pre-commit' to actually use it.
Kevin Wolf March 31, 2015, 5:13 p.m. UTC | #9
Am 26.03.2015 um 16:04 hat Eric Blake geschrieben:
> On 03/26/2015 07:18 AM, Markus Armbruster wrote:
> > Eric Blake <eblake@redhat.com> writes:
> >>                     Meanwhile, it would be nice to allow
> >> 'discriminator':'EnumType' without 'base' for creating a simple union
> >> that is type-safe rather than open-coded to a generated enum (right
> >> now, we are only type-safe when using a flat union, but that uses
> >> a different syntax of 'discriminator':'member-name' which requires
> >> a base class containing a 'member-name' enum field).
> > 
> > I'm afraid I don't get you.  Can you give examples?
> 
> Using this common code with the appropriate union for each example:
> { 'command':'foo', 'data':UNION }
> 
> Right now, we have flat unions which are required to be type-safe (all
> branches MUST map back to the enum type of the discriminator, enforced
> by the generator, so that if the enum later adds a member, the union
> must also be updated to match):
> 
> [1]
> { 'union':'Safe', 'base':'Base', 'discriminator':'switch',
>   'data':{ 'one':'str', 'two':'int' }}
> 
> {"execute":"foo", "arguments":{'switch':'one', 'data':'hello'}}
> 
> and simple unions which cannot be typesafe (the branches of the union
> are open-coded - even if they correlate to an existing enum, there is
> nothing enforcing that extensions to the enum be reflected into the union):
> 
> [2]
> { 'union':'SimpleButOpenCoded',
>   'data':{ 'one': 'str', 'two':'int' }}
> 
> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
> 
> I'm hoping to add as a followup series a variant of simple unions that
> is type-safe:
> 
> [3]
> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
>   'data':{ 'one':'str', 'two':'int' }}
> 
> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}

This would overload 'discriminator' with two different meanings:

* In case [1] it refers to one key in the JSON object which contains the
  name of the union branch to select. That is, it is the _name_ of the
  key used as a discriminator.

* In case [3], the 'type' key is used in the JSON object to select a
  union branch. 'discriminator' describes the _valid values_ of it, i.e.
  the branch names.

We shouldn't mix these meanings. If you need [3], we could call it
'discriminator-type' or something like that. If both options are
consistently used for simple and flat unions, you end up with these
rules:

* 'discriminator' is the key that is used to select the union branch.

  - For flat unions it is required and must refer to an explicitly
    declared key in the base type.

  - For simple unions it is optional and defaults to 'type'. The key is
    implicitly created in the union type.

* 'discrimiator-type' describes the valid values of 'discriminator',
  either by referring to an enum type or to 'str'.

  - For flat unions, this must match the type of the explicit
    declaration of the discriminator field. It is optional and defauls
    to the only valid value.

  - For simple unions it is optional, too, and defaults to 'str'.

  - If it is the name of an enum type, that enum type is reused and the
    declared union branches must match the valid values of the enum.

  - If it is 'str', a new enum is generated, and all the declared union
    branches are used as valid values.

There's quite some duplication in it where we need to make sure that the
schema matches in all places, but without an explicit declaration of the
disriminator field in simple unions, this seems to be the best I can
come up with.

> But the existing, unused-except-in-testsuite, notion of a simple union
> with a base class looks like:
> 
> [4]
> { 'type':'Shared', 'data':{'common':'int'}}
> { 'union':'SimpleWithBase', 'base':'Shared',
>   'data':{ 'one':'str', 'two':'int' }}
> 
> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
> 
> If we were to allow the addition of 'discriminator':'EnumType' to a
> simple union [3], but then add that discriminator to an existing case of
> a simple union with base [4], it would look like:
> 
> { 'type':'Shared', 'data':{'common':'int'}}
> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
>   'discriminator':'MyEnum',
>   'data':{ 'one':'str', 'two':'int' }}
> 
> Yuck.  That is indistinguishable from flat unions [1], except by whether
> discriminator names an enum type or a member of the base class.

Which could even be ambiguous, couldn't it?

> > In particular, I can define simple unions in terms of flat ones by
> > restricting all union cases to a single member named 'data'.  They're
> > not implemented that way, but that's a historical accident.  Simple
> > unions are a redundant concept.
> 
> Cool.  Or more concretely,
> 
> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
> 
> is identical on the wire to:
> 
> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
> { 'type': 'Branch1', 'data': { 'data': 'str' } }
> { 'type': 'Branch2', 'data': { 'data': 'int' } }
> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }

Perhaps we should expose all unions for schema introspection in a way
similar to this (possibly not splitting out Branch1 and Branch2 as
independent types, though). We would just have to give a name to
implicitly generated enums and base types.

Kevin
Eric Blake March 31, 2015, 6:15 p.m. UTC | #10
On 03/31/2015 11:13 AM, Kevin Wolf wrote:

>> [3]
>> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
>>   'data':{ 'one':'str', 'two':'int' }}
>>
>> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
> 
> This would overload 'discriminator' with two different meanings:
> 
> * In case [1] it refers to one key in the JSON object which contains the
>   name of the union branch to select. That is, it is the _name_ of the
>   key used as a discriminator.
> 
> * In case [3], the 'type' key is used in the JSON object to select a
>   union branch. 'discriminator' describes the _valid values_ of it, i.e.
>   the branch names.
> 
> We shouldn't mix these meanings. If you need [3], we could call it
> 'discriminator-type' or something like that. If both options are
> consistently used for simple and flat unions, you end up with these
> rules:
> 

Nice idea. Good thing I haven't actually implemented the extension yet.

> * 'discriminator' is the key that is used to select the union branch.
> 
>   - For flat unions it is required and must refer to an explicitly
>     declared key in the base type.
> 
>   - For simple unions it is optional and defaults to 'type'. The key is
>     implicitly created in the union type.
> 
> * 'discrimiator-type' describes the valid values of 'discriminator',
>   either by referring to an enum type or to 'str'.

Referring to 'str' should only be allowed for implicit enums.  We
already require it to resolve to an enum for flat unions.

> 
>   - For flat unions, this must match the type of the explicit
>     declaration of the discriminator field. It is optional and defauls
>     to the only valid value.
> 
>   - For simple unions it is optional, too, and defaults to 'str'.
> 
>   - If it is the name of an enum type, that enum type is reused and the
>     declared union branches must match the valid values of the enum.
> 
>   - If it is 'str', a new enum is generated, and all the declared union
>     branches are used as valid values.

Sounds reasonable to me.  But even with this approach, I still don't
know that we have room for a 'base' type on a simple union.  I guess the
joy of future extensions is that we can make legal whatever we want, and
that this series is adding enough tests to make it obvious whether those
new additions break any existing test cases.

> 
> There's quite some duplication in it where we need to make sure that the
> schema matches in all places, but without an explicit declaration of the
> disriminator field in simple unions, this seems to be the best I can
> come up with.

Yeah, it's a bit more verbose than the overloaded version, but may be
worth it.

> 
>> But the existing, unused-except-in-testsuite, notion of a simple union
>> with a base class looks like:
>>
>> [4]
>> { 'type':'Shared', 'data':{'common':'int'}}
>> { 'union':'SimpleWithBase', 'base':'Shared',
>>   'data':{ 'one':'str', 'two':'int' }}
>>
>> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
>>
>> If we were to allow the addition of 'discriminator':'EnumType' to a
>> simple union [3], but then add that discriminator to an existing case of
>> a simple union with base [4], it would look like:
>>
>> { 'type':'Shared', 'data':{'common':'int'}}
>> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
>>   'discriminator':'MyEnum',
>>   'data':{ 'one':'str', 'two':'int' }}
>>
>> Yuck.  That is indistinguishable from flat unions [1], except by whether
>> discriminator names an enum type or a member of the base class.
> 
> Which could even be ambiguous, couldn't it?

My whole argument for why I _don't_ want to allow base types for simple
unions.

> 
>>> In particular, I can define simple unions in terms of flat ones by
>>> restricting all union cases to a single member named 'data'.  They're
>>> not implemented that way, but that's a historical accident.  Simple
>>> unions are a redundant concept.
>>
>> Cool.  Or more concretely,
>>
>> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
>>
>> is identical on the wire to:
>>
>> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
>> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
>> { 'type': 'Branch1', 'data': { 'data': 'str' } }
>> { 'type': 'Branch2', 'data': { 'data': 'int' } }
>> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
>>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
> 
> Perhaps we should expose all unions for schema introspection in a way
> similar to this (possibly not splitting out Branch1 and Branch2 as
> independent types, though). We would just have to give a name to
> implicitly generated enums and base types.

So maybe we update the schema to allow anonymous types.  That is:

{ 'union': 'Simple',
  'data': { 'one': { 'name': 'str', 'value': 'int' } } }

would be nicer than the current requirement of:

{ 'type': 'Branch1', 'data': { 'name': 'str', 'value': 'int' } }
{ 'union': 'Simple',
  'data': { 'one': 'Branch1' } }
Eric Blake March 31, 2015, 6:31 p.m. UTC | #11
On 03/31/2015 12:15 PM, Eric Blake wrote:

>>> Cool.  Or more concretely,
>>>
>>> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
>>>
>>> is identical on the wire to:
>>>
>>> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
>>> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
>>> { 'type': 'Branch1', 'data': { 'data': 'str' } }
>>> { 'type': 'Branch2', 'data': { 'data': 'int' } }
>>> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
>>>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
>>
>> Perhaps we should expose all unions for schema introspection in a way
>> similar to this (possibly not splitting out Branch1 and Branch2 as
>> independent types, though). We would just have to give a name to
>> implicitly generated enums and base types.
> 
> So maybe we update the schema to allow anonymous types.  That is:
> 
> { 'union': 'Simple',
>   'data': { 'one': { 'name': 'str', 'value': 'int' } } }
> 
> would be nicer than the current requirement of:
> 
> { 'type': 'Branch1', 'data': { 'name': 'str', 'value': 'int' } }
> { 'union': 'Simple',
>   'data': { 'one': 'Branch1' } }

Hmm; maybe we could support a notion of a 'common' dictionary for simple
unions, containing all non-discriminator fields present in all branches,
where (using anonymous types for compactness), and assuming the enum
definition:

{ 'union': 'Simple',
  'common': { 'readonly': 'bool' },
  'data': { 'one': 'str', 'two': 'int' } }

is shorthand for:

{ 'union': 'Flat',
  'base': { 'readonly': 'bool', 'type': 'Enum' },
  'discriminator': 'type',
  'data': { 'one': { 'data': 'str' },
            'two': { 'data': 'int' } } }

that is, where 'base' for flat unions is the union of the 'common' type
and 'discriminator':'discriminator-type' member of simple unions.

At any rate, any changes along these lines will be a later series.  Time
for me to get back to work on publishing v6 :)
Kevin Wolf March 31, 2015, 6:34 p.m. UTC | #12
Am 31.03.2015 um 20:15 hat Eric Blake geschrieben:
> On 03/31/2015 11:13 AM, Kevin Wolf wrote:
> 
> >> [3]
> >> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >>
> >> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
> > 
> > This would overload 'discriminator' with two different meanings:
> > 
> > * In case [1] it refers to one key in the JSON object which contains the
> >   name of the union branch to select. That is, it is the _name_ of the
> >   key used as a discriminator.
> > 
> > * In case [3], the 'type' key is used in the JSON object to select a
> >   union branch. 'discriminator' describes the _valid values_ of it, i.e.
> >   the branch names.
> > 
> > We shouldn't mix these meanings. If you need [3], we could call it
> > 'discriminator-type' or something like that. If both options are
> > consistently used for simple and flat unions, you end up with these
> > rules:
> > 
> 
> Nice idea. Good thing I haven't actually implemented the extension yet.
> 
> > * 'discriminator' is the key that is used to select the union branch.
> > 
> >   - For flat unions it is required and must refer to an explicitly
> >     declared key in the base type.
> > 
> >   - For simple unions it is optional and defaults to 'type'. The key is
> >     implicitly created in the union type.
> > 
> > * 'discrimiator-type' describes the valid values of 'discriminator',
> >   either by referring to an enum type or to 'str'.
> 
> Referring to 'str' should only be allowed for implicit enums.  We
> already require it to resolve to an enum for flat unions.
> 
> > 
> >   - For flat unions, this must match the type of the explicit
> >     declaration of the discriminator field. It is optional and defauls
> >     to the only valid value.
> > 
> >   - For simple unions it is optional, too, and defaults to 'str'.
> > 
> >   - If it is the name of an enum type, that enum type is reused and the
> >     declared union branches must match the valid values of the enum.
> > 
> >   - If it is 'str', a new enum is generated, and all the declared union
> >     branches are used as valid values.
> 
> Sounds reasonable to me.  But even with this approach, I still don't
> know that we have room for a 'base' type on a simple union.  I guess the
> joy of future extensions is that we can make legal whatever we want, and
> that this series is adding enough tests to make it obvious whether those
> new additions break any existing test cases.

I think we have room for it, but probably still no use case. And they
are kind of ugly with mixing top-level fields for the base type and
nested ones for the branch-specific fields. (To be honest, I find all
non-flat unions ugly, but mixing is even uglier.)

So I certainly don't object to removing simple unions with base.

> > There's quite some duplication in it where we need to make sure that the
> > schema matches in all places, but without an explicit declaration of the
> > disriminator field in simple unions, this seems to be the best I can
> > come up with.
> 
> Yeah, it's a bit more verbose than the overloaded version, but may be
> worth it.
> 
> > 
> >> But the existing, unused-except-in-testsuite, notion of a simple union
> >> with a base class looks like:
> >>
> >> [4]
> >> { 'type':'Shared', 'data':{'common':'int'}}
> >> { 'union':'SimpleWithBase', 'base':'Shared',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >>
> >> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
> >>
> >> If we were to allow the addition of 'discriminator':'EnumType' to a
> >> simple union [3], but then add that discriminator to an existing case of
> >> a simple union with base [4], it would look like:
> >>
> >> { 'type':'Shared', 'data':{'common':'int'}}
> >> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
> >>   'discriminator':'MyEnum',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >>
> >> Yuck.  That is indistinguishable from flat unions [1], except by whether
> >> discriminator names an enum type or a member of the base class.
> > 
> > Which could even be ambiguous, couldn't it?
> 
> My whole argument for why I _don't_ want to allow base types for simple
> unions.

Essentially the problem is that we distinguish between simple and flat
unions only by looking at currently arbitrarily restricted options. So
another option to solve it would be a boolean 'flat', which we would set
as false for all existing simple unions, and let it default to true.
That would give us nicer QMP for new unions, too.

(I still don't think we need to save simple unions with base types, but
it's a thought I wanted to mention anyway because I like flat unions.)

> >>> In particular, I can define simple unions in terms of flat ones by
> >>> restricting all union cases to a single member named 'data'.  They're
> >>> not implemented that way, but that's a historical accident.  Simple
> >>> unions are a redundant concept.
> >>
> >> Cool.  Or more concretely,
> >>
> >> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
> >>
> >> is identical on the wire to:
> >>
> >> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
> >> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
> >> { 'type': 'Branch1', 'data': { 'data': 'str' } }
> >> { 'type': 'Branch2', 'data': { 'data': 'int' } }
> >> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
> >>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
> > 
> > Perhaps we should expose all unions for schema introspection in a way
> > similar to this (possibly not splitting out Branch1 and Branch2 as
> > independent types, though). We would just have to give a name to
> > implicitly generated enums and base types.
> 
> So maybe we update the schema to allow anonymous types.  That is:
> 
> { 'union': 'Simple',
>   'data': { 'one': { 'name': 'str', 'value': 'int' } } }
> 
> would be nicer than the current requirement of:
> 
> { 'type': 'Branch1', 'data': { 'name': 'str', 'value': 'int' } }
> { 'union': 'Simple',
>   'data': { 'one': 'Branch1' } }

Hm... You just removed them, right? Maybe my suggestion isn't that
constructive then. :-)

Kevin
Markus Armbruster March 31, 2015, 8:46 p.m. UTC | #13
Kevin Wolf <kwolf@redhat.com> writes:

> Am 26.03.2015 um 16:04 hat Eric Blake geschrieben:
>> On 03/26/2015 07:18 AM, Markus Armbruster wrote:
>> > Eric Blake <eblake@redhat.com> writes:
>> >>                     Meanwhile, it would be nice to allow
>> >> 'discriminator':'EnumType' without 'base' for creating a simple union
>> >> that is type-safe rather than open-coded to a generated enum (right
>> >> now, we are only type-safe when using a flat union, but that uses
>> >> a different syntax of 'discriminator':'member-name' which requires
>> >> a base class containing a 'member-name' enum field).
>> > 
>> > I'm afraid I don't get you.  Can you give examples?
>> 
>> Using this common code with the appropriate union for each example:
>> { 'command':'foo', 'data':UNION }
>> 
>> Right now, we have flat unions which are required to be type-safe (all
>> branches MUST map back to the enum type of the discriminator, enforced
>> by the generator, so that if the enum later adds a member, the union
>> must also be updated to match):
>> 
>> [1]
>> { 'union':'Safe', 'base':'Base', 'discriminator':'switch',
>>   'data':{ 'one':'str', 'two':'int' }}
>> 
>> {"execute":"foo", "arguments":{'switch':'one', 'data':'hello'}}
>> 
>> and simple unions which cannot be typesafe (the branches of the union
>> are open-coded - even if they correlate to an existing enum, there is
>> nothing enforcing that extensions to the enum be reflected into the union):
>> 
>> [2]
>> { 'union':'SimpleButOpenCoded',
>>   'data':{ 'one': 'str', 'two':'int' }}
>> 
>> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
>> 
>> I'm hoping to add as a followup series a variant of simple unions that
>> is type-safe:
>> 
>> [3]
>> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
>>   'data':{ 'one':'str', 'two':'int' }}
>> 
>> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
>
> This would overload 'discriminator' with two different meanings:
>
> * In case [1] it refers to one key in the JSON object which contains the
>   name of the union branch to select. That is, it is the _name_ of the
>   key used as a discriminator.
>
> * In case [3], the 'type' key is used in the JSON object to select a
>   union branch. 'discriminator' describes the _valid values_ of it, i.e.
>   the branch names.
>
> We shouldn't mix these meanings. If you need [3], we could call it
> 'discriminator-type' or something like that. If both options are
> consistently used for simple and flat unions, you end up with these
> rules:
>
> * 'discriminator' is the key that is used to select the union branch.
>
>   - For flat unions it is required and must refer to an explicitly
>     declared key in the base type.
>
>   - For simple unions it is optional and defaults to 'type'. The key is
>     implicitly created in the union type.
>
> * 'discrimiator-type' describes the valid values of 'discriminator',
>   either by referring to an enum type or to 'str'.
>
>   - For flat unions, this must match the type of the explicit
>     declaration of the discriminator field. It is optional and defauls
>     to the only valid value.
>
>   - For simple unions it is optional, too, and defaults to 'str'.
>
>   - If it is the name of an enum type, that enum type is reused and the
>     declared union branches must match the valid values of the enum.
>
>   - If it is 'str', a new enum is generated, and all the declared union
>     branches are used as valid values.
>
> There's quite some duplication in it where we need to make sure that the
> schema matches in all places, but without an explicit declaration of the
> disriminator field in simple unions, this seems to be the best I can
> come up with.
>
>> But the existing, unused-except-in-testsuite, notion of a simple union
>> with a base class looks like:
>> 
>> [4]
>> { 'type':'Shared', 'data':{'common':'int'}}
>> { 'union':'SimpleWithBase', 'base':'Shared',
>>   'data':{ 'one':'str', 'two':'int' }}
>> 
>> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
>> 
>> If we were to allow the addition of 'discriminator':'EnumType' to a
>> simple union [3], but then add that discriminator to an existing case of
>> a simple union with base [4], it would look like:
>> 
>> { 'type':'Shared', 'data':{'common':'int'}}
>> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
>>   'discriminator':'MyEnum',
>>   'data':{ 'one':'str', 'two':'int' }}
>> 
>> Yuck.  That is indistinguishable from flat unions [1], except by whether
>> discriminator names an enum type or a member of the base class.
>
> Which could even be ambiguous, couldn't it?
>
>> > In particular, I can define simple unions in terms of flat ones by
>> > restricting all union cases to a single member named 'data'.  They're
>> > not implemented that way, but that's a historical accident.  Simple
>> > unions are a redundant concept.
>> 
>> Cool.  Or more concretely,
>> 
>> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
>> 
>> is identical on the wire to:
>> 
>> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
>> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
>> { 'type': 'Branch1', 'data': { 'data': 'str' } }
>> { 'type': 'Branch2', 'data': { 'data': 'int' } }
>> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
>>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
>
> Perhaps we should expose all unions for schema introspection in a way
> similar to this (possibly not splitting out Branch1 and Branch2 as
> independent types, though).

My current thinking on this is a bit more radical.  I suspect there's a
straightforward object type buried in this mess, struggling to get out:
the good old variant record.  It consists of

* a name

* an optional base type name (must be a object type without variants)

* list of own members (empty okay)

  Effective members = own members + effective base type members

* optionally variants

  - one of the effective members is the discriminator (must be enum)

  - for each discriminator value a list of variant members (empty okay)

All existing type/union types are specializations:

* The complex type (struct type) is an object type without variants.

* The simple union type is an object type

  - without a base type

  - with an implicitly defined own member of an implicitly defined
    enumeration type serving as discriminator

  - with no other own members

  - with a single variant member for each discriminator value

* The flat union type is an object type

  - with a base type

  - without own members

  - with a discriminator

I further suspect lowering these types to the general form will make the
generator simpler, not just the introspection.

>                             We would just have to give a name to
> implicitly generated enums and base types.

Introspection doesn't care whether we defined something implicitly or
explicitly.  Let's make up names to hide that.

I'm trying to get a proof-of-concept introspection patch working this
week.  It'll probably be ugly enough to curdle the milk in your tea.
Eric Blake March 31, 2015, 11:40 p.m. UTC | #14
On 03/30/2015 04:45 PM, Eric Blake wrote:
> On 03/26/2015 09:58 AM, Markus Armbruster wrote:
> 
>>>> /home/armbru/work/qemu/.git/rebase-apply/patch:325: new blank line at EOF.
>>>
>>> Huh. I thought I had git set up to reject me from making commits like
>>> that locally, but obviously not.
>>
>> There's another one in PATCH 13:
>>
>> /home/armbru/work/qemu/.git/rebase-apply/patch:156: new blank line at EOF.
>>
>>> http://wiki.qemu.org/Contribute/SubmitAPatch should probably mention the
>>> magic one-time setup to use to turn this type of checking on...
>>
>> Feel free to add it :)
> 
> The wiki already mentions how:
> *
> [http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html
> Automate a checkpatch run on commit]
> 
> but I had failed to 'chmod +x .git/hooks/pre-commit' to actually use it.

Actually, scripts/checkpatch.pl does NOT flag blank lines; so I want to
also wire up git's default .git/hooks/pre-commit checker which can flag
that type of damage (Stefan's blog and/or checkpatch.pl may need an update).
Kevin Wolf April 1, 2015, 8:23 a.m. UTC | #15
Am 31.03.2015 um 22:46 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 26.03.2015 um 16:04 hat Eric Blake geschrieben:
> >> On 03/26/2015 07:18 AM, Markus Armbruster wrote:
> >> > Eric Blake <eblake@redhat.com> writes:
> >> >>                     Meanwhile, it would be nice to allow
> >> >> 'discriminator':'EnumType' without 'base' for creating a simple union
> >> >> that is type-safe rather than open-coded to a generated enum (right
> >> >> now, we are only type-safe when using a flat union, but that uses
> >> >> a different syntax of 'discriminator':'member-name' which requires
> >> >> a base class containing a 'member-name' enum field).
> >> > 
> >> > I'm afraid I don't get you.  Can you give examples?
> >> 
> >> Using this common code with the appropriate union for each example:
> >> { 'command':'foo', 'data':UNION }
> >> 
> >> Right now, we have flat unions which are required to be type-safe (all
> >> branches MUST map back to the enum type of the discriminator, enforced
> >> by the generator, so that if the enum later adds a member, the union
> >> must also be updated to match):
> >> 
> >> [1]
> >> { 'union':'Safe', 'base':'Base', 'discriminator':'switch',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >> 
> >> {"execute":"foo", "arguments":{'switch':'one', 'data':'hello'}}
> >> 
> >> and simple unions which cannot be typesafe (the branches of the union
> >> are open-coded - even if they correlate to an existing enum, there is
> >> nothing enforcing that extensions to the enum be reflected into the union):
> >> 
> >> [2]
> >> { 'union':'SimpleButOpenCoded',
> >>   'data':{ 'one': 'str', 'two':'int' }}
> >> 
> >> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
> >> 
> >> I'm hoping to add as a followup series a variant of simple unions that
> >> is type-safe:
> >> 
> >> [3]
> >> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >> 
> >> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
> >
> > This would overload 'discriminator' with two different meanings:
> >
> > * In case [1] it refers to one key in the JSON object which contains the
> >   name of the union branch to select. That is, it is the _name_ of the
> >   key used as a discriminator.
> >
> > * In case [3], the 'type' key is used in the JSON object to select a
> >   union branch. 'discriminator' describes the _valid values_ of it, i.e.
> >   the branch names.
> >
> > We shouldn't mix these meanings. If you need [3], we could call it
> > 'discriminator-type' or something like that. If both options are
> > consistently used for simple and flat unions, you end up with these
> > rules:
> >
> > * 'discriminator' is the key that is used to select the union branch.
> >
> >   - For flat unions it is required and must refer to an explicitly
> >     declared key in the base type.
> >
> >   - For simple unions it is optional and defaults to 'type'. The key is
> >     implicitly created in the union type.
> >
> > * 'discrimiator-type' describes the valid values of 'discriminator',
> >   either by referring to an enum type or to 'str'.
> >
> >   - For flat unions, this must match the type of the explicit
> >     declaration of the discriminator field. It is optional and defauls
> >     to the only valid value.
> >
> >   - For simple unions it is optional, too, and defaults to 'str'.
> >
> >   - If it is the name of an enum type, that enum type is reused and the
> >     declared union branches must match the valid values of the enum.
> >
> >   - If it is 'str', a new enum is generated, and all the declared union
> >     branches are used as valid values.
> >
> > There's quite some duplication in it where we need to make sure that the
> > schema matches in all places, but without an explicit declaration of the
> > disriminator field in simple unions, this seems to be the best I can
> > come up with.
> >
> >> But the existing, unused-except-in-testsuite, notion of a simple union
> >> with a base class looks like:
> >> 
> >> [4]
> >> { 'type':'Shared', 'data':{'common':'int'}}
> >> { 'union':'SimpleWithBase', 'base':'Shared',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >> 
> >> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
> >> 
> >> If we were to allow the addition of 'discriminator':'EnumType' to a
> >> simple union [3], but then add that discriminator to an existing case of
> >> a simple union with base [4], it would look like:
> >> 
> >> { 'type':'Shared', 'data':{'common':'int'}}
> >> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
> >>   'discriminator':'MyEnum',
> >>   'data':{ 'one':'str', 'two':'int' }}
> >> 
> >> Yuck.  That is indistinguishable from flat unions [1], except by whether
> >> discriminator names an enum type or a member of the base class.
> >
> > Which could even be ambiguous, couldn't it?
> >
> >> > In particular, I can define simple unions in terms of flat ones by
> >> > restricting all union cases to a single member named 'data'.  They're
> >> > not implemented that way, but that's a historical accident.  Simple
> >> > unions are a redundant concept.
> >> 
> >> Cool.  Or more concretely,
> >> 
> >> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
> >> 
> >> is identical on the wire to:
> >> 
> >> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
> >> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
> >> { 'type': 'Branch1', 'data': { 'data': 'str' } }
> >> { 'type': 'Branch2', 'data': { 'data': 'int' } }
> >> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
> >>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
> >
> > Perhaps we should expose all unions for schema introspection in a way
> > similar to this (possibly not splitting out Branch1 and Branch2 as
> > independent types, though).
> 
> My current thinking on this is a bit more radical.  I suspect there's a
> straightforward object type buried in this mess, struggling to get out:
> the good old variant record.  It consists of
> 
> * a name
> 
> * an optional base type name (must be a object type without variants)
> 
> * list of own members (empty okay)
> 
>   Effective members = own members + effective base type members
> 
> * optionally variants
> 
>   - one of the effective members is the discriminator (must be enum)
> 
>   - for each discriminator value a list of variant members (empty okay)
> 
> All existing type/union types are specializations:
> 
> * The complex type (struct type) is an object type without variants.
> 
> * The simple union type is an object type
> 
>   - without a base type
> 
>   - with an implicitly defined own member of an implicitly defined
>     enumeration type serving as discriminator
> 
>   - with no other own members
> 
>   - with a single variant member for each discriminator value
> 
> * The flat union type is an object type
> 
>   - with a base type
> 
>   - without own members
> 
>   - with a discriminator
> 
> I further suspect lowering these types to the general form will make the
> generator simpler, not just the introspection.

That seems to be essentially the --verbose version of what I said above,
except that you also include simple structs. So yes, I think I agree.

Or maybe I'm missing what else you think is different?

> >                             We would just have to give a name to
> > implicitly generated enums and base types.
> 
> Introspection doesn't care whether we defined something implicitly or
> explicitly.  Let's make up names to hide that.

We just need to find a good way to make up names that stay stable and
don't cause clashes. I'm afraid that unlike block device IDs, we might
not have additional characters that could select a different namespace
here?

Not that this would be super hard, but if we need to use the same
namespace for automatically generated names, we need to properly
document which other identifiers are automatically used when you
declare a type (and detect clashes in the generator, of course).

> I'm trying to get a proof-of-concept introspection patch working this
> week.  It'll probably be ugly enough to curdle the milk in your tea.

Who put that milk into my tea?! I never do that! ;-)

Kevin
Markus Armbruster April 1, 2015, 9:14 a.m. UTC | #16
Kevin Wolf <kwolf@redhat.com> writes:

> Am 31.03.2015 um 22:46 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 26.03.2015 um 16:04 hat Eric Blake geschrieben:
>> >> On 03/26/2015 07:18 AM, Markus Armbruster wrote:
>> >> > Eric Blake <eblake@redhat.com> writes:
>> >> >>                     Meanwhile, it would be nice to allow
>> >> >> 'discriminator':'EnumType' without 'base' for creating a simple union
>> >> >> that is type-safe rather than open-coded to a generated enum (right
>> >> >> now, we are only type-safe when using a flat union, but that uses
>> >> >> a different syntax of 'discriminator':'member-name' which requires
>> >> >> a base class containing a 'member-name' enum field).
>> >> > 
>> >> > I'm afraid I don't get you.  Can you give examples?
>> >> 
>> >> Using this common code with the appropriate union for each example:
>> >> { 'command':'foo', 'data':UNION }
>> >> 
>> >> Right now, we have flat unions which are required to be type-safe (all
>> >> branches MUST map back to the enum type of the discriminator, enforced
>> >> by the generator, so that if the enum later adds a member, the union
>> >> must also be updated to match):
>> >> 
>> >> [1]
>> >> { 'union':'Safe', 'base':'Base', 'discriminator':'switch',
>> >>   'data':{ 'one':'str', 'two':'int' }}
>> >> 
>> >> {"execute":"foo", "arguments":{'switch':'one', 'data':'hello'}}
>> >> 
>> >> and simple unions which cannot be typesafe (the branches of the union
>> >> are open-coded - even if they correlate to an existing enum, there is
>> >> nothing enforcing that extensions to the enum be reflected into the union):
>> >> 
>> >> [2]
>> >> { 'union':'SimpleButOpenCoded',
>> >>   'data':{ 'one': 'str', 'two':'int' }}
>> >> 
>> >> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
>> >> 
>> >> I'm hoping to add as a followup series a variant of simple unions that
>> >> is type-safe:
>> >> 
>> >> [3]
>> >> { 'union':'SimpleAndSafe', 'discriminator':'MyEnum',
>> >>   'data':{ 'one':'str', 'two':'int' }}
>> >> 
>> >> {"execute":"foo", "arguments":{'type':'one', 'data':'hello'}}
>> >
>> > This would overload 'discriminator' with two different meanings:
>> >
>> > * In case [1] it refers to one key in the JSON object which contains the
>> >   name of the union branch to select. That is, it is the _name_ of the
>> >   key used as a discriminator.
>> >
>> > * In case [3], the 'type' key is used in the JSON object to select a
>> >   union branch. 'discriminator' describes the _valid values_ of it, i.e.
>> >   the branch names.
>> >
>> > We shouldn't mix these meanings. If you need [3], we could call it
>> > 'discriminator-type' or something like that. If both options are
>> > consistently used for simple and flat unions, you end up with these
>> > rules:
>> >
>> > * 'discriminator' is the key that is used to select the union branch.
>> >
>> >   - For flat unions it is required and must refer to an explicitly
>> >     declared key in the base type.
>> >
>> >   - For simple unions it is optional and defaults to 'type'. The key is
>> >     implicitly created in the union type.
>> >
>> > * 'discrimiator-type' describes the valid values of 'discriminator',
>> >   either by referring to an enum type or to 'str'.
>> >
>> >   - For flat unions, this must match the type of the explicit
>> >     declaration of the discriminator field. It is optional and defauls
>> >     to the only valid value.
>> >
>> >   - For simple unions it is optional, too, and defaults to 'str'.
>> >
>> >   - If it is the name of an enum type, that enum type is reused and the
>> >     declared union branches must match the valid values of the enum.
>> >
>> >   - If it is 'str', a new enum is generated, and all the declared union
>> >     branches are used as valid values.
>> >
>> > There's quite some duplication in it where we need to make sure that the
>> > schema matches in all places, but without an explicit declaration of the
>> > disriminator field in simple unions, this seems to be the best I can
>> > come up with.
>> >
>> >> But the existing, unused-except-in-testsuite, notion of a simple union
>> >> with a base class looks like:
>> >> 
>> >> [4]
>> >> { 'type':'Shared', 'data':{'common':'int'}}
>> >> { 'union':'SimpleWithBase', 'base':'Shared',
>> >>   'data':{ 'one':'str', 'two':'int' }}
>> >> 
>> >> {"execute":"foo", "arguments":{'common':1, 'type':'one', 'data':'hello'}}
>> >> 
>> >> If we were to allow the addition of 'discriminator':'EnumType' to a
>> >> simple union [3], but then add that discriminator to an existing case of
>> >> a simple union with base [4], it would look like:
>> >> 
>> >> { 'type':'Shared', 'data':{'common':'int'}}
>> >> { 'union':'SimpleWithBaseAndDiscriminator', 'base':'Shared',
>> >>   'discriminator':'MyEnum',
>> >>   'data':{ 'one':'str', 'two':'int' }}
>> >> 
>> >> Yuck.  That is indistinguishable from flat unions [1], except by whether
>> >> discriminator names an enum type or a member of the base class.
>> >
>> > Which could even be ambiguous, couldn't it?
>> >
>> >> > In particular, I can define simple unions in terms of flat ones by
>> >> > restricting all union cases to a single member named 'data'.  They're
>> >> > not implemented that way, but that's a historical accident.  Simple
>> >> > unions are a redundant concept.
>> >> 
>> >> Cool.  Or more concretely,
>> >> 
>> >> { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }
>> >> 
>> >> is identical on the wire to:
>> >> 
>> >> { 'enum': 'MyEnum', 'data': ['one', 'two'] }
>> >> { 'type': 'Base', 'data': { 'type': 'MyEnum' } }
>> >> { 'type': 'Branch1', 'data': { 'data': 'str' } }
>> >> { 'type': 'Branch2', 'data': { 'data': 'int' } }
>> >> { 'union': 'Flat', 'base': 'Base', 'discriminator': 'type',
>> >>   'data': { 'one': 'Branch1', 'two': 'Branch2' } }
>> >
>> > Perhaps we should expose all unions for schema introspection in a way
>> > similar to this (possibly not splitting out Branch1 and Branch2 as
>> > independent types, though).
>> 
>> My current thinking on this is a bit more radical.  I suspect there's a
>> straightforward object type buried in this mess, struggling to get out:
>> the good old variant record.  It consists of
>> 
>> * a name
>> 
>> * an optional base type name (must be a object type without variants)
>> 
>> * list of own members (empty okay)
>> 
>>   Effective members = own members + effective base type members
>> 
>> * optionally variants
>> 
>>   - one of the effective members is the discriminator (must be enum)
>> 
>>   - for each discriminator value a list of variant members (empty okay)
>> 
>> All existing type/union types are specializations:
>> 
>> * The complex type (struct type) is an object type without variants.
>> 
>> * The simple union type is an object type
>> 
>>   - without a base type
>> 
>>   - with an implicitly defined own member of an implicitly defined
>>     enumeration type serving as discriminator
>> 
>>   - with no other own members
>> 
>>   - with a single variant member for each discriminator value
>> 
>> * The flat union type is an object type
>> 
>>   - with a base type
>> 
>>   - without own members
>> 
>>   - with a discriminator
>> 
>> I further suspect lowering these types to the general form will make the
>> generator simpler, not just the introspection.
>
> That seems to be essentially the --verbose version of what I said above,
> except that you also include simple structs. So yes, I think I agree.
>
> Or maybe I'm missing what else you think is different?

Lamport's quip "If you're thinking without writing, you only think
you're thinking" certainly applies to me.  And then I can just as well
send it out, to give you an opportunity to point out holes or
misunderstandings.

For actually coding something, I need to think --verbose.  Computers
have so far stubbornly refused all my attempts to make them DWIM %-}

>> >                             We would just have to give a name to
>> > implicitly generated enums and base types.
>> 
>> Introspection doesn't care whether we defined something implicitly or
>> explicitly.  Let's make up names to hide that.
>
> We just need to find a good way to make up names that stay stable and
> don't cause clashes. I'm afraid that unlike block device IDs, we might
> not have additional characters that could select a different namespace
> here?
>
> Not that this would be super hard, but if we need to use the same
> namespace for automatically generated names, we need to properly
> document which other identifiers are automatically used when you
> declare a type (and detect clashes in the generator, of course).

"[PATCH 21] qapi: Require valid names" makes it easy.  I agree that
introspection documentation should explain automatically generated
identifiers.

>> I'm trying to get a proof-of-concept introspection patch working this
>> week.  It'll probably be ugly enough to curdle the milk in your tea.
>
> Who put that milk into my tea?! I never do that! ;-)

Enjoy!
diff mbox

Patch

diff --git a/tests/Makefile b/tests/Makefile
index 68bc353..f8bc2a8 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -216,10 +216,16 @@  check-qapi-schema-y := $(addprefix tests/qapi-schema/, \
 	qapi-schema-test.json quoted-structural-chars.json \
 	trailing-comma-list.json trailing-comma-object.json \
 	unclosed-list.json unclosed-object.json unclosed-string.json \
-	duplicate-key.json union-invalid-base.json flat-union-no-base.json \
-	flat-union-invalid-discriminator.json \
+	duplicate-key.json union-invalid-base.json union-bad-branch.json \
+	union-optional-branch.json flat-union-optional-discriminator.json \
+	flat-union-no-base.json flat-union-invalid-discriminator.json \
 	flat-union-invalid-branch-key.json flat-union-reverse-define.json \
-	flat-union-string-discriminator.json \
+	flat-union-string-discriminator.json union-base-no-discriminator.json \
+	flat-union-bad-discriminator.json flat-union-bad-base.json \
+	flat-union-base-union.json union-unknown.json union-max.json \
+	alternate-nested.json alternate-unknown.json alternate-clash.json \
+	alternate-good.json alternate-base.json alternate-array.json \
+	alternate-conflict-string.json alternate-conflict-dict.json \
 	include-simple.json include-relpath.json include-format-err.json \
 	include-non-file.json include-no-file.json include-before-err.json \
 	include-nested-err.json include-self-cycle.json include-cycle.json \
diff --git a/tests/qapi-schema/alternate-array.err b/tests/qapi-schema/alternate-array.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-array.exit b/tests/qapi-schema/alternate-array.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-array.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-array.json b/tests/qapi-schema/alternate-array.json
new file mode 100644
index 0000000..b399500
--- /dev/null
+++ b/tests/qapi-schema/alternate-array.json
@@ -0,0 +1,8 @@ 
+# FIXME: we do not support array branches of anonymous unions yet
+# FIXME: this should fail as long as we lack support
+{ 'type': 'One',
+  'data': { 'name': 'str' } }
+{ 'union': 'MyUnion',
+  'discriminator': {},
+  'data': { 'one': 'One',
+	    'two': [ 'int' ] } }
diff --git a/tests/qapi-schema/alternate-array.out b/tests/qapi-schema/alternate-array.out
new file mode 100644
index 0000000..90dc22c
--- /dev/null
+++ b/tests/qapi-schema/alternate-array.out
@@ -0,0 +1,4 @@ 
+[OrderedDict([('type', 'One'), ('data', OrderedDict([('name', 'str')]))]),
+ OrderedDict([('union', 'MyUnion'), ('discriminator', OrderedDict()), ('data', OrderedDict([('one', 'One'), ('two', ['int'])]))])]
+[{'enum_name': 'MyUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'One'), ('data', OrderedDict([('name', 'str')]))])]
diff --git a/tests/qapi-schema/alternate-base.err b/tests/qapi-schema/alternate-base.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-base.exit b/tests/qapi-schema/alternate-base.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-base.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-base.json b/tests/qapi-schema/alternate-base.json
new file mode 100644
index 0000000..2d36db1
--- /dev/null
+++ b/tests/qapi-schema/alternate-base.json
@@ -0,0 +1,7 @@ 
+# FIXME: we should reject anonymous union with base type
+{ 'type': 'Base',
+  'data': { 'string': 'str' } }
+{ 'union': 'MyUnion',
+  'base': 'Base',
+  'discriminator': {},
+  'data': { 'number': 'int' } }
diff --git a/tests/qapi-schema/alternate-base.out b/tests/qapi-schema/alternate-base.out
new file mode 100644
index 0000000..7fb31f5
--- /dev/null
+++ b/tests/qapi-schema/alternate-base.out
@@ -0,0 +1,4 @@ 
+[OrderedDict([('type', 'Base'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('union', 'MyUnion'), ('base', 'Base'), ('discriminator', OrderedDict()), ('data', OrderedDict([('number', 'int')]))])]
+[{'enum_name': 'MyUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'Base'), ('data', OrderedDict([('string', 'str')]))])]
diff --git a/tests/qapi-schema/alternate-clash.err b/tests/qapi-schema/alternate-clash.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-clash.exit b/tests/qapi-schema/alternate-clash.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-clash.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-clash.json b/tests/qapi-schema/alternate-clash.json
new file mode 100644
index 0000000..7e2ef23
--- /dev/null
+++ b/tests/qapi-schema/alternate-clash.json
@@ -0,0 +1,4 @@ 
+# FIXME: we should detect C enum collisions in an anonymous union
+{ 'union': 'Union1',
+  'discriminator': {},
+  'data': { 'one': 'str', 'ONE': 'int' } }
diff --git a/tests/qapi-schema/alternate-clash.out b/tests/qapi-schema/alternate-clash.out
new file mode 100644
index 0000000..c6687fa
--- /dev/null
+++ b/tests/qapi-schema/alternate-clash.out
@@ -0,0 +1,3 @@ 
+[OrderedDict([('union', 'Union1'), ('discriminator', OrderedDict()), ('data', OrderedDict([('one', 'str'), ('ONE', 'int')]))])]
+[{'enum_name': 'Union1Kind', 'enum_values': None}]
+[]
diff --git a/tests/qapi-schema/alternate-conflict-dict.err b/tests/qapi-schema/alternate-conflict-dict.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-conflict-dict.exit b/tests/qapi-schema/alternate-conflict-dict.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-conflict-dict.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-conflict-dict.json b/tests/qapi-schema/alternate-conflict-dict.json
new file mode 100644
index 0000000..70fe089
--- /dev/null
+++ b/tests/qapi-schema/alternate-conflict-dict.json
@@ -0,0 +1,9 @@ 
+# FIXME: we should reject anonymous unions with multiple object branches
+{ 'type': 'One',
+  'data': { 'name': 'str' } }
+{ 'type': 'Two',
+  'data': { 'value': 'int' } }
+{ 'union': 'MyUnion',
+  'discriminator': {},
+  'data': { 'one': 'One',
+	    'two': 'Two' } }
diff --git a/tests/qapi-schema/alternate-conflict-dict.out b/tests/qapi-schema/alternate-conflict-dict.out
new file mode 100644
index 0000000..b9ac945
--- /dev/null
+++ b/tests/qapi-schema/alternate-conflict-dict.out
@@ -0,0 +1,6 @@ 
+[OrderedDict([('type', 'One'), ('data', OrderedDict([('name', 'str')]))]),
+ OrderedDict([('type', 'Two'), ('data', OrderedDict([('value', 'int')]))]),
+ OrderedDict([('union', 'MyUnion'), ('discriminator', OrderedDict()), ('data', OrderedDict([('one', 'One'), ('two', 'Two')]))])]
+[{'enum_name': 'MyUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'One'), ('data', OrderedDict([('name', 'str')]))]),
+ OrderedDict([('type', 'Two'), ('data', OrderedDict([('value', 'int')]))])]
diff --git a/tests/qapi-schema/alternate-conflict-string.err b/tests/qapi-schema/alternate-conflict-string.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-conflict-string.exit b/tests/qapi-schema/alternate-conflict-string.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-conflict-string.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-conflict-string.json b/tests/qapi-schema/alternate-conflict-string.json
new file mode 100644
index 0000000..5fd1a47
--- /dev/null
+++ b/tests/qapi-schema/alternate-conflict-string.json
@@ -0,0 +1,8 @@ 
+# FIXME: we should reject anonymous unions with multiple string-like branches
+{ 'enum': 'Enum',
+  'data': [ 'hello', 'world' ] }
+{ 'union': 'MyUnion',
+  'discriminator': {},
+  'data': { 'one': 'str',
+	    'two': 'Enum' } }
+
diff --git a/tests/qapi-schema/alternate-conflict-string.out b/tests/qapi-schema/alternate-conflict-string.out
new file mode 100644
index 0000000..e7b39a2
--- /dev/null
+++ b/tests/qapi-schema/alternate-conflict-string.out
@@ -0,0 +1,5 @@ 
+[OrderedDict([('enum', 'Enum'), ('data', ['hello', 'world'])]),
+ OrderedDict([('union', 'MyUnion'), ('discriminator', OrderedDict()), ('data', OrderedDict([('one', 'str'), ('two', 'Enum')]))])]
+[{'enum_name': 'Enum', 'enum_values': ['hello', 'world']},
+ {'enum_name': 'MyUnionKind', 'enum_values': None}]
+[]
diff --git a/tests/qapi-schema/alternate-good.err b/tests/qapi-schema/alternate-good.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-good.exit b/tests/qapi-schema/alternate-good.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-good.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-good.json b/tests/qapi-schema/alternate-good.json
new file mode 100644
index 0000000..1068e2f
--- /dev/null
+++ b/tests/qapi-schema/alternate-good.json
@@ -0,0 +1,10 @@ 
+# Working example of anonymous union
+{ 'type': 'Data',
+  'data': { '*number': 'int', '*name': 'str' } }
+{ 'enum': 'Enum',
+  'data': [ 'hello', 'world' ] }
+{ 'union': 'MyUnion',
+  'discriminator': {},
+  'data': { 'value': 'int',
+	    'string': 'Enum',
+	    'struct': 'Data' } }
diff --git a/tests/qapi-schema/alternate-good.out b/tests/qapi-schema/alternate-good.out
new file mode 100644
index 0000000..b5117d1
--- /dev/null
+++ b/tests/qapi-schema/alternate-good.out
@@ -0,0 +1,6 @@ 
+[OrderedDict([('type', 'Data'), ('data', OrderedDict([('*number', 'int'), ('*name', 'str')]))]),
+ OrderedDict([('enum', 'Enum'), ('data', ['hello', 'world'])]),
+ OrderedDict([('union', 'MyUnion'), ('discriminator', OrderedDict()), ('data', OrderedDict([('value', 'int'), ('string', 'Enum'), ('struct', 'Data')]))])]
+[{'enum_name': 'Enum', 'enum_values': ['hello', 'world']},
+ {'enum_name': 'MyUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'Data'), ('data', OrderedDict([('*number', 'int'), ('*name', 'str')]))])]
diff --git a/tests/qapi-schema/alternate-nested.err b/tests/qapi-schema/alternate-nested.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-nested.exit b/tests/qapi-schema/alternate-nested.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-nested.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-nested.json b/tests/qapi-schema/alternate-nested.json
new file mode 100644
index 0000000..d5812bf
--- /dev/null
+++ b/tests/qapi-schema/alternate-nested.json
@@ -0,0 +1,7 @@ 
+# FIXME: we should reject a nested anonymous union branch
+{ 'union': 'Union1',
+  'discriminator': {},
+  'data': { 'name': 'str', 'value': 'int' } }
+{ 'union': 'Union2',
+  'discriminator': {},
+  'data': { 'nested': 'Union1' } }
diff --git a/tests/qapi-schema/alternate-nested.out b/tests/qapi-schema/alternate-nested.out
new file mode 100644
index 0000000..0137c1f
--- /dev/null
+++ b/tests/qapi-schema/alternate-nested.out
@@ -0,0 +1,5 @@ 
+[OrderedDict([('union', 'Union1'), ('discriminator', OrderedDict()), ('data', OrderedDict([('name', 'str'), ('value', 'int')]))]),
+ OrderedDict([('union', 'Union2'), ('discriminator', OrderedDict()), ('data', OrderedDict([('nested', 'Union1')]))])]
+[{'enum_name': 'Union1Kind', 'enum_values': None},
+ {'enum_name': 'Union2Kind', 'enum_values': None}]
+[]
diff --git a/tests/qapi-schema/alternate-unknown.err b/tests/qapi-schema/alternate-unknown.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/alternate-unknown.exit b/tests/qapi-schema/alternate-unknown.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/alternate-unknown.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/alternate-unknown.json b/tests/qapi-schema/alternate-unknown.json
new file mode 100644
index 0000000..0bab9c2
--- /dev/null
+++ b/tests/qapi-schema/alternate-unknown.json
@@ -0,0 +1,4 @@ 
+# FIXME: we should reject an anonymous union with unknown type in branch
+{ 'union': 'Union',
+  'discriminator': {},
+  'data': { 'unknown': 'MissingType' } }
diff --git a/tests/qapi-schema/alternate-unknown.out b/tests/qapi-schema/alternate-unknown.out
new file mode 100644
index 0000000..0911cdc
--- /dev/null
+++ b/tests/qapi-schema/alternate-unknown.out
@@ -0,0 +1,3 @@ 
+[OrderedDict([('union', 'Union'), ('discriminator', OrderedDict()), ('data', OrderedDict([('unknown', 'MissingType')]))])]
+[{'enum_name': 'UnionKind', 'enum_values': None}]
+[]
diff --git a/tests/qapi-schema/flat-union-bad-base.err b/tests/qapi-schema/flat-union-bad-base.err
new file mode 100644
index 0000000..5962ff4
--- /dev/null
+++ b/tests/qapi-schema/flat-union-bad-base.err
@@ -0,0 +1 @@ 
+tests/qapi-schema/flat-union-bad-base.json:9: Base 'OrderedDict([('enum1', 'TestEnum'), ('kind', 'str')])' is not a valid type
diff --git a/tests/qapi-schema/flat-union-bad-base.exit b/tests/qapi-schema/flat-union-bad-base.exit
new file mode 100644
index 0000000..d00491f
--- /dev/null
+++ b/tests/qapi-schema/flat-union-bad-base.exit
@@ -0,0 +1 @@ 
+1
diff --git a/tests/qapi-schema/flat-union-bad-base.json b/tests/qapi-schema/flat-union-bad-base.json
new file mode 100644
index 0000000..d69168f
--- /dev/null
+++ b/tests/qapi-schema/flat-union-bad-base.json
@@ -0,0 +1,13 @@ 
+# we require the base to be an existing complex type
+# FIXME: should we allow an anonymous inline base type?
+{ 'enum': 'TestEnum',
+  'data': [ 'value1', 'value2' ] }
+{ 'type': 'TestTypeA',
+  'data': { 'string': 'str' } }
+{ 'type': 'TestTypeB',
+  'data': { 'integer': 'int' } }
+{ 'union': 'TestUnion',
+  'base': { 'enum1': 'TestEnum', 'kind': 'str' },
+  'discriminator': 'TestEnum',
+  'data': { 'kind1': 'TestTypeA',
+            'kind2': 'TestTypeB' } }
diff --git a/tests/qapi-schema/flat-union-bad-base.out b/tests/qapi-schema/flat-union-bad-base.out
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/flat-union-bad-discriminator.err b/tests/qapi-schema/flat-union-bad-discriminator.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/flat-union-bad-discriminator.exit b/tests/qapi-schema/flat-union-bad-discriminator.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/flat-union-bad-discriminator.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/flat-union-bad-discriminator.json b/tests/qapi-schema/flat-union-bad-discriminator.json
new file mode 100644
index 0000000..1599a59
--- /dev/null
+++ b/tests/qapi-schema/flat-union-bad-discriminator.json
@@ -0,0 +1,14 @@ 
+# FIXME: we should require the discriminator to be a string naming a base-type member
+{ 'enum': 'TestEnum',
+  'data': [ 'value1', 'value2' ] }
+{ 'type': 'TestBase',
+  'data': { 'enum1': 'TestEnum', 'kind': 'str' } }
+{ 'type': 'TestTypeA',
+  'data': { 'string': 'str' } }
+{ 'type': 'TestTypeB',
+  'data': { 'integer': 'int' } }
+{ 'union': 'TestUnion',
+  'base': 'TestBase',
+  'discriminator': [],
+  'data': { 'kind1': 'TestTypeA',
+            'kind2': 'TestTypeB' } }
diff --git a/tests/qapi-schema/flat-union-bad-discriminator.out b/tests/qapi-schema/flat-union-bad-discriminator.out
new file mode 100644
index 0000000..b6ce217
--- /dev/null
+++ b/tests/qapi-schema/flat-union-bad-discriminator.out
@@ -0,0 +1,10 @@ 
+[OrderedDict([('enum', 'TestEnum'), ('data', ['value1', 'value2'])]),
+ OrderedDict([('type', 'TestBase'), ('data', OrderedDict([('enum1', 'TestEnum'), ('kind', 'str')]))]),
+ OrderedDict([('type', 'TestTypeA'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('type', 'TestTypeB'), ('data', OrderedDict([('integer', 'int')]))]),
+ OrderedDict([('union', 'TestUnion'), ('base', 'TestBase'), ('discriminator', []), ('data', OrderedDict([('kind1', 'TestTypeA'), ('kind2', 'TestTypeB')]))])]
+[{'enum_name': 'TestEnum', 'enum_values': ['value1', 'value2']},
+ {'enum_name': 'TestUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'TestBase'), ('data', OrderedDict([('enum1', 'TestEnum'), ('kind', 'str')]))]),
+ OrderedDict([('type', 'TestTypeA'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('type', 'TestTypeB'), ('data', OrderedDict([('integer', 'int')]))])]
diff --git a/tests/qapi-schema/flat-union-base-union.err b/tests/qapi-schema/flat-union-base-union.err
new file mode 100644
index 0000000..185bf51
--- /dev/null
+++ b/tests/qapi-schema/flat-union-base-union.err
@@ -0,0 +1 @@ 
+tests/qapi-schema/flat-union-base-union.json:11: Base 'UnionBase' is not a valid type
diff --git a/tests/qapi-schema/flat-union-base-union.exit b/tests/qapi-schema/flat-union-base-union.exit
new file mode 100644
index 0000000..d00491f
--- /dev/null
+++ b/tests/qapi-schema/flat-union-base-union.exit
@@ -0,0 +1 @@ 
+1
diff --git a/tests/qapi-schema/flat-union-base-union.json b/tests/qapi-schema/flat-union-base-union.json
new file mode 100644
index 0000000..bbaa2da
--- /dev/null
+++ b/tests/qapi-schema/flat-union-base-union.json
@@ -0,0 +1,15 @@ 
+# FIXME: the error message needs help: we require the base to be a struct
+{ 'enum': 'TestEnum',
+  'data': [ 'value1', 'value2' ] }
+{ 'type': 'TestTypeA',
+  'data': { 'string': 'str' } }
+{ 'type': 'TestTypeB',
+  'data': { 'integer': 'int' } }
+{ 'union': 'UnionBase',
+  'data': { 'kind1': 'TestTypeA',
+	    'kind2': 'TestTypeB' } }
+{ 'union': 'TestUnion',
+  'base': 'UnionBase',
+  'discriminator': 'type',
+  'data': { 'kind1': 'TestTypeA',
+            'kind2': 'TestTypeB' } }
diff --git a/tests/qapi-schema/flat-union-base-union.out b/tests/qapi-schema/flat-union-base-union.out
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/flat-union-no-base.err b/tests/qapi-schema/flat-union-no-base.err
index a59749e..eaf3592 100644
--- a/tests/qapi-schema/flat-union-no-base.err
+++ b/tests/qapi-schema/flat-union-no-base.err
@@ -1 +1 @@ 
-tests/qapi-schema/flat-union-no-base.json:7: Flat union 'TestUnion' must have a base field
+tests/qapi-schema/flat-union-no-base.json:8: Flat union 'TestUnion' must have a base field
diff --git a/tests/qapi-schema/flat-union-no-base.json b/tests/qapi-schema/flat-union-no-base.json
index 50f2673..6d8dc7f 100644
--- a/tests/qapi-schema/flat-union-no-base.json
+++ b/tests/qapi-schema/flat-union-no-base.json
@@ -1,10 +1,11 @@ 
+# FIXME: we should allow discriminator:type without base on non-flat unions
 { 'type': 'TestTypeA',
   'data': { 'string': 'str' } }
-
 { 'type': 'TestTypeB',
   'data': { 'integer': 'int' } }
-
+{ 'enum': 'Enum',
+  'data': [ 'value1', 'value2' ] }
 { 'union': 'TestUnion',
-  'discriminator': 'enum1',
+  'discriminator': 'Enum',
   'data': { 'value1': 'TestTypeA',
             'value2': 'TestTypeB' } }
diff --git a/tests/qapi-schema/flat-union-optional-discriminator.err b/tests/qapi-schema/flat-union-optional-discriminator.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/flat-union-optional-discriminator.exit b/tests/qapi-schema/flat-union-optional-discriminator.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/flat-union-optional-discriminator.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/flat-union-optional-discriminator.json b/tests/qapi-schema/flat-union-optional-discriminator.json
new file mode 100644
index 0000000..4957c72
--- /dev/null
+++ b/tests/qapi-schema/flat-union-optional-discriminator.json
@@ -0,0 +1,9 @@ 
+# FIXME: we should require the discriminator to be non-optional
+{ 'enum': 'Enum', 'data': [ 'one', 'two' ] }
+{ 'type': 'Base',
+  'data': { '*switch': 'Enum' } }
+{ 'union': 'MyUnion',
+  'base': 'Base',
+  'discriminator': '*switch',
+  'data': { 'one': 'int',
+	    'two': 'str' } }
diff --git a/tests/qapi-schema/flat-union-optional-discriminator.out b/tests/qapi-schema/flat-union-optional-discriminator.out
new file mode 100644
index 0000000..f4b6bed
--- /dev/null
+++ b/tests/qapi-schema/flat-union-optional-discriminator.out
@@ -0,0 +1,5 @@ 
+[OrderedDict([('enum', 'Enum'), ('data', ['one', 'two'])]),
+ OrderedDict([('type', 'Base'), ('data', OrderedDict([('*switch', 'Enum')]))]),
+ OrderedDict([('union', 'MyUnion'), ('base', 'Base'), ('discriminator', '*switch'), ('data', OrderedDict([('one', 'int'), ('two', 'str')]))])]
+[{'enum_name': 'Enum', 'enum_values': ['one', 'two']}]
+[OrderedDict([('type', 'Base'), ('data', OrderedDict([('*switch', 'Enum')]))])]
diff --git a/tests/qapi-schema/union-bad-branch.err b/tests/qapi-schema/union-bad-branch.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/union-bad-branch.exit b/tests/qapi-schema/union-bad-branch.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/union-bad-branch.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/union-bad-branch.json b/tests/qapi-schema/union-bad-branch.json
new file mode 100644
index 0000000..f7aeae8
--- /dev/null
+++ b/tests/qapi-schema/union-bad-branch.json
@@ -0,0 +1,8 @@ 
+# FIXME: we should reject normal unions where branches would collide in C
+{ 'type': 'One',
+  'data': { 'string': 'str' } }
+{ 'type': 'Two',
+  'data': { 'number': 'int' } }
+{ 'union': 'MyUnion',
+  'data': { 'one': 'One',
+	    'ONE': 'Two' } }
diff --git a/tests/qapi-schema/union-bad-branch.out b/tests/qapi-schema/union-bad-branch.out
new file mode 100644
index 0000000..6baf01b
--- /dev/null
+++ b/tests/qapi-schema/union-bad-branch.out
@@ -0,0 +1,6 @@ 
+[OrderedDict([('type', 'One'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('type', 'Two'), ('data', OrderedDict([('number', 'int')]))]),
+ OrderedDict([('union', 'MyUnion'), ('data', OrderedDict([('one', 'One'), ('ONE', 'Two')]))])]
+[{'enum_name': 'MyUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'One'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('type', 'Two'), ('data', OrderedDict([('number', 'int')]))])]
diff --git a/tests/qapi-schema/union-base-no-discriminator.err b/tests/qapi-schema/union-base-no-discriminator.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/union-base-no-discriminator.exit b/tests/qapi-schema/union-base-no-discriminator.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/union-base-no-discriminator.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/union-base-no-discriminator.json b/tests/qapi-schema/union-base-no-discriminator.json
new file mode 100644
index 0000000..b5da546
--- /dev/null
+++ b/tests/qapi-schema/union-base-no-discriminator.json
@@ -0,0 +1,14 @@ 
+# FIXME: either allow base in non-flat unions, or diagnose missing discriminator
+{ 'type': 'TestTypeA',
+  'data': { 'string': 'str' } }
+
+{ 'type': 'TestTypeB',
+  'data': { 'integer': 'int' } }
+
+{ 'type': 'Base',
+  'data': { 'string': 'str' } }
+
+{ 'union': 'TestUnion',
+  'base': 'Base',
+  'data': { 'value1': 'TestTypeA',
+            'value2': 'TestTypeB' } }
diff --git a/tests/qapi-schema/union-base-no-discriminator.out b/tests/qapi-schema/union-base-no-discriminator.out
new file mode 100644
index 0000000..505fd57
--- /dev/null
+++ b/tests/qapi-schema/union-base-no-discriminator.out
@@ -0,0 +1,8 @@ 
+[OrderedDict([('type', 'TestTypeA'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('type', 'TestTypeB'), ('data', OrderedDict([('integer', 'int')]))]),
+ OrderedDict([('type', 'Base'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('union', 'TestUnion'), ('base', 'Base'), ('data', OrderedDict([('value1', 'TestTypeA'), ('value2', 'TestTypeB')]))])]
+[{'enum_name': 'TestUnionKind', 'enum_values': None}]
+[OrderedDict([('type', 'TestTypeA'), ('data', OrderedDict([('string', 'str')]))]),
+ OrderedDict([('type', 'TestTypeB'), ('data', OrderedDict([('integer', 'int')]))]),
+ OrderedDict([('type', 'Base'), ('data', OrderedDict([('string', 'str')]))])]
diff --git a/tests/qapi-schema/union-invalid-base.err b/tests/qapi-schema/union-invalid-base.err
index 938f969..3cc82c0 100644
--- a/tests/qapi-schema/union-invalid-base.err
+++ b/tests/qapi-schema/union-invalid-base.err
@@ -1 +1 @@ 
-tests/qapi-schema/union-invalid-base.json:7: Base 'TestBaseWrong' is not a valid type
+tests/qapi-schema/union-invalid-base.json:8: Base 'int' is not a valid type
diff --git a/tests/qapi-schema/union-invalid-base.json b/tests/qapi-schema/union-invalid-base.json
index 1fa4930..bc5dc8d 100644
--- a/tests/qapi-schema/union-invalid-base.json
+++ b/tests/qapi-schema/union-invalid-base.json
@@ -1,3 +1,4 @@ 
+# a union base type must be a struct
 { 'type': 'TestTypeA',
   'data': { 'string': 'str' } }

@@ -5,6 +6,7 @@ 
   'data': { 'integer': 'int' } }

 { 'union': 'TestUnion',
-  'base': 'TestBaseWrong',
+  'base': 'int',
+  'discriminator': 'int',
   'data': { 'value1': 'TestTypeA',
             'value2': 'TestTypeB' } }
diff --git a/tests/qapi-schema/union-max.err b/tests/qapi-schema/union-max.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/union-max.exit b/tests/qapi-schema/union-max.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/union-max.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/union-max.json b/tests/qapi-schema/union-max.json
new file mode 100644
index 0000000..45648c4
--- /dev/null
+++ b/tests/qapi-schema/union-max.json
@@ -0,0 +1,3 @@ 
+# FIXME: we should reject 'max' branch in a union, for collision with C enum
+{ 'union': 'Union',
+  'data': { 'max': 'int' } }
diff --git a/tests/qapi-schema/union-max.out b/tests/qapi-schema/union-max.out
new file mode 100644
index 0000000..2757d36
--- /dev/null
+++ b/tests/qapi-schema/union-max.out
@@ -0,0 +1,3 @@ 
+[OrderedDict([('union', 'Union'), ('data', OrderedDict([('max', 'int')]))])]
+[{'enum_name': 'UnionKind', 'enum_values': None}]
+[]
diff --git a/tests/qapi-schema/union-optional-branch.err b/tests/qapi-schema/union-optional-branch.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/union-optional-branch.exit b/tests/qapi-schema/union-optional-branch.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/union-optional-branch.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/union-optional-branch.json b/tests/qapi-schema/union-optional-branch.json
new file mode 100644
index 0000000..c513db7
--- /dev/null
+++ b/tests/qapi-schema/union-optional-branch.json
@@ -0,0 +1,2 @@ 
+# FIXME: union branches cannot be optional
+{ 'union': 'Union', 'data': { '*a': 'int', 'b': 'str' } }
diff --git a/tests/qapi-schema/union-optional-branch.out b/tests/qapi-schema/union-optional-branch.out
new file mode 100644
index 0000000..b03b5d2
--- /dev/null
+++ b/tests/qapi-schema/union-optional-branch.out
@@ -0,0 +1,3 @@ 
+[OrderedDict([('union', 'Union'), ('data', OrderedDict([('*a', 'int'), ('b', 'str')]))])]
+[{'enum_name': 'UnionKind', 'enum_values': None}]
+[]
diff --git a/tests/qapi-schema/union-unknown.err b/tests/qapi-schema/union-unknown.err
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qapi-schema/union-unknown.exit b/tests/qapi-schema/union-unknown.exit
new file mode 100644
index 0000000..573541a
--- /dev/null
+++ b/tests/qapi-schema/union-unknown.exit
@@ -0,0 +1 @@ 
+0
diff --git a/tests/qapi-schema/union-unknown.json b/tests/qapi-schema/union-unknown.json
new file mode 100644
index 0000000..258f1d3
--- /dev/null
+++ b/tests/qapi-schema/union-unknown.json
@@ -0,0 +1,3 @@ 
+# FIXME: we should reject a union with unknown type in branch
+{ 'union': 'Union',
+  'data': { 'unknown': 'MissingType' } }
diff --git a/tests/qapi-schema/union-unknown.out b/tests/qapi-schema/union-unknown.out
new file mode 100644
index 0000000..8223dcf
--- /dev/null
+++ b/tests/qapi-schema/union-unknown.out
@@ -0,0 +1,3 @@ 
+[OrderedDict([('union', 'Union'), ('data', OrderedDict([('unknown', 'MissingType')]))])]
+[{'enum_name': 'UnionKind', 'enum_values': None}]
+[]